designing and writing some PL/SQL code (these activities are mainly ... This
week's lab exercise involves a simple data model for recording details about ...
SFDV3007 Lab 1: SQL, PL/SQL, and Physical Tuning 1 Introduction This first lab is an opportunity to refresh your memory of the Commerce laboratory environment, SQL, Oracle PL/SQL, and TOAD. Your tasks will include logging in to Oracle from TOAD, changing your password, creating some tables, defining a few indexes, and (optionally) designing and writing some PL/SQL code (these activities are mainly intended as general revision for students not enrolled in SFDV3008). You will also be introduced to some of the physical-level aspects of Oracle10g, and some of the tools and techniques that you will be using to diagnose and resolve physical performance issues. Because this is a 300-level course, we will be expecting a little more independent work from you as students. The lab sessions will be one hour and 50 minutes long, but each week’s lab exercises will probably take longer than this to complete. Therefore, you should treat the lab classes as help sessions, and aim to attempt at least some of the work and reading in advance, in your own time. If you encounter any technical or other issues outside your lab time, please let us know, either by e-mail, the discussion board on Blackboard or in person. This first lab document also includes an explanation of a number of tools and techniques you can use in Oracle for diagnosing and resolving physical tuning and performance issues. You may need to refer to this material again in the coming labs. As with SFDV3002, your primary technical resource is the Oracle documentation set. You should already be familiar with the general organisation of the Oracle documentation from last year. You can get to the Oracle documentation through the Other Documents area on the SFDV3007 site on Blackboard, or by pointing your Web browser to **insert documentation URL**. We suggest you bookmark the main page in your browser for future reference. This week’s lab exercise involves a simple data model for recording details about students at a university and the courses they have enrolled for (see Figure 1). For the sake of simplicity, we are only concerned with enrolments for the current year, and are assuming that all papers are full-year (i.e. not semesterised).
Figure 1. Entity-Relationship Diagram of the Student Enrolment Data Model
2 Physical Tuning in Oracle Much of the lab work for the first half of this course will involve physically tuning your database—making changes at the physical database level to improve the performance of database operations (especially queries). Recall that the key to being able to do this is physical data independence, which insulates users from the internal complexity of the database. There are a number of tools and resources available in the Oracle environment that can assist you in tuning your database, including: •
the Oracle data dictionary views;
•
the SQL EXPLAIN PLAN command;
•
the SQL Trace facility; and
•
the ROWID pseudocolumn.
These will become more familiar to you over the coming weeks, but we provide the following brief introduction to physical tuning in the meantime. You will find this handout useful to consult during the next few labs.
2.1 The Oracle Data Dictionary The data dictionary (or system catalog) is a collection of relations containing metadata—data pertaining to the contents of the database. Because the data dictionary is itself a relational database, you can query it using SQL statements as you would with any SQL database. For example, to find out about the tables in the database, you can query the USER_TABLES data dictionary view. The data dictionary also includes information about the physical level of the database, including physical-level attributes of the tables, and data on any indexes, clusters, partitioned tables, extents, and materialised views that have been created. Initially, some of the attributes in the data dictionary views will be empty (null). The DBMS maintains a certain amount of information about the physical level characteristics of tables and views, as it helps the optimiser choose efficient physical access paths, but it is prohibitively expensive for the DBMS to maintain comprehensive physical metadata in real-time. To gather this extra data, you have a couple of options: the ANALYZE SQL statement, and the GATHER_SCHEMA_STATS built-in PL/SQL procedure. The ANALYZE statement is slightly simpler to run, and is useful if you just want to analyse a single schema object. The GATHER_SCHEMA_STATS procedure, which belongs to the DBMS_Stats package, is more complicated to run, but can be used to analyse all schema objects in a user’s schema. Note that neither technique produces any direct output, but check the status bar in TOAD to make sure the code has run.
Version 10g of Oracle introduces some degree of automatic analysis of schema objects, so you will sometimes find that a schema object already has full statistics, even though you have not analysed it yourself. However, to guarantee up-to-date statistics when tuning, you should always analyse the objects yourself first, before running EXPLAIN PLAN or using SQL Trace. Here is a typical example of the ANALYZE statement, which you would run in TOAD’s SQL Window. analyze table Student compute statistics; Here is an example call to GATHER_SCHEMA_STATS. Note that you must specify the your username in the OwnName argument. Since this is an anonymous PL/SQL block, you would run it in the TOAD SQL Window as well. begin DBMS_Stats.Gather_Schema_Stats( OwnName=>'edwch193l', Granularity=>'ALL', Cascade=>TRUE ); end; This may take some time to run if your schema contains many large schema objects. The CASCADE parameter being set to TRUE causes the DBMS to analyse all indexes on any tables it encounters. The GRANULARITY parameter controls the level of detail of the analysis of partitioned tables. Alternatively, you can use TOAD's built-in facility for analysing schema objects, which you can invoke using the Tools → Analyze All Objects menu item.
2.2 The EXPLAIN PLAN Command The optimiser is an expert system built into a DBMS that decides how to run a particular database operation at the physical level. As most complex operations are queries (SELECT statements), the behaviour of the optimiser is geared towards optimising query performance. You can use the EXPLAIN PLAN SQL command in Oracle to find out what query plan the optimiser is choosing for a given query. This can be used to: •
Identify whether a tuning method such as indexing or replication is being used by the system.
•
Determine how much of a performance gain is effected by the tuning method being tested.
•
Identify redundant indexes that are no longer effective and should be deleted.
To use the EXPLAIN PLAN command in plain SQL is a little complicated; you can read about how to do this in the Oracle documentation if you need to. In TOAD, however, there is built-in functionality to run EXPLAIN PLAN on the current SQL statement and display the details of the query plan immediately. We recommend you use this feature in carrying out your lab work for INFO 321. To run EXPLAIN PLAN in TOAD: 1. Click on or highlight the statement you wish to test. 2. Open the SQL-Window menu and click Explain Plan Current SQL. Alternatively, use the keyboard shortcut Ctrl-E. Suppose we wished to find out the explain plan for the following query: select count(*) from Student; The output from EXPLAIN PLAN for this query would probably resemble the following: Operation SELECT STATEMENT Hint=CHOOSE SORT AGGREGATE INDEX FULL SCAN
Object Name
STUDENT_PK
Rows 1 1 1
Bytes
Cost 1
In the Operation column, you can see a list of physical operations that the DBMS carried out for this query. This is actually a tree of operations: the top-level SELECT statement involved a SORT sub-task, which in turn involved a full index scan. The Object Name column identifies the name of the schema object involved in each step of the query plan, most commonly an index or a table. In this case, the STUDENT_PK index was used in the index full scan operation. Note that the STUDENT table does not appear in the Object Name column for any of the operations—the DBMS did not actually need to look in the table to perform this query, because the number of rows can be determined more efficiently by using the primary key index. Also associated with most operations will be an approximation of the number of rows involved, and a cost, which is probably the most important figure, as it indicates the optimiser’s estimate of the amount of effort required for the query plan. A higher cost indicates that the DBMS had to do more work, so in tuning your database you are striving to lower the cost of common queries as much as possible. If you find that EXPLAIN PLAN is not returning values for the Bytes and Cost attributes, the data dictionary probably does not contain the relevant physical statistics for the schema objects involved. Use the ANALYZE statement or the GATHER_SCHEMA_STATS procedure to analyse the schema objects, and try again. A common use of EXPLAIN PLAN is to determine whether a physical tuning method is being used, and how effective it is. This means that you have to examine the cost of the plan before and
after the method is implemented. Here is the typical procedure for testing a tuning method using EXPLAIN PLAN: 1. Run GATHER_SCHEMA_STATS to ensure that the data dictionary is up-to-date. 2. Run EXPLAIN PLAN on the statement(s) you expect to be helped by the tuning method in question. 3. Implement the tuning method (create indexes, materialised views, clusters, etc.). 4. Run GATHER_SCHEMA_STATS to ensure that the data dictionary is up-to-date for the new schema objects. 5. Run EXPLAIN PLAN on the statement(s) being tested, and check whether the optimiser is using the new objects, and whether the cost has been reduced.
2.3 SQL Trace SQL Trace is a facility in Oracle that allows a database administrator to account for database server resources used during a session. By finding out how much CPU time, memory, and disk I/O was involved, performance bottlenecks can be identified and remedied. SQL Trace gives similar information to the EXPLAIN PLAN command, but is more detailed in its summary of resources used, and is intended for tuning the physical hardware rather than the query planning of the optimiser. The output from SQL Trace is written to a file that resides on the database server, which makes it a little less convenient to use than EXPLAIN PLAN. The output file can then be converted into a more readable format using the TKPROF command-line utility. SQL Trace will be used in future labs; more detail on how to use it will be provided then. SQL Trace is a feature of Oracle that allows you to find out detailed information on the computing resources used by your Oracle session. SQL Trace performs similar functions to the Explain Plan command, but goes into a little more detail and will give a more complete picture of the I/O performed for your queries and other statements. This can be useful in testing the effectiveness of your physical tuning strategies. To use SQL Trace, use the following procedure: 1. Tell Oracle to use your username in the trace file name. alter session set tracefile_identifier = 'edwch193'; 2. Enable SQL Trace for your current session using the ALTER SESSION command in TOAD: alter session set sql_trace = true;
3. Perform the database operations you wish to diagnose. These should be minimal to ensure your results are meaningful. In particular, do not execute the statement(s) you wish to test more than once! 4. Turn off SQL Trace for your session. alter session set sql_trace = false; 5. Next, locate the trace file in Windows Explorer. Enter the path **path to SQL Trace share on Oracle server** in the Address field, and authenticate yourself using your commerce domain student usercode (e.g. commerce\edwch193) and your student ID number. Then look for the file containing your usercode, and copy the filename to the clipboard (you will need it later). 6. Run the Windows Command Interpreter (Start → Run..., cmd). 7. In the Windows Command Interpreter, run the tkprof command on your trace file, saving the output to your S: drive (paste the actual filename from the clipboard). tkprof **path to SQL Trace share**\o321a_ora_2544_edwch193.trc s:\sqltrace.txt 8. Read and interpret the output from TKPROF. notepad s:\sqltrace.txt 9. Delete your tracefile from \\info-nts-01\udump321a$. del **path to SQL Trace share**\o321a_ora_2544_edwch193.trc For more comprehensive information on SQL Trace and TKPROF, consult Using SQL Trace and TKPROF in the Oracle Performance Tuning Guide.
2.4 ROWIDs The final physical-level concept we introduce this week is the ROWID. Although database users perceive database tables simply as collections of rows, which can be arranged in any order, at the physical level the DBMS must maintain strict control over where the data for each row are actually stored on disk. It does this through the use of ROWIDs—unique physical-level addresses that identify where a record is located in the physical database. Because they identify the exact physical location of records, ROWIDs represent the fastest possible access path to data in the database. To access a record via its ROWID is known as direct access. In practice, however, ROWIDs are seldom known in advance, and they may change over time (particularly when data are physically reorganised), so pure direct access is not normally possible. ROWIDs are, however, the basis of how indexes work.
It is sometimes useful to be able to find out the ROWID associated with a particular row. You can do this by including the ROWID pseudocolumn in the SELECT clause of a SELECT statement. Note that you need to enable the display of ROWIDs in TOAD from the View → Options → Data Grids - Data option first. For example: select RowID, ID, Family_Name, Given_Names from Student order by RowID; ROWID ID FAMILY_NAME GIVEN_NAMES ------------------ -- ----------- ----------AAAG++AAGAAAAANAAA 1 Jones Jimbo AAAG++AAGAAAAANAAB 2 Simpson Lisa 2 rows selected As you can see from the output, the ROWID values are not exactly self-explanatory! They do actually contain some useful information, which you can decode using the RowID_Info function available on Blackboard. After you have downloaded and run the script in TOAD, you can use it as follows: select RowID_Info(RowID), ID, Family_Name, Given_Names from Student order by RowID; ROWID_INFO(ROWID) ID FAMILY_NAME GIVEN_NAMES ---------------------------------------- -- ----------- ----------{Segment=28606, File=6, Block=13, Row=0} 1 Jones Jimbo {Segment=28606, File=6, Block=13, Row=1} 2 Simpson Lisa 2 rows selected
This is much more intelligible—clearly, the two rows are being stored in the same database block in the same data file on disk. We will be using ROWIDs to test data partitioning and clustering in the next two labs. Note that systems other than Oracle may use different terminology and functions for dealing with physical row addresses; for example, PostgreSQL refers to them as tuple identifiers and uses the ctid system column to refer to them.
3 Hints for Executing SQL Statements in TOAD As you may recall from INFO 212, TOAD has some idiosyncrasies when executing SQL code. The following is a summary of the various options for running SQL statements and scripts in TOAD. To execute a statement when the SQL Editor window contains only that one statement, ensure that the statement contains no blank lines, and than there are blank lines before and after the statement, and then either:
•
Press F9
•
Type Ctrl-Enter
To execute one statement among many others in the window, either: •
•
Highlight the statement to be executed and then either:
Press F9
Click the “Execute Statement” button
Click within the statement and then either:
Type Ctrl-Enter
Click the “Run Current Statement” button
Type Shift-F9
To execute a series of statements in the SQL Editor window in sequence: •
Ensure that any PL/SQL code has a terminating '/' character.
•
Optionally select the range of text you wish to run, and then either
Press F5
Click the “Execute all of current window as script” button
In some situations, blank lines and comments will confuse TOAD’s SQL parser. If you get unexplained “invalid character” errors from the server, try using the Strip Code Statement function (Ctrl-P) or deleting any comments. If your code contains statements that contain blank lines, highlighting the statement and executing it as a single statement (F9) will normally succeed. Alternatively, if you wish to use vertical whitespace to improve the readability of your code, use a single-line comment instead of a completely empty line. In general, each SQL statement you write should be ended with a semicolon character (';'). However, when using TOAD, you may need to remove the terminating semicolons at the end of certain SQL statements, notably INSERT and DELETE. Whether you need to do this depends on what mode of execution you are using: when executing as a script, the semicolons are necessary, but when executing individual statements, TOAD may complain if your statements end in a semicolon. Note that this is an issue with TOAD—in general, according to the SQL standard, all SQL statements should be terminated with a semicolon.
4 Readings
You will need to do a certain amount of reading to carry out the laboratory exercises in INFO 321. Most of the information you need can be found in the department’s copy of the Oracle Database Documentation Library, which is available within the campus network. The tutor will demonstrate how to get to the most important sections on the Web. The following readings will help you carry out this week’s activities: •
Oracle10g SQL Reference: SQL Statements:
CREATE TABLE
CREATE INDEX
SELECT
INSERT
ANALYZE
EXPLAIN PLAN
•
Oracle10g PL/SQL User’s Guide and Reference
•
Oracle10g Performance Tuning Guide:
I/O Configuration and Design
Introduction to the Optimizer
Using EXPLAIN PLAN
Using SQL Trace
•
Oracle10g Database Concepts: ROWID Datatypes
•
Relevant lecture and lab notes from INFO 212 (available on Blackboard)
•
The TOAD online help
You do not have to read all of the Oracle Tuning documentation references this week—just read the first few subsections of each reference to get a flavour of the concepts. However, you should aim to read most of each reference before the end of week 3.
5 Tasks 5.1 Initial Setup 1. Log into your workstation using your normal student usercode.
2. Launch the TOAD application from the Start Menu, from Course Specific Resources → Databases → TOAD 8.0. If TOAD fails to start up, you may need to install the Oracle Client software manually, by launching Course Specific Resources → Databases → Oracle10g → Install Oracle10g… or SQL*Plus from the Start Menu. 3. Select the Oracle10gR2_home Oracle Home from the Connect Using drop-down box, then select or type isorcl-321a in the Database field. Log in using your normal student usercode with “l” (for “lab work”) appended (you also have an “...a” account for assignment work). Note that your initial password is your student ID number. 4. Change your Oracle password, using the ALTER USER SQL statement. Note that you may need to enclose your password within double quotation marks, because the password is treated as an identifier, not a string. For example: alter user bonja007l identified by "Sh@k3nN0tSt1rr3d"; 5. Confirm that your new password works by logging in using a new connection. Do not log out of your first session until you have verified that your new password works. Also, please do not to forget your new password.
5.2 SQL DDL Revision **This example is based around the way that papers are specified here at Otago. You will need to modify it to suit your local administrative environment. Paper code and description are fairly obvious. Points are effectively a measure of workload: 1 point is approximately ten hours for a semester (so 18 points = 180 hours across a semester). EFTS is a New Zealand acronym for Equivalent Full Time Student. It indicates what fraction of a full time load (108 points) the paper takes up, and is normally related to the number of points. For example, an 18-point paper is normally 0.15 of a full-time workload, so its EFTS value is 0.15. This means that each student enrolled in that paper counts as 0.15 of a student across all enrolments for the year. You can safely replace these with your own equivalents.** 1. Write SQL code to create a new table called Paper. Your table should contain the following attributes: •
Paper_Code (a 7-character string),
•
Paper_Description,
•
Points,
•
EFTS.
All papers have an integer number of points, but EFTS (Equivalent Full-Time Student) values may require up to four decimal places. The table’s primary key is Paper_Code. In addition, the Points and EFTS attributes must be constrained to be non-null values greater than 0. As with all INFO 321 lab exercises, be sure to save your SQL code as you go. While you may be able to reverse-engineer an approximation of your original code from the contents of your schema on Oracle, it is much better to keep your original code. Saving one file for the entire schema definition is appropriate. Execute your CREATE TABLE statement, and correct any syntax errors you may have. 2. Create the Student table, which is to contain the following attributes: •
Student_ID (a 7-digit number),
•
Family_Name (a character string of up to 32 characters),
•
Given_Names (up to 32 characters),
•
Known_Name (up to 16 characters),
•
Gender (1 character),
•
Total_Points (for storing the total number of points for which the student is enrolled this year).
Your table definition should also include the following constraints: •
Student_ID is the primary key,
•
Gender must not be null,
•
Gender must be either 'F' or 'M'.
3. Create the Enrolment table. As Enrolment is purely associative, you should be able to work out for yourself what the attributes and primary and foreign key(s) should be. 4. Write an SQL SELECT statement to query the data dictionary and show the details of the tables you have created.
5.3 SQL DML Revision 1. Write INSERT statements to create some new papers, students, and enrolments, with values that will allow you to test the following additional tasks. (Remember that the DESCRIBE statement can be helpful when inserting data. It works both in TOAD and in plain SQL, and lists all column names and data types for the specified table.)
2. Write a SELECT statement that will show the details of all students whose family name begins with the letter 'M'. 3. Write a SELECT statement that will show the total number of male and female students. 4. Write a SELECT statement to show the number of students enrolled in each paper. 5. Write a SELECT statement to show the name of each student and the paper codes for papers in which they are enrolled. Construct the student name as a single column, with the family name in all capitals. The results should be sorted in order of student name and paper code.
5.4 Indexes and Physical Tuning 1. Create a B-tree index on the Family_Name attribute of Student. 2. Write an SQL SELECT statement to find out the names of all the schema objects in your schema. The data dictionary view to query for this is USER_OBJECTS. 3. Write SQL SELECT statements to show the specific details of the table and index schema objects in your database. These can be found in the USER_TABLES and USER_INDEXES data dictionary views respectively. 4. Write an SQL SELECT statement to find out the details of the physical-level segments that have been created by the DBMS for storing the data in your schema objects. Use the USER_SEGMENTS data dictionary view for this task. Consult “Oracle Exposed”, and Figures 1–1, 1–2, and Table 1–3 in the lecture material for an explanation of how the various logical and physical objects in Oracle are interrelated. 5. In the above questions involving the USER_TABLES and USER_INDEXES data dictionary views, you may have noticed that many of the columns did not contain any data. These missing data are mostly physical attributes of the objects, which are used by the optimiser to estimate efficient query plans. Unless these statistics are known and current, the optimiser will not be able to make well-informed decisions. Run the ANALYZE statement with the COMPUTE STATISTICS option on one of your tables and one of your indexes, and observe how the data in the relevant data dictionary views change. 6. Use the Explain Plan command in TOAD (Ctrl-E) on a suitable test query to find out whether Oracle is using the indexes you created. Why might the optimiser choose not to use an index for a particular query? Is there a way to force the DBMS to use a particular access method? 7. Follow the procedure outlined above for running SQL Trace and TKPROF on your index test query from the previous question. Examine the trace profile output to see what information is provided there.
8. Write a query to show the average size of rows in the Student table (hint: use the data dictionary). 9. Download and create the RowID utilities from the script provided on Blackboard, and then use the RowID_Info(RowID) function call in some queries to find out the physical location of the rows in your tables.
5.5 PL/SQL Revision 1. Write a PL/SQL function that will return the total EFTS value for a given student (i.e. the function should accept a student ID as its only argument). 2. Write a trigger that will update a student’s Total_Points when he or she enrols for or withdraws from a paper.