Wednesday, April 26, 2006

Oracle Tuning

For most efficient access, the order that tables are specified can make a difference. The optimiser will re-order table access based on the indexes available. If access paths on two tables seem equally efficient, Oracle will use the order of tables specified in the FROM clause to determine the execution path.
The rule is to place the table that reduces the number of returned rows by the largest factor last. This table tends to be the one that has most WHERE conditions reducing the data, or it is the smallest table. If in doubt, consult the DBA team.
Just to remind you that this order of tables is important only if you are using Rule Based Optimizer. If that is the case, then table with fewer rows should be listed LAST in the FROM clause (because that should be the driving table, and parser processes table names from right to left).


2.5 WHERE Clause
The bulk of SQL performance tuning relates to the careful specification of the WHERE clause. Each clause within the WHERE clause can be executed with varying degrees of efficiency. The efficiency of a particular WHERE clause is a function of the size of the table, and the indexes that are available on the selecting column(s).
2.5.1 Access Paths and Indexes
To obtain a row from a given table in the database, Oracle uses an access path. An access path is a mechanism of obtaining particular rows. A set of access paths for all tables involved in a query and the internal Oracle operations that join the results from the access paths form a complete execution plan.
Oracle supports many different access paths, of which at least one is available for every row required for each table used within a query. The following table shows the types of access paths supported by Oracle, where ranked number one is the most efficient and ranked 15 is the least efficient. Oracle uses the relative efficiencies in this table as part of the process of determining cost for a particular execution plan..

Rank Condition
1 Single row by ROWID
2 Single row by cluster join (only for clustered table)
3 Single row by hash cluster key with unique or primary key (only for hash clusters)
4 Single row by unique or primary key (where unique index exists)
5 cluster join (for cluster tables)
6 Hash cluster key (for cluster columns)
7 Indexed cluster key (for clustered columns)
8 Composite Index
9 Single column indexes
10 Bounded range searches
11 Unbounded range searches on indexes columns
12 Sort-merge join (join of non-indexed columns)
13 MAX or MIN of indexed column (where these group functions appear in the column list of the SQL statement.
14 ORDER BY on indexed columns
15 Full table scan. This access path is always available.
An index is an additional database structure that is manually created but automatically maintained by Oracle. One or more columns in a single table may be included in a single indexed, and there may be many indexes for a particular table.
An index allows rapid access to particular rows in a table when a WHERE clause selects
* all of the columns referenced by a particular index
* the leading columns of an index created over multiple columns.
Unless an index exists on columns used in the WHERE clause, access paths that use indexes are not available.
Cluster and hash indexes are characteristics of the way that tables are physically built in the database. As we can see from the table listed above that indexes and clustering make the most efficient access paths available to Oracle.
Accesses to non indexed columns in tables can only be made via a full table scan. For large tables, a full table scan will take a long time and is to be avoided. If the application needs to locate records in a large table by a non indexed column, seek advice from the DBA team.

3 Summary
This section summarises the following good practices which application developers should bear in mind when developing SQL scripts:
* Table Aliases - always use table aliases when more than one table is involved in a query. The advantages of this method are to prevent future syntax errors when ambiguously named columns is added to table.The SQL statement is more readable.
* Sequencing of WHERE clause conditions - Note that non-index WHERE clause conditions are evaluated one by one after retrieving the rows. The AND statements are evaluated bottom-up, and the OR statements are evaluated top-down. Always place complex AND conditions in the beginning and placing complex OR conditions at the end.
* Reduce the number of trips to the database - Every time a SQL is executed the statements are parsed, indexes are evaluated, variables are bound, and in a client-server environment, SQL and results are transferred over the network. In order to maintain efficiency developers should minimises the number of statements executed.
* WHERE vs HAVING - HAVING filters the groups after the rows have been fetched, sorted and grouped. WHERE filters the rows when they are fetched. Therefore, the WHERE statement will reduce the overheads involved in reading the data blocks and sorting. This in turn improves the performance.
* EXISTS in place of DISTINCT - EXISTS does not result in sorting and it is more efficient when an index is available. DISTINCT is always sorts the results and eliminates the duplicates.
* NOT EXISTS in place of NOT IN - NOT IN with a sub-query results in an internal sort. Replacing NOT IN (sub-query) with NOT EXISTS (co-related sub-query) may improve performance. Use this method ONLY WHEN an index is available for co-related sub-query.
* Using Indexes - Using indexes improves the performance only when the number of records to be retrieved are about 20% of total records in the table. To make the indexes available always avoid using calculations on indexed columns, and indexed_column <> value or using indexed_column IS NULL statements.

1 comment:

SPoT said...

nice article! Loved the last section.