 |
|
Oracle pagination tips
Oracle Tips by Vikas Rajan Atrey
|
Also see these important notes on
estimating the
maximum results from a paginated Oracle SQL query.
Invariably while developing
data intensive web applications we come across the requirement of
pagination. The term "pagination" refers to the collection of a
definite number of rows per page, along with a mechanism to navigate
back-and-forth between the pages. We also have the concept of
filtering, a technique to cut-short the result set based on some
specified attribute's value. Let's take a closer look at doing pagination
in Oracle SQL.
The number of rows per page
can be specified in two ways:
Also, (for the user's
convenience) options can also be coded to "group" or "sort" the SQL result set,
based on some user specified attributes, on the fly.
The objective of this short
article is to suggest a way to incorporate these two features in order to get
best response time for the end user. We use the terms "Delayed lookup"
and "Guided Search" to explain these two Oracle pagination
techniques.
If we carefully examine the
table list in the ?FROM? clause of a SQL query, each of the tables in the list
can be divided in to one or more of the below three categories.
- Display table -
The table's columns are present in the final result set (the paginated
display).
- Filtering table -
The table is used only to filter the data.
- Lookup table -
The table is being purely used to display the description corresponding to a
code/identifier i.e. the table is a lookup table.
For queries with lookup
tables, we can defer the access of the lookup tables (category #3) and
execute the remaining query. Once we have reduced the result set to only
few dozen records (with all the required grouping and sorting) then only we will
access the lookup tables and fetch the desired columns values.
Structurally, such a deferred
SQL query might look like this. Note that the join into tables "b" and "c"
are deferred until after the main result set is fetched. Again, the lookup
tables ?b? and ?c - are deferred, and they will be joined only with 20 rows
(number of records to be displayed in a page) corresponding to the main result
set.
select
v2.* ,
b.desc ,
c.desc
from
b
,
c ,
(select * from
(
Select rownum rn , v1.*
From
(Select /*+
materialize no_merge
*/
t1.id1 ,
t2.id2, t1.col1 , t2.col2
from
t1 , t2
Where
<join clause>
And
<filter clause>
< group by clause>
<Order by clause>)v1
where
rn >1 and rn <=20) v2
where
v2.id1= b.id1
and
v2.id2=c.id2;
Note the use of the
materialize
hint for keeping the intermediate result set and the
no_merge hint.
It is a well known fact that
oracle can join only two tables at a time (The different join orders are
evaluated by the oracle SQL optimizer, and trace event 10053 or an execution
plan can be used to see which join orders are evaluated by oracle to arrive at
the final join order) .
For example, if there are 5
tables in the FROM clause, then theoretically oracle may evaluate 120 join
orders, a factorial expansion. For a 7-way table join, we have 5,040
possible table join combinations.
After evaluating certain join
orders, the optimizer may decide that it is not worthy to evaluate more join
orders because the query execution time is not going to be impacted much or the
query execution itself will not take that long to justify evaluation of all the
join orders.
The init.ora parameter
optimizer_max_permutations
determines the maximum number of join orders that oracle will evaluate before
deciding the final join order/execution plan. (It becomes a hidden
parameter in Oracle 10g). The setting for optimizer_max_permutations
may also force oracle to stop evaluating further join orders after certain
number of join orders are evaluated resulting into sub optimal execution plan.
It is evident that number of
tables in the ?from ?clause of our main query will get reduced if we use above
technique.
For example, tables b and
c are not any filtering any records, so they can be joined into the main
result set after we have completed the grouping , ordering and pagination
operations (i.e. we have only 20 rows in main result set). This can result
in a very efficient SQL join.
Typically. lookup tables do
not contain more than few thousand records and joining them with only 20 records
will not take much I/O.
Using my Oracle pagination
technique we have reduced the parsing, joining, grouping and sorting work to a
considerable extent, while not compromising the functionality. Hence, we
see much faster query performance.
NOTE: One problem with the
Oracle optimizer is that he cannot always guess the inter-join result set size,
and histograms and dynamic sampling can help. See this important note on
tuning with Oracle histograms.
Also, see
determining the optimal table join order.
Sometimes users may not
remember the exact filter (to get required result set), and specify a format
mask, forcing our paginated application to use more than one wildcard in the
query (e.g. the filter in the query may be: "where name like '%P%'" ).
This rules out any possibility of effective index usage and forces a full-table
scan.
Also, the number of records
satisfying this search might be very huge, creating a burden on database server
as well as on the network (not to mention the poor end user response time).
One obvious solution in this
scenario could be to force user to enter at least certain number of characters
as his search criterion so as to limit number of records satisfying his filter
but that solution will not help as the user really does not know those many
characters corresponding to his search.
A better approach would be to
use guided search which helps the end-user to refine his search
and then fetch the required information. To do this, we first present the
end-user a list of values satisfying his search criterion (without any result
set), and then use that refined search to fetch the required paginated
information for the user. This approach will benefit the user, the application,
Oracle and the entire network.
It is very unlikely that any
business user will be interested to browse huge number of records (satisfying
clauses similar to where name like '%P%') that his unconstrained filter
will fetch. Instead, they are interested only in few selected records but
they can not find proper criteria. Once the end-user is provided with huge
number of records, they can again refine his search to get to his candidate
records.
We are only reversing the
sequence of the steps in his two step search process and in turn user is getting
convenience and much better response time not to mention the reduced load on the
database and the network traffic.
A better, faster application with Oracle
pagination:
The query will be executed in
two parts. The first part of the query will help the user to refine his search.
(e.g. user is searching house online, but they are not sure about the exact
details. So, our first step will be help user to refine their search.
The small query is used in
the first step of the search will access only an index, and it has no need to
evaluate any business logic, causing it to execute much faster. Also the result
set will usually be the single column, with much less data for the application
to send across the network.
If there are only definite
number of permutations (the set of search columns) allowed to be used by the
search then these sets can be pre-populated at EOD (End of the day) or SOD
(Start of the day) activities of the application. We also have the option
of pre-populating these values from Oracle's Index Organized tables (IOTs) for
speedier access.
Now, after the completion of
the step 1, the user has already narrowed down his search and hence the query to
fetch the main result set. At this point, the end-user will be in better
position to use more selective indexes so as to fetch precise result set in much
less time. Also such steps will improve database performance in general.
About the author:
 |
Vikas Rajan Atrey is an experienced
Oracle professional and a Technical Architect employed by Infosys
Technologies Limited in Pune, India. Vikas can be reached at
atrey_vikas at yahoo.com. |