Tuesday, September 23, 2008

Optimizer Mode - FIRST_ROWS Vs ALL_ROWS

What circumstances we use ALL_ROWS and what circumstances we use FIRST_ROWS optimizer mode? This article is written in oracle9i.

First_rows attempts to optimize the query to get the very first row back to the client as fast as possible. This is good for an interactive client server environment where the client runs a query and shows the user the first 10 rows or so and waits for them to page down to get more.

All_rows attempts to optimize the query to get the very last row as fast as possible. This makes sense in a stored procedure for example where the client does not regain control until the stored procedure completes. You don't care if you have to wait to get the first row if the last row gets back to you twice as fast. In a client server/interactive application you may well care about that.

In TOAD or SQL Navigator, When we select the data, it display immediately. But it does not mean that, it is faster. If we scroll down, it might be fetching the data in the background mode. First_rows is best place for OLTP environment. Also in some reporting environment, if user wants to see initial data first and later see the rest of the data, then first_rows is good option. When we run the query in the stored procedure, first_rows would not be a good choice, all_rows is good option here, because, there is no use to fetch the first few records immediatley inside the stored procedure.

Let us demonstrate the FIRST_ROWS/ALL_ROWS optimizer hint.

Oracle9i Enterprise Edition Release 9.2.0.8.0 - 64bit Production
With the Partitioning, OLAP and Oracle Data Mining options
JServer Release 9.2.0.8.0 - Production

SQL> create table testtable as select * from user_objects;

Table created.

SQL> create index idx on testtable(object_type);

Index created.

SQL> EXECUTE DBMS_STATS.GATHER_TABLE_STATS(OWNNAME => 'SCREPORT',TABNAME => 'TESTTABLE',ESTIMATE_PER
CENT => 10, METHOD_OPT => 'FOR ALL COLUMNS SIZE 1', CASCADE => TRUE);

PL/SQL procedure successfully completed.

SQL> select count(*) from testtable;

COUNT(*)
----------
5619712

SQL> select count(*) from testtable where object_type='TABLE';

COUNT(*)
----------
2392064

SQL> set autotrace traceonly exp;

SQL> select /*+ all_rows */ * from testtable where object_type='TABLE';

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=HINT: ALL_ROWS (Cost=4316 Card=62
3914 Bytes=53032690)

1 0 TABLE ACCESS (FULL) OF 'TESTTABLE' (Cost=4316 Card=623914
Bytes=53032690)

In TESTTABLE table, we have around 5 million records, the above query returns half of the records. Optimizer use full table scan when we use all_rows hint. Because, it needs to read all the rows before it display the data in the screen. The cost for the all_rows is 4316.


SQL> select /*+ first_rows */ * FROM TESTTABLE where object_type='TABLE';

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=HINT: FIRST_ROWS (Cost=51502 Card
=623914 Bytes=53032690)

1 0 TABLE ACCESS (BY INDEX ROWID) OF 'TESTTABLE' (Cost=51502 C
ard=623914 Bytes=53032690)

2 1 INDEX (RANGE SCAN) OF 'IDX' (NON-UNIQUE) (Cost=1604 Card
=623914)

Optimizer use the index scan when we use the first_rows hint. Because, it use the index scan and reads first few rows to display in the screen, then it reads the rest of the data. The cost is 51502.
so first_rows hint looks faster, but it is really not. In this example, the cost is 12 times more in index scan(first_rows) when we compared to All_rows(full table scan).

When do we use FIRST_ROWS?

To answer this question, we can use first_rows when user want to see the first few rows immediately. It is mostly used in OLTP, some reporting environment.

When do we use ALL_ROWS?

To answer this question, we can use all_rows when user want to process all the rows before we see the output.. Mostly used in OLAP. All_rows use less resource when compared to first_rows.

Important factor in FIRST_ROWS

1. It prefer to use the index scan
2. It prefer to use nested loop join over hash joins. Because, nested loop joins data as selected. but hash join hashes the data in hash table which takes time.
3. Good for OLTP

Important factor in ALL_ROWS

1. It use both index scan & full table scan depends on how many blocks optimizer is reading in the table.
2. Good for OLAP
3. It most likly to use hash join, again depends upon other factors.

No comments: