Sunday, November 8, 2009

Dynamic Sampling in Oracle

I started my career with oracle 7.0 and i remember, in oracle7.x/8.x version, optimizer use Rule based optimizer if one or all of the tables does not have any statistics in the join query. In oracle9i R2, dynamic sampling query is introduced and it has ability to sample the table and collect the better statistics dynamically during the hard parse. This sampling takes place only at hard parse time and is used to dynamically generate better statistics for the optimizer to use, hence the name dynamic sampling.

Oracle optimizer use the default statistic values when table does not have any statistics and dynamic sampling query feature is disabled. The default statistics values are well documented in Performance Tuning Guide.

How do we set Dynamic sampling parameter? There are couple of ways, we can set the parameter.

1. We can set this in session level or instance level.
2. Dynamic sampling query hint in query level. (Hint)

Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.3.0 - Production
With the Partitioning, OLAP and Data Mining options

SQL> alter session set OPTIMIZER_DYNAMIC_SAMPLING = 2;

Session altered.

SQL> alter system set OPTIMIZER_DYNAMIC_SAMPLING = 2;

System altered.

SQL> select /*+ dynamic_sampling(emp 0) */ * from scott.emp;

When does dynamic sampling is useful?
1. It is useful when table data is drastically changing and statistic becomes stale very often.
2. Global temporary table is good place to use dynamic sampling feature.

Dynamic sampling Levels? Optimizer Dynamic Sampling has 11 levels between 0 to 10. What are these levels? The answer is pretty straight forward and it is clearly documented in Oracle Performance Tuning Guide. In Oracle9i R2, the default setting for dynamic sampling level is 1. At this setting, the optimizer will tend to use dynamic sampling only in some cases when a table is unanalyzed. This includes the case of global temporary tables without statistics but doesn’t ensure that dynamic sampling will be used against all unanalyzed tables. In Oracle Database 10g Release 1 and above, the default setting is 2. This setting ensures that the optimizer will have a good estimate whenever it optimizes a query that accesses a table with no statistics.

The source of this Article is Asktom. Please read this link and this link has very pretty good information about Dynamic sampling.

Wednesday, November 4, 2009

Restoring old statistics

I see some DBA's, they won't backup the statistics before collect the new statistics. If we don't have backup, and optimizer behaves differently with new statistics, then we are in trouble. If we have old statistic backup, then we can resolve the performance issue by restoring the old statistics. Oracle highly recommends to take the backup of current statistics before we collect new statistics.

Oracle10g has a new feature which will save the statistics before overwriting the new statistics.

There are couple of ways, we can backup the statistics.

1. We can export the statistics in a table and import back when ever we need. I discussed this approach in different thread. This approach is mostly useful when we transfer the statistics from one server to another server. Please refer this thread Post

2. Since Oracle10g, whenever statistics are modified, old version of statistics are saved automatically before overwriting the new statistics. However automatic saving of old statistics will not take place if you use ANALYZE in 10g for statistics collection.

I am going to discuss how the statistics are saved and how do we restore the old statistics in Approach 2.

Where does oracle store the statistics? Oracle DBA_OPTSTAT_OPERATIONS table contains a log history of statistics collected on the database at various level such as table, schema or database. We can restore older statistics anytime by making use of START_TIME and END_TIME values from DBA_OPTSTAT_OPERATIONS table.

How does Oracle maintain the Statistics History? GET_STATS_HISTORY_AVAILABILITY is a Function in DBMS_STATS package that returns oldest timestamp where statistics history is available. Users cannot restore statistics to a time stamp older than the oldest time stamp. Here is the query to find the statistics history starting timestamp.
SQL> select dbms_stats.get_stats_history_availability from dual;

GET_STATS_HISTORY_AVAILABILITY
---------------------------------------------------------------------------
06-OCT-09 09.52.45.351000000 PM -05:00

SQL>

Oracle maintain the history statistics based on the retention value. Oracle Purge the statistics automatically if it exceeds the statistics retention setting. By defualt, oracle purge the statistics if it is older then 31 days. We can also alter this retention period by using DBMS_STATS.ALTER_STATS_HISTORY_RETENTION procedure. PURGE_STATS is procedure in DBMS_STATS package that enables you to manually purge old statistics beyond a time stamp.

SQL> execute dbms_stats.alter_stats_history_retention(20);

PL/SQL procedure successfully completed.

As per the above code, oracle keep 20 days statistics history.

SQL> execute dbms_stats.purge_stats('06-OCT-09 09.52.45.351000000 PM -05:00');

PL/SQL procedure successfully completed.

SQL>

As per the above code, oracle purge the statistics prior to 06-OCT-09.

How do we recover the statistics? Oracle RESTORE_TABLE_STATS procedure restores statistics of a table as of a specified timestamp. It also restores the statistics of associated indexes and columns. Let me take EMP table in scott Schema and restore previous statistics.

Here are the steps to restore the previous timestamp statistics:

Step1: Let us check the past statistics and what date the statistics were collected.

Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.3.0 - Production
With the Partitioning, OLAP and Data Mining options

SQL> select STATS_UPDATE_TIME from dba_tab_stats_history where table_name='EMP';

STATS_UPDATE_TIME
---------------------------------------------------------------------------
30-OCT-09 06.25.54.809000 PM -04:00
07-NOV-09 07.23.23.504000 AM -05:00
07-NOV-09 07.27.31.709000 AM -05:00
07-NOV-09 07.29.54.324000 AM -05:00

SQL> select last_analyzed from dba_tables where
2 table_name='EMP' and owner='SCOTT';

LAST_ANAL
---------
11-NOV-09

SQL> select sysdate from dual;

SYSDATE
---------
11-NOV-09

SQL>

Step2 : Let us restore the statistics which we collected on 30-OCT-2009.

SQL> execute dbms_stats.restore_table_stats('SCOTT','EMP','30-OCT-09 06.25.54.809000 PM -04:00');

PL/SQL procedure successfully completed.

SQL> select last_analyzed from dba_tables where
2 table_name='EMP' and owner='SCOTT';

LAST_ANAL
---------
30-OCT-09

SQL>


At what circumstances, do we restore the Statistics? We have performance issue in the database and DB was fine yesterday and since yesterday, there is no major change on database. Oracle10g automatic statistic gathering is turned on and last statistics gathering was close to the time that users started complanining about performance. At this circumstances, may be new statistics could be a culprit. We can restore the old statistics and see if we have any performance issue.

Restriction on restoring the statistics:

1. It does not have ability to restore user-defined statistics.

2. Old statistics will not be saved if we use ANALYZE command to gather statistics. Hence, we can recover the statistics.

Here is the input(Help) to write this topic.

Monday, November 2, 2009

Automatic Statistics Gathering in Oracle10g

Oracle10g has one of the automated feature which is called Automatic Statistics gathering. Oracle gathers statistics on all database objects automatically and maintains those statistics in a regularly scheduled maintenance job. When the DB is created, job is automatically created and started to collect statistics on database object with missing or stale statistics. The job GATHER_STATS_JOB calls the procedure dbms_stats.gather_database_stats_job_proc to gather statistics during the maintenance window.

What does GATHER_DATABASE_STATS_JOB_PROC do? Unfortunately, the package body is wrapped, so without access to the original source code, the implementation details are obscured. This procedure collects statistics on database objects when the object has no previously gathered statistics or the existing statistics are stale because the underlying object has been modified significantly (more than 10% of the rows).

This is an internal procedure, but its operates in a very similar fashion to the DBMS_STATS.GATHER_DATABASE_STATS procedure using the GATHER AUTO option. The primary difference is that the DBMS_STATS.GATHER_DATABASE_STATS_JOB_PROC procedure prioritizes the database objects that require statistics, so that those objects which most need updated statistics are processed first. This ensures that the most-needed statistics are gathered before the maintenance window closes.

How do we enable this feature? Automatic statistics gathering is enabled by default when a database is created, or when a database is upgraded from an earlier database release. The following condition should be met to enable this feature.

1. GATHER_STATS_JOB should be enabled.
2. statistics_level should be TYPICAL or ALL.

We can enable the job as below when the job is already disabled for any reason.

Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.3.0 - Production
With the Partitioning, OLAP and Data Mining options

SQL> connect sys/password@orcl as sysdba
Connected.
SQL> BEGIN
2 DBMS_SCHEDULER.ENABLE('GATHER_STATS_JOB');
3 END;
4 /

PL/SQL procedure successfully completed.

SQL>

What time does automatic statistic collection run? By default this job runs within a maintenance window between 10 P.M to 6 A.M week nights and all day on weekends. But in oracle11g, i believe, the maintenance window is not same as Oracle10g.

Do we need to collect statistics in other then the scheduled window? In few Scenario's, it would be required to gather statistics manually.

1. In some cases, we might need to collect the statistics in specific objects other than this maintenance window. If the table data are changed rapidly and optimizer behaves differently due to data changes, then we need to gather statistics manually. We can not wait till the maintenance window open.

2. If the table is bigger and it is failing in the maintenance window for some reason, then you can lock that table during the maintenance window and unlock the table and gather the statistics manually and lock back again.

3. If you have specific table and you want to gather statistics with different option, then you can lock those tables during the maintenance window and gather statistics manually during day time.

Should we change the parameter values in Automatic statistic collection? Oracle recommends not to change the default values for automatic statistics collection. In case, if you need to gather statistics something differently, then lock the table during the maintenance window and unlock the table and gather statistics manually with specfic parameter and lock the table back again.

We have get_param function to find out the parameter values. Also set_param procedure to change the default values for dbms_stats pacakge parameters.

Friday, October 30, 2009

Compute Oracle Index Statistics

Many times, we forget to gather index statistics after creating the index. Optimizer will not use the index effectively if index does not have statistics.

Prior to Oracle9i, we have to issue two commands for creating index and gathering index statistics.

SQL> create index idx on test(object_name);

Index created.

SQL> exec dbms_stats.gather_index_stats(null, 'IDX');

PL/SQL procedure successfully completed.

SQL>

Starting in Oracle 9i, we have a compute statistics clause. We can both creating index and gathering index statistics in one command.

Connected to:
Oracle9i Enterprise Edition Release 9.2.0.8.0 - 64bit Production
With the Partitioning, OLAP and Oracle Data Mining options
JServer Release 9.2.0.8.0 - Production

SQL> create table test as select * from user_objects;

Table created.

SQL> create index idx on test(object_name);

Index created.

SQL> select table_name, num_rows, last_analyzed
2 from user_tables
3 where table_name ='TEST'
4 /

TABLE_NAME NUM_ROWS LAST_ANAL
------------------------------ ---------- ---------
TEST

SQL> drop index idx;

Index dropped.

SQL> create index idx on test(object_name) compute statistics;

Index created.

SQL> select table_name, num_rows, last_analyzed
2 from user_tables
3 where table_name ='TEST'
4 /

TABLE_NAME NUM_ROWS LAST_ANAL
------------------------------ ---------- ---------
TEST 184 31-OCT-09

SQL>

In Oracle10g, we do not need to use compute statistics clause at all. Oracle gather statistics while creating index automatically.

Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.3.0 - Production
With the Partitioning, OLAP and Data Mining options

SQL> create index idx on test(object_name);

Index created.

SQL> select num_rows, last_analyzed from user_ind_statistics where index_name ='IDX';

NUM_ROWS LAST_ANAL
---------- ---------
50484 30-OCT-09

SQL>

Statistics Lock in Oracle10g

Oracle10g has one of the useful feature that we can lock the table statistics. When statistics on a table are locked, all the statistics depending on the table, including table statistics, column statistics, histograms and statistics on all dependent indexes, are considered to be locked. No one can gather statistics on a table when it is locked. But ofcourse, we can overwrite the statistics with force option. How does it useful for DBA's on day to day activities? At what circumstances, this feature is useful?

We can use this feature in the following circumstances...

1. There are tables where you want to setup gathering statistics manually. You can stop gathering statistics during the regular schedule by locking the statistics.

2. Some cases, Queries works fine with old statistics. You can avoid gathering statistics at this situation.

3. Some time, tables are bigger and automatic gathering statistics might fail silently. In this scenario, we might need to lock the table and collect the statistics seperately. Refer these links Post1, Post2, Post3

4. Sometime, gathering statistics, creating histograms takes very long time on bigger table and we can avoid such a bigger table while collecting statistics for all the tables in schema or DB level.

5. For some reason, if we want to use any specific parameter to gather statistics on particular table, then we can use this option to lock the statistics and gather statistics in different time.

How do we lock the statistics?

Connected to:Oracle Database 10g Enterprise Edition Release 10.2.0.3.0 - Production
With the Partitioning, OLAP and Data Mining options

SQL> execute dbms_stats.lock_table_stats('SCOTT','EMP');

PL/SQL procedure successfully completed.

SQL> execute dbms_stats.lock_schema_stats('SCOTT');

PL/SQL procedure successfully completed.

SQL>

Index Creation on Locked tables : In oracle10g, when you create index, the statistics also will be generated automatically. (Please refer my another Post for gathering statistics while creating index) . When the table is locked, statistics will not be generated while creating the index. We need to use FORCE option to gather the statistics while creating index for locked objects.

Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.3.0 - Production
With the Partitioning, OLAP and Data Mining options

SQL> create table test as select * from dba_objects;

Table created.

SQL> exec dbms_stats.lock_table_stats(null, 'TEST');

PL/SQL procedure successfully completed.

SQL> create index idx on test(object_name);

Index created.

SQL> select num_rows, last_analyzed from user_ind_statistics where index_name ='IDX';

NUM_ROWS LAST_ANAL
---------- ---------

SQL> drop index idx;

Index dropped.

SQL> create index idx on test(object_name) compute statistics;
create index idx on test(object_name) compute statistics
*
ERROR at line 1:
ORA-38029: object statistics are locked

SQL> create index idx on test(object_name);

Index created.

SQL> exec dbms_stats.gather_index_stats(null, 'IDX');
BEGIN dbms_stats.gather_index_stats(null, 'IDX'); END;

*
ERROR at line 1:
ORA-20005: object statistics are locked (stattype = ALL)
ORA-06512: at "SYS.DBMS_STATS", line 10640
ORA-06512: at "SYS.DBMS_STATS", line 10664
ORA-06512: at line 1

SQL> exec dbms_stats.gather_index_stats(null, 'IDX',force=>true);

PL/SQL procedure successfully completed.

SQL> select num_rows, last_analyzed from user_ind_statistics where index_name ='IDX';

NUM_ROWS LAST_ANAL
---------- ---------
50484 30-OCT-09

SQL> alter index idx rebuild compute statistics;
alter index idx rebuild compute statistics
*
ERROR at line 1:
ORA-38029: object statistics are locked

SQL> alter index idx rebuild;

Index altered.

SQL> exec dbms_stats.gather_index_stats(null, 'IDX',force=>true);

PL/SQL procedure successfully completed.

SQL>

Monday, October 26, 2009

PLSQL_OPTIMIZE_LEVEL

I came to Austin for interview and i am on my way to Newjersy. I am hanging in the Austin Airport and i have another three hours for my flight departure. I thought, i write about one of the new Oracle10g optimization parameter PLSQL_OPTIMIZE_LEVEL.

This parameter determine the optimization level to compile the PLSQL code. The higher setting, oracle use more efforts to compile the code. This parameter will eliminate the dead code and moving the code out of the loop which does the same thing for each iteration. This has three valid values, which are 0,1 and 2. But default value for this parameter is 2.

Let us discuss about each value for this parameter. Please note, Oracle has not provided any detail level example for each value of this parameter. So i can't demonstrate exactly what oracle does for each level. Indeed, we can see the performance improvement in each level.

PLSQL_OPTIMIZE_LEVEL = 0 The value 0 works some what as pre 10g release. Oracle documentation says it works better than 9i. Let me write a procedure and run in oracle10g with value 0.

Oracle Database 10g Enterprise Edition Release 10.2.0.3.0 - Production
With the Partitioning, OLAP and Data Mining options

SQL> alter session set plsql_optimize_level =0;

Session altered.

SQL> set serveroutput on
SQL> create or replace procedure test as
2 a integer;
3 b integer;
4 c integer;
5 d integer;
6 v_time integer;
7 begin
8 v_time := Dbms_Utility.GET_CPU_TIME();
9 for j in 1..10000000 loop
10 a:= 100;
11 b:= null;
12 c:= nvl(b,1)+a;
13 end loop;
14 Dbms_Output.Put_Line(Dbms_Utility.GET_CPU_TIME()-v_time);
15 end;
16 /

Procedure created.

SQL> execute test;
770

PL/SQL procedure successfully completed.

The above procedure runs in 770 mseconds in oracle10g with plsql_optimize_level=0.

PLSQL_OPTIMIZE_LEVEL = 1 It eliminates unnecessary computation and exceptions. Since oracle has not given any example for the each value, i guess, it removes the statement b:=NULL in the TEST procedure. It does not make any sense to assign NULL value for each iteration of the loop in the TEST Procedure.

SQL> alter procedure test compile plsql_optimize_level = 1;

Procedure altered.

SQL> execute test;
502

PL/SQL procedure successfully completed.

The above procedure executes in 502 seconds and it is better then the plsql_optimize_level=0.

PLSQL_OPTIMIZE_LEVEL = 2 It moves the unnecessary dead code relatively far from its original location. I guess, it moves the assignment statement out of loop. Since it assigns the same value for each iteration which is not meaningful. Be aware that, some time, oracle takes long time to compile the procedure when we have value 2. Since oracle has to rewrite the code during the compilation stage and not during the execution stage.

SQL> alter procedure test compile plsql_optimize_level = 2;

Procedure altered.

SQL> execute test;
301

PL/SQL procedure successfully completed.

SQL>

The above procedure runs in 301 seconds and performance is far better then the value 1.

Monday, October 19, 2009

How to use histogram in Oracle

I would like to write about Oracle Histogram today. Histogram is very nice feature to help cost based optimizer to make right decision.

What is Histogram? Histograms are feature in CBO and it helps to optimizer to determine how data are skewed(distributed) with in the column. Histogram is good to create for the column which are included in the WHERE clause where the column is highly skewed. Histogram helps to optimizer to decide whether to use an index or full-table scan or help the optimizer determine the fastest table join order.

What are the advantage of Histogram? Histograms are useful in two places.

1. Histograms are useful for Oracle optimizer to choose the right access method in a table.

2. It is also useful for optimizer to decide the correct table join order. When we join multiple tables, histogram helps to minimize the intermediate result set. Since the smaller size of the intermediate result set will improve the performance.

Type of Histograms: Oracle uses two types of histograms for column statistics: height-balanced histograms and frequency histograms.

1. Height - balanced Histograms : The column values are divided into bands so that each band contains approximately the same number of rows. For instances, we have 10 distinct values in the column and only five buckets. It will create height based(Height balanced) histograms and it will evenly spread values through the buckets. A height-based histogram is when there are more distinct values than the number of buckets and the histogram statistics shows a range of rows across the buckets

2. Frequency Histograms : Each value of the column corresponds to a single bucket of the histogram. This is also called value based histogram. Each bucket contains the number of occurrences of that single value. Frequency histograms are automatically created instead of height-balanced histograms when the number of distinct values is less than or equal to the number of histogram buckets specified.

Method_opt Parameter: This is the parameter which tells about creating histogram while collecting the statistics. The default is FOR ALL COLUMNS SIZE AUTO in Oracle10g. But in oracle9i, the default is FOR ALL COLUMN SIZE 1 which will turn off the histogram collection.

FOR ALL [INDEXED HIDDEN] COLUMNS [size_clause]
FOR COLUMNS [size clause] columnattribute [size_clause] [,columnattribute [size_clause]...]

size_clause is defined as size_clause := SIZE {integer REPEAT AUTO SKEWONLY}

integer : Number of histogram buckets. Must be in the range [1,254]

REPEAT : Collects histograms only on the columns that already have histograms.

AUTO : Oracle determines the columns to collect histograms based on data distribution and the workload of the columns. We have a table called sys.col_usage$ that stores information about column usage. dbms_stats use this information to determine whether histogram is required for the columns.

SKEWONLY : Oracle determines the columns to collect histograms based on the data distribution of the columns.

Let me demonstrate how optimizer works with and without histogram as below two scenario. We take the emp table for this demonstration. The table has around 3.6 million records. The table emp_status column is highly skewed. It has two distinct values(Y,N). We have bitmap index on emp_status column.

Scenario 1 Let us generate the statistics without any histogram and see what kind of execution path optimizer is using. Without the histogram, oracle assume that, the data is evenly distributed and optimizer think that, we will have around 1.8 million record for emp_status Y and around another 1.8 million records for emp_status N.

Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.3.0 - Production
With the Partitioning, OLAP and Data Mining options

SQL> select count(*),emp_status from scott.emp
2 group by emp_status;

COUNT(*) E
---------- -
1 N
3670016 Y

SQL> execute DBMS_STATS.GATHER_TABLE_STATS(OWNNAME => 'SCOTT', TABNAME => 'EMP',ESTIMATE_PERCENT =>
10, METHOD_OPT => 'FOR ALL COLUMNS SIZE 1',CASCADE => TRUE);

PL/SQL procedure successfully completed.

SQL> select ename from scott.emp where emp_status='Y';

3670016 rows selected.

--------------------------------------------------------------------------
Id Operation Name Rows Bytes Cost (%CPU) Time
--------------------------------------------------------------------------
0 SELECT STATEMENT 1832K 15M 5374 (5) 00:01:05
* 1 TABLE ACCESS FULL EMP 1832K 15M 5374 (5) 00:01:05
--------------------------------------------------------------------------

SQL> select ename from scott.emp where emp_status='N';

--------------------------------------------------------------------------
Id Operation Name Rows Bytes Cost (%CPU) Time
--------------------------------------------------------------------------
0 SELECT STATEMENT 1832K 15M 5374 (5) 00:01:05
* 1 TABLE ACCESS FULL EMP 1832K 15M 5374 (5) 00:01:05
--------------------------------------------------------------------------

Conclusion: Optimizer is using full table scan for the query which returns 3670016 records as well as it using full table scan for query which returns just only one record. This is obvisouly incorrect. This problem will be resolved by collecting histogram. Let us see in the next scenario.

Scenario 2 : Let us generate the statistics with histogram and see what kind of execution path optimizer is using. FOR COLUMN SIZE 2 EMP_STATUS will create two bucket for column emp_status. If we are not sure the distinct number of values in the column, then we can use AUTO option to collect histogram. With this histogram, oracle optimizer knows that, the column emp_status is highly skewed and it has two bucket and one bucket has around 3.6 million records with emp_status Y and another bucket has only one record with emp_status N. Now depends upon the query, optimizer decides whether to use index or Full table scan.

SQL> execute DBMS_STATS.GATHER_TABLE_STATS(OWNNAME => 'SCOTT', TABNAME => 'EMP',ESTIMATE_PERCENT =>
10, METHOD_OPT => 'FOR COLUMNS SIZE 2 EMP_STATUS',CASCADE => TRUE);

PL/SQL procedure successfully completed.

SQL> select ename from scott.emp where emp_status='Y';

3670016 rows selected.

--------------------------------------------------------------------------
Id Operation Name Rows Bytes Cost (%CPU) Time
--------------------------------------------------------------------------
0 SELECT STATEMENT 3681K 31M 5375 (5) 00:01:05
* 1 TABLE ACCESS FULL EMP 3681K 31M 5375 (5) 00:01:05
--------------------------------------------------------------------------

SQL> select ename from scott.emp where emp_status='N';

--------------------------------------------------------------------------
Id Operation Name Rows Bytes Cost (%CPU) Time
--------------------------------------------------------------------------
0 SELECT STATEMENT 1 9 1 (0) 00:00:01
1 TABLE ACCESS BY INDEX ROWID EMP 1 9 1 (0) 00:00:01
2 BITMAP CONVERSION TO ROWIDS
* 3 BITMAP INDEX SINGLE VALUE IDX_EMP
--------------------------------------------------------------------------

Conclusion: Optimizer is using full table scan for the query which returns 3670016 records. At the same time, optimizer is using index scan when for other query which returns one record. This scenario, the optimizer choose the right execution plan based on the query WHERE clause.

Data dictionary objects for Histogram:
user_histograms
user_part_histograms
user_subpart_histograms
user_tab_histograms
user_tab_col_statistics

Thursday, October 8, 2009

Transferring statistics between database

In general, development DB usually will have only portion of the data when we compared to Production database. In such a scenario, when we fix any production issues, obviously we make the changes in Dev DB and test the code and move to Prod DB. While testing the code in Dev DB, if we want to compare the execution plan between Dev and Prod, then we can copy the Prod DB statistics into Dev DB and forcast the optimizer behaviour in development server.

DBMS_STATS has an ability to transfer statistics between servers allowing consistent execution plans between servers with varying amounts of data. This article is tested in oracle10g. Here are the steps to transfer the statistics.

Source database : orcl
Source schema : sales
Target database : oradev
Target schema : sales


Now our goal is to copy the statistics from sales@orcl to sales@ordev.

Let us follow the below steps to copy the statistics from source(production) to target(development). I am running all the steps in System Schema..

step1. First create a stat table in the source database. The statistics table is created in SYSTEM schema.

Oracle Database 10g Enterprise Edition Release 10.2.0.3.0 - Production
With the Partitioning, OLAP and Data Mining options

SQL> connect system/password@orcl
Connected.
SQL> EXEC DBMS_STATS.create_stat_table('SYSTEM','STATS_TABLE');

PL/SQL procedure successfully completed.

SQL>

Step2. Export the sales schema statistics.

SQL> EXEC DBMS_STATS.export_schema_stats('SALES','STATS_TABLE',NULL,'SYSTEM');

PL/SQL procedure successfully completed.

SQL>

Step3. Export the STATS_TABLE by using expdp or exp utility and move the dump file to target(ordev) server.

Step4. Import the dump file into target database by using impdp or imp utility. Here i imported the dump file in system schema at target server.

Step5. Import the statistics into application schema(sales@ordev). Please remember, previous step, we imported the stats_table content into system schema by using impdp method. But this step, we are importing the statistics into relevant data dictionary table by using dbms_stats pacakge.

SQL> EXEC DBMS_STATS.import_schema_stats('SALES','STATS_TABLE',NULL,'SYSTEM');

PL/SQL procedure successfully completed.

SQL>

Step6. Drop the stats_table in target server.

SQL> EXEC DBMS_STATS.drop_stat_table('SYSTEM','STATS_TABLE');

PL/SQL procedure successfully completed.

SQL>

Note : We can follow step1 and step2 to backup the statistics before we gather new statistics. It is always good to backup the statistics before we overwrite the new statistics. In case, if we see any performance problem with new statistics, then we can import the old statistics. This option is very useful to transfer the statistics from one DB to another DB.

In oracle10g, it automatically save the statistics for last 31 days by default. We can restore the past statistics within the database at any time. This option is useful to restore the statistics in the same database. Please refer this Restoring statistics

Wednesday, October 7, 2009

Refreshing Stale Statistics

Oracle optimizer use the statistics information to choose the right path and execute the query efficiently. It is important to maintain the recent statistics to run the reports efficiently. Oracle highly recommends to use DBMS_STATS to gather statistics. Why oracle recommends to use DBMS_STATS? Click here to answer your question. This article is based on Oracle10g.

DBMS_STATS package has wonderful feature that capable of analyzing the stale statistics. I am going to discuss about collecting stale statistics in dbms_stats package.

In general, Gathering statistics consumes lot of resource and CPU time. Once we gathered statistics on a table, we do not need to collect the statistics on the same table until we make reasonable amount of data changes. Let us say, we have a schema called sales. This schema has lot of tables and many table has huge number of records. We schedule to analyze the entire schema every day at 2AM. In day to day DML activities, some of the tables are not having any changes or very minimum changes. In this scenario, we do not need to analyze the tables which are having very minimum changes or no changes. But scheduler automatically start analyzing all the tables in the schema at 2AM every day. This process unnecessarily consuming extra resource and degrade the server performance.

How do we stop analyzing the tables when there is no DML activity or very minimal DML activity? Yes... We can... Oracle introduced feature in DBMS_STATS package where oracle collect statistics on schema level or database level, only when the statistics are stale or out of date.

What is stale statistics? Oracle will record an approximate count of the number of rows that have been inserted,updated, and deleted in a table. The information will be recorded in user_tab_modifications view. When that count reaches a threshold percentage of the number of rows in the table , then the statistics are considered stale. The table monitoring should be enabled for recording the DML changes in user_tab_modification view. In oracle10g, Oracle automatically enable table monitoring and record the DML changes in user_tab_modifications view. Prior to oracle10g, we need to enable the table monitoring manually.

How do we enable table monitoring? In oracle10g, the table monitoring is default when statistic_level parameter is TYPICAL. Prior to Oracle10g, we need to enable table monitoring manually. Prior to Oracle10g, the below command is used to enable or disable the table monitoring. But since Oracle10g, the below command does not have any effect.

ALTER TABLE table_name MONITORING[NOMONITORING]

What is threshold percentage? Oracle automatically determines the threshold. Oracle doesn't officially document the threshold, so the threshold, and the entire algorithm, is subject to change over time.

Let me give an example how to analyze stale statistics :

Oracle Database 10g Enterprise Edition Release 10.2.0.3.0 - Production
With the Partitioning, OLAP and Data Mining options

SQL> BEGIN
2 DBMS_STATS.GATHER_SCHEMA_STATS (
3 ownname => 'SALES',
4 estimate_percent => 20,
5 block_sample => TRUE,
6 method_opt => 'FOR COLUMNS SIZE 10',
7 options => 'GATHER AUTO',
8 cascade => TRUE);
9 END;
10 /

PL/SQL procedure successfully completed.

SQL>

Note : It is very important that, we should use GATHER AUTO or GATHER STALE to analyze the stale statistics. Also table monitoring is mandatory. Since oracle10g, the table monitoring is enabled by default. So we do not need to worry about table monitoring since oracle10g.

This feature is very useful for large and complex databases where refreshing statistics for all objects can cause a heavy drain on server resources.

Thursday, October 1, 2009

Analyze Versus DBMS_STATS

Cost based optimizer is preferred method for oracle optimizer. In order to make good use of the CBO, you need to create accurate statistics. Prior to oracle8i, we use ANALYZE command to gather statistics.

DBMS_STATS package is introduced in oracle8i. Since Oracle8i, Oracle highly recommeds to use DBMS_STATS instead of ANALYZE command. This article is written in oracle10g. I am going to address below topics in this thread....

1. Why oracle recommends to use DBMS_STATS package?
2. What are the advantages of DBMS_STATS compared to ANALYZE?
3. How do we use DBMS_STATS package to analyze the table?
4. What are new features in each version for DBMS_STATS?

Why oracle recommends to use DBMS_STATS since Oracle8i?

1. Gathering statistics can be done in Parallel. This option is not available in ANALYZE command.

2. It is used to collect the stale statistics. I discussed about collecting stale statistics in another topic. Please refer stale statistics to know more about collecting stale statistics.

3. DBMS_STATS is a PLSQL pacakge. So it is easy to call. But ANALYZE does not.

4. It is used to collect statistics for external tables. But ANALYZE does not.

5. DBMS_STATS used to collect system statistics. But ANLAYZE does not.

6. Some time, ANALYZE does not produce accurate statistics. But DBMS_STATS does.

7. We can not use ANLAYZE command to gather statistics for partition or sub partition level. But we can use DBMS_STATS to analyze any specific partition or sub partition. This is especially useful for partition table. We do not need to analyze the Historical data whenever we refresh the current partition.

8. We can transfer statistics from one DB to another DB when we collected statistics through DBMS_STATS. But it can not be done when we use ANALYZE command to collect the statistics. Please refer statistics transfer to know more about trasferring statistics.

What force you to use ANALYZE command in all Oracle versions? ANALYZE can be used to collect the statistics like CHAIN_CNT, AVG_SPACE, and EMPTY_BLOCKS. DBMS_STATS will not collect these statistics. We might need to use ANALYZE in case if we want to see chained rows, average space and empty blocks.

There are several parameter exists for collecting statistics on table level, schema level, database level and system level. But i do not want to explain all the parameters which are already in Oracle help. Still i would like to explain some parameters.

estimate_percent: Percentage of rows or blocks to estimate. The valid rage is 0.000001 to 100. when we pass NULL for this parameter, then it computes. Compute is same as 100% sample. For instance, if we pass 20%, then it takes roughly around 20% of rows or 20% blocks depends on the BLOCK_SAMPLE parameter. This Parameter is used for analyzing on table, index, schema level and database level.

block_sample: This determines whether or not to use random block sampling instead of random row sampling. Block sampling would be slightly less accurate in the case where rows have roughly the same lifecycle and, thus, values are spread non-uniformly throughout the table. In case if you want to drive in deep on this, David Aldridge has a nice article on block sampling. This Parameter is used for analyzing on table, schema level and database level.

method_opt: This parameter tells about histogram in table. It determine which column should have histogram and number of histogram created for the table columns. This Parameter is used for analyzing on table, schema level and database level. Please refer histogram to know more about Histogram in Oracle.

granularity:This parameter is useful when you want to gather statistics on specific partition or sub partition in a table. The valid parameters are ALL, AUTO, GLOBAL, GLOBAL AND PARTITION, PARTITION, SUBPARTITION. This Parameter is used for analyzing on table, index, schema level and database level only if the table or index is partitioned.

no_invalidate: Does not invalidate the dependent cursors or currently parsed SQL statement if set to TRUE. The procedure invalidates the dependent cursors immediately if set to FALSE. Use DBMS_STATS.AUTO_INVALIDATE to have Oracle decide when to invalidate dependent cursors. This Parameter is used for analyzing or deleting statistics on table, index, schema level, database level.

Degree: Degree of parallelism. It has three valid values.

NULL : Oracle takes the value which is specified in degree clause of create or alter table statement.

DBMS_STATS.DEFAULT_DEGREE : It takes the value based on number of CPU's and init parameters.

DBMS_STATS.AUTO_DEGREE : It determines the value automatically. It is either 1 or default degree according to the size of the object.

Options: This parameter is used only for analyzing the data on schema level or DB level. There are multiple values for this parameters. The valid value for this parameters are GATHER, GATHER AUTO, GATHER STALE, GATHER EMPTY, LIST AUTO, LIST STALE, LIST EMPTY. Let me explain these valid values in short. Since these values are important to gather statistics on schema level.

GATHER-Gather statistics for all the objects in a schema or database.

GATHER AUTO-Gather statistics when the statistics are stale or when there is no statistics. It does both GATHER STALE and GATHER EMPTY.

GATHER STALE-Gather statistics only when it is stale. Does not collect when there is no statistics.

GATHER EMPTY-Gather statistics only when no statistics.

LIST AUTO: Returns a list of objects to be processed with GATHER AUTO.

LIST STALE: Returns list of stale objects as determined by looking at the user_tab_modifications

LIST EMPTY: Returns list of objects which currently have no statistics.

Gathering_mode : This parameter is used only for gathering system statistics. The valid modes are NOWORKLOAD, INTERVAL,START and STOP. The default is NOWORKLOAD. The START and STOP is used to stop and start the system statistics.

Example for collecting statistics on table:

DBMS_STATS.GATHER_TABLE_STATS(
OWNNAME => 'TANDEB',
TABNAME => 'CUSTOMER',
PARTNAME => 'PART092009'
GRANULARITY => 'PARTITION',
ESTIMATE_PERCENT => 10,
METHOD_OPT => 'FOR ALL COLUMNS SIZE 1',
CASCADE => TRUE,
NO_INVALIDATE => TRUE);

Example for collecting statistics on Schema:

DBMS_STATS.GATHER_SCHEMA_STATS(
OWNNAME => 'SCOMPPRD',
ESTIMATE_PERCENT => 10,
METHOD_OPT => 'FOR ALL COLUMNS SIZE 1',
OPTIONS => 'GATHER',
CASCADE => TRUE,
NO_VALIDATE => TRUE);


Example for collecting system statistics:

DBMS_STATS.GATHER_SYSTEM_STATS(
GATHERING_MODE => 'INTERVAL',
INTERVAL => 10);

Example for collecting database statistics:

DBMS_STATS.GATHER_DATABASE_STATS(
ESTIMATE_PERCENT => 10,
METHOD_OPT => 'FOR ALL COLUMNS SIZE 1',
CASCADE => TRUE,
NO_VALIDATE => TRUE,
GATHER_SYS => FALSE)


New feature in Oracle9i:

1. Introduced to gather system statistics. Such as I/O and CPU utilization.

2. It can direct the database to select the appropriate sample size to generate accurate statistics. A new value for the ESTIMATE_PERCENT parameter, DBMS_STATS.AUTO_SAMPLE_SIZE will let Oracle decide the sample size necessary to ensure generation of accurate statistics.

3. Oracle9i introduced new values for the size clause in the METHOD_OPT parameter automate the decisions regarding the columns on which histograms need to be created while letting administrators control the factors affecting such decisions. Besides specifying a numeric value for the size clause, administrators have the new options (AUTO, SKEWONLY, REPEAT)

4. Oracle9i introduced new feature to enable or disable the table monitoring in schema level or DB level in one command.

DBMS_STATS.alter_schema_tab_monitoring('MYSCHEMA', TRUE); DBMS_STATS.alter_schema_tab_monitoring('MYSCHEMA', FALSE);

DBMS_STATS.alter_database_tab_monitoring(TRUE); DBMS_STATS.alter_database_tab_monitoring(FALSE);

New feature in Oracle10g:

1. Oracle10g enable table monitoring automatically. Table monitoring is required to collect the stale statisics. We do not need to enable monitoring explicitly. This feature is disabled when statistics_level is BASIC. It enables the table monitoring feature when statistics_level is TYPICAL. ALTER TABLE [NO] MONITORING clauses as well as alter_schema_tab_monitoring and alter_database_tab_monitoring procedures of the dbms_stats package are now obsolete in oracle10g. But still it runs without any error. But there is no effect.

2. Oracle10g introduced two new values for Granularity parameter. These are AUTO and GLOBAL AND PARTITION. This parameter is applicable for analyzing partitioning tables.

AUTO : Oracle collect statistics GLOBAL level, Partition level, and sub-partition level only if sub partition method is LIST. If sub parition is not a LIST, then it collects only GLOBAL, Partition level.

GLOBAL AND PARTITION : Oracle gathers the global and partition level statistics. No sub-partition level statistics are gathered.

3. Oracle10g introduced new value DBMS_STATS.AUTO_DEGREE for Degree parameter. When you specify the auto_degree, Oracle will determine the degree of parallelism automatically. It will be either 1 (serial execution) or default_degree (the system default value based on number of CPUs and initialization parameters), according to the size of the object.

4. Oracle10g has ability to restore the previous statistics. Oracle saves last 31 days statistics by default. We can recover previous days statistics in case, optimizer behaves differently with current statistics. Please refer my another post restoring statistics

5. We can lock the table statistics. This would be helpful if you want to avoid gathering statistics during the maintenance window. Please refer my another post Locking statistics

6. Oracle10g has automatic statistics gathering feature. Oracle gather statistics for the entire database every day during the maintenance window. Please refer my another post automatic statistics gathering

7. The statistics will be collected automatically when we create index. In oracle9i, we need to use compute statistics clause to collect statistics while creating index. Please refer my another post compute index statistics

What is impact when we analyze the tables during the peak hours?

1. Oracle consume more resource when we gather statistics. This will slow down the overall performance in the sever.

2. When statistics are updated for a database object, Oracle invalidates any currently parsed SQL statements that access the object. The next time such a statement executes, the statement is re-parsed and the optimizer automatically chooses a new execution plan based on the new statistics. This will degrade the server performance during the peak hours. But we can control this by using the parameter NO_INVALIDATE. This has three values(TRUE, FALSE, DBMS_STATS.AUTO_INVALIDATE). TRUE will not invalidate the already parsed SQL statement. NO will invalidate parsed SQL statement immediately. AUTO_INVALIDATE decides when to invalidate the already parsed Statement.

Here is the oracle help to know more about dbms_stats procedure. Oracle Help

Tuesday, September 29, 2009

Resetting High Water Mark in Oracle10g

I would like to write how to reset HWM in oracle10g. Prior to Oracle10g, resetting high water mark is painful procedure in busy environment. Oracle made our life easy to reset the High Water Mark in oracle10g. I am not going to discuss about what is HWM. Since i already discussed this in another topic. Please click to read about what is HWM and what are the various options available to reset the HMW.

We have traditional ways to reset the HWM prior to Oracle10g. We might need Table downtime when we use below traditional methods....

1. Imp/exp
2. Alter tablespace move
3. truncate and insert
4. user dbms_redefinition package to copy the table

Okay... Let us talk about resetting HWM in Oracle10g. The tablespace should be ASSM(Automatic segment space Management) enabled to leverge this feature. In oracle10g, we do not need table downtime to reset the HWM. It would be easy to reset the HWM in 24/7 environment.

What does Oracle do while using Oracle10g feature to reset the HMW?

Oracle split this process as two phase. Let me explain what is happening in each phase.

Phase I. Oracle move the rows which are located in the middle or at the end of a segment further more down to the beginning of the segment and make the segment more compact. This shrinking process is kind of delete and insert. But it is not really!!!. This process is moving the data row by row. It acquires a row level lock when the row is moved down to the begining of the segment. The corresponding index data will be handled like any other row level DML. So we do not need to worry about rebuilding the indexes for the row. Also row level lock will happen for very short moment. Before we start this phase, we need to enable row movement. Here is the command to complete the phase I. Here i am using the table called bookings.

SQL> alter table bookings enable row movement;

Table altered.

SQL> alter table bookings shrink space compact;

Table altered

Phase II This step will reset the high water mark. This will acquire table level lock for very short moment while resetting the HWM. Here is the command to accomplish this task.

SQL> alter table bookings shrink space;

Table altered

SQL> alter table bookings shrink space cascade; (it is for all dependent objects as well)

Table altered

If we want to reset the HWM in one go, then below command will accomplish the task. The below command moves the rows and reset the HWM.

SQL> alter table bookings shrink space;

Table altered

What are the advantages of using Oracle10g new feature to reset the HWM?

There are serveral advantages over traditional methods. Let me list the advantages here.

1. It can be done in online. There is a table level lock for very short moment. Traditional methods are not supporting to reset the HWM in online.

2. It does not take extra space while resetting the HWM. If we use traditional method, DBMS_REDEFINITION package use double the amount of space.

3. It does acquire only row level lock while performing majority of the shrinking(moving rows) work. It acquires table level lock only when it resets the HWM which is in phase II. But traditional methods requires table down time for resetting the HWM except using dbms_redefinition package.

4. Index will be maintained and remain usable. But in traditional methods, we need to rebuild the index. Especially when we use ALTER TABLESPACE MOVE command.

5. It can be made in one command(alter table emp shrink space). In traditional method, there are multiple steps.

6. If you are not sure that you can afford table level lock at specific time, then you can do the majority of the shriniking work and later we can reset the HWM. Since table level lock is required only while resetting the HWM. The whole process can be done in two steps as i explained above. This advantage is not available in traditional methods.

What are the restriction of using Oracle10g new feature to reset the HWM?

1. It is only possible in ASSM tablespace
2. Not supporting for clustered tables, tables with column data type LONG

What circumstances we can reset the HWM as two phase?

The table is not frequently used(insert/update/delete) and it can afford to have table level lock, then we can reset the HWM in one go. We can reset the HWM as two steps when the table is used by several people and it is always busy and it does not permit the table level lock even for short moment at specific time. This scenario, we can move the rows to shrink the table. Then in the night time or off peak hours, we can reset the HWM.

Tuesday, September 15, 2009

DBMS_PROFILER

DBMS_PROFILER is introduced in oracle8i. It is powerful tool to find the PLSQL execution time and determine the bottleneck of the program unit. This tool provides information about how many times each line is executing and how much time it is spending to execute for each line of the code. The basic idea behind the profiling is, developers can understand where the code is taking most time and they can detect and optimize the PLSQL code. We use SQL trace to determine the bottleneck for SQL code. But for PLSQL code, we can use dbms_profiler utility to profile the run time behaviour. Steps might very for other oracle versions.

How do we set up dbms_profiler utility?

Please remember, the dbms_profiler setup is not part of Oracle installation. We need to setup manually if we want to profile the PLSQL code. There are five simple steps to configure the dbms_profiler. Let us start configure the profiler. This article is tested in oracle10g R2.

Step1. The dbms_profiler package can be loaded by running the $ORACLE_HOME/rdbms/admin/profload.sql script as the SYS user. Execute profload.sql in sys schema.

Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.3.0 - Production
With the Partitioning, OLAP and Data Mining options

SQL> connect sys/password@orcl as sysdba
Connected.
SQL> start c:/oracle/product/10.2.0/db_1/rdbms/admin/profload.sql

Package created.

Grant succeeded.

Synonym created.

Library created.

Package body created.

Testing for correct installation
SYS.DBMS_PROFILER successfully loaded.

PL/SQL procedure successfully completed.

Step2. dbms_profiler package requires some schema objects which should be created in central schema or application schema. Create a new schema called profiler. We can either create a new schema or use existing schema. In this case, i created new schema, named Profiler.

SQL> create user profiler
2 identified by profiler;

User created.

SQL> grant create session,resource,connect to profiler;
Grant succeeded.

SQL>

Step3. Run the $ORACLE_HOME/rdbms/admin/proftab.sql file on the profiler schema to create some schema objects to store profiler information. This proftab.sql file creates below three tables with some other objects.

1.PLSQL_PROFILER_RUNS
2.PLSQL_PROFILER_UNITS
3.PLSQL_PROFILER_DATA

SQL> connect profiler/profiler@orcl
Connected.
SQL> start c:/oracle/product/10.2.0/db_1/rdbms/admin/proftab.sql
drop table plsql_profiler_data cascade constraints
*
ERROR at line 1:
ORA-00942: table or view does not exist

drop table plsql_profiler_units cascade constraints
*
ERROR at line 1:
ORA-00942: table or view does not exist

drop table plsql_profiler_runs cascade constraints
*
ERROR at line 1:
ORA-00942: table or view does not exist

drop sequence plsql_profiler_runnumber
*
ERROR at line 1:
ORA-02289: sequence does not exist

Table created.

Comment created.

Table created.

Comment created.

Table created.

Comment created.

Sequence created.

SQL>

Step4. Connect into profiler schema and grant the below privileges.

GRANT SELECT ON plsql_profiler_runnumber TO PUBLIC;
GRANT SELECT, INSERT, UPDATE, DELETE ON plsql_profiler_data TO PUBLIC;
GRANT SELECT, INSERT, UPDATE, DELETE ON plsql_profiler_units TO PUBLIC;
GRANT SELECT, INSERT, UPDATE, DELETE ON plsql_profiler_runs TO PUBLIC;

SQL> connect profiler/profiler@orcl
Connected.
SQL> GRANT SELECT ON plsql_profiler_runnumber TO PUBLIC;

Grant succeeded.

SQL> GRANT SELECT, INSERT, UPDATE, DELETE ON plsql_profiler_data TO PUBLIC;

Grant succeeded.

SQL> GRANT SELECT, INSERT, UPDATE, DELETE ON plsql_profiler_units TO PUBLIC;

Grant succeeded.

SQL> GRANT SELECT, INSERT, UPDATE, DELETE ON plsql_profiler_runs TO PUBLIC;

Grant succeeded.

SQL>

Step5. Connect into sys schema and grant the below privileges for dbms_profiler package.

CREATE PUBLIC SYNONYM plsql_profiler_runs FOR profiler.plsql_profiler_runs;
CREATE PUBLIC SYNONYM plsql_profiler_units FOR profiler.plsql_profiler_units;
CREATE PUBLIC SYNONYM plsql_profiler_data FOR profiler.plsql_profiler_data;
CREATE PUBLIC SYNONYM plsql_profiler_runnumber FOR profiler.plsql_profiler_runnumber;

SQL> connect sys/password@orcl as sysdba
Connected.
SQL> CREATE PUBLIC SYNONYM plsql_profiler_runs FOR profiler.plsql_profiler_runs;

Synonym created.

SQL> CREATE PUBLIC SYNONYM plsql_profiler_units FOR profiler.plsql_profiler_units;

Synonym created.

SQL> CREATE PUBLIC SYNONYM plsql_profiler_data FOR profiler.plsql_profiler_data;

Synonym created.

SQL> CREATE PUBLIC SYNONYM plsql_profiler_runnumber FOR profiler.plsql_profiler_runnumber;

Synonym created.

SQL>

Once we are successful with five steps, we can start profiling the PLSQL code.

How do we profile the PLSQL Procedure?

Let us create sample procedure and profile the procedure.

1. start profiler
2. run the procedure
3. stop profiler
4. flush data from memory and save into table
5. Analyze the data and see where it is taking more time

Here i create a procedure called do_something in SCOTT schema. You can also profile the procedure which is existing in any schema in the database.

SQL> CREATE OR REPLACE PROCEDURE do_something (p_times IN NUMBER) AS
2 v_cnt NUMBER;
3 BEGIN
4 FOR i IN 1 .. p_times LOOP
5 SELECT count(*) + p_times
6 INTO v_cnt
7 FROM EMP;
8 END LOOP;
9 END;
10 /

Procedure created.

The below unnamed PLSQL code starts the profiler and call the procedure. Once the procedure is executed, it stop the profiler. It flush the data from memory and save into table.

SQL> DECLARE
2 l_result BINARY_INTEGER;
3 BEGIN
4 l_result := DBMS_PROFILER.start_profiler(run_comment => 'do_something: ' SYSDATE);
5 do_something(p_times => 100);
6 l_result := DBMS_PROFILER.stop_profiler;
7 dbms_profiler.flush_data;
8 END;
9 /

PL/SQL procedure successfully completed.

SQL>

Here is query to check the profiler result.

SQL> SET LINESIZE 200
SQL> SET TRIMOUT ON
SQL>
SQL> COLUMN runid FORMAT 99999
SQL> COLUMN run_comment FORMAT A50
SQL> SELECT runid,
2 run_date,
3 run_comment,
4 run_total_time
5 FROM plsql_profiler_runs
6 ORDER BY runid;

RUNID RUN_DATE RUN_COMMENT RUN_TOTAL_TIME
------ --------- -------------------------------------------------- --------------
4 15-SEP-09 do_something: 15-SEP-09 686370753

SQL> SELECT d.line#,
2 d.total_occur,
3 d.total_time
4 FROM plsql_profiler_units u
5 JOIN plsql_profiler_data d ON u.runid = d.runid AND u.unit_number = d.unit_number
6 WHERE u.runid = 4
7 and unit_name='DO_SOMETHING'
8 and unit_owner='SCOTT'
9 and unit_type='PROCEDURE'
10 ORDER BY u.unit_number, d.line#;

LINE# TOTAL_OCCUR TOTAL_TIME
---------- ----------- ----------
1 1 199466
4 101 2247771
5 100 513261322
9 1 85485

SQL> SELECT line ' : ' text
2 FROM all_source
3 WHERE owner = 'SCOTT'
4 AND type = 'PROCEDURE'
5 AND name = 'DO_SOMETHING';

LINE':'TEXT
----------------------------------------------------------------------------------------------------
1 : PROCEDURE do_something (p_times IN NUMBER) AS
2 : v_cnt NUMBER;
3 : BEGIN
4 : FOR i IN 1 .. p_times LOOP
5 : SELECT count(*) + p_times
6 : INTO v_cnt
7 : FROM EMP;
8 : END LOOP;
9 : END;

9 rows selected.

SQL>

Conclusion : The line number 4 runs 101 times and line number 5 runs 100 times. The procedure spends most of the time at line number 5. Now we figured out exactly where it is taking longer time. We can focus on tuning the line 5 in case if we want to....

Where do we use dbms_profiler? TKPROF and Explain plan helps to find where it is taking long time to complete the SQL. Dbms_profiler is useful, if we want to find out which line is consuming most time in entire PLSQL.

The source of this article is oracle-base.

Thursday, September 10, 2009

Oracle SQL Questions for Newbies

I thought, it would be useful if i post some sample SQL questions for newbies. These bunch of SQL questions would be helpful for beginners who wanted to learn Oracle SQL.

Here are some tables which needs to be created to practice SQL questions. Just for learning process, you could use SCOTT schema to practice SQL questions. I am posting 101 Oracle SQL questions with answers.

First step would be, we need to create the below tables and insert relevant data to practice the questions.

Create the below tables:

CREATE TABLE DEPT (
DEPTNO NUMBER(2),
DNAME VARCHAR2(14),
LOC VARCHAR2(13));

INSERT INTO DEPT VALUES (10, 'ACCOUNTING', 'NEW YORK');
INSERT INTO DEPT VALUES (20, 'RESEARCH', 'DALLAS');
INSERT INTO DEPT VALUES (30, 'SALES', 'CHICAGO');
INSERT INTO DEPT VALUES (40, 'OPERATIONS', 'BOSTON');

ALTER TABLE DEPT ADD PRIMARY KEY(DEPTNO);

CREATE TABLE EMP (EMPNO NUMBER(4) NOT NULL,
ENAME VARCHAR2(10),
JOB VARCHAR2(9),
MGR NUMBER(4),
HIREDATE DATE,
SAL NUMBER(7,2),
COMM NUMBER(7,2),
DEPTNO NUMBER(2));

ALTER TABLE EMP ADD CONSTRAINT FK_EMP_01
FOREIGN KEY(DEPTNO) REFERENCES DEPT;

INSERT INTO EMP VALUES (7369, 'SMITH', 'CLERK', 7902, TO_DATE('17-DEC-1980', 'DD-MON-YYYY'), 800, NULL, 20);
INSERT INTO EMP VALUES (7499, 'ALLEN', 'SALESMAN', 7698, TO_DATE('20-FEB-1981', 'DD-MON-YYYY'), 1600, 300, 30);
INSERT INTO EMP VALUES (7521, 'WARD', 'SALESMAN', 7698, TO_DATE('22-FEB-1981', 'DD-MON-YYYY'), 1250, 500, 30);
INSERT INTO EMP VALUES (7566, 'JONES', 'MANAGER', 7839, TO_DATE('2-APR-1981', 'DD-MON-YYYY'), 2975, NULL, 20);
INSERT INTO EMP VALUES (7654, 'MARTIN', 'SALESMAN', 7698,TO_DATE('28-SEP-1981', 'DD-MON-YYYY'), 1250, 1400, 30);
INSERT INTO EMP VALUES (7698, 'BLAKE', 'MANAGER', 7839,TO_DATE('1-MAY-1981', 'DD-MON-YYYY'), 2850, NULL, 30);
INSERT INTO EMP VALUES (7782, 'CLARK', 'MANAGER', 7839,TO_DATE('9-JUN-1981', 'DD-MON-YYYY'), 2450, NULL, 10);
INSERT INTO EMP VALUES (7788, 'SCOTT', 'ANALYST', 7566,TO_DATE('09-DEC-1982', 'DD-MON-YYYY'), 3000, NULL, 20);
INSERT INTO EMP VALUES (7839, 'KING', 'PRESIDENT', NULL,TO_DATE('17-NOV-1981', 'DD-MON-YYYY'), 5000, NULL, 10);
INSERT INTO EMP VALUES (7844, 'TURNER', 'SALESMAN', 7698,TO_DATE('8-SEP-1981', 'DD-MON-YYYY'), 1500, 0, 30);
INSERT INTO EMP VALUES (7876, 'ADAMS', 'CLERK', 7788,TO_DATE('12-JAN-1983', 'DD-MON-YYYY'), 1100, NULL, 20);
INSERT INTO EMP VALUES (7900, 'JAMES', 'CLERK', 7698,TO_DATE('3-DEC-1981', 'DD-MON-YYYY'), 950, NULL, 30);
INSERT INTO EMP VALUES (7902, 'FORD', 'ANALYST', 7566,TO_DATE('3-DEC-1981', 'DD-MON-YYYY'), 3000, NULL, 20);
INSERT INTO EMP VALUES (7934, 'MILLER', 'CLERK', 7782,TO_DATE('23-JAN-1982', 'DD-MON-YYYY'), 1300, NULL, 10);

CREATE TABLE SALGRADE(
GRADE NUMBER(2),
LOSAL NUMBER,
HISAL NUMBER);

INSERT INTO SALGRADE VALUES(1, 700,1200);
INSERT INTO SALGRADE VALUES(2, 1201,1400);
INSERT INTO SALGRADE VALUES(3, 1401,2000);
INSERT INTO SALGRADE VALUES(4, 2001,3000);
INSERT INTO SALGRADE VALUES(5, 3001,9999);

Question & Answers:

1) Display all the records in emp table?
select * from emp;
2) Display all the records in emp table where employee belongs to deptno 10?
select * from emp where deptno = 10

3) Display all the records in emp table where employee does not belong to deptno 30?

select * from emp where deptno != 30;

4) Display total number of records in Emp table?

select count(*) from emp;

5) Display emp table with salary descending order?

select * from emp order by sal desc

6) Display first five records in employee table?

select * from emp where rownum <= 5
7) Display all the records in emp table order by ascending deptno, descending salary?
select * from emp order by deptno asc, sal desc

8) Display all employees those who were joined in year 1981?

select * from emp where to_char(hiredate,'YYYY') = 1981;

9) Display COMM in emp table. Display zero in place of null.

select nvl(comm,0) from emp

10) Display the records in emp table where MGR in 7698,7566 and sal should be greater then 1500

select * from emp where mgr in(7698,7566) and sal > 1500

11) Display all employees where employees hired before 01-JAN-1981

select * from emp where hiredate < '01-JAN-1981'
12) Display all employees with how many years they have been servicing in the company?

select hiredate,round((sysdate-hiredate)/360) as years from emp

13) Display all employees those were not joined in 1981?

select * from emp where to_char(hiredate,'YYYY') != 1981;

14) Display all employees where their hiredate belongs to third quarter?

select * from emp where to_char(hiredate,'Q') = 3;

15) Display all employees where their salary is less then the Ford’s salary?

select * from emp where sal <(select sal from emp where ename='FORD');
16) Display all the records in EMP table along with the rowid?

select ename,rowid from emp;

17) Display all records in EMP table those were joined before SCOTT joined?

select * from emp where hiredate <(select hiredate from emp where ename='SCOTT')
18) Display all employees those who were joined in third quarter of 1981?

select * from emp where to_char(hiredate,'Q') = 3 and to_char(hiredate,'YYYY') = 1981

19) Add 3 months with hiredate in EMP table and display the result?

select hiredate, add_months(hiredate,3) from emp

20) Display the date for next TUESDAY in hiredate column?

select next_day(hiredate,'TUESDAY') from emp;

21) Find the date, 15 days after today’s date.

select sysdate+15 from dual

22) Write a query to display current date?

select sysdate from dual;
select current_date from dual;

23) Display distinct job from emp table?

select distinct job from emp

24) Display all the records in emp table where employee hired after 28-SEP-81 and before 03-DEC-81?

select * from emp where hiredate between '28-SEP-81' and '03-DEC-81'

25) Write a query that displays the employee’s names with the first letter capitalized and all other letters lowercase for all employees whose name starts with J, A, or M

select initcap(ename) from emp where ename like 'J%' or ename like 'A%' or ename like 'M%'

26) Display all jobs that are in department 10. Include the location of department in the output.

select job, loc from emp,dept where emp.deptno = dept.deptno and emp.deptno =10

27) Write a query to display the employee name, department name of all employees who earn a commission

select ename,dname from emp,dept where emp.deptno = dept.deptno and comm is not null;

28) Display the empno, ename, sal, and salary increased by 15%.

select empno, ename, sal actual_sal, (sal * 15/100) as Increased_sal from emp

29) Display ename, sal, grade. Use emp, salgrade table

select ename,sal,grade from emp,salgrade where sal between losal and hisal;

30) Display all employees and corresponding managers

select w.ename,w.sal,m.ename,m.sal from emp w, emp m where w.mgr = m.empno;

31) Display all the departments where employee salary greater then average salary of that department.

select ename,deptno, sal from emp a where sal > (select avg(sal) from emp where emp.deptno = a.deptno) order by deptno;

32) Display all employees whose salary greater then the manager salary?

select w.ename,w.sal,m.ename,m.sal from emp w, emp m where w.mgr = m.empno and w.sal > m.sal

33) Display employees where length of ename is 5

select * from emp where length(ename) =5

34) Display all employees where ename start with J and ends with S

select * from emp where ename like 'J%S'

35) Display all employees where employee does not belong to 10,20,40

select * from emp where deptno not in(10,20,40)

36) Display all employees where jobs does not belong to PRESIDENT and MANAGER?

select * from emp where job not in('PRESIDENT','MANAGER');

37) Display the maximum salary in the emp table

select max(sal) from emp

38) Display average salary for job SALESMAN

select avg(sal) from emp where job = 'SALESMAN'

39) Display all three figures salary in emp table

select * from emp where sal < = 999;
select * from emp where length(sal) = 3;
40) Display all records in emp table for employee who does not receive any commission

select * from emp where comm is not null

41) Display all ename where first character could be anything, but second character should be L?
select * from emp where ename like '_L%'

42) Display nth highest and nth lowest salary in emp table?

SELECT DISTINCT (a.sal) FROM EMP A WHERE &N = (SELECT COUNT (DISTINCT (b.sal)) FROM EMP B WHERE a.sal<=b.sal);
select distinct sal from (select ename,sal,dense_rank() over(order by sal desc) dr from emp) where dr = &x ;
43) Display all the departments where department has 3 employees?

select deptno from dept a where deptno in(select deptno from emp group by deptno having count(*)=3)

44) Display emp name and corresponding subordinates. Use CONNECT BY clause.

select lpad(' ',level+12)+ename from emp connect by prior empno = mgr start with mgr is null

Note: Please replace pipe symbol in the place of + sign for question 44. Pipe symbol is not displaying the blog. This is the reason, i used Plus sign here.

45) Display sum of salary for each department. The output should be in one record

select sum(decode(deptno,10,sal)) dept10, sum(decode(deptno,20,sal)) dept20, sum(decode(deptno,30,sal)) dept30, sum(sal) total_sal from emp

46) Display all department with Minimum salary and maximum salary?

select min(sal),max(sal) from emp;

47) Display all ename, sal, deptno,dname from emp, dept table where all department which has employees as well as department does not have any employees. This query should include non matching rows.

select dname,b.deptno, ename,sal from emp a, dept b where a.deptno(+) = b.deptno;
select dname,b.deptno, ename,sal from emp a right outer join dept b on a.deptno = b.deptno;

48) Display all ename, sal, deptno from emp, dept table where all employees which has matching department as well as employee does not have any departments. This query should include non matching rows.
Note: In the below query, employee will always have matching record in dept table. Emp, dept table may not be good example to answer this question.
select dname,b.deptno, ename,sal from emp a, dept b where a.deptno = b.deptno(+);
select dname,b.deptno, ename,sal from emp a left outer join dept b on a.deptno = b.deptno;
49) Display all ename, sal, deptno from emp, dept table where all employees which has matching and non matching department as well as all departments in dept table which has matching and non matching employees. This query should include non matching rows on both the tables.
Note: In the below query, employee will always have matching record in dept table. Emp, dept table may not be good example to answer this question.
select dname,b.deptno, ename,sal from emp a full outer join dept b on a.deptno = b.deptno
50) Display all ename, empno, dname, loc from emp, dept table without joining two tables
select * from emp,dept;

51) Display all the departments where department does not have any employees

select deptno from dept where not exists(select 1 from emp where emp.deptno = dept.deptno);

select deptno from dept where deptno not in(select deptno from emp);

52) Display all the departments where department does have atleast one employee

select * from dept a where exists(select 1 from emp b where b.deptno = a.deptno)

select * from dept a where deptno in(select deptno from emp b where a.deptno = b.deptno)

53) Display all employees those who are not managers?

select ename from emp a where not exists (select 1 from emp b where b.mgr = a.empno);

select ename from emp a where empno not in (select mgr from emp b where b.mgr = a.empno and mgr is not null);

54) Display ename, deptno from emp table with format of {ename} belongs to {deptno}

select ename+' belongs to '+deptno from emp

Note: Please replace pipe symbol in the place of + sign for question 44. Pipe symbol is not displaying the blog. This is the reason, i used Plus sign here.

55) Display total number of employees hired for 1980,1981,1982. The output should be in one record.

select
count(decode(to_char(hiredate,'YYYY'), 1980,hiredate)) total_hire_1980,
count(decode(to_char(hiredate,'YYYY'), 1981,hiredate)) total_hire_1981,
count(decode(to_char(hiredate,'YYYY'), 1982,hiredate)) total_hire_1982
from emp

56) Display ename, deptno from employee table. Also add another column in the same query and it should display ten for dept 10, twenty for dept 20, thirty for dept 30, fourty for dept 40

select ename,deptno, (case deptno
when 10 then 'Ten'
when 20 then 'Twenty'
when 30 then 'Thirty'
when 40 then 'fourty'
else 'others' end) as dept
from emp

57) Display all the records in emp table. The ename should be lower case. The job first character should be upper case and rest of the character in job field should be lower case.

select lower(ename) as ename, initcap(job) as job from emp
58) Display all employees those who have joined in first week of the month ?

select * from emp where to_char(hiredate,'W') = 1;

59) Display all empoyees those who have joined in the 49th week of the year?

select * from emp where to_char(hiredate,'WW') = 49;

60) Display empno, deptno, salary, salary difference between current record and previous record in emp table. Deptno should be in descending order.

SELECT empno,
ename,
job,
sal,
LAG(sal, 1, 0) OVER (ORDER BY sal) AS sal_prev,
sal - LAG(sal, 1, 0) OVER (ORDER BY sal) AS sal_diff
FROM emp;

61) Create table emp1 and copy the emp table for deptno 10 while creating the table

Create table emp1 as select * from emp where deptno=10

62) Create table emp2 with same structure of emp table. Do not copy the data

create table emp2 as select * from emp where 1=2

63) Insert new record in emp1 table, Merge the emp1 table on emp table.

insert into emp1 values(9999,'PAUL','MANAGER',7839,SYSDATE,8900,NULL,10);

MERGE
INTO emp tgt
USING emp1 src
ON ( src.empno = tgt.empno )
WHEN MATCHED
THEN
UPDATE
SET tgt.ename = src.ename,
tgt.job = src.job,
tgt.mgr = src.mgr,
tgt.hiredate = src.hiredate,
tgt.sal = src.sal,
tgt.deptno = src.deptno
WHEN NOT MATCHED
THEN
Insert(
Tgt.empno,
Tgt.Ename,
Tgt.Job,
Tgt.Mgr,
Tgt.Hiredate,
Tgt.Sal,
Tgt.Comm,
Tgt.Deptno)
values (src.empno,
src.ename,
src.job,
src.mgr,
src.hiredate,
src.sal,
src.comm,
src.deptno);

64) Display all the records for deptno which belongs to employee name JAMES?

select * from emp where deptno in(select deptno from emp where ename = 'JAMES')

65) Display all the records in emp table where salary should be less then or equal to ADAMS salary?

select * from emp where sal <= (select sal from emp where ename='ADAMS')

66) Display all employees those were joined before employee WARD joined?

select * from emp where hiredate < (select hiredate from emp where ename='WARD')

67) Display all subordinate those who are working under BLAKE?

Select ename from emp where mgr = (select empno from emp where ename='BLAKE')

68) Display all subordinate(all levels) for employee BLAKE?

select ename from emp start with empno = (select empno from emp where ename='BLAKE')
connect by prior empno = mgr

69) Display all record in emp table for deptno which belongs to KING's Job?

select * from emp where deptno in(select deptno from emp where job= (select job from emp where ename = 'KING'))

70) Display the employees for empno which belongs to job PRESIDENT?

select * from emp where empno in(select empno from emp where ename in(select ename from emp where JOB = 'PRESIDENT'));

71) Display list of ename those who have joined in Year 81 as MANAGER?

select * from emp where to_char(hiredate,'YYYY') = 1981 and job = 'MANAGER';

72) Display who is making highest commission?

select * from emp where comm = (select max(comm) from emp);

73) Display who is senior most employee? How many years has been working?

select * from emp where trunc(sysdate-hiredate)/365 = (select max(trunc(sysdate-hiredate)/365) from emp);

select * from emp where hiredate =(select min(hiredate) from emp)
74) Display who is most experienced and least experienced employee?

select * from emp where trunc(sysdate-hiredate)/365 = (select min(trunc(sysdate-hiredate)/365) from emp);

select * from emp where hiredate =(select max(hiredate) from emp)

75) Display ename, sal, grade, dname, loc for each employee.

select empno,ename,b.deptno,dname,grade from
emp a,dept b, salgrade c
where a.deptno = b.deptno
and sal between losal and hisal;

76) Display all employee whose location is DALLAS?

SELECT emp.ename, emp.JOB, emp.deptno
FROM emp
WHERE EXISTS
(SELECT 'x'
FROM dept d
WHERE d.DEPTNO = emp.DEPTNO
AND d.LOC = 'DALLAS') ;

select emp.ename, emp.job, emp.deptno
from emp
where deptno in(select deptno from dept where loc='DALLAS');

77) Display ename, job, dname, deptno for each employee by using INLINE view?

SELECT emp.ename,
emp.JOB,
emp.deptno,
dnames.dname
FROM emp
JOIN (select dname, deptno
from dept ) dnames ON emp.deptno = dnames.deptno

78) List ename, job, sal and department of all employees whose salary is not within the salary grade?

select ename, job, sal, dname
from emp, dept
where emp.deptno = dept.deptno
and not exists
(select ‘x’ from salgrade
where emp.sal between losal and hisal);

79) Use EMP and EMP1 table. Query should have only three columns. Display empno,ename,sal from both tables inluding duplicates.
select empno, ename, sal from emp
union all
select empno, ename, sal from emp1

80) Delete emp table for detpno 10 and 20.
delete emp where deptno in(10,20);

81) Delete all employees those are not getting any commission?
delete emp where comm is null;

82) Delete all employees those who employeed more then 28 years
delete emp where trunc(sysdate - hiredate)/365 > 28;

83) Add duplicate records in emp1 table. Delete the duplicate records in emp1 table.
insert into emp1 select * from emp1 where rownum <=1; commit; delete emp1 a where a.rowid <>(select min(b.rowid) from emp1 b where a.empno = b.empno);

84) Delete the employees where employee salary greater then average salary of department salary?
delete emp a where sal > (select avg(sal) from emp where emp.deptno = a.deptno);

85) Delete all employees those who are reporting to BLAKE?
Delete emp where ename in(Select ename from emp where mgr = (select empno from emp where ename='BLAKE'))

86) Delete all levels of employees those who are under BLAKE?
Delete emp where ename in(select ename from emp start with empno = (select empno from emp where ename='BLAKE')
connect by prior empno = mgr)

87) Delete all employees those who are only managers?
delete emp where ename in(select ename from emp a where empno in (select mgr from emp b where b.mgr = a.empno and mgr is not null))

88) Remove the department in dept table where dept does not have any employees?

delete dept where deptno not in(select deptno from emp where deptno is not null)

89) Remove all grade 2 employees in emp table?

delete emp where empno in(select empno from emp,salgrade where sal between losal and hisal and grade = 2)

90) Remove all the employees in SMITH's department

delete emp where deptno = (select deptno from emp where ename = 'SMITH')

91) Remove least paid employee who are reporting to BLAKE ?

delete emp where sal = (select min(sal) from emp where mgr =
(select empno from emp where ename = 'BLAKE')) and
ename in(select ename from emp where mgr =
(select empno from emp where ename = 'BLAKE'))

92) Remove all employees who were joined before SMITH joined?
delete emp where hiredate < (select hiredate from emp where ename='SMITH');
93) Rename the employee name JONES to ANDY


update emp set ename = 'ANDY' where ename = 'JONES'

94) Change the WARD's hiredate to one day ahead

update emp set hiredate = hiredate + 1 where ename = 'WARD'

95) Update MARTIN salary same as SMITH's salary

update emp set sal = (select sal from emp where ename = 'SMITH') where ename='MARTIN'

96) Increase the salary 5% for employee those who are earning commission less then 1000

update emp set sal = sal + (sal * (5/100)) where comm between 0 and 1000

97) Increase 250$ commission for BLAKE's team

update emp set comm = nvl(comm,0)+250 where mgr = (select empno from emp where ename='BLAKE');
98) Increase 100$ for employee who is making more then averge salary of his department?

update emp a set sal = sal + 150 where sal > (select avg(sal) from emp b where b.deptno = a.deptno)
99) Increase 1% salary for employee who is making lowest salary in dept 10

update emp set sal = sal + (sal* 1/100)
where
sal = (select min(sal) from emp where deptno = 10)
and deptno = 10

100) Reduce the commission amount from employee salary for each employee who were joined after ALLEN joined.

update emp set sal = sal - NVL(comm,0)
where empno in(select empno from emp where hiredate > (select hiredate from emp where
ename = 'ALLEN'))

101) Increase commission 10$ for employees those who are located in NEW YORK.

update emp a set comm = NVL(COMM,0) + 10
where deptno = (select deptno from dept where loc='NEW YORK');

Hope you enjoy all these questions... Please let me know if you have any comments or clarification... Feel free to click advertisement to maintain this blog. Thank you for visiting my Bog!!!.

Please refer more questions for PLSQL