Sunday, November 8, 2009

Dynamic Sampling in Oracle

I started my career with oracle 7.0 and i remember, in oracle7.x/8.x version, optimizer use Rule based optimizer if one or all of the tables does not have any statistics in the join query. In oracle9i R2, dynamic sampling query is introduced and it has ability to sample the table and collect the better statistics dynamically during the hard parse. This sampling takes place only at hard parse time and is used to dynamically generate better statistics for the optimizer to use, hence the name dynamic sampling.

Oracle optimizer use the default statistic values when table does not have any statistics and dynamic sampling query feature is disabled. The default statistics values are well documented in Performance Tuning Guide.

How do we set Dynamic sampling parameter? There are couple of ways, we can set the parameter.

1. We can set this in session level or instance level.
2. Dynamic sampling query hint in query level. (Hint)

Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.3.0 - Production
With the Partitioning, OLAP and Data Mining options

SQL> alter session set OPTIMIZER_DYNAMIC_SAMPLING = 2;

Session altered.

SQL> alter system set OPTIMIZER_DYNAMIC_SAMPLING = 2;

System altered.

SQL> select /*+ dynamic_sampling(emp 0) */ * from scott.emp;

When does dynamic sampling is useful?
1. It is useful when table data is drastically changing and statistic becomes stale very often.
2. Global temporary table is good place to use dynamic sampling feature.

Dynamic sampling Levels? Optimizer Dynamic Sampling has 11 levels between 0 to 10. What are these levels? The answer is pretty straight forward and it is clearly documented in Oracle Performance Tuning Guide. In Oracle9i R2, the default setting for dynamic sampling level is 1. At this setting, the optimizer will tend to use dynamic sampling only in some cases when a table is unanalyzed. This includes the case of global temporary tables without statistics but doesn’t ensure that dynamic sampling will be used against all unanalyzed tables. In Oracle Database 10g Release 1 and above, the default setting is 2. This setting ensures that the optimizer will have a good estimate whenever it optimizes a query that accesses a table with no statistics.

The source of this Article is Asktom. Please read this link and this link has very pretty good information about Dynamic sampling.

Wednesday, November 4, 2009

Restoring old statistics

I see some DBA's, they won't backup the statistics before collect the new statistics. If we don't have backup, and optimizer behaves differently with new statistics, then we are in trouble. If we have old statistic backup, then we can resolve the performance issue by restoring the old statistics. Oracle highly recommends to take the backup of current statistics before we collect new statistics.

Oracle10g has a new feature which will save the statistics before overwriting the new statistics.

There are couple of ways, we can backup the statistics.

1. We can export the statistics in a table and import back when ever we need. I discussed this approach in different thread. This approach is mostly useful when we transfer the statistics from one server to another server. Please refer this thread Post

2. Since Oracle10g, whenever statistics are modified, old version of statistics are saved automatically before overwriting the new statistics. However automatic saving of old statistics will not take place if you use ANALYZE in 10g for statistics collection.

I am going to discuss how the statistics are saved and how do we restore the old statistics in Approach 2.

Where does oracle store the statistics? Oracle DBA_OPTSTAT_OPERATIONS table contains a log history of statistics collected on the database at various level such as table, schema or database. We can restore older statistics anytime by making use of START_TIME and END_TIME values from DBA_OPTSTAT_OPERATIONS table.

How does Oracle maintain the Statistics History? GET_STATS_HISTORY_AVAILABILITY is a Function in DBMS_STATS package that returns oldest timestamp where statistics history is available. Users cannot restore statistics to a time stamp older than the oldest time stamp. Here is the query to find the statistics history starting timestamp.
SQL> select dbms_stats.get_stats_history_availability from dual;

GET_STATS_HISTORY_AVAILABILITY
---------------------------------------------------------------------------
06-OCT-09 09.52.45.351000000 PM -05:00

SQL>

Oracle maintain the history statistics based on the retention value. Oracle Purge the statistics automatically if it exceeds the statistics retention setting. By defualt, oracle purge the statistics if it is older then 31 days. We can also alter this retention period by using DBMS_STATS.ALTER_STATS_HISTORY_RETENTION procedure. PURGE_STATS is procedure in DBMS_STATS package that enables you to manually purge old statistics beyond a time stamp.

SQL> execute dbms_stats.alter_stats_history_retention(20);

PL/SQL procedure successfully completed.

As per the above code, oracle keep 20 days statistics history.

SQL> execute dbms_stats.purge_stats('06-OCT-09 09.52.45.351000000 PM -05:00');

PL/SQL procedure successfully completed.

SQL>

As per the above code, oracle purge the statistics prior to 06-OCT-09.

How do we recover the statistics? Oracle RESTORE_TABLE_STATS procedure restores statistics of a table as of a specified timestamp. It also restores the statistics of associated indexes and columns. Let me take EMP table in scott Schema and restore previous statistics.

Here are the steps to restore the previous timestamp statistics:

Step1: Let us check the past statistics and what date the statistics were collected.

Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.3.0 - Production
With the Partitioning, OLAP and Data Mining options

SQL> select STATS_UPDATE_TIME from dba_tab_stats_history where table_name='EMP';

STATS_UPDATE_TIME
---------------------------------------------------------------------------
30-OCT-09 06.25.54.809000 PM -04:00
07-NOV-09 07.23.23.504000 AM -05:00
07-NOV-09 07.27.31.709000 AM -05:00
07-NOV-09 07.29.54.324000 AM -05:00

SQL> select last_analyzed from dba_tables where
2 table_name='EMP' and owner='SCOTT';

LAST_ANAL
---------
11-NOV-09

SQL> select sysdate from dual;

SYSDATE
---------
11-NOV-09

SQL>

Step2 : Let us restore the statistics which we collected on 30-OCT-2009.

SQL> execute dbms_stats.restore_table_stats('SCOTT','EMP','30-OCT-09 06.25.54.809000 PM -04:00');

PL/SQL procedure successfully completed.

SQL> select last_analyzed from dba_tables where
2 table_name='EMP' and owner='SCOTT';

LAST_ANAL
---------
30-OCT-09

SQL>


At what circumstances, do we restore the Statistics? We have performance issue in the database and DB was fine yesterday and since yesterday, there is no major change on database. Oracle10g automatic statistic gathering is turned on and last statistics gathering was close to the time that users started complanining about performance. At this circumstances, may be new statistics could be a culprit. We can restore the old statistics and see if we have any performance issue.

Restriction on restoring the statistics:

1. It does not have ability to restore user-defined statistics.

2. Old statistics will not be saved if we use ANALYZE command to gather statistics. Hence, we can recover the statistics.

Here is the input(Help) to write this topic.

Monday, November 2, 2009

Automatic Statistics Gathering in Oracle10g

Oracle10g has one of the automated feature which is called Automatic Statistics gathering. Oracle gathers statistics on all database objects automatically and maintains those statistics in a regularly scheduled maintenance job. When the DB is created, job is automatically created and started to collect statistics on database object with missing or stale statistics. The job GATHER_STATS_JOB calls the procedure dbms_stats.gather_database_stats_job_proc to gather statistics during the maintenance window.

What does GATHER_DATABASE_STATS_JOB_PROC do? Unfortunately, the package body is wrapped, so without access to the original source code, the implementation details are obscured. This procedure collects statistics on database objects when the object has no previously gathered statistics or the existing statistics are stale because the underlying object has been modified significantly (more than 10% of the rows).

This is an internal procedure, but its operates in a very similar fashion to the DBMS_STATS.GATHER_DATABASE_STATS procedure using the GATHER AUTO option. The primary difference is that the DBMS_STATS.GATHER_DATABASE_STATS_JOB_PROC procedure prioritizes the database objects that require statistics, so that those objects which most need updated statistics are processed first. This ensures that the most-needed statistics are gathered before the maintenance window closes.

How do we enable this feature? Automatic statistics gathering is enabled by default when a database is created, or when a database is upgraded from an earlier database release. The following condition should be met to enable this feature.

1. GATHER_STATS_JOB should be enabled.
2. statistics_level should be TYPICAL or ALL.

We can enable the job as below when the job is already disabled for any reason.

Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.3.0 - Production
With the Partitioning, OLAP and Data Mining options

SQL> connect sys/password@orcl as sysdba
Connected.
SQL> BEGIN
2 DBMS_SCHEDULER.ENABLE('GATHER_STATS_JOB');
3 END;
4 /

PL/SQL procedure successfully completed.

SQL>

What time does automatic statistic collection run? By default this job runs within a maintenance window between 10 P.M to 6 A.M week nights and all day on weekends. But in oracle11g, i believe, the maintenance window is not same as Oracle10g.

Do we need to collect statistics in other then the scheduled window? In few Scenario's, it would be required to gather statistics manually.

1. In some cases, we might need to collect the statistics in specific objects other than this maintenance window. If the table data are changed rapidly and optimizer behaves differently due to data changes, then we need to gather statistics manually. We can not wait till the maintenance window open.

2. If the table is bigger and it is failing in the maintenance window for some reason, then you can lock that table during the maintenance window and unlock the table and gather the statistics manually and lock back again.

3. If you have specific table and you want to gather statistics with different option, then you can lock those tables during the maintenance window and gather statistics manually during day time.

Should we change the parameter values in Automatic statistic collection? Oracle recommends not to change the default values for automatic statistics collection. In case, if you need to gather statistics something differently, then lock the table during the maintenance window and unlock the table and gather statistics manually with specfic parameter and lock the table back again.

We have get_param function to find out the parameter values. Also set_param procedure to change the default values for dbms_stats pacakge parameters.