2
votes

Environment: Oracle database 19C

The table in question has a few number data type columns and one column of CLOB data type. The table is properly indexed and there is a nightly gather stats job as well.

Below are the operations on the table-

  • A PL/SQL batch procedure inserts 4 to 5 million of records from a flat file presented as an external table
  • After the insert operation, another batch process reads the rows and updates some of the columns
  • A daily purge process deletes rows that are no longer needed

My question is - should gather stats be triggered immediately after the insert and/or delete operations on the table ?

Per this Oracle doc Online Statistics Gathering for Bulk Loads, bulk loads only gather online statistics automatically when the object is empty. My process will not benefit from it as the table is not empty when I load data.

But online statistics gathering works for insert into select operations on empty segments using direct path. So next I am going to try append hint. Any thoughts... ?

1
What's the total number of rows in in the table? Optimizer statistics are more about skewedness and percentages than absolute numbers. So while you would almost certainly want to gather stats if you updated 5 million out of 6 million rows, you probably wouldn't want to gather stats every day if there are a billion total rows. Also, is the table partitioned and only certain partitions affected? If so, incremental statistics gathering may help significantly. - Jon Heller
I am only inserting rows and deleting rows. Table is not partitioned but I am considering automatic partition on a column called load_d which is a number field tied to a sequence. I load data from a flat file into the table and every row loaded from the file will have same load_id value. For next file loaded I will generate a new load_id and load data. Please see my comment to the answer below with my test results.I do not see any stale stats even after inserting 40 million rows . - RKA
Looks like Oracle 19C feature High-Frequency Automatic Optimizer Statistics is kicking in. Per doc docs.oracle.com/en/database/oracle/oracle-database/19/tgsql/… High-frequency automatic optimizer statistics collection complements the standard statistics collection job. By default, the collection occurs every 15 minutes, meaning that statistics have less time in which to be stale. Is there a way I can check if High-Frequency Automatic Optimizer Statistics is enabled ? - RKA

1 Answers

0
votes

Before Oracle 12c, it was best practise to gather statistics immediately after a bulk load. However, according to Oracle's SQL Tuning Guide, many applications failed to do so, therefore they automated this for certain operations.

I would recommend to have a look at the dictionary views DBA_TAB_STATISTICS, DBA_IND_STATISTICS and DBA_TAB_MODIFICATIONS and see how your table behaves:

CREATE TABLE t AS SELECT * FROM all_objects;
CREATE INDEX i ON t(object_name);

SELECT table_name, num_rows, stale_stats
  FROM DBA_TAB_STATISTICS WHERE table_name='T'
UNION ALL
SELECT index_name, num_rows, stale_stats 
  FROM DBA_IND_STATISTICS WHERE table_name='T';

TABLE_NAME   NUM_ROWS   STALE_STATS
T               67135   NO
I               67135   NO 

If you insert data, the statistics are marked as stale:

INSERT INTO t SELECT * FROM all_objects;

TABLE_NAME   NUM_ROWS   STALE_STATS
T               67138   YES
I               67138   YES

SELECT inserts, updates, deletes 
  FROM DBA_TAB_MODIFICATIONS 
 WHERE table_name='T';

INSERTS UPDATES DELETES
  67140       0       0

Likewise for updates and delete:

UPDATE t SET object_id = - object_id WHERE object_type='TABLE';
4,449 rows updated.

DELETE FROM t WHERE object_type = 'SYNONYM';
23,120 rows deleted.

INSERTS UPDATES DELETES
  67140    4449   23120

When you gather statistics, stale_stats becomes 'NO' again, and `DBA_TAB_MODIFICATIONS* goes back to zero (or an empty row)

EXEC DBMS_STATS.GATHER_TABLE_STATS(NULL, 'T');

TABLE_NAME   NUM_ROWS   STALE_STATS
T              111158   YES
I              111158   YES

Please note, that `INSERT /*+ APPEND */ gathers only statistics if the table (or partition) is empty. The restriction is documented here.

So, I would recommend in your code, after the inserts, updates and deletes are done, to check if the table(s) appear in USER_TAB_MODIFICATIONS. If the statistics are stale, I'd gather statistics.

I would also look into partitioning. Check if you can insert, update and gather stats in a fresh new partition, which would be a bit faster. And check if you can purge your data by dropping a whole partition, which would be a lot faster.