Automatic Table Maintenance Self-controlled
Using DB2 automatic table maintenance and want to get more control or simply control it yourselves? Still manually checking if a table or index is in need of reorganization and getting tired of it? Then you should read on. I’ve got a possible solution which will gain you more control and will make your life easier.
Ever heard of a stored procedure that can do the reorg checking for you? Maybe you have, but just in case you haven’t, IBM LUW has go two procedures for it, right out of the box. One for checking if a table needs a reorg and another for checking if an index needs a reorg. Both are held in the
SYSPROC schema. The one for checking the tables is called
REORGCHK_TB_STATS and the one for checking indexes
REORGCHK_IX_STATS. See the References section for more detail on these procedures.
I can hear you thinking; Ok that’s great, but how about the automated part of it? Well, the nice thing of procedures is that you can call them from other procedures. So, why not create our own procedure that executes the table and index reorg checks, evaluates the results and then decides to reorganize the table and/or index(es) based on the outcome? Let’s have a look at how we can setup automatic table maintenance, self-controlled.
Setup automatic reorganization procedure
Let’s examine what we need to perform the task at hand.
- We need to check which tables and indexes are in need of a reorg, for this we need to execute the REORGCHK procedures. These procedures will create a session table with the results, one for the reorg table and one for the reorg indexes stored procedure. Because these two procedures return two dynamic result sets, we need to add it to our stored procedure signature too
- Than we need to evaluate the results. Which tables and/or indexes need reorganization. The outcome of the evaluation is easily put into a temporary table together with the reorg command to execute
- Based on the evaluation we reorganize the table or index
- After reorganization it’s a good idea to collect newly statistics
Defining our stored procedure
CREATE PROCEDURE dba.housekeep_base_tables_sp() LANGUAGE SQL SPECIFIC housekeep_base_tables_sp DYNAMIC RESULT SETS 2 BEGIN … END
We add the
DYNAMIC RESULT SETS 2 to the signature because of the two stored procedures were using in it (
Because we want to evaluate and store the reorgchk results we create a temporary table to do so. This table can look like the one below. Please note that this table holds more columns that actually needed for our example. It’s just to give you an idea of what you can store more in an automatic table maintenance stored procedure in which you can evaluate more than just reorganizing a table or not. For instance automating runstats based on data change percentage. See my previous blog Runstats on data change percentage about details on those. Table type could be partitioned, non-partitioned, of a certain size.. all kinds of things that can give you greater control about what’s being executed in the end.
DECLARE GLOBAL TEMPORARY TABLE housekeeptabs ( table_schema VARCHAR(128) NOT NULL , TABLE_NAME VARCHAR(128) NOT NULL , statistics_profile VARCHAR(2048) , db2_reorg_line VARCHAR(256) , db2_runst_line VARCHAR(2048) , card BIGINT , task_type VARCHAR(30) , reorg_needed CHAR(1) , runstats_needed CHAR(1) , data_change_pct DECIMAL(17,2) ) ON COMMIT PRESERVE ROWS NOT LOGGED ON ROLLBACK PRESERVE ROWS WITH REPLACE;
Call REORGCHK procedures
Next we need to call both stored procedures from inside our own
housekeeping_base_tables_sp stored procedure. To achieve this we can put in the following lines
-- Determine reorg status of tables and indexes that need reorg -- This returns a dynamic result set (there for need to set the procedure to return 2 dynamic result sets) CALL SYSPROC.REORGCHK_TB_STATS('T','ALL'); CALL SYSPROC.REORGCHK_IX_STATS('T','ALL');
Filling the temporary table
Because I’m in favor of offline reorgs, I’ve split up the filling of the temporary table into two. One part filling the table for non-partitioned tables and another for partitioned tables.
Non-partitioned tables insert statement
INSERT INTO SESSION.housekeeptabs SELECT TRIM(a.tabschema) AS table_schema , TRIM(a.tabname) AS TABLE_NAME , a.statistics_profile AS statistics_profile , VARCHAR ( 'reorg table' || SPACE(1) || RTRIM (a.tabschema) || '.' || RTRIM (a.tabname) || SPACE(1) ) AS db2_reorg_line , VARCHAR ( 'runstats on table' || SPACE(1) || RTRIM (a.tabschema) || '.' || RTRIM (a.tabname) || SPACE(1) || CASE WHEN a.statistics_profile IS NOT NULL THEN 'use profile ' ELSE 'on all columns with distribution on key columns and detailed indexes all' END ) AS db2_runst_line , a.card , CASE WHEN ( a.card >= #bigTableNrRows) THEN 'BIG_TABLE' ELSE 'NORMAL_TABLE' END , CASE WHEN (a.tabschema, a.tabname) IN ( SELECT TABLE_SCHEMA, TABLE_NAME FROM TABLE ( SELECT TABLE_SCHEMA, TABLE_NAME FROM SESSION.TB_STATS WHERE REORG LIKE '%*%' UNION SELECT DISTINCT TABLE_SCHEMA, TABLE_NAME FROM SESSION.IX_STATS WHERE REORG LIKE '%*%' ) AS TABLES) THEN 'T' ELSE 'F' END AS REORG_NEEDED , CASE WHEN (DBA.MON_DATA_CHANGE_THRESHOLD_EXCEEDED(a.tabschema, a.tabname, a.card, #runstats_data_change_pct_threshold) = TRUE) -- data change threshold exceeded OR (a.type = 'V' AND substr(a.PROPERTY, 13,1) = 'Y' AND a.STATISTICS_PROFILE IS NOT NULL) -- statistical views with statistics profile set OR (a.stats_time IS NOT NULL AND (a.alter_time > a.stats_time)) -- table changed, but no runstats was executed afterwards THEN 'T' ELSE 'F' END AS RUNSTATS_NEEDED , DBA.MON_GET_DATA_CHANGE_PERCENTAGE(a.tabschema, a.tabname, a.card) AS DATA_CHANGE_PCT FROM SYSCAT.TABLES a WHERE (a.type = 'T' OR (a.type = 'V' AND substr(a.PROPERTY, 13,1) = 'Y' AND a.STATISTICS_PROFILE IS NOT NULL)) -- tables and statistical views with statistics profile set AND a.tabschema NOT LIKE ('SYS%') -- Exclude system tables AND a.tabschema NOT LIKE ('ADMIN%') -- Exclude admin explain tables AND a.tabschema != ('DBA'); -- Exclude DBA tables COMMIT;
The insert for partitioned tables is almost the same. The reorg command has to reorg the data partition instead of the table in whole. To be able to do this we need to join the
SYSCAT.DATAPARTITIONS to our
JOIN SYSCAT.DATAPARTITIONS p ON p.tabschema = a.tabschema AND p.tabname = a.tabname
The reorg command changes like this
VARCHAR ( 'reorg table ' || RTRIM(p.tabschema) || '.' || RTRIM(p.tabname) || ' on data partition ' || RTRIM(p.datapartitionname) ) AS db2_reorg_line
Evaluate the results
Evaluation if reorganization is needed is done by in a loop, looping over all tables in the temporary table and then looking at the flag
The whole procedure is displayed below. It is a stripped version of the one I’m using myself. You can add more control to it, add logging into separate tables for back-tracking and history or to display the results in a specially designed maintenance application. The possibilities are endless. Note that the procedure described here is also evaluating runstats collection. This is not further described in this blog. More information about the evaluation of runstats can be found in a blog article I wrote earlier and can be found here. It’s merely there to give you a broader view of the possibilities when writing your own maintenance tasks.
I’m quite conservative in determining if a table or partition needs a reorg. I’m reorganizing a table or partition (including all of its indexes) when the reorg check indicates that the table needs a reorg or when an index of that table needs an index. But you can make this more fine grained and more sophisticated. One can choose to only reorg what is needed. To evaluate you can have a look at the different formula’s that the procedures are returning. A really good article about those formula’s and how they are calculated can be found on the IBM Knowledge Center, but I would certainly advice you to have a look at Ember Crooks DB2Commerce website for an excellent article on these.
Please also note that there is a drawback in regard to partitioned tables. In my database only the last partition of every table is changed, therefor when that last partition needs a reorg we also collect statistics on that table (sadly runstats collection is not available on a data partition but can only be executed for the whole table). Because of this specific behavior at my database the table that holds this last partition is only collecting runstats once when the partition needs reorganization. But if you have multiple partitions in a table that need reorganization, the current solution will collect statistics on the table every time a partition is reorganized. And that’s merely a waste of time. Better solution for such cases are to collect statistics after all partitions in a single table that need reorganization have been reorganized.
Won’t compile trick
And guess what when you try to install the procedure: It won’t compile and will give you the well-known “Don’t exists”
SQL-204. You can get around this by firstly calling the stored procedures just before creating the procedure. Now the DB2 compiler “sees” the two session tables that are created by the two reorg check procedures. Calling of the two stored procedures prior to installing the housekeeping stored procedure is already incorporated in the end result described above.
Being in control about your maintenance tasks is always a good idea. The method described above shows how relatively easy you can control reorganization and collection of statistics in a single automatic table maintenance self-controlled stored procedure. The procedure can be called from a scheduler at the time of your liking, for instance when the load on your system is the lowest, outside of business hours of just on demand. It gives you the flexibility you as a DBA need to be in control.