Automatic CALCTASKDIMS and Empty Tasks

Some Old News

One relatively unsung enhancement to Essbase in 11.1.2.2 was a change to CALCTASKDIMS behavior.  Before 11.1.2.2, CALCTASKDIMS defaulted to a value of 1.  From 11.1.2.2,  Essbase selects a value for CALCTASKDIMS automatically unless overridden by the user with either the CALCTASKDIMS .cfg file setting or the SET CALCTASKDIMS calculation command.

So why am I blogging about this years after 11.1.2.2 came out?  First, there is a good theoretical reason why the above Essbase ‘enhancement’ might have a seriously negative effect on calculation performance, which can be especially surprising when it occurs following a supposed upgrade.  Second, this actually bit a coworker a few days ago, and it’s always satisfying (for me, if not my coworker) when empirical data and theory coincide.

Some Theory

Parallel calculation is explained in the Database Administrator’s Guide here; alternatively, I gave a quick overview in slides 12-17 of this Kscope 2011 presentation.   The important thing to know about CALCTASKDIMS is that it determines how many sparse dimensions are analyzed to create the ‘calculation task schedule’.  For example, CALCTASKDIMS 1 on Sample.Basic will treat each Market as a potential parallel task.  CALCTASKDIMS 2 on Sample.Basic will treat each combination of a Product and a Market as potential parallel tasks.  This means that the number of potential tasks can increase dramatically with the value of CALCTASKDIMS.  Note that CALCTASKDIMS has nothing to do with the number of parallel threads assigned to chew through the tasks in the list (that’s what CALCPARALLEL is for).

In all real-world databases, some of the potential tasks turn out to have no data; these are called ’empty tasks’.  The Database Administrator’s Guide contains the following warning:

If the ratio of empty tasks to the tasks specified in the calculation task schedule is greater than 50% (for example, 291 / 576), parallelism may not be giving you improved performance because of the high sparsity in the data model.

Increasing the value of CALCTASKDIMS will, at best, not reduce the proportion of empty tasks.  In real-world databases it is extremely likely to increase it; in my example above, ‘New York’ may be one of the non-empty tasks when CALCTASKDIMS is 1, but when CALCTASKDIMS is increased to 2, only some of the ‘New York’->Product tasks are non-empty.  This certainly does not mean that increasing CALCTASKDIMS always has a negative affect on performance.  But generally speaking, CALCTASKDIMS should (in my opinion, regardless of version) be selected by analysis and testing.  Oracle have not documented the algorithm used to automatically set CALCTASKDIMS in 11.1.2.2.

Some Practice

My coworker has been involved in a project to upgrade a client from an earlier release, and simultaneously migrate their applications to substantially more powerful (and expensive) hardware.  Naturally, a degree of performance improvement was anticipated.  However, the duration of one particular BSO aggregation increased dramatically.

Sure enough, no CALCTASKDIMS setting was specified in the original calculation script, resulting in an effective CALCTASKDIMS setting (pre-11.1.2.2) of 1.  In freshly-installed 11.1.2.3, Essbase selected CALCTASKDIMS 5.  This produced a calculation task schedule as follows:

OK/INFO - 1012679 - Calculation task schedule [23616,5298,1182,438,126,42,30,18,18,18,12,12,6,6,6].
OK/INFO - 1012680 - Parallelizing using [5] task dimensions.
OK/INFO - 1012681 - Empty tasks [22404,4992,932,350,88,30,24,14,14,14,8,12,6,6,6].

Note that well over 90% of the tasks are empty at most ‘steps’ in the schedule.  Poor performance is therefore, per the DBAG warning, not a surprise.  Forcing CALCTASKDIMS back to 1 improved matters; CALCTASKDIMS 2 and 3 ran even faster.

In summary, the automatically selected CALCTASKDIMS setting in 11.1.2.2 onward is not necessarily optimal; it may even be (much) worse than the CALCTASKDIMS 1 default in pre-11.1.2.2 versions.  With 11.1.1 now out of premier support, most remaining customers will be moving to later versions.  It’s worth watching out for this phenomenon – no one likes to see any aspect of performance get worse after spending money and time on an upgrade!

Leave a Reply

Your email address will not be published. Required fields are marked *