MDX and Missing Suppression

One feature of all Essbase query tools is the ability to suppress (i.e. remove from the result set) #Missing values, in most cases via a simple, on-or-off option that applies to full rows. In MDX, #Missing suppression is slightly more involved, but also has some interesting subtleties that make possible queries that would be harder or impossible (at least, in one step) to achieve using other tools.

In this post I’m going to discuss the use of (all?) the various methods I’ve found for #Missing suppression in MDX. The official documentation certainly doesn’t explicitly mention all of these in the context of #Missing suppression, so I hope that I can offer some tips beyond a regurgitation of the Technical Reference. Something I’m not going to talk about except in the most general terms is the performance implications of the various options. Since, as this post will show, there are many ways to achieve the same query result – and even more legitimate permutations of these techniques combined – that is an area that would warrant further investigation.

Specifically, the available methods (links are to version 11.1.2.3 Technical Reference) are:

Three items are intentionally missing from this list:

First, the MaxL alter session option “set dml_output get_missing_cells” sounds like it should relate to #Missing suppression, but it’s relevant only to cells of date type, text type or to which format strings are applied.

Second, NONEMPTYBLOCK. NONEMPTYBLOCK is not a reliable #Missing suppression tool, but it is discussed below, primarily to explain why!

Third, the NONEMPTYMEMBER and NONEMPTYTUPLE optimization properties. Again, these are discussed only to explain why they don’t help with #Missing suppression.

A final preliminary note about the code and data in my examples.  They are all queries against the ASOsamp.Sample database, loaded with the default “dataload.txt” file, and I have specified “alter session set dml_output alias off;” so that member names are shown rather than aliases.  I’m assuming some familiarity with the structure of the ASOsamp.Sample database, and also sufficient knowledge of MDX to interpret the queries themselves without any explanation.  Specifically, this means understanding the following MDX elements:

  • Keywords SELECT, ON AXIS(), FROM, and WHERE
  • Functions CrossJoin, Descendants, Children and Levels
  • Tuples, e.g. “([Jan], [Sales])”
  • Sets, e.g. {[Jan], [Feb], [Mar]}

There are many good MDX materials online, but an excellent guide specifically for Essbase is William Hodges’ whitepaper from Kscope13. A (free) ODTUG “associate” membership is required. Gary Crisci’s chapter in “Developing Essbase Applications” is another highly-recommended resource.

So, on to the code!

NON EMPTY

The most straightforward and reliable #Missing suppression technique in MDX is to apply the NON EMPTY clause as part of an axis specification.

For example, the output of the following query contains #Missing values. Rows containing #Missing values are shown in bold:

SELECT {[Revenue], [Transactions]} ON AXIS(0), 
       {Descendants([Electronic World])} ON AXIS(1)
  FROM ASOsamp.Sample
 WHERE ([Jan]);

Axis-1              (Revenue)           (Transactions)
+-------------------+-------------------+-------------------
(Electronic World)            542808.16                3270
(072000)                       141483.5                 799
(075477)                       #Missing            #Missing
(078901)                       #Missing            #Missing
(081855)                      116020.43                 755
(086691)                      148911.85                 888
(091543)                      136392.38                 828
(096556)                       #Missing            #Missing

The same query with the NON EMPTY clause applied to the AXIS(1) specification produces the results below, with #Missing rows suppressed:

SELECT {[Revenue], [Transactions]} ON AXIS(0),
       NON EMPTY {Descendants([Electronic World])} ON AXIS(1)
  FROM ASOsamp.Sample
 WHERE ([Jan]);

 Axis-1              (Revenue)           (Transactions)
+-------------------+-------------------+-------------------
 (Electronic World)            542808.16                3270
 (072000)                       141483.5                 799
 (081855)                      116020.43                 755
 (086691)                      148911.85                 888
 (091543)                      136392.38                 828

This is a very simple, two-dimensional query.  It is very common to want more than one dimension on rows. How should NON EMPTY be applied in that case? One (not totally unreasonable) approach might be to add further axes, and apply NON EMPTY to each of them:

SELECT {[Revenue], [Transactions]} ON AXIS(0),
       NON EMPTY {Descendants([Electronic World])} ON AXIS(1),
       NON EMPTY {[Geography].Children} ON AXIS(2)
  FROM ASOsamp.Sample
 WHERE ([Jan]);

Axis-2              Axis-1              (Revenue)           (Transactions)
+-------------------+-------------------+-------------------+-------------------
(North East)        (Electronic World)            116020.43                 755
(North East)        (072000)                       #Missing            #Missing
(North East)        (081855)                      116020.43                 755
(North East)        (086691)                       #Missing            #Missing
(North East)        (091543)                       #Missing            #Missing
(South)             (Electronic World)            426787.73                2515
(South)             (072000)                       141483.5                 799
(South)             (081855)                       #Missing            #Missing
(South)             (086691)                      148911.85                 888
(South)             (091543)                      136392.38                 828

But evidently this doesn’t produce the desired effect; #Missing rows are still appearing in the result set, despite the output being filtered to some degree (compare the list of stores on AXIS(1) with the list in the totally unfiltered query above).  What’s happening is that the members on AXIS(1) and the members on AXIS(2) are being evaluated for “NON EMPTY-ness” separately, in the context of all (i.e. before suppression) members of the remaining axes and WHERE clause.  In this example, if there is non-#Missing data at ([Revenue], [Jan]) or ([Transactions], [Jan]) for a given store on AXIS(1), in any of the potential regions on AXIS(2), that store will not be eliminated by NON EMPTY.  The same applies to each of the regions on AXIS(2), which are evaluated for “NON EMPTY-ness” against all of the potential stores on AXIS(1).

Attempting to suppress #Missing with NON EMPTY on multiple axes is probably quite a common MDX misunderstanding (common enough to have its own Support Document, ID 1277490.1).  In fact, to suppress the #Missing combinations of Store and Geography, the two dimensions must be combined into one axis using a CrossJoin.  As long as the NON EMPTY clause is not used, there is no difference in the data returned by the following two queries1 (the metadata for each row appears differently, with two distinct one-dimensional axes in the first example; and a two-dimensional tuple on a single axis in the second):

SELECT {[Revenue], [Transactions]} ON AXIS(0),
       {[Geography].Children} ON AXIS(1),
       {Descendants([Electronic World])} ON AXIS(2)
  FROM ASOsamp.Sample
 WHERE ([Jan]);

SELECT {[Revenue], [Transactions]} ON AXIS(0),
       {CrossJoin({[Geography].Children}, {Descendants([Electronic World])})} ON AXIS(1)
  FROM ASOsamp.Sample
 WHERE ([Jan]);

However, when the NON EMPTY clause is applied, the CrossJoin behaves differently, providing ‘conventional’ #Missing suppression:

SELECT {[Revenue], [Transactions]} ON AXIS(0),
       NON EMPTY {CrossJoin({[Geography].Children}, {Descendants([Electronic World])})} ON AXIS(1)
  FROM ASOsamp.Sample
 WHERE ([Jan]);

Axis-1              (Revenue)           (Transactions)
+-------------------+-------------------+-------------------
(North East, Electr           116020.43                 755
(North East, 081855           116020.43                 755
(South, Electronic            426787.73                2515
(South, 072000)                141483.5                 799
(South, 086691)               148911.85                 888
(South, 091543)               136392.38                 828

It is worth noting that NON EMPTY can also be used to suppress #Missing columns.  Column suppression is not available in some other Essbase query tools, such as the ‘legacy’ Add-In or Smart View (an option is available but inapplicable to Essbase connections).

NonEmptySubset

If applied to an axis simply for #Missing suppression, the NonEmptySubset function behaves identically to NON EMPTY.  For example:

SELECT {[Revenue], [Transactions]} ON AXIS(0),
       {NonEmptySubset(CrossJoin({[Geography].Children}, {Descendants([Electronic World])}))} ON AXIS(1)
  FROM ASOsamp.Sample
 WHERE ([Jan]);

Axis-1              (Revenue)           (Transactions)
+-------------------+-------------------+-------------------
(North East, Electr           116020.43                 755
(North East, 081855           116020.43                 755
(South, Electronic            426787.73                2515
(South, 072000)                141483.5                 799
(South, 086691)               148911.85                 888
(South, 091543)               136392.38                 828

The function also accepts two additional, optional parameters.  First, an expression against which whatever set is passed can be evaluated, so that, for example, the ([Feb], [Sales]) of only those products having ([Jan], [Sales]) can be returned – potentially, a useful feature.  Second, one or more dimensions can be specified, which will restrict the output of NonEmptySubset.  So a CrossJoin of (e.g.) Geography and Stores can be evaluated for non-#Missing values, while only returning the set of non-Missing Geography members.

Clearly, NonEmptySubset can do some interesting things that go beyond straightforward #Missing suppression, and a full explanation is beyond the scope of this post. Since NonEmptySubset offers additional functionality, my guess is that its performance for “vanilla” #Missing suppression will be inferior to the NON EMPTY axis specification. It would be interesting to benchmark the two.

Leaves

The Leaves function will return all (and only) the leaf-level members that contribute to the value of an upper-level member.  For example:

SELECT {[Revenue], [Transactions]} ON AXIS(0),
       {Leaves([Electronic World])} ON AXIS(1)
  FROM ASOsamp.Sample
 WHERE ([Jan]);

The above query is functionally identical to the following:

SELECT {[Revenue], [Transactions]} ON AXIS(0),
       NON EMPTY {Descendants([Electronic World], [Stores].Levels(0), LEAVES)} ON AXIS(1)
  FROM ASOsamp.Sample
 WHERE ([Jan]);

However, the Leaves function has a number of limitations when compared to NON EMPTY:

  • As the name suggests, it will only return leaf-level members. Note that level-0 shared members that are parents in their primary hierarchies are not leaf-level.
  • It can only be used against ASO databases
  • It can only be used against stored hierarchies. Attempting to use Leaves against a dynamic hierarchy returns Essbase error 1200655

As with NON EMPTY, use of Leaves to suppress #Missing combinations when more than one dimension appears on each row is possible by using the CrossJoin function.  Unlike NON EMPTY, the Leaves function is used inside the CrossJoin.  Confusingly (to me, at least, and not explained in the documentation), Leaves need only be applied to one of the two sets inside the CrossJoin – though it does not matter which, and applying it to both does no harm.  Consequently, all three of the following queries are functionally equivalent, and successfully suppress all #Missing rows.  Note that the LEAVES specification in the Descendants function is not related to the Leaves function, which is shown in bold:

SELECT {[Revenue], [Transactions]} ON AXIS(0),
       {CrossJoin(Leaves([Electronic World]), Descendants([MTD], [Time].Levels(0), LEAVES))} ON AXIS(1)
  FROM ASOsamp.Sample;

SELECT {[Revenue], [Transactions]} ON AXIS(0),
       {CrossJoin({Descendants([Electronic World], [Stores].Levels(0), LEAVES)}, Leaves([MTD]))} ON AXIS(1)
  FROM ASOsamp.Sample;

SELECT {[Revenue], [Transactions]} ON AXIS(0),
       {CrossJoin({Leaves([Electronic World])}, Leaves([MTD]))} ON AXIS(1)
  FROM ASOsamp.Sample;

Filter

The Filter function can be used to test for #Missing values, restricting results accordingly.  There are two ways of testing for #Missing values; via the IsEmpty function, and by comparison to MISSING.  Note that there is no “#” prefix used in MDX, as there would be in BSO member formulas or calculation scripts.

For example:

SELECT {[Revenue], [Transactions]} ON AXIS(0),
       {Filter({[Geography].Children}, NOT IsEmpty([Revenue]))} ON AXIS(1)
  FROM ASOsamp.Sample
 WHERE ([Jan]);

Axis-1 (Revenue) (Transactions)
+-------------------+-------------------+-------------------
(North East) 4195968.65 25097
(South) 3281934.06 19403

SELECT {[Revenue], [Transactions]} ON AXIS(0),
       {Filter({[Geography].Children}, [Revenue] <> MISSING)} ON AXIS(1)
  FROM ASOsamp.Sample
 WHERE ([Jan]);

Axis-1 (Revenue) (Transactions)
+-------------------+-------------------+-------------------
(North East) 4195968.65 25097
(South) 3281934.06 19403

From a functional perspective – performance benchmarking would be an interesting exercise – there is no difference between “IsEmpty([member])” and “[member] <> MISSING”.  As with NonEmptySubset, Filter can be used to do something that NON EMPTY can’t: Restrict on members or tuples other than those being returned in the result set. A set of Products can be filtered on non-#Missing ([Jan], [Sales]), while retrieving ([Feb], [Sales]), for example.  Again, this type of query goes beyond the scope of this post.

NONEMPTYBLOCK

NONEMPTYBLOCK is an undocumented, BSO-only keyword that appears in the MDX generated by Hyperion Planning when the ‘Suppress Missing Block’ form option is selected. It has received wider exposure through Cameron Lackpour’s blog.

NONEMPTYBLOCK restricts the selection on an axis to existing blocks (that is, sparse combinations) and can make a significant improvement to performance. However, NONEMPTYBLOCK is not particularly useful as a #Missing suppression technique in queries: While filtering out non-existent blocks may eliminate some #Missing values in the result set, it is not guaranteed to eliminate all. For example, consider a query against Sample.Basic on ([Dec], [Sales]) for every combination of Product and Market. NONEMPTYBLOCK will eliminate empty sparse combinations where absolutely no data exists for a particular Product / Market. It will not eliminate remaining Product / Market combinations that happen to have no ([Dec], [Sales]), but do have data in other Periods and / or Accounts.

It may be advantageous to combine a NONEMPTYBLOCK sparse axis with a NON EMPTY dense axis to take advantage of the performance improvement offered by NONEMPTYBLOCK, while ensuring complete #Missing suppression.

NONEMPTYMEMBER and NONEMPTYTUPLE

The purpose of NONEMPTYMEMBER / NONEMPTYTUPLE is to improve performance, by making the execution of a time-consuming or complex formula conditional on finding a non-#Missing value at a more quickly evaluated member or tuple. For example, the execution of a complex formula that calculates the net present value of payments on a loan can be made conditional on the current balance of the loan not being #Missing.

It might seem that it could also be leveraged to provide #Missing suppression, but this is not the case. When NONEMPTYMEMBER or NONEMPTYTUPLE prevents the evaluation of a formula / calculated member, that unevaluated formula / calculated member still returns #Missing. For example:

WITH MEMBER [Measures].[NonEmptyMemberRevenue] AS
	  'NONEMPTYMEMBER [Revenue]
		[Revenue]'
SELECT {[Jan]} ON AXIS(0),
       {CrossJoin({[NonEmptyMemberRevenue]}, {Descendants([Electronic World])})} ON AXIS(1)
  FROM ASOsamp.Sample;

 Axis-1              (Jan)
+-------------------+-------------------
 (NonEmptyRevenue, E           542808.16
 (NonEmptyRevenue, 0            141483.5
 (NonEmptyRevenue, 0            #Missing
 (NonEmptyRevenue, 0            #Missing
 (NonEmptyRevenue, 0           116020.43
 (NonEmptyRevenue, 0           148911.85
 (NonEmptyRevenue, 0           136392.38
 (NonEmptyRevenue, 0            #Missing

In Summary

In summary, and probably not surprisingly, NON EMPTY should be your default option. It “just works” – provided you understand the interaction of multiple dimensions.

Against ASO, Leaves will often do better than NON EMPTY for large stored hierarchies (the Technical Reference suggests that the cut-off is around 10,000 members, but testing is recommended!), assuming you only want leaf-level members. Against BSO, try using NONEMPTYBLOCK in preference to NON EMPTY for sparse dimensions, but remember that dense dimensions will still need to be restricted with one of the other techniques.

NonEmptySubset and Filter are included above primarily for completeness, but may be worth testing (for performance) in specific cases. As are combinations – a query that uses Filter to restrict output to rows with e.g. Sales > 10,000 will automatically suppress #Missing, but this does not mean that adding NON EMPTY to the axis will provide no benefit.

And finally (especially for an Add-In veteran) it’s a nice “freebie” that all of the above techniques work as well for columns as for rows.

Show 1 footnote

  1. The axes 2, 3 and so on are technically speaking not rows – MDX provides the aliases “PAGE” and “CHAPTER” for axes 2 and 3 respectively, after which numbers must be used – but MDX flattens these additional axes into rows in order to display results as a single table. This is a different approach to some other Essbase retrieval tools such as the Report Writer, which can create multiple tables for each tuple in a “page” axis

Leave a Reply

Your email address will not be published. Required fields are marked *