This post expands on a concept I touched on a few years back in the context of #Missing suppression (see the section on NONEMPTY in this post) and specifically the “perspective” from which any kind of filter – for #Missing or not – is evaluated. This question on the Essbase Users board got me thinking about the topic again, and the potential MDX ‘gotcha’ it presents.
A First Attempt
Let’s suppose we’re running a query against Sample.Basic. I’m looking at Profit for the Markets in the [East] region, and [Colas] Products. Here’s my first attempt at a query and its results:
SELECT {[Profit]} ON AXIS(0), NON EMPTY CrossJoin({[Colas].Children}, {[East].Children}) ON AXIS(1) FROM [Sample].[Basic]; Axis-1 (Profit) +----------------------------------+---------------- (Cola, New York) 11129 (Cola, Massachusetts) 5105 (Cola, Florida) 942 (Cola, Connecticut) 1217 (Cola, New Hampshire) 367 (Diet Cola, New York) 1114 (Diet Cola, Florida) 1114 (Caffeine Free Cola, New York) 413 (Caffeine Free Cola, New Hampshire 413
What’s Going On?
So far, so good. NON EMPTY is eliminating a bunch of #Missing rows here for us. But let’s say we really only care about seeing our highest performing Product / Market combinations – specifically those where [Profit] exceeds 5000. Let’s try putting a Filter statement into the query:
SELECT {[Profit]} ON AXIS(0), NON EMPTY CrossJoin({[Colas].Children}, Filter({[East].Children}, [Profit] > 5000)) ON AXIS(1) FROM [Sample].[Basic]; Axis-1 (Profit) +----------------------------------+--------------------- (Cola, New York) 11129 (Cola, Massachusetts) 5105 (Cola, Florida) 942 (Diet Cola, New York) 1114 (Diet Cola, Florida) 1114 (Caffeine Free Cola, New York) 413
Well, this is weird. Didn’t we just throw a Filter in there for rows where [Profit] is greater than 5000? And yet there are still four lines in there with [Profit] of less than 5000!
A Question of Perspective
So what’s really happening? The following query looking at the individual Markets but at the root of the Product dimension gives us a clue:
SELECT {[Profit]} ON AXIS(0), {[East].Children} ON AXIS(1) FROM [Sample].[Basic]; Axis-1 (Profit) +---------------+---------------------------------- (New York) 24161 (Massachusetts) 6712 (Florida) 5029 (Connecticut) 3093 (New Hampshire) 1125
It might not be a coincidence that the states where the total [Profit] (for all Products) exceeds 5000 are the ones for which we saw rows in the previous query, i.e. [New York], [Massachusetts], and [Florida]!
The problem is the “perspective” in play when that Filter clause is being evaluated. When we do this…
CrossJoin({[Colas].Children}, Filter({[East].Children}, [Profit] > 5000))
…we are not asking for the combinations of [Colas] Products and [East] Markets where the [Profit] exceeds for 5000. We are asking for the combinations of [Colas] Products with those [East] Markets where the [Profit] for all Products exceeds 5000. The Filter on [East].Children is not aware of its “context” as one side a CrossJoin with [Colas].Children.
What’s the Fix?
So how can we solve the problem? The answer is analogous to the solution in the older post I mentioned above regarding suppressing #Missing values across multiple dimensions in a CrossJoin: Wrap the Filter around the CrossJoin, rather than wrapping the CrossJoin around the Filter. Now the Filter is interpreted as applying to the tuples in the CrossJoin result, and not only to the tuples in one of the two CrossJoin parameters:
SELECT {[Profit]} ON AXIS(0), NON EMPTY Filter(CrossJoin({[Colas].Children}, {[East].Children}), [Profit] > 5000) ON AXIS(1) FROM [Sample].[Basic]; Axis-1 (Profit) +-----------------------+---------------------------------- (Cola, New York) 11129 (Cola, Massachusetts) 5105
Tada! The result set that we really wanted – only those combinations of Product and Market with [Profit] > 5000.