When using bitmap compression, squeezing a given set of data into a denser arrangement doesn’t always result in a space saving of the same magnitude
The Database Administrator’s Guide contains a section on estimating disk space requirements. Based on the information there, we know that for each block (again, assuming bitmap compression) the space required is:
- 72 bytes as a fixed length header (the ‘block header’ of the dreaded ‘invalid block header’ error)
- A bitmap containing one bit per dense cell in the block
- 8 bytes per non-#Missing value in the block
It’s important to note that the space consumed by the non-#Missing data itself is always 8 bytes per value. This means that for a given data set there is no variation in the space required for data – regardless of block size or density. The only things that change with dense/sparse configuration are the amount of space required for block headers (driven by the changing number of blocks) and space required for bitmaps (driven by the changing block size and number of blocks). In other words, the “overhead”.
Running the numbers for most reasonable block sizes (say 10k up), we can calculate the following rules of thumb. I am not a fan of Essbase rules of thumb when they substitute for an understanding of the underlying mechanism, but hopefully in this case I’m providing both. Notice that the relative change in block density in both examples is the same (ten times), but the relative change in space required is very different:
1. Increasing block density has a large effect on space required when starting from a low density
2. Increasing block density has a small effect on space required when starting from a high density
With an initial block density of 0.1%, only ~5% of space is consumed by data with ~95% going to overhead. Increasing block density ten times to 1% results in overhead falling to ~60%. A data set that requires ~1GB will shrink all the way to ~125MB.
Pingback: The EPM Week In Review: Week Ending February 13, 2016 – Hyperion EPM