Short note on Line Item and High Cardinality dimensions

One of the quotes I like most I learned from Tom Kyte from Oracle.
It’s one from Artemus Ward:

It ain’t so much the things we don’t know that get us into trouble.
It’s the things you know that just ain’t so.”

This blog post is about one of those cases.

A customer complained about the performance of the index creation process in one of his process chains.

I had a look at the processes messages and found this:

[...]
28.03.2011 06:10:55 SQL: 28.03.2011 06:10:55 ALE_BWP_010
28.03.2011 06:10:55  CREATE BITMAP INDEX
28.03.2011 06:10:55 "SAPBWP"."/BIC/ECUBENAME~040" ON "/BIC/ECUBENAME"
28.03.2011 06:10:55 ( "KEY_SDIMENS1" ) PCTFREE 10 INITRANS
28.03.2011 06:10:55    20 MAXTRANS 255 STORAGE ( INITIAL 16 K NEXT
28.03.2011 06:10:55 81920 K MINEXTENTS 1 MAXEXTENTS UNLIMITED
28.03.2011 06:10:55 PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1
28.03.2011 06:10:55 BUFFER_POOL DEFAULT ) LOCAL PARALLEL TABLESPACE
28.03.2011 06:10:55 "PSAPBWP" NOLOGGING COMPUTE STATISTIC
 --- > 6 hours !!!!????
28.03.2011 12:47:30 SQL-END: 28.03.2011 12:47:30 06:36:35
[...]

There was one index that actually took 6 hours and 36 minutes to create, although the table scan was done in parallel on Oracle level.

The other indexes on the same table only took some minutes (except one other that took something around 30 minutes).
So what could be the reason for that?

The next thing I looked up was the number of distinct values for the different dimension KEY columns (since these are the only ones getting indexes on fact tables):

OBJECT                      DIST        NUM_ROWS   (LF) BLK
Table: /BIC/ECUBENAME                   783908640   14192040
 
  KEY_SDIMENS1            445965129              <<<<<<
    KEY_SDIMENS2                 1529
    KEY_SDIMENS3                 3223
    KEY_SDIMENS4                  795
    KEY_SDIMENS5                  439
    KEY_SDIMENS6                28578
    KEY_SDIMENS7              1126484
    KEY_SDIMENS8                83675
    KEY_SDIMENS9                53296
    KEY_SDIMENSA              2465897
    KEY_SDIMENSB                  170
    KEY_SDIMENSC            194594206             <<<<<<
    KEY_SDIMENSD                    1
    KEY_SDIMENSP                    1
    KEY_SDIMENST                  631
    KEY_SDIMENSU                   43

And this was already the hint required to solve this mystery.

Bitmap indexes are extremely efficient during lookup and when used to combine selection criteria.
What they don’t do very good is to handle large numbers of distinct values (always compared to the total number of rows in the table of course).
In this case, the dimension for which the index creation took six hours (column KEY_SDIMENS1) had more than half as much distinct values than the total number of rows in the table.

If you’re used to data modelling in BW you know that it’s recommended not to have that many distinct values in the dimensions and if it cannot be avoided then the dimension should be marked as line item dimension.
This can be done in the Infocube modelling in transaction RSA1 by right-clicking on the dimension and opening the properties dialogue.

Flagging a dimension as line item dimension changes the data model by that it removes the intermediate SID table that is normally put between fact tables and the actual dimension tables (for flexibility reasons) and joins the fact table directly with the dimension table.

It changes this :

[DIMENSION-TAB1] >-----<  (SID-TAB1) >---< [[FACT TABLE]] >---<  (SID-TAB2) >-----< [DIMENSION-TAB2]

to this:

 [DIMENSION-TAB1] >-----< [[FACT TABLE]]  >---<  (SID-TAB2) >-----< [DIMENSION-TAB2]

Besides this change in the data model the index for the line item dimension should not be a BITMAP index anymore, but a B*TREE index instead. Of course, the SAP BW developers know about this and provided a way to avoid the creation of BITMAP indexes on such dimension key columns.

The fallacy the customer and I became victims of was to believe that flagging a dimension as line item automatically includes the setting for the B*Tree index. After all, this totally makes sense, doesn’t it?

Unfortunately, but correctly, the BW developers separated these two functions and provided two flags for infocube dimensions that can be set independently:

[ x ] Line Item
and
[ x ] High Cardinality

Only setting the High Cardinality flag leads to the creation of B*Tree indexes!
Fortunately, this can be done anytime, even when the infocube is already in use and filled with data.

This is of course documented (here) but as often, things mix up in the memory and we end up believing the wrong stuff.

Before I forget to mention it: as far as I know the High Cardinality flag really only does something on Oracle-based BW systems as the Bitmap Indexes are exclusively available on Oracle.
Maybe there are similar features on other platforms, but I’m not aware of them, so I leave them out for now (and avoid believing the wrong stuff once more…).

So, make sure to check your line item dimensions to avoid long indexing runs.

Best regards,

Lars

Leave a Reply