A first (BW) issue…

In the second week working in BW support, I noticed that there was a problem reported by customers again and again.
The customers complained about “database statistics update fails in process chains” or “brconnect run returns errors”.
When checking the relevant log file entries for the actions there was always something like

ERROR at line 1:
ORA-20000: index "SAPR3"."/BIC/F100069~900"  or partition of such index is in unusable state
ORA-06512: at "SYS.DBMS_STATS", line 15181
ORA-06512: at "SYS.DBMS_STATS", line 15203
ORA-06512: at line 1

The more database savvy customers or DBAs sometimes had a quick resolution at hand: to rebuild the index.

And, indeed, after rebuilding the affected index (as I realized it was nearly always a 900 index) the statistics could be correctly gathered.
Problem solved, or not?

Well, only until the next compression run for the infocube happened.

Infocube compression, wait, like infocube.zip or what?

The term ‘compression’ is a bit misleading here and ‘to condense’ seems more appropriate.
To understand this action, we’ve to take a look at how fact table data is managed in SAP BW.
If you’re yet unfamiliar with the BW-star schema (extended star schema) it might be a good idea to check the documentation or the BW related blogs and WIKIs here at SDN.

The fact tables are used to store the actual business data that should be reported on to. In the simplest case, this data is loaded from the source system (e.g. the ERP system) on a regular basis, say daily.
So, we get new data in more or less large chunks or packets.
The BW term for this is ‘request‘.
A data request is a specific chunk of data that only has one thing in common: it has been loaded into the fact table together. [more on this]

Requests and what they are good for

This request wise processing provides several options that are otherwise quite hard to achieve in data warehouse systems:

  • we can load just the data that had been changed since the last load (delta loading)
  • we can check the data after each load for technical and quality issues and decide whether or not it should occur in the reporting
  • in case some of the data was not imported correctly, we can easily delete it again, without impairing the reports that use the already available data.

If you think about it, this means, that it’s totally possible to load information about the same business objects (let’s say: direct sales to high-level customers in southern Germany) can be loaded several times.´

If you load the sales transaction data every day to, later on, report it on a weekly base then you’ll get the sum of all the sales aggregated over a week – and use data of 7 requests work of data (1 request per day, 7 days a week).

But as we see, for our reporting requirement (sales on weekly basis) it’s actually not necessary to keep all the data load packets (requests).
Once we are sure about that the data is technically and quality wise OK, then we might just as well sum up the data to the weekly level, store this and throw away the 7 requests.
This is what the compression of Infocubes pretty much is about.
In SAP BW this is implemented based on two tables:
the F-fact table to which all data load requests go and the E-fact table, which contains the information for the pre-aggregated (condensed) data. [more on this]

Getting closer…

On reporting time, the SAP BW OLAP engine knows about the fact that our data is stored in two tables.
So for every BW query against an Infocube, we usually see TWO nearly identical SQL statements, that only differ in the fact table that is actually used.

Now we have:

  • two tables,
  • we have data that needs to be read and aggregated request-wise from the F-fact table,
  • we have the aggregation result that has to be saved in the E-fact table
  • we’ve data that afterwards needs to be deleted from the F-fact table – otherwise some numbers would be included twice in our report!

Looking at this database requirements, there’s an Oracle feature available, that really is made for it (it really is ;-)): PARTITIONING!

Without partitioning the final deletion of already condensed requests would require to

  • scan at least one full index in order to find all rows matching the requests to be deleted,
  • remove them
  • and update the indexes afterwards.

With partitioning, all we’ve to do is to drop the partitions that contain our request.
That’s the reason why the F-fact tables are always partitioned based on the request dimension (on DBMSes that support partitioning, of course)

So, we can easily get rid of data, when the table is partitioned the right way.

But what about the indexes?

There’s a treat in Oracle available for that as well: local partitioning of indexes.
Simply put this means: for every table partition a corresponding partition is created for the partitioned index.

With this, we don’t even have to rebuild the indexes after dropping a table partition.
All we’ve to do is to also drop the corresponding index partition together with the table partition.
The remaining index will still be completely correct and will still cover all data in the table.

Ok, now we arrived at the start of the problem 😉

All this sounds quite good.
In fact, it’s great!
And (of course) here begins the problem.

This great combination of clever data design, implementation and database feature exploitation only works properly if the indexes really are partitioned exactly as the table is.
So BW has to take care of this since Oracle allows to create indexes with a different partitioning scheme or without partitioning as well.
If this is the case and we drop table partitions, then Oracle would have to read every row of the dropped partition to know every deleted row and take this information to maintain the index data.
Obviously, this would render the partition advantage null and void.
So, Oracle simply flags all indexes for which the same drop partition cannot be performed as UNUSABLE.

Such UNUSABLE indexes can be repaired simply be rebuilding them.
The Oracle cost-based optimizer is clever enough to ignore those indexes (also see Oracle parameter “skip_unusable_indexes“), so queries will not fail because of this.

Except… except we force Oracle to use the broken index by using a hint.

“Where the heck do we do such stuff?” – is that the question you’ve got in mind right now?

Well, we do it every time you run the update statistics job.
And collecting CBO statistics after deleting lots of data from a central fact table in BW is usually done as part of the data loading process (chain).

In Oracle, update statistics means calling a PL/SQL stored procedure in the DBMS_STATS package. And in there, Oracle will run SQL statements like this:

select /*+ no_parallel_index(t,"/BIC/F100069~900") dbms_stats cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring no_expand index(t,"/BIC/F100069~900") */ count(distinct sys_op_lbid(15948,'R',t.rowid)) as nrw ,count(distinct sys_op_lbid(15948,'L',t.rowid)) as nlb ,count(distinct "KEY_IC_0025P") as ndk ,null as clf from "SAPR3"."/BIC/F100069" t where TBL$OR$IDX$PART$NUM("SAPR3"."/BIC/F100069",0,4,0,"ROWID") = :objn

No reason to understand the whole statement for now, but the important part here is the INDEX hint.
With this hint, Oracle is forced to use the index (for which statistics should be collected).
If our index is UNUSABLE at this point in time, then Oracle has no other choice as to report:

ERROR at line 1:
ORA-20000: index "SAPR3"."/BIC/F100069~900"  or partition of such index is in unusable state
ORA-06512: at "SYS.DBMS_STATS", line 15181
ORA-06512: at "SYS.DBMS_STATS", line 15203
ORA-06512: at line 1

LOOOONG story short.

Mashing all this together:

WHEN
        we've a partitioned table with a non-partitioned index
AND
       we drop a partition of the table, so that the non-partitioned index is flagged UNUSABLE
AND
       when we finally run an update statistics on this table,
THEN
       we end up with our ORA-20000 error message.

Why do I tell you this, when BW does take care of the database objects so well?

Because sometimes it fails to do so.
One reason for such a failure is a bug that has been introduced with an SAP Basis SP.
This bug is already fixed with SAP note 1479683 but there are many customers around who haven’t yet installed the note and who are just facing miraculous errors in their info cube compression process chains.

As you can see the connection between the ORA-20000 error message about the unusable index and the real cause is rather a long distance one, although straightforward once you understand it.

The solution (ahem… TATAAA!)

To finally get rid of this problem, I wrote note #1513510.

In there you’ll find that you have to
a) import the fix for the SAP Basis bug (note #1479683)
b) recreate (NOT just rebuild) the indexes

For step b) the easiest and quickest way is to use the same function modules that SAP BW does use for this task.

Excursion… Aggregates are just little infocubes…

In note #1513510 I included an SQL statement to find all partitioned tables for which there are non-partitioned indexes so that the DBA or BW-Poweruser can look up the problematic ones without having to wait for process chains to fail.
The statement looks like this:

select
    /*+
    no_parallel_index(t,"/BIC/F100069~900")
    dbms_stats
    cursor_sharing_exact
    use_weak_name_resl
    dynamic_sampling(0)
    no_monitoring
    no_expand
    index(t,"/BIC/F100069~900")
    */
 count(distinct sys_op_lbid(15948,'R',t.rowid)) as nrw
,count(distinct sys_op_lbid(15948,'L',t.rowid)) as nlb
,count(distinct "KEY_IC_0025P") as ndk
,null as clf
from
    "SAPR3"."/BIC/F100069" t
where
    TBL$OR$IDX$PART$NUM("SAPR3"."/BIC/F100069",0,4,0,"ROWID") = :objn

If you run this statement you may come across tables like /BIC/F100234.
But you don’t have any Infocube named “100234” – so what are those tables about?
They belong to aggregate infocubes. [more on this here]
These are (small) subsets of data that the OLAP processor can choose to deliver the reporting result much quicker. In this respect, aggregates are very much like database indexes.
Since the aggregates really are subsets of the actual large infocube they have an F- and E-fact table as well and the same problem can occur with them as well.

If you now want to know to which infocube a specific aggregate belongs to, you can easily look it up in the RSAGGRDIR table.
For every aggregate, you’ll find an entry within the table that maps the aggregate to the infocube.

Checking this table in SE16 delivers an output similar to this:

Table:          RSDDAGGRDIR

   AGGRUID                   OBJVERS INFOCUBE   AGGRCUBE ...
                                                         ...
   03T89W5IUUEFPARPLPZ29YZF3 A       0BWVC_AGV  100108   ...
   200H0IWIR23WASLOESNT8BKJZ A       0BWVC_AGV  100099   ...
   3VILFVABC6MYNF9R10M0WYVHR A       ICD05      100772   ...
   3VIVFYBTHUCH8HF58HQCRGJXS A       ICD03      100778   ...
   3VIVG4AW8R8BQ0JPRVJWKZJZK A       ICD03      100779   ...
   3VIVG8ZVTWHX3SFLC8ZEQ6RQO A       ICD03      100780   ...
   3VP09ETI53LHVKWQHLL79RK5X A       RSDRICUBE  100032   ...
   40VXFTXRAJ6NNT88CWOEA3LYN A       0BWVC09CP  100071   ...
   40VXFU60ZFEEYGCAFPLSI952N A       0BWVC09CP  100072   ...
   40VXK7M8IUTH0IH052QGOF94F A       0BWVC09CP  100073   ...
   [...]
  

Ok, I hope some of you really made it to the end of this rather lengthy first BW blog post. I’ll try to keep them shorter in the future 😉

regards,

Lars

Leave a Reply