MaxDB ST05-trace fallacy – when sometimes the trace is wrong…

One of the most important analytical tools used to investigate slow running SQL statements is the well-known ST05 – SQL trace.

The idea of it is that the ABAP database interface notes down what SQL statement it sends to the database, how long it took to get the result, what selection criteria were used and so on.

Obviously, a key point here is that the trace contains exactly the SQL that was actually sent to the database.

Recently I came across a speciality that must be considered when using the ST05 trace results on MaxDB databases. Otherwise one will end up with totally wrong execution paths and thus at wrong conclusions for how to improve performance.

Let’s look at the following SQL Explain plan:

We see that the estimated costs are quite high, but this is not what I want to point out today here. Instead, keep an eye on the execution path – it’s a RANGE CONDITION FOR KEY that uses four key columns (MANDT, BZOBJ, KALNR, KALKA).

Next, also a common step in performance analysis, we take the statement, have the ST05 fill in the ?’s with the real variable values and use the ST05 or the SQL Studio to explain the statement again (and modify it eventually):

Now we still see a RANGE CONDITION FOR KEY, but only two columns are used, making the plan even less efficient.
Now – WHAT IS THE REASON FOR THIS?

The values had been taken directly from the ST05 trace. Let’s double check this:

Sure enough, the bind values are there.
Notable, however, is the fact that the values for KALNR and KADKY (now missing in the used KEY COLUMN list) are NUMBER type values.

The leading zeroes in A2/A3 and the date-like information in A5/A6 might give an indication that this might not be totally correct.
Let’s check the table column definition:

Surprisingly we find both columns to be CHARACTER types.
Well, not exactly surprisingly – many number-datatypes from the ABAP world are mapped to character columns.

For the application and the database interface this is rather transparent, but for the database query optimizer, this is an issue as it cannot use non-matching datatypes for KEY or INDEX accesses. If we want to run or explain the statement manually, we have to take care of this.

Therefore, we have to enclose the numbers into apostrophes to mark them as character strings:

And here we are, back with the original execution plan.

Now the real performance analysis can begin!

P.S. although the KADKY column is not used for the execution plan it’s still important to get the data type correctly – otherwise the optimizer has no chance to estimate the selectivity of the conditions correctly.

Leave a Reply