SAP Support case “FOR ALL ENTRIES disaster”

Last week a colleague from the WHM (data warehouse management) support team called me to get some assistance with a customer message he was processing.

The customer complained about short dumps with the infamous “TSV_TNEW_BLOCKS_NO_ROLL_MEMORY” error, which is basically the way an ABAP work process reports that it ran out of memory.

Checking the short dump

Inspecting the short dump a little further gave some hints on what happened:

[...]
|Error analysis
|    The internal table "\\PROGRAM=GP4MHJ3Z311GXR3HBZEIH3F768L\\DATA=IT_AZSDSOO0600"
|     could not be enlarged further.
|
|    Memory location: "Session memory"
|
|    You attempted to extend the data structure for the management of the
|    memory blocks for table "\\PROGRAM=GP4MHJ3Z311GXR3HBZEIH3F768L\\DATA=IT_AZSDSOO06
|    00". However, the 2696208 bytes required for
|    this were no longer available in the specified memory area.
|
|    The amount of memory requested is no longer available.
[...]

OK, apparently it had been tried to load a lot of data into an internal table (IT_AZSDSOO0600) but there hadn’t been enough free memory available for that. The error message mentions that approx. 2.5 MB had been requested, which is not really much, but not how much space was already allocated.

We also found information about what program caused this issue:

[...]
|Information on where terminated
|    Termination occurred in the ABAP program "GP4MHJ3Z311GXR3HBZEIH3F768L" - in
|     "EXPERT_ROUTINE".
|    The main program was "RSBATCH_EXECUTE_PROZESS ".
[...]

The BW-savvy folks already will have noticed that this is a generated ABAP program.
In SAP BW many activities that are specific to a certain InfoSource/Provider (e.g. loading data, transferring data…) are realized by generated ABAP programs.
These generated programs are usually based on a generic template and could often be extended with so-called expert routines to implement customer specific requirements.
As we see in the error message above, this had been done here as well.
The short dump occurred right in the expert routine of the report.

This is also proven by the stack backtrace resp. the list of active ABAP calls:

[...]
|Active Calls/Events
-------------------------------------------------------------------------------------------------
|No.   Ty.          Program                             Include                             Line
|      Name
-------------------------------------------------------------------------------------------------
|    7 METHOD       GP4MHJ3Z311GXR3HBZEIH3F768L         GP4MHJ3Z311GXR3HBZEIH3F768L           557
|      LCL_TRANSFORM=>EXPERT_ROUTINE
|    6 METHOD       GP4MHJ3Z311GXR3HBZEIH3F768L         GP4MHJ3Z311GXR3HBZEIH3F768L           504
|      LCL_TRANSFORM=>EXECUTE
[...]

The dump also nicely includes a snipped from the ABAP source where the error occurred (line 557):

[...]
|Source Code Extract
---------------------------------------------------------------------------
|Line |SourceCde
---------------------------------------------------------------------------
|  545|  METHOD expert_routine.
|  546|*=== Segments ===
|  547|
|  548|    FIELD-SYMBOLS:
|  549|      <SOURCE_FIELDS>    TYPE _ty_s_SC_1.
|  550|
|  551|    DATA:
|  552|      RESULT_FIELDS      TYPE _ty_s_TG_1.
|  553|    ... "insert your code here
|  554|
|  555|    if not SOURCE_PACKAGE is initial.
|  556|
|>>>>>|      select DOC_NUMBER DOC_CATEG REASON_REJ /BIC/ZCPP_NO  GROSS_WGT
|  558|        BASE_UOM NET_VALUE ITEM_CATEG UNIT_OF_WT /BIC/ZCASE_DAT
|  559|      from /BIC/AZSDSOO0600
|  560|              into corresponding fields of table it_AZSDSOO0600
|  561|        for all entries in SOURCE_PACKAGE
|  562|               where /BIC/ZCPP_NO = SOURCE_PACKAGE-/BIC/ZCPP_NO .
[...]

A hot lead

There we have it: A simple SELECT statement should have loaded data from the active
data table of a PSA/ODS into the internal table it_AZSDSOO0600.
If you check and compare this with the information a few paragraphs above, you’ll find that this is the internal table for which not enough memory was available.

So, we have a pretty hot lead here!

The next thing to check is, of course, to check how many rows do we expect to be returned from the source table /BIC/AZSDSOO0600?
The table itself isn’t that big: a quick look at the CBO statistics reveals that the table contains around 268.006 rows.

Could this already be too much?

By reproducing the issue we had been able to get some more details about the size of the internal table before the dump occurs:

[...]
IT_AZSDSOO0600
Table IT_24841[51189146x104]
[...]

WOW!

For some reason, we managed to select 51.189.146 rows from a table that only contains 268.006 rows.
How is that possible?
Besides that, there is also a WHERE condition with a restriction in place, so we would have expected less rows.
Checking the WHERE condition is in fact key to the solution of this problem.

The customer used the FOR ALL ENTRIES clause in this case which is a SQL feature specific to ABAP.
It allows you to fill a internal table, like SOURCE_PACKAGE in this case, with the values you want to select and hand this to the database without the need of making up a IN LIST yourself.
This is a nice treat to the ABAP developer and a good thing for the database, since the database interface layer plays a nice trick with this statement.

Instead of just creating a possibly super large IN LIST, the db interface automatically loops over the internal table and creates many SQL statements with small IN LISTS, like this:

SELECT
[...]
FROM
"/BIC/AZSDSOO0600"
WHERE
"/BIC/ZCPP_NO" IN ( :A0 , :A1 , :A2 , :A3 , :A4 )

During the execution of the ABAP SQL statement the database interface grabs chunks of 5 values from the internal table (SOURCE_PACKAGE) and executes the same statement over and over again, until all values had been used.
Since the same statement is reused very often, the parsing effort on database level is minimized and due to the short IN LIST the memory consumption for the parsing is also kept very small.

As the documentation for the FOR ALL ENTRIES clause explains, the database interface also takes care of making the lines of the total result unique.

So far so good – but still we don’t know why there was a multiplication of result rows in our case.

Maybe checking the SOURCE_PACKAGE list gives additional insight:

[...]
|SOURCE_PACKAGE
|    Table IT_1638[954x1260]
[...]

Obviously there had been quite a number of IN LIST items to be used here.
Divided by our 5 IN LIST items per statement, we can approximate that the SELECT needs to be executed 954/5 -> 190 times (+ 1 time with the remaining 4 items).
Let’s do a quick cross-check and divide our unexpected super large result set by this number:
51.189.146 / 190 => 269.416

Hmm… pretty close to our total number of rows in the table!

At this point it looks like the whole table was read into the internal table with every new chunk of five IN LIST items.
In fact this is exactly what happened here!

To be able to de-duplicate the result set before returning it to the ABAP program, the database interface first has to temporarily store all returned rows. Since there wasn’t enough memory to do this, the de-duplication wasn’t done yet, when the short dump occurred.

But why did we read the whole table over and over in the first place?
This was due to the contents of the SOURCE_PACKAGE table.
In there, the field used to create the IN LIST items (/BIC/ZCPP_NO) was far from unique.
For the vast majority of rows, the field was just the initial value (” “).
Therefore, the IN LISTs created effectively looked like this:

1st chunk (IN LIST values 1 - 5)
---------------------------------
"/BIC/ZCPP_NO" IN ( " "  ,  " "  ,  " "  ,  " "  ,  " "  )
2nd chunk (IN LIST values 6 - 10)
---------------------------------
"/BIC/ZCPP_NO" IN ( " "  ,  " "  ,  " "  ,  " "  ,  " "  )
3rd chunk (IN LIST values 11 -15)
---------------------------------
"/BIC/ZCPP_NO" IN ( " "  ,  " "  ,  " "  ,  " "  ,  " "  )

For each single chunk the repetition of a IN LIST entry would not change a bit of the result set. The duplicate conditions would just be ignored as soon as a row qualifies the first entry of the IN LIST (remember: a IN LIST is a special case of a OR and EQUAL condition).

Checking the source data in this case proved: most of the rows in the source table would be selected by this condition, so running one of the statements would produce a result worth of around 260.000 rows.

And there is our memory “eater”!

Due to the duplicate initial values in the field of the SOURCE_PACKAGE table, the database interface executed statements over and over again which effectively selected the whole table.
In addition to that it had to store all the single result sets into the internal table before removing the duplicates from it.

The solution to this problem is obvious and simple:
Before calling the SELECT … FOR ALL ENTRIES statement, the list of values to be fed into the FOR ALL ENTRIES statement should be made unique.

In this specific case, this would have made the list a lot shorter (down to ca. 20 entries).
Which would not only have prevented the short dump, but also reduced the number of SQL executions on the database from 191 to 5 plus it saved the time and memory to get rid of the unwanted duplicate rows.

As usual, once you know and understand the dependencies the effect is easy to understand – but at first sight I had been pretty surprised myself about a SELECT statement that multiplies the result set.

 

Leave a Reply