‘Mind the gap…’ (#4 – reading query execution plans)

In my mini-series “Mind the gap…” I will try to shed some light on where the little or big differences between MaxDB and Oracle databases are and what to keep in mind when working with them.

Today I’d like to write a bit about the pitfalls you can tap into when trying to apply your query tuning knowledge from Oracle to your MaxDB instance.

Basically, both databases do employ a so-called cost-based optimizer to figure out what access strategy will deliver the results as fast as possible.

For those of you that don’t know what ‘optimizer’ or ‘execution plan’ means, I recommend to grab the documentation of either one of the DBMS and read about it there. Here I will focus on the differences between both systems.

The test schema

To make it easier to compare the output of the explain-statements for both DBMS I use the same test data on both databases.

Actually, it is just the demo schema ‘HOTEL’ that comes with every MaxDB.

Here’s how it looks like from a modelling point of view:


To make things a bit more equal for both databases, I created unique indexes on Oracle wherever there is a primary key on MaxDB.As you see it’s a pretty straightforward schema with no specialities.

Also, still to make it simpler, I will just focus on the CUSTOMER and CITY tables as this should be enough to make the points.

1. Query, single table – get used to the explain commands

select * from city

This first query is really just for getting in touch with the command available to display information about the execution plan.

While for Oracle there is one command ‘EXPLAIN PLAN FOR’ to generate the execution plan and another one ‘SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY);’ to display it, MaxDB directly shows the execution plan as the result of ‘EXPLAIN’ or ‘EXPLAIN JOIN’.

And, yes, there are also other options to generate execution plans for both databases, but the ones mentioned above are the most commonly used ones.

One word about the output of the Oracle EXPLAIN PLAN: when used from within SAP CCMS (ST04/ST05) you’ll see SAPs own representation of the execution plan.

It contains less information but is more graphically oriented – anyhow, since newer releases of the SAP_BASIS also provide the Oracle output formatting I’ll stick with this for this blog.

So why does MaxDB needs two commands if it displays the execution plan automatically? Because the information of both outputs differs.

In fact, MaxDB’s explain facilities needed to be extended and fit to some old coding – so, for now, we need to live with two EXPLAINs to see all information (remark: a completely new implementation of EXPLAIN is on it’s way but likely won’t be available before MaxDB 7.8).

We’ll see what both commands do later on, so let’s get started with the simple query above.

Oracle
SQL> explain plan for
2  select * from city;
Explained.
SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------
Plan hash value: 3973142328

————————————————————————–
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
————————————————————————–
|   0 | SELECT STATEMENT  |      |  1765 | 31770 |     5  (20)| 00:00:01 |
|   1 |  TABLE ACCESS FULL| CITY |  1765 | 31770 |     4   (0)| 00:00:01 |
————————————————————————–

8 rows selected.

MaxDB
SCHEMANAME  TABLENAME  COLUMN_OR_INDEX  STRATEGY                                PAGECOUNT
LARS        CITY                        TABLE SCAN                                      11
                                           RESULT IS NOT COPIED , COSTVALUE IS          11
                                        QUERYREWRITE - APPLIED RULES:           
                                           DistinctPullUp                                1

So, what do we see here?

Oracle

Oracle shows us the access strategy (a full table scan), the accessed table (CITY) and for each line, we see the expected number of rows, the bytes to be processed, a cost value and an estimation for the execution time.

An important thing to know here is that the costs basically add up to each level of execution. We’ll see below what ‘level of execution means here’. Anyhow, this level is shown by how far the entry in ‘OPERATION’ is indeed to the right.

Note that we don’t see how many blocks Oracle expects to touch – this information in encoded to the COST for each line. Finally, the ‘TIME’ column is derived from the COSTs and some system-wide performance statistics.

MaxDB

MaxDB, in turn, shows us the used table (CITY in schema LARS) and the chosen operation on the table (TABLE SCAN here as well). Instead of COSTS or TIME we see a PAGES column.

ATTENTION: this DOES NOT contain the estimated number of pages to be accessed at this line, but it shows the SIZE of the table/index accessed by the operation in the current line. Therefore the PAGES of a multi-step execution plan DON’T add up like they do in Oracle.
Instead, the MaxDB explain provides a separate line ‘COSTVALUE IS’. This and only this line gives us information about the estimated effort of the query.

The lines below ‘Queryrewrite…’ are just additional information – the numbers shown in the PAGES column here simply tell how often a specific rewrite rule had been applied to the query. So these aren’t costs either.

One word to the COSTVALUE line: this comes in two fashions:

1) ‘RESULT IS NOT COPIED, COSTVALUE IS’ – this means the result rows are delivered to the client on-the-fly. They aren’t copied to a temporary result buffer.

2) ‘RESULT IS COPIED, COSTVALUE IS’ – the result has bee copied to an internal temporary storage and will be delivered from there to the client when it fetches the data.

Ok, now that we know these basics, let’s move on to a bit more complicated statement:

2. Query, single table with selection criteria

Oracle
SQL> explain plan for

  2  select * from city where name='Dallas';

Explained.
SQL>  select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------
Plan hash value: 2402407757

-----------------------------------------------------------------------------------------------
| Id  | Operation                   | Name            | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                 |    21 |   378 |     4   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| CITY            |    21 |   378 |     4   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | I_CITY_NAME_ZIP |    21 |       |     0   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("NAME"='Dallas')

14 rows selected.

MaxDB
explain
select * from city where name='Dallas'
SCHEMANAME  TABLENAME  COLUMN_OR_INDEX  STRATEGY                                PAGECOUNT
LARS        CITY       I_CITY_NAME      EQUAL CONDITION FOR INDEX                       11
                       NAME                  (USED INDEX COLUMN)                
                                           RESULT IS NOT COPIED , COSTVALUE IS           2
                                        QUERYREWRITE - APPLIED RULES:           
                                           DistinctPullUp                                1

Oracle

Here we see a bit better, how the ‘execution level’ works. Oracle first accesses the Index I_CITY_NAME_ZIP (you guessed it – it covers the NAME and ZIP columns). From the index, Oracle gets all ROWIDs that fit for the WHERE condition.

There we also note, that Oracle nicely tells us for each step in the execution plan, what filters and/or access conditions are used. ‘ACCESS’ means here, that Oracle can use the condition to locate the data – just like it does by accessing the index with a search condition. If Oracle has to browse through all the data to find the fitting rows it will print out “filter” in the predicate information.

MaxDB

The MaxDB output also gains information: the name of the index accessed plus all used columns of the index are printed below the table access. Again the PAGES number in the line for the index access does NOT tell us the expected costs but simply show how large the index is in total. There is one exception to this rule (when a KEYACCESS is done) but we leave that out for now.

Up to here, the differences between both explain tools are remarkable but don’t lead to a big difference in understanding a query’s execution strategy.

But up to here we only had a look at the most simple statements there are.

Let’s see how things turn out if we apply some pressure 🙂 and do joins.

3. Query, join

Oracle
SQL> explain plan for
  2   select c.title, c.firstname, c.name, c.zip, ci.name, c.address
  3   from customer c inner join city ci on c.zip=ci.zip;
Explained.
SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------
Plan hash value: 4175067910
-------------------------------------------------------------------------------
| Id  | Operation          | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |          |  3731 |   229K|    14   (8)| 00:00:01 |
|*  1 |  HASH JOIN         |          |  3731 |   229K|    14   (8)| 00:00:01 |
|   2 |   TABLE ACCESS FULL| CITY     |  1765 | 28240 |     4   (0)| 00:00:01 |
|   3 |   TABLE ACCESS FULL| CUSTOMER |  3731 |   171K|     9   (0)| 00:00:01 |
-------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("C"."ZIP"="CI"."ZIP")
15 rows selected.
MaxDB
explain
 select c.title, c.firstname, c.name, c.zip, ci.name, c.address
 from customer c inner join city ci on c.zip=ci.zip

TABLENAME  COLUMN_OR_INDEX  STRATEGY                                PAGECOUNT
C                           TABLE SCAN                                      40
CI         ZIP              JOIN VIA KEY COLUMN                             11
TABLE HASHED
NO TEMPORARY RESULTS CREATED
RESULT IS COPIED   , COSTVALUE IS         290

-------------
explain join
 select c.title, c.firstname, c.name, c.zip, ci.name, c.address
 from customer c inner join city ci on c.zip=ci.zip

STRATEGY             SCHEMANAME  TABLENAME  MAX ADDNL. ROWS  MULTIPLIER  REVERSE MULTIPLIER  ADDITIONAL PAGES  ADDITIONAL ROWS  ACCUMULATED_COSTS
C          4480             1           1                   106.666666666667  4480             40
JOIN VIA SINGLE KEY              CI         2574             1           19.9332591768632    248.888888888889  4480             289.846383079439


Oracle

Now we really see how the ‘execution level’ works in Oracle.

The two tables are joined via the HASH JOIN. For this, the first table (the upper one in the explain plan output) is read completely into memory and HASHED.

Then the second (lower) table is read completely as well and the fitting rows (the ones that fulfil the join condition) are found by HASHING the rows against the existing HASH SET.

As you see, both table scans are on the same ‘level of execution’ since both scans deliver rows to the same operation (the HASH JOIN).

You also note that the number of rows does not add up here, but that the HASH JOIN line really contains the number of rows expected to come out of the join. As this demo schema constraints all ZIP values in the CUSTOMER table so that they have to be present in the CITY table and as ZIP values are unique (in fact this is the primary key of the CITY table) Oracle correctly figures that there is one fitting row from the CITY table for each row of the CUSTOMER table. Therefore the HASH JOIN will produce exactly the number of rows of the CUSTOMER table.

The COSTs, in turn, do add up here. So the Query costs are the cost for scanning table CUSTOMER (9) + cost for scanning table CITY (4) + an overhead for performing the actual join operation (1) = 14.

And again we see what predicates Oracle used here to find the fitting rows (yes, a HASH function IS an access method).

MaxDB

The MaxDB output for the same join is somewhat more complicated or less concentrated depending on how you like to view it.

The ‘explain’ statement just informs us about the order of the execution, specifically the JOIN ORDER.
Although all the other information provided by the Oracle ‘explain plan’ can be very useful, the join order is the most important information about join processing
Along with this important information, we get the usual information about the table and index sizes as well as the total cost estimation.

To get more information, like the ones we get from Oracle, we’ve to use the second ‘explain’ command  – ‘explain join’.

This finally gives us the estimation of rows and pages to be processed within each step.

Anyhow, what we can rely on here is mainly the ADDITIONAL ROWS column as the other columns are filled by unrevealed internal calculations that usually don’t match the intuition.

Nevertheless we see that, just like in the Oracle example, a HASH JOIN is chosen and that the number of rows in the final result set equals the number of rows in the CUSTOMER table (the difference to the number of rows in the Oracle example is simply because there ARE more customers in my MaxDB … ).

This time the result set has been copied – which also is incorporated into the cost calculation.

Ok, we’re nearly done. Let’s have a look at the most complicated case (for today ;-)):

4. Query, join with selection criteria

Oracle
SQL> explain plan for
  2   select c.title, c.firstname, c.name, c.zip, ci.name, c.address
  3   from customer c inner join city ci on c.zip=ci.zip
  4   where ci.name ='Dallas';

Explained.

SQL>  select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
————————————————————————————–
Plan hash value: 1286193604

————————————————————————————–
| Id  | Operation          | Name            | Rows  | Bytes | Cost (%CPU)| Time     |
————————————————————————————–
|   0 | SELECT STATEMENT   |                 |    48 |  3024 |    10  (10)| 00:00:01 |
|*  1 |  HASH JOIN         |                 |    48 |  3024 |    10  (10)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN | I_CITY_NAME_ZIP |    21 |   336 |     0   (0)| 00:00:01 |
|   3 |   TABLE ACCESS FULL| CUSTOMER        |  3731 |   171K|     9   (0)| 00:00:01 |
————————————————————————————–

Predicate Information (identified by operation id):
—————————————————

1 – access(“C”.”ZIP”=”CI”.”ZIP”)
2 – access(“CI”.”NAME”=’Dallas’)

16 rows selected.

MaxDB
explain
 select c.title, c.firstname, c.name, c.zip, ci.name, c.address
 from customer c inner join city ci on c.zip=ci.zip
 where ci.name ='Dallas'
TABLENAME  COLUMN_OR_INDEX  STRATEGY                                PAGECOUNT
CI         I_CITY_NAME      EQUAL CONDITION FOR INDEX                       11
                            ONLY INDEX ACCESSED                     
           NAME                  (USED INDEX COLUMN)                
C          I_CUSTOMER_ZIP   JOIN VIA INDEXED COLUMN                         40
           ZIP                   (USED INDEX COLUMN)                
                                 NO TEMPORARY RESULTS CREATED       
                                 RESULT IS COPIED   , COSTVALUE IS           5
------------- 
explain join
 select c.title, c.firstname, c.name, c.zip, ci.name, c.address
 from customer c inner join city ci on c.zip=ci.zip
 where ci.name ='Dallas'

STRATEGY                      SCHEMANAME  TABLENAME  MAX ADDNL. ROWS   MULTIPLIER  REVERSE MULTIPLIER  ADDITIONAL PAGES  ADDITIONAL ROWS   ACCUMULATED_COSTS
CI         25.9194422927963  1           1                   1                 25.9194422927963  1
JOIN VIA MORE THAN ONE FIELD              C          4480              2           1                   2.87993803253292  51.8388845855926  4.47963144734404

Oracle

Ok, the ‘explain plan’ output is not much more complicated now. In fact, even with the new WHERE condition, Oracle has decided for a HASH scan.
Nevertheless, you notice that now table CITY is not touched anymore. Instead, the index I_CITY_NAME_ZIP is read (see the access predicate 2) and since the index contains the ZIP data as well, it is used to build up the HASH SET for the join.

The other join part is now processed like before – a simple full table scan that is hashed to create the result set.

If you wonder, why Oracle decides for HASH JOIN here (although there is an index on ZIP on the CUSTOMER table): it’s because the estimated COSTS are lower.

Remember that this is a rather small table and Oracle uses a multiblock I/O for table scans. So, in this case, it’s faster to simply load all data and do the HASHING than look up ZIP number by ZIP number in the index on CUSTOMER and visit the table for each fitting record.

But attention: just these cost differences can lead to severe performance issues when Oracle gets them wrong. Especially in OLTP system we usually don’t want to see HASH JOINs…

MaxDB

MaxDB decides for a different plan this time: first, it also performs an INDEX ONLY access on the I_CITY_NAME index.

Due to the way how indexes work in MaxDB, every index always contains the full primary key – so this index does also contain the ZIP values.

With this information, the join can already be done. This time via the I_CUSTOMER_ZIP index on the CUSTOMER table.

It’s not written out here, but this join is a nested loop join.

Again we see: the numbers in the PAGES column DON’T add up to the COSTS line.

Instead, we notice that the COSTS are the rounded ‘ACCUMULATED COSTS’ of the ‘explain join’ command.

Cheat Sheet

------------------- 8< ------------ cut here ------------ 8< ------------
Oracle MaxDB
ONE explain statement, result needs to be displayed by:
select * from table(dbms_xplan.display);
TWO explain statements, result is visible immediately.
Dense information,CBO arithmetic can be followed. Information focus on the resulting plan – optimizer arithmetics can barely be followed.
“Costs” add up in the execution plan. “Costs” appear in the “Costs” line ONLY.
All other values in the PAGES column are the statistics of the accessed tables/indexes.
notes to know:

618868 – FAQ: Oracle performance
766349 – FAQ: Oracle SQL optimization
772497 – FAQ: Oracle Hints
176754 – Problems with CBO and RBO

notes to know:

819324 – FAQ: MaxDB SQL optimization
832544 –  FAQ: MaxDB Hints

Documentation to read:

Oracle Documentation – Overview of the Optimizer

Oracle Scratchpad
(Oracle Optimizer and Performance related blog by Jonathan Lewis)

Documentation to read:

MaxDB Wiki

MaxDB Documentation – SQL Optimizer

---------------------- 8< ------------ cut here ------------ 8< --------------------
Former editions of ‘Mind the gap…’:

‘Mind the gap…’ (#3 – Backup/Restore – Part 2)

‘Mind the gap…’ (#3 – Backup/Restore – Part 1)

‘Mind the gap…’ (#2 – Data storage)

‘Mind the gap …’ (1# – Logon mechanisms)

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.