What a big impact a little money can have…

What I dislike in donating

Really, I’m not too big into making donations.

It’s not that I’m greedy and what to keep everything for me.

It’s also not that I’m parsimonious like Mr Scrooge. Actually, I do want to see some of my money put to use for people that need it more than I do.

I sometimes have the feeling that the donations drain somewhere in the administrations of the big charity organizations.
Especially I find it hard to see how my efforts – as small as they are – will do something against large problems like hunger or AIDS or natural disasters.

So, in general, making donations usually leaves me with the feeling of having given too little to be useful or to make a difference. That’s why I don’t like donating.

Anyhow last week a very good friend of mine send me an email. As she knows I’m no big fan of “send-me-money-spam” I was pretty sure that this wasn’t the usual begging crap you get especially before the holiday season starts.

An alternative in sight!

The email was about lending money to people that don’t want it as a present but that want to do business.
They don’t want a gift, but they want to finance their short-term business requirements.

The email was talking about micro-loans.

Micro-loans are very small volume loans usually starting with a few hundred bucks up to a few thousand. Banks often don’t see much business in micro-loans – although they can make all the difference for small businesses.

Micro-loans do help people that are working hard and who are in business.
Anyhow, they might most often not get loans from commercial banks because they often don’t have many securities.
For Banks this is the classic no-go; they don’t even put a few hundred dollars at risk here.

So, actually, micro-loans enable the small-scale entrepreneurs to stay in business and to increase the welfare and the GDP of their country.
They help them to stay in a position to take care of the lives of their families themselves – moving them away from the need for any outside gift giving.

Fortunately, there are organizations around that make it possible for everyone to become a lender for those people. One of them is called *Kiva* and that was the one mentioned in the email I got.

Kiva - loans that change lives

The people at Kiva provide a platform for giving loans.

What is it about?

They’ve everything there: profiles of people (that’s right, it’s about people, not companies!) that need a loan, you see who they are and what they do with your money.

They allow money transfer with the usual credit cards or PayPal – so it’s pretty secure.

Let’s be clear about this: you don’t get any interest for your money, you just lend it.

But to me, this really feels a whole lot better to me than simply give away my money to anonymous charity organizations where I never know what they do with the money and if it will ever reach somebody in need of it.

The loans are limited in time and you get updates on the what they do with your money while it runs. And yes, there IS a risk of default loans and the people of Kiva don’t hide it.

In fact, you can browse through the list of loans and see which ones are still running, which ones have ended and the defaulted ones as well.

Anyhow, you can start giving micro-loans starting from $25 – so where’s the risk?

What’s in there for Kiva?

How do they get the money they need to provide all that? Good question!

In fact, they do need donations which can be made with every loan you give.
But you don’t have to. You can decide independently of your loan whether or not and how much you are willing to donate to this platform.

Ok, as I said, usually I’m really not into the charity thing, but I really liked *this* idea, so I decided to let you know about this.

‘Mind the gap…’ (#4 – reading query execution plans)

In my mini-series “Mind the gap…” I will try to shed some light on where the little or big differences between MaxDB and Oracle databases are and what to keep in mind when working with them.

Today I’d like to write a bit about the pitfalls you can tap into when trying to apply your query tuning knowledge from Oracle to your MaxDB instance.

Basically, both databases do employ a so-called cost-based optimizer to figure out what access strategy will deliver the results as fast as possible.

For those of you that don’t know what ‘optimizer’ or ‘execution plan’ means, I recommend to grab the documentation of either one of the DBMS and read about it there. Here I will focus on the differences between both systems.

The test schema

To make it easier to compare the output of the explain-statements for both DBMS I use the same test data on both databases.

Actually, it is just the demo schema ‘HOTEL’ that comes with every MaxDB.

Here’s how it looks like from a modelling point of view:


To make things a bit more equal for both databases, I created unique indexes on Oracle wherever there is a primary key on MaxDB.As you see it’s a pretty straightforward schema with no specialities.

Also, still to make it simpler, I will just focus on the CUSTOMER and CITY tables as this should be enough to make the points.

1. Query, single table – get used to the explain commands

select * from city

This first query is really just for getting in touch with the command available to display information about the execution plan.

While for Oracle there is one command ‘EXPLAIN PLAN FOR’ to generate the execution plan and another one ‘SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY);’ to display it, MaxDB directly shows the execution plan as the result of ‘EXPLAIN’ or ‘EXPLAIN JOIN’.

And, yes, there are also other options to generate execution plans for both databases, but the ones mentioned above are the most commonly used ones.

One word about the output of the Oracle EXPLAIN PLAN: when used from within SAP CCMS (ST04/ST05) you’ll see SAPs own representation of the execution plan.

It contains less information but is more graphically oriented – anyhow, since newer releases of the SAP_BASIS also provide the Oracle output formatting I’ll stick with this for this blog.

So why does MaxDB needs two commands if it displays the execution plan automatically? Because the information of both outputs differs.

In fact, MaxDB’s explain facilities needed to be extended and fit to some old coding – so, for now, we need to live with two EXPLAINs to see all information (remark: a completely new implementation of EXPLAIN is on it’s way but likely won’t be available before MaxDB 7.8).

We’ll see what both commands do later on, so let’s get started with the simple query above.

Oracle
SQL> explain plan for
2  select * from city;
Explained.
SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------
Plan hash value: 3973142328

————————————————————————–
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
————————————————————————–
|   0 | SELECT STATEMENT  |      |  1765 | 31770 |     5  (20)| 00:00:01 |
|   1 |  TABLE ACCESS FULL| CITY |  1765 | 31770 |     4   (0)| 00:00:01 |
————————————————————————–

8 rows selected.

MaxDB
SCHEMANAME  TABLENAME  COLUMN_OR_INDEX  STRATEGY                                PAGECOUNT
LARS        CITY                        TABLE SCAN                                      11
                                           RESULT IS NOT COPIED , COSTVALUE IS          11
                                        QUERYREWRITE - APPLIED RULES:           
                                           DistinctPullUp                                1

So, what do we see here?

Oracle

Oracle shows us the access strategy (a full table scan), the accessed table (CITY) and for each line, we see the expected number of rows, the bytes to be processed, a cost value and an estimation for the execution time.

An important thing to know here is that the costs basically add up to each level of execution. We’ll see below what ‘level of execution means here’. Anyhow, this level is shown by how far the entry in ‘OPERATION’ is indeed to the right.

Note that we don’t see how many blocks Oracle expects to touch – this information in encoded to the COST for each line. Finally, the ‘TIME’ column is derived from the COSTs and some system-wide performance statistics.

MaxDB

MaxDB, in turn, shows us the used table (CITY in schema LARS) and the chosen operation on the table (TABLE SCAN here as well). Instead of COSTS or TIME we see a PAGES column.

ATTENTION: this DOES NOT contain the estimated number of pages to be accessed at this line, but it shows the SIZE of the table/index accessed by the operation in the current line. Therefore the PAGES of a multi-step execution plan DON’T add up like they do in Oracle.
Instead, the MaxDB explain provides a separate line ‘COSTVALUE IS’. This and only this line gives us information about the estimated effort of the query.

The lines below ‘Queryrewrite…’ are just additional information – the numbers shown in the PAGES column here simply tell how often a specific rewrite rule had been applied to the query. So these aren’t costs either.

One word to the COSTVALUE line: this comes in two fashions:

1) ‘RESULT IS NOT COPIED, COSTVALUE IS’ – this means the result rows are delivered to the client on-the-fly. They aren’t copied to a temporary result buffer.

2) ‘RESULT IS COPIED, COSTVALUE IS’ – the result has bee copied to an internal temporary storage and will be delivered from there to the client when it fetches the data.

Ok, now that we know these basics, let’s move on to a bit more complicated statement:

2. Query, single table with selection criteria

Oracle
SQL> explain plan for

  2  select * from city where name='Dallas';

Explained.
SQL>  select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------
Plan hash value: 2402407757

-----------------------------------------------------------------------------------------------
| Id  | Operation                   | Name            | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                 |    21 |   378 |     4   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| CITY            |    21 |   378 |     4   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | I_CITY_NAME_ZIP |    21 |       |     0   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("NAME"='Dallas')

14 rows selected.

MaxDB
explain
select * from city where name='Dallas'
SCHEMANAME  TABLENAME  COLUMN_OR_INDEX  STRATEGY                                PAGECOUNT
LARS        CITY       I_CITY_NAME      EQUAL CONDITION FOR INDEX                       11
                       NAME                  (USED INDEX COLUMN)                
                                           RESULT IS NOT COPIED , COSTVALUE IS           2
                                        QUERYREWRITE - APPLIED RULES:           
                                           DistinctPullUp                                1

Oracle

Here we see a bit better, how the ‘execution level’ works. Oracle first accesses the Index I_CITY_NAME_ZIP (you guessed it – it covers the NAME and ZIP columns). From the index, Oracle gets all ROWIDs that fit for the WHERE condition.

There we also note, that Oracle nicely tells us for each step in the execution plan, what filters and/or access conditions are used. ‘ACCESS’ means here, that Oracle can use the condition to locate the data – just like it does by accessing the index with a search condition. If Oracle has to browse through all the data to find the fitting rows it will print out “filter” in the predicate information.

MaxDB

The MaxDB output also gains information: the name of the index accessed plus all used columns of the index are printed below the table access. Again the PAGES number in the line for the index access does NOT tell us the expected costs but simply show how large the index is in total. There is one exception to this rule (when a KEYACCESS is done) but we leave that out for now.

Up to here, the differences between both explain tools are remarkable but don’t lead to a big difference in understanding a query’s execution strategy.

But up to here we only had a look at the most simple statements there are.

Let’s see how things turn out if we apply some pressure 🙂 and do joins.

3. Query, join

Oracle
SQL> explain plan for
  2   select c.title, c.firstname, c.name, c.zip, ci.name, c.address
  3   from customer c inner join city ci on c.zip=ci.zip;
Explained.
SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------
Plan hash value: 4175067910
-------------------------------------------------------------------------------
| Id  | Operation          | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |          |  3731 |   229K|    14   (8)| 00:00:01 |
|*  1 |  HASH JOIN         |          |  3731 |   229K|    14   (8)| 00:00:01 |
|   2 |   TABLE ACCESS FULL| CITY     |  1765 | 28240 |     4   (0)| 00:00:01 |
|   3 |   TABLE ACCESS FULL| CUSTOMER |  3731 |   171K|     9   (0)| 00:00:01 |
-------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("C"."ZIP"="CI"."ZIP")
15 rows selected.
MaxDB
explain
 select c.title, c.firstname, c.name, c.zip, ci.name, c.address
 from customer c inner join city ci on c.zip=ci.zip

TABLENAME  COLUMN_OR_INDEX  STRATEGY                                PAGECOUNT
C                           TABLE SCAN                                      40
CI         ZIP              JOIN VIA KEY COLUMN                             11
TABLE HASHED
NO TEMPORARY RESULTS CREATED
RESULT IS COPIED   , COSTVALUE IS         290

-------------
explain join
 select c.title, c.firstname, c.name, c.zip, ci.name, c.address
 from customer c inner join city ci on c.zip=ci.zip

STRATEGY             SCHEMANAME  TABLENAME  MAX ADDNL. ROWS  MULTIPLIER  REVERSE MULTIPLIER  ADDITIONAL PAGES  ADDITIONAL ROWS  ACCUMULATED_COSTS
C          4480             1           1                   106.666666666667  4480             40
JOIN VIA SINGLE KEY              CI         2574             1           19.9332591768632    248.888888888889  4480             289.846383079439


Oracle

Now we really see how the ‘execution level’ works in Oracle.

The two tables are joined via the HASH JOIN. For this, the first table (the upper one in the explain plan output) is read completely into memory and HASHED.

Then the second (lower) table is read completely as well and the fitting rows (the ones that fulfil the join condition) are found by HASHING the rows against the existing HASH SET.

As you see, both table scans are on the same ‘level of execution’ since both scans deliver rows to the same operation (the HASH JOIN).

You also note that the number of rows does not add up here, but that the HASH JOIN line really contains the number of rows expected to come out of the join. As this demo schema constraints all ZIP values in the CUSTOMER table so that they have to be present in the CITY table and as ZIP values are unique (in fact this is the primary key of the CITY table) Oracle correctly figures that there is one fitting row from the CITY table for each row of the CUSTOMER table. Therefore the HASH JOIN will produce exactly the number of rows of the CUSTOMER table.

The COSTs, in turn, do add up here. So the Query costs are the cost for scanning table CUSTOMER (9) + cost for scanning table CITY (4) + an overhead for performing the actual join operation (1) = 14.

And again we see what predicates Oracle used here to find the fitting rows (yes, a HASH function IS an access method).

MaxDB

The MaxDB output for the same join is somewhat more complicated or less concentrated depending on how you like to view it.

The ‘explain’ statement just informs us about the order of the execution, specifically the JOIN ORDER.
Although all the other information provided by the Oracle ‘explain plan’ can be very useful, the join order is the most important information about join processing
Along with this important information, we get the usual information about the table and index sizes as well as the total cost estimation.

To get more information, like the ones we get from Oracle, we’ve to use the second ‘explain’ command  – ‘explain join’.

This finally gives us the estimation of rows and pages to be processed within each step.

Anyhow, what we can rely on here is mainly the ADDITIONAL ROWS column as the other columns are filled by unrevealed internal calculations that usually don’t match the intuition.

Nevertheless we see that, just like in the Oracle example, a HASH JOIN is chosen and that the number of rows in the final result set equals the number of rows in the CUSTOMER table (the difference to the number of rows in the Oracle example is simply because there ARE more customers in my MaxDB … ).

This time the result set has been copied – which also is incorporated into the cost calculation.

Ok, we’re nearly done. Let’s have a look at the most complicated case (for today ;-)):

4. Query, join with selection criteria

Oracle
SQL> explain plan for
  2   select c.title, c.firstname, c.name, c.zip, ci.name, c.address
  3   from customer c inner join city ci on c.zip=ci.zip
  4   where ci.name ='Dallas';

Explained.

SQL>  select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
————————————————————————————–
Plan hash value: 1286193604

————————————————————————————–
| Id  | Operation          | Name            | Rows  | Bytes | Cost (%CPU)| Time     |
————————————————————————————–
|   0 | SELECT STATEMENT   |                 |    48 |  3024 |    10  (10)| 00:00:01 |
|*  1 |  HASH JOIN         |                 |    48 |  3024 |    10  (10)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN | I_CITY_NAME_ZIP |    21 |   336 |     0   (0)| 00:00:01 |
|   3 |   TABLE ACCESS FULL| CUSTOMER        |  3731 |   171K|     9   (0)| 00:00:01 |
————————————————————————————–

Predicate Information (identified by operation id):
—————————————————

1 – access(“C”.”ZIP”=”CI”.”ZIP”)
2 – access(“CI”.”NAME”=’Dallas’)

16 rows selected.

MaxDB
explain
 select c.title, c.firstname, c.name, c.zip, ci.name, c.address
 from customer c inner join city ci on c.zip=ci.zip
 where ci.name ='Dallas'
TABLENAME  COLUMN_OR_INDEX  STRATEGY                                PAGECOUNT
CI         I_CITY_NAME      EQUAL CONDITION FOR INDEX                       11
                            ONLY INDEX ACCESSED                     
           NAME                  (USED INDEX COLUMN)                
C          I_CUSTOMER_ZIP   JOIN VIA INDEXED COLUMN                         40
           ZIP                   (USED INDEX COLUMN)                
                                 NO TEMPORARY RESULTS CREATED       
                                 RESULT IS COPIED   , COSTVALUE IS           5
------------- 
explain join
 select c.title, c.firstname, c.name, c.zip, ci.name, c.address
 from customer c inner join city ci on c.zip=ci.zip
 where ci.name ='Dallas'

STRATEGY                      SCHEMANAME  TABLENAME  MAX ADDNL. ROWS   MULTIPLIER  REVERSE MULTIPLIER  ADDITIONAL PAGES  ADDITIONAL ROWS   ACCUMULATED_COSTS
CI         25.9194422927963  1           1                   1                 25.9194422927963  1
JOIN VIA MORE THAN ONE FIELD              C          4480              2           1                   2.87993803253292  51.8388845855926  4.47963144734404

Oracle

Ok, the ‘explain plan’ output is not much more complicated now. In fact, even with the new WHERE condition, Oracle has decided for a HASH scan.
Nevertheless, you notice that now table CITY is not touched anymore. Instead, the index I_CITY_NAME_ZIP is read (see the access predicate 2) and since the index contains the ZIP data as well, it is used to build up the HASH SET for the join.

The other join part is now processed like before – a simple full table scan that is hashed to create the result set.

If you wonder, why Oracle decides for HASH JOIN here (although there is an index on ZIP on the CUSTOMER table): it’s because the estimated COSTS are lower.

Remember that this is a rather small table and Oracle uses a multiblock I/O for table scans. So, in this case, it’s faster to simply load all data and do the HASHING than look up ZIP number by ZIP number in the index on CUSTOMER and visit the table for each fitting record.

But attention: just these cost differences can lead to severe performance issues when Oracle gets them wrong. Especially in OLTP system we usually don’t want to see HASH JOINs…

MaxDB

MaxDB decides for a different plan this time: first, it also performs an INDEX ONLY access on the I_CITY_NAME index.

Due to the way how indexes work in MaxDB, every index always contains the full primary key – so this index does also contain the ZIP values.

With this information, the join can already be done. This time via the I_CUSTOMER_ZIP index on the CUSTOMER table.

It’s not written out here, but this join is a nested loop join.

Again we see: the numbers in the PAGES column DON’T add up to the COSTS line.

Instead, we notice that the COSTS are the rounded ‘ACCUMULATED COSTS’ of the ‘explain join’ command.

Cheat Sheet

------------------- 8< ------------ cut here ------------ 8< ------------
Oracle MaxDB
ONE explain statement, result needs to be displayed by:
select * from table(dbms_xplan.display);
TWO explain statements, result is visible immediately.
Dense information,CBO arithmetic can be followed. Information focus on the resulting plan – optimizer arithmetics can barely be followed.
“Costs” add up in the execution plan. “Costs” appear in the “Costs” line ONLY.
All other values in the PAGES column are the statistics of the accessed tables/indexes.
notes to know:

618868 – FAQ: Oracle performance
766349 – FAQ: Oracle SQL optimization
772497 – FAQ: Oracle Hints
176754 – Problems with CBO and RBO

notes to know:

819324 – FAQ: MaxDB SQL optimization
832544 –  FAQ: MaxDB Hints

Documentation to read:

Oracle Documentation – Overview of the Optimizer

Oracle Scratchpad
(Oracle Optimizer and Performance related blog by Jonathan Lewis)

Documentation to read:

MaxDB Wiki

MaxDB Documentation – SQL Optimizer

---------------------- 8< ------------ cut here ------------ 8< --------------------
Former editions of ‘Mind the gap…’:

‘Mind the gap…’ (#3 – Backup/Restore – Part 2)

‘Mind the gap…’ (#3 – Backup/Restore – Part 1)

‘Mind the gap…’ (#2 – Data storage)

‘Mind the gap …’ (1# – Logon mechanisms)