A classic in disguise…

In my last blog entry “Starting off with a classic…” I presented a classic issue that may appear when the database software is upgraded.

The problem was that sometimes result sets are delivered to the user in a sorted manner even if no ORDER BY was specified. If the user/developer relies on this order it may be a cause for application errors, as the database don’t guarantee the order if ORDER BY is omitted.

Now I came across a customer message that described a problem after the upgrade from Oracle 9i to Oracle 10g which was quite similiar – but only on the second look.

THE TABLE

To understand this problem it’s handy to have a demo table.
My one looks like this:

create table tt (cdate varchar2(8), info varchar2(10));

For the test, we also need some data:

insert into tt values('20070101', 'TEST-JAN');
insert into tt values('20070104', 'TEST-JAN');
insert into tt values('20070104', 'TEST-JAN');
insert into tt values('20070104', 'TEST-JAN');
insert into tt values('20070104', 'TEST-JAN');
insert into tt values('20070124', 'TEST-JAN');
insert into tt values('20070124', 'TEST-JAN');
insert into tt values('20070124', 'TEST-JAN');
insert into tt values('20070201', 'TEST-FEB');
insert into tt values('20070201', 'TEST-FEB');
insert into tt values('20070203', 'TEST-FEB');
insert into tt values('20070212', 'TEST-FEB');
analyze table tt compute statistics;

Now a quick check, what’s in the table now:

break on cdate skip 1
select * from tt order by cdate asc;
CDATE    INFO
-------- ----------
20070101 TEST-JAN

20070104 TEST-JAN
         TEST-JAN
         TEST-JAN
         TEST-JAN

20070124 TEST-JAN
         TEST-JAN
         TEST-JAN

20070201 TEST-FEB
         TEST-FEB

20070203 TEST-FEB

20070212 TEST-FEB
12 rows selected.

As we see there are several different dates with different numer of entries for the different dates. That’s pretty much like it was on the customer system.

The Query

The customer now wanted to know the newest cdate of the table. That is of course the one that is the highest one.
He complained that after the upgrade to Oracle 10g the database does NOT return the highest cdate value anymore. So, what happened?
What he coded was:

select max(cdate) from tt group by cdate;

The result sets looked like this:

Oracle 9i:
MAX(CDAT
--------
20070101
20070104  >
20070124  >
20070201  >
20070203  >
20070212  >
6 rows selected.
Oracle 10g:
MAX(CDAT
--------
20070203
20070212   >
20070201     <
20070101     <
20070104     >
20070124     >
6 rows selected.

For easier comparisation I marked each line with < or > denoting that the current line cdate is smalle < or bigger > than the preceding one.
So the customer made a correct observation: the result of his query was sorted before the upgrade and is not sorted afterwards. But as we already now, the order of the  result set was not guaranteed since no ORDER BY clause was included.

Anyhow we want to know why this is happening…

The Analysis

With Oracle 10g the query  is executed like this:

set autotrace traceonly exp
select max(cdate) from tt group by cdate;
-------------------------------------------
| Id  | Operation          | Name | Rows  |
-------------------------------------------
|   0 | SELECT STATEMENT   |      |     6 |
|   1 |  HASH GROUP BY     |      |     6 |
|   2 |   TABLE ACCESS FULL| TT   |    12 |
-------------------------------------------

Switching back to Oracle 9i we see that something has changed here:

alter session set optimizer_features_enable='9.2.0';
select max(cdate) from tt group by cdate;
-------------------------------------------
| Id  | Operation          | Name | Rows  |
-------------------------------------------
|   0 | SELECT STATEMENT   |      |     6 |
|   1 |  SORT GROUP BY     |      |     6 |
|   2 |   TABLE ACCESS FULL| TT   |    12 |
-------------------------------------------

While Oracle 9i used a SORT GROUP BY operation to produce the result set Oracle 10g employs a HASH GROUP BY.
As the name already tells, the former one sorts the data for this operation thus producing a sorted resultset. So up to now we know that the result set order changed and why it has changed, but still the question is: why does it produce a wrong result for MAX()?
The answer is simple: it does not produce a wrong result at all.

The Solution

What is wrong in this example is the statement itself. The customer wanted the newest cdate value in the table. But he asked for the highest value for each different value, since he grouped his data over cdate. As he always got back a resultset that included all distinct values for cdate the developer must have scrolled through the result set to get the highest value…

The correct implementation of this question in SQL would of course be this one:

select max(cdate) from tt;

This one would deliver only one single value (SELECT SINGLE!): the highest value for cdate.

No sort order in the result set, no scrolling trough it. Just the query and the result.

As you can see, this “problem” might come up at very different corners of your SQL code.

So better doublecheck it, before assuming a database bug.

Best regards,
Lars

Starting off with a classic…

Every now and then a new version of the database software comes along.
Upgrades of Oracle or MaxDB often include major improvements concerning stability and performance. These improvements can and should be read as: things run different now.
A common example of how different things work and a “classic” issue in support is the following: “We upgraded our database software and now the data is not returned in the same order as before.”

Of course, at first sight, the new and presumely better version of the database software “forgot” how to sort things correctly. Therefore a support message is opened.
But look (and think) again:
If the Statement, that is now giving back data in a different order, does not include the ORDER BY clause, than the database has done nothing wrong.

In each and every database manual and also in the sql standard you’ll find a hint to this:
Unless ORDER BY is supplied, the database does NOT guarantee the order of returned rows.
If your application logic relies on the order, then use ORDER BY.
And really that’s it! Period.
So why can this happen at all?
The answer to this is (most often): a change of data access paths.
Let’s make an example (this time with Oracle 10g):

 

1. We create two tables SMALL and BIG:

 

create table small as (select object_name from dba_objects);
create table big as (select object_name, object_type from dba_objects);

repeat this a few times:
insert into big (select * from big);
Result: there’s now a 1:m relationship between the object_name in SMALL and the object_name in BIG.

2. Create index on BIG and collect cbo statistics:
create index i_big on big (object_name);
analyze table small compute statistics;
analyze table big compute statistics for table for all indexes;

 

 

3. Let’s see which order the data has right now:
select b.* from small s, big b where s.object_name = b.object_name;

 

OBJECT_NAME          OBJECT_TYPE
-------------------- -----------
I_COBJ#              INDEX
C_FILE#_BLOCK#       CLUSTER
SEG$                 TABLE
I_UNDO1              INDEX
I_COL1               INDEX
I_IND1               INDEX
I_COL3               INDEX
I_CDEF2              INDEX
I_USER#              INDEX
[...]

So, this is not ordered anyhow…

 

4. How has this been when the rule based optimizer was used? The data was sorted back then!
select /*+ rule */ b.* from small s, big b where s.object_name = b.object_name;

OBJECT_NAME          OBJECT_TYPE
-------------------- --------------------
I_COBJ#              INDEX
I_COBJ#              INDEX
I_COBJ#              INDEX
I_COBJ#              INDEX
I_COBJ#              INDEX
I_COBJ#              INDEX
I_COBJ#              INDEX
I_COBJ#              INDEX
I_COBJ#              INDEX
I_COBJ#              INDEX
I_COBJ#              INDEX
I_COBJ#              INDEX
I_COBJ#              INDEX
I_COBJ#              INDEX
I_COBJ#              INDEX
I_COBJ#              INDEX
C_FILE#_BLOCK#       CLUSTER
C_FILE#_BLOCK#       CLUSTER
C_FILE#_BLOCK#       CLUSTER
C_FILE#_BLOCK#       CLUSTER
C_FILE#_BLOCK#       CLUSTER
C_FILE#_BLOCK#       CLUSTER
[...]

 

 

5. So why is that ? It’s all about access paths !
explain plan for
select /*+ rule */ b.* 
from small s, big b
where s.object_name = b.object_name;
select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
----------------------------------------------
Plan hash value: 1302638698
----------------------------------------------
| Id  | Operation                    | Name  |
----------------------------------------------
|  0  | SELECT STATEMENT             |       |
|  1  |  TABLE ACCESS BY INDEX ROWID | BIG   |
|  2  |   NESTED LOOPS               |       |
|  3  |    TABLE ACCESS FULL         | SMALL |
|* 4  |    INDEX RANGE SCAN          | I_BIG |
----------------------------------------------

With the rulebased optimizer the available index has been used to make up the join. As a coincidence the data has been delivered the way it was picked up: sorted in the index sort manner.

When the RBO-Usage vanished with Oracle 10g (at the latest), other join strategies have been considered:

 

explain plan for select  b.* from small s, big b where s.object_name = b.object_name;

select * from table(dbms_xplan.display);
------------------------------------
| Id  | Operation          | Name  |
------------------------------------
|   0 | SELECT STATEMENT   |       |
|*  1 |  HASH JOIN         |       |
|   2 |   TABLE ACCESS FULL| SMALL |
|   3 |   TABLE ACCESS FULL| BIG   |
------------------------------------

*) I omitted some information here to make the comparisation easier.

As you can see, now the index is not used at all. The join is done via a HASH join. This hash join does not take care about the way the data is sorted, but is pretty quick for the join itself.
The result is a quicker join but the “order” of rows has completely vanished.

 

6. Now, what do you guess will happen to the access path if we ask for a sorted result?

Index access? Guess again!

 

explain plan for select  b.* from small s, big b where s.object_name = b.object_name order by b.object_name;
-------------------------------------
| Id  | Operation           | Name  |
-------------------------------------
|   0 | SELECT STATEMENT    |       |
|   1 |  MERGE JOIN         |       |
|   2 |   SORT JOIN         |       |
|   3 |    TABLE ACCESS FULL| SMALL |
|*  4 |   SORT JOIN         |       |
|   5 |    TABLE ACCESS FULL| BIG   |
-------------------------------------

A SORT-MERGE-Join is done. This join method does not only care about the sort order, it relies on it.

As these examples are valid for Oracle the same effects can be demonstrated with different versions/releases of MaxDB (former SAP DB).

For the SQL-User (usually the application or report developer) knowing this effect should lead to the conclusio:

IF THE DATA SHOULD BE SORTED, ASK FOR IT. USE ORDER BY.
NO ORDER BY – NO GUARANTEED ORDER.

Best regards,  Lars