MaxDB ‚Äď On locking mechsanisms and how we learn something new each day‚Ķ Part II

As I promised here’s the second part of my short excurse into the shoals of lock management with MaxDB databases. (The first part can be found MaxDB – On locking mechanisms and how we get to know our MaxDB a bit better each day… Part I)

2) When deadlocks go unnoticed…

Ok, big buzz word in the title – I’m sure that just the word ‘deadlock’ will get me the audience this time ūüėČ

Before we start with the actual point, let’s get clear about what is meant by ‘deadlock’.
Among the huge variety of possible locking and hanging situations parallel working systems can get into, deadlocks are very specific.
The point of deadlocks is not that the system is hanging for a long time, but that it is impossible for the processes involved in the deadlock to resolve it by themselves.

Since locking situations can be thought as (mathematical) graphs. A deadlock can be defined as a closed circular graph with the minimum number of vertices.
The simplest example would look like this:

Process A  [PA]                Process B  [PB]
 Resource A [RA]                Resource B [RB]
 
    ([PA],[RA]) <--- PB ---
                lock request
                 --- PA ---> ([PB],[RB])

In this case, PA and PB need to wait for each other for the release of the requested resource. But since they both wait, no process can actually release a lock – this is a deadlock.

Of course, deadlocks can be way more complex, including many resources, more processes and sometimes even multiple application layers (these are really nasty since usually there is no coherent view to these cross-layer locks).

One advantage of this rather abstract view to deadlocks is that this makes it easier to recognize them.
That is what’s behind the deadlock detection feature of current DBMS.

Whenever a process needs to wait for a resource for a long time (say 1 second or so), the DBMS looks out for such a deadlock graph and eventually ‘resolves’ the situation by telling one of the waiting processes that it won’t get the lock.

The general idea behind the feature is of course not to prevent deadlocks.
Deadlocks are usually design-errors, bugs of the application program. This cannot be fixed automatically.

However, it is important for heavy-duty databases to keep running as long as possible.

To make this possible, the deadlock detection and resolution helps a great deal.
Once a deadlock is removed, the whole system can continue it’s work, while only one transaction gets an error.

So far the story is rather nice, isn’t it?

The DBMS checks for deadlocks and makes sure that the system will stay responsive even if the application designers made a mistake.

Unfortunately, nothing is perfect – and so isn’t the deadlock detection in MaxDB.
As you may know (or learn now) MaxDB knows different kinds of SQL locks:

  • Table locks
  • Row locks
  • Dictionary/Catalog locks

As long as the deadlock is just between table/row-locks everything works just as expected:

#### Session 1 (Isolation level 1, Autocommit off)
select * from locka
COLA  COLB
1     X
2     X
select * from lockb
COLA  COLB
1     Y
2     Y
update lockb set colb='YX' where cola=1
#### Session 2 (Isolation level 1, Autocommit off)
update locka set colb='XY' where cola=1

 

#### Monitoring session
select session, tablename, lockmode, lockstate, rowidhex from locks
SESSION  TABLENAME  LOCKMODE       LOCKSTATE  ROWIDHEX
8459     LOCKB      row_exclusive  write      00C1100000000...
8460     LOCKA      row_exclusive  write      00C1100000000...
Nothing special up to here – let’s create a deadlock:
#### Session 1
 update locka set cola='XY' where cola=1
#### Session 2
update lockb set colb='YX' where cola=1
Auto Commit: Off, SQL Mode: Internal, Isolation Level: Committed
 General error;600 POS(1) Work rolled back
update lockb set colb='YX' where cola=1

*** corrected the update statements 20.10.09 22:02 ***

As we see the crosswise row lock request (for the update an exclusive lock is required) is recognized and one session is rolled back.

Now let’s do this again, but let’s use shared (catalogue) locks as well…

#### Session 1
update lockb set colb='YX' where cola=1
#### Session 2
 update locka set colb='XY' where cola=1
#### Session 1
alter table locka add (colc varchar(10))
--> hangs !
#### Monitoring session
select session, tablename, lockmode, lockstate, rowidhex from locks
SESSION  TABLENAME      LOCKMODE       LOCKSTATE  ROWIDHEX
8459     LOCKA          row_exclusive  write      00C110000000000...
8460     SYS%CAT2       row_share      ?          FFFF00000000000...
8460     SYSDDLHISTORY  row_exclusive  write      00FFFE000000000...
8460     LOCKB          row_exclusive  write      00C110000000000...

 

Wow!

Besides our two already known row_exclusive locks on tables LOCKA and LOCKB we also find one for SYSDDLHISTORY and a row_share lock for SYS%CAT2.

What are those about?
Well, the lock for SYSDDLHISTORY is for an insert statement that is automatically done with MaxDB >= 7.7 whenever a DDL statement is issued.
The SYSDDLHISTORY table will contain all committed DDL statements by that – neat feature but has nothing to do with what we want to do here.
The SYS%CAT2, in turn, is the mentioned catalogue lock.

Now let’s create the deadlock:
#### Session 2
alter table lockb add (colc varchar(10))
--> hangs !
#### Monitoring session
select session, tablename, lockmode, lockstate, rowidhex from locks
SESSION  TABLENAME      LOCKMODE       LOCKSTATE  ROWIDHEX
8459     SYS%CAT2       row_share      ?          FFFF00000000000...
8459     SYSDDLHISTORY  row_exclusive  write      00FFFE000000000...
8459     LOCKA          row_exclusive  write      00C110000000000...
8460     SYS%CAT2       row_share      ?          FFFF00000000000...
8460     SYSDDLHISTORY  row_exclusive  write      00FFFE000000000...
8460     LOCKB          row_exclusive  write      00C110000000000...
select tablename, h_applprocess as holder, h_lockmode,
r_applprocess as requestor, r_reqmode from lock_waits
TABLENAME  HOLDER  H_LOCKMODE     REQUESTOR  R_REQMODE
LOCKB      4132    row_exclusive  1904       sys_exclusive
LOCKA      1904    row_exclusive  4132       sys_exclusive

Now, this is, in fact, a deadlock but MaxDB does not do anything about it.

The reason for that is simple:
The deadlock detection does not include the share locks!

To be precise, for share locks the kernel does not maintain a list of session IDs, but only a single counter.
Based on this counter it’s not possible to find out which session is holding/waiting for a specific share lock and in consequence, the kernel cannot tell which tasks to roll back.
In this case, one user task needs to be manually cancelled or the lock timeout will deny the first request.

Although this is an ugly limitation of the deadlock detection it’s not really that bad in day to day DB usage.
The reason simply is that usually there are only few DDL commands running in parallel – especially when it’s not the upgrade weekend.

3) The dead walk – how deleted rows reappear

Ok, one last thing ūüôā

It’s a simple effect that I found to be¬†surprising while I was playing around with locks during the ‘research’ phase for this blog.

#### Session 1
select * from locktest
THE_ROW  THE_ROW2
1        ?
10       ?
2        ?
3        ?
4        ?
5        x
6        x
7        x
8        x
9        x
delete from locktest where the_row >='5'
More than one row updated or deleted. Affected Rows:  5
-> SEE: no commit here!
#### Session 2
select * from locktest
THE_ROW  THE_ROW2
1        ?
10       ?
2        ?
3        ?
4        ?

 

Where is the data?

#### Session 1
rollback
#### Session 2
select * from locktest
THE_ROW  THE_ROW2
1        ?
10       ?
2        ?
3        ?
4        ?
5        x
6        x
7        x
8        x
9        x

 

There it is!

This is a really nasty feature if you come from other DBMS like Oracle.
MaxDB currently (!) does not support a consistent view concurrency and it does not reconstruct deleted rows.
Since deletions are done in-place during the statement execution (and not at commit time) the deleted rows are really just gone when the second session looks into the table.
There’s nothing there to tell the second session to look for old data, the data is just gone.

If your application really relies on a consistent view of the data without data access phenomena like ‘dirty reads’, ‘non-repeatable reads’ etc. then you either need to use a higher transaction isolation mode (but loose scalability by that) or make your application aware of this.

Looking back

As we’ve seen locking is not really something that is ‘just there’.
It can become pretty important to be able to differentiate between what locking can do for you and what it wouldn’t do.

One important thing I did not mention yet explicitly: I’ve been just writing about SQL locks. But MaxDB (and the other DBMS as well) rely on multiple different shared resources that need to be protected/serialized as well.

For that task, MaxDB uses B*Tree-locks, critical regions, semaphores & mutexes, filesystem locks and the like.

So there’s plenty of topics to write about …

Resources

For more information on this area of MaxDB please check these resources:

MaxDB Internals Course – Locking

SAP Note #1243937 – FAQ: MaxDB SQL-Locks

MaxDB Dokumentation – Locks

Marketing

If you’re not already booked for October 27-29 this year and you happen to stay in Vienna and you keep asking yourself what to do … then get your ticket for SAP TechED 2009 and make sure to attend my MaxDB session¬†!

In addition to the presentation there will be a expert session on the afternoon, where I’ll await your questions that I hopefully can answer.
It’s session EXP349 MaxDB Q&A Tuesday, 2:30 P.M¬†that you should register for.

 

The others

Ok, I admit, I don’t have a very good idea of MS SQL Server.

I do Oracle and MaxDB – that pretty much is it.

Of course, as a database support guy you always need to peek over the fence to the other DBMS (e.g. working on priority Very High messages during weekends) but this has nothing to do with gaining a certain level of real experience with the ‘other’ DBMS.

Although my MS SQL colleague is usually sitting just on the opposite end of the desk usually everybody is busy enough with working on his/her own stuff.

But this makes me even more lucky to have found the following blog about SAP on MS SQL Server:

Running SAP Applications on SQL Server

So if you’re in MS SQL Server you really want to pay this one a visit (as long as you return to good old Oracle and MaxDB afterwards ;-)).

As far as I’m informed the blog is written by several authors, some of them working at SAP in Walldorf most of their time.

Have fun reading!

Lars

MaxDB ‚Äď On locking mechanisms and how we get to know our MaxDB a bit better each day‚Ķ Part I

One of the most boring topics within the area of DBMS surely is locking.
It’s also one of the topics that most DB users try to avoid as much as possible.

Still, it’s utmost important to correctly understand how locking works in the database of choice to be able to work successfully with it and without bad surprises.

Here are two cases of MaxDB locking behaviour that is not obvious at all (and also not well documented, yet).
Note: the first part “grew” while I was researching about it, so I split this blog into two parts. The second part will be released within the next few days.

1) When not to trust count(*)

With the advent of the file directory counters a neat query optimization for the popular ‘select count(*) …‘ was implemented.
Whenever the optimizer realizes that the user just wants to know the total number of rows in a table this number is retrieved from the file directory counter of the table and not from the table itself.

The performance gain is immense: instead of reading potential thousand of pages and count the rows that are found, MaxDB just looks up the counter and returns the number to the user.

But what happens when there is a transaction open for this query?
Let’s see:

####Session 1:
select * from locktest
THE_ROW
-------
1
10
2
3
4
5
6
7
8
9
select count(*) as cnt from locktest
CNT
---
10
explain select count(*) as cnt from locktest
SCHEMANAME  TABLENAME  COLUMN_OR_INDEX  STRATEGY                                PAGECOUNT
------------------------------------------------------------------------------------------
LARS        LOCKTEST                    TABLE SCAN                                       1
                                        COUNT OPTIMIZATION                      
                                             RESULT IS COPIED   , COSTVALUE IS           2
                                        QUERYREWRITE - APPLIED RULES:           
                                           DistinctPullUp                                1
####Session 2:
insert into locktest (select the_row+20 from locktest)
More than one row updated or deleted. Affected Rows:  10
####Session 1:
select count(*) as cnt from locktest
CNT
---
20
####Session 2:
rollback
Statement successfully executed. No Result
####Session 1:
select count(*) as cnt from locktest
CNT
---
10

As we see select count(*) returned the count including the additional 10 rows, although they never were committed!
Luckily the explain tells us about the optimization by printing out the “COUNT OPTIMIZATION” line.

A way out?

But how can we force MaxDB to actually deliver the result for committed rows?
Until writing this blog I thought the solution would simply be to trick the optimizer out of the COUNT OPTIMIZATION, e.g. by adding a dummy predicate that will evaluate to TRUE for all rows, let’s say 1=1.

Let’s see how this works:

####Session 1:
explain select count(*) as cnt from locktest where 1=1
SCHEMANAME  TABLENAME  COLUMN_OR_INDEX  STRATEGY                                PAGECOUNT
------------------------------------------------------------------------------------------
LARS        LOCKTEST                    TABLE SCAN                                       1
                                             RESULT IS COPIED   , COSTVALUE IS           3
                                        QUERYREWRITE - APPLIED RULES:           
                                           DistinctPullUp                                1

Ok, the “COUNT OPTIMIZATION” is gone, so we should expect the ‘correct’ counting of data…

####Session 2:
insert into locktest (select the_row+20 from locktest)
More than one row updated or deleted. Affected Rows:  10
####Session 1:
select count(*) as cnt from locktest where 1=1
CNT
---
20

WOW – wait a minute!
Still the same wrong result?
YES!
So I’ve written nonsense above?

Nope, what I wrote is actually true.
So how come that we still see the uncommitted data?

The point here is, that there are some more optimizations build in that are all but obvious.

Clever tricks

When applying the dummy-predicate MaxDB actually will walk down through the whole table.
So far the explain is completely correct.

But what happens with the rows read is what makes up the odd behaviour.
For each row found the kernel now checks whether the row is qualified against the predicates.
Part of this is to recognize that just counting is wanted and no actual data from the table should be returned.
So the kernel takes a shortcut and reports back that the row did not fit but increments a counter instead. This is then done with all rows in the table.

The key point here is to know that the kernel does only check the locks of a row once it has been recognized as “qualified row”.
Therefore this (obviously old) count(*) optimization shortcuts before the lock handling thereby always delivering uncommitted read-results.

Phew… weird isn’t it? But things are getting creepier!

Another way to fool the optimizer out of the set of optimizations is to use “count(column_name)” instead of “count(*)”.
The problem with that is that for “count(column_name)” only those rows are counted, for which the column is not a NULL value.
This might not be a problem for most SAP tables, but surely can be an issue for others.

Basically, only two “safe” ways exist to force the committed read-count.
In both of the following cases, the kernel really reads all rows and really needs to check locking but checking for entries in the lock lists.
The two ways are:

1) using “count(pk_column_name)”

In our example column “THE_ROW” forms the primary key, so let’s check this counting approach:

####Session 2:
insert into locktest (select the_row+20 from locktest)
More than one row updated or deleted. Affected Rows:  10
###Session 1:
select  count(the_row) as cnt from locktest
CNT
---
10
SCHEMANAME  TABLENAME  COLUMN_OR_INDEX  STRATEGY                                PAGECOUNT
------------------------------------------------------------------------------------------
LARS        LOCKTEST                    TABLE SCAN                                       1
                                             RESULT IS COPIED   , COSTVALUE IS           3
                                        QUERYREWRITE - APPLIED RULES:           
                                           DistinctPullUp                                1

If we change the table and add a new column which might contain NULL values we can observe how fragile this approach can become:

####Session 2:
rollback
alter table locktest add (the_row2 varchar(20))
commit

Now set some values…

update locktest set the_row2='x' where the_row >='5'
More than one row updated or deleted. Affected Rows:  5
commit
select * from locktest
THE_ROW  THE_ROW2
-------  --------
1        ?
10       ?
2        ?
3        ?
4        ?
5        x
6        x
7        x
8        x
9        x

If we now just use THE_ROW2 for counting we will get the number of columns that are NOT NULL, regardless of the locking state of the rows:

select count(the_row2) as cnt from locktest
CNT
---
5

And, of course, locking would be adhered to in this case.
That means that without the commit after the update a second session doing the count(the_row2) would simply be blocked until the commit has been issued (when we assume to work at least in isolation mode 1 – committed read).

2) using multiple aggregate-functions at once e.g. “count(*), count(*)”

####Session 2:
insert into locktest (select the_row+20, the_row2 from locktest)
More than one row updated or deleted. Affected Rows:  10
####Session 1:
select count(*) as cnt_1, count(*) as cnt_2 from locktest
CNT_1   CNT_2
-----   -----
   10      10
   
SCHEMANAME  TABLENAME  COLUMN_OR_INDEX  STRATEGY                                PAGECOUNT
------------------------------------------------------------------------------------------
LARS        LOCKTEST                    TABLE SCAN                                       1
                                             RESULT IS COPIED   , COSTVALUE IS           3
                                        QUERYREWRITE - APPLIED RULES:           
                                           DistinctPullUp                                1

 

NOTE: Here, session 1 is also hanging, until session 2 commits or roles back!

That’s it (for now)

That’s it about the count(*) locking effects that may occur with MaxDB.
Quite strange stuff going on here, isn’t it?

So why aren’t there problems with MaxDB and the SAP products all the time?
That’s simply because the NetWeaver application servers use uncommitted read (a.k.a. isolation level 0) all the time!

The data access layer is designed to know that rows that are found in the database can already be outdated by then. Since there is a whole business transaction handling engine implemented in the NetWeaver stack (“above” the DBMS) it really doesn’t matter at all for the work processes.
Whenever a user needs to ‘really’ put his hands on a row the application needs to get a enqueue anyhow and that ensures that the correct values are retrieved and now other users can change them while the user is working on the data.

And oh, yes, this is just another big argument against direct database access…

continue reading about locking behaviour in part II of this blog, coming up this week.