HANA backup script hint… a little bit more security, please!

+++ UPDATE 23.11.2011 +++

After the release of revision 20 (SPS 03) of the HANA database, some more changes concerning command syntax and system object naming had been identified.
The commands in this blog post have been adapted to reflect this.

+++ UPDATE 23.11.2011 +++

+++ UPDATE 24.10.2011 ++++
I’ve been informed that with the current HANA revisions, starting with rev.16 which is available for download in the SAP Service Marketplace since 10th of September, an altered syntax for taking backups is used.
So, to make them easier to follow in the future, I changed the examples in this blog accordingly.
+++ UPDATE 24.10.2011 ++++

When you check the SAP In-Memory Database Backup and Recovery Guide on https://help.sap.com/hana you’ll find on page 13 (4.3.2 Using SQL Commands) the first reference to a command line SQL client:
hdbsql.

The example that follows explains how to create a very simple SQL script that can be fed into the hdbsql tool in batch mode.
That way a super simple schedulable backup command is created, e.g. for scheduling via CRON or any other scheduler tool.

This is what we find in the PDF (syntax updated – see update note above):

-----------8<------snip--------8<----------
[...]
Create a file backup.sql with the following content:
connect -n <server name> -i <instance number> -u <user> -p <password>
//
BACKUP DATA USING FILE ('Monday')
//
Store this file with read rights for the crontab user.
Execute the backup by calling the file backup.sql:
/usr/sap/<SID>/HDB<instance number>/exe/hdbsql -I backup.sql
[...]
-----------8<------snip--------8<----------

Of course, this is what most customers and partners do and use in their first HANA implementation.
And in the cases, I’ve seen, for -u <user> the SYSTEM user was used.
This basically leads to a situation where there is a plain text sql script file that contains the full logon data for the SYSTEM user of your HANA box.

NOT GOOD! NOT AT ALL!

But are there alternatives? Sure there are!

Step 1 (and keep it repeating like a mantra) – Never use user SYSTEM for plain operation tasks like backups.

Instead, create a backup operator user that is not allowed to do anything else than performing backups.
Let’s check how to do this step by step:

Login to the HANA box as SYSTEM. You can do all the steps as well in HANA Studio, but as I assume most of you already are familiar with the user management there, so I present the command line version here (I chose a reddish color for the SYSTEM account sessions, just to make it a bit easier to recognize them – definitively NOT for decorative purposes…):

C:\Program Files\SAP\\dbclient>hdbsql -U S
Welcome to the SAP HANA Database interactive terminal.

Type:  \h for help with commands
\q to quit

hdbsql HAN=> select user() from dummy
CURRENT_USER
“SYSTEM”

1 row selected (0 usec)

So, I’m logged on as SYSTEM user. Great! Let’s go and create the user for backups!
(+++ 24.10.2011 – changed the logon from -U b to -U S more clarity +++)

hdbsql=> create user backop password 123
* 412: invalid password layout: minimal password length is [8] SQLSTATE: HY000

Ok, I always forget – we have a password rule build into HANA and minimal password length is 8 characters.
Let’s try again!

hdbsql HAN=> create user backop password 12345678
* 412: invalid password layout: password has to meet the rule ['A1a'] SQLSTATE: HY000

Grrr… I forgot again… this is really something I don’t like about password rules… but anyway: one more try!

hdbsql HAN=> create user backop password Aa12345678
0 rows affected (0 usec)

Success! Yes! Time for the first coffee break! (Ok, just kiddin’ we’re nearly done…)
Let’s connect with this user and try to take a backup!

hdbsql=> \\connect -n vml3012 -i 00 -u backop -p Aa12345678
Connected to HAN@vml3012:30015
hdbsql HAN=>
hdbsql HAN=> BACKUP DATA USING FILE ('BACKME')
* 414: user is forced to change password: alter password needed for current user SQLSTATE: HY000

This is also something I already had a few support messages for.
The very first time you login with a specific user to HANA you are forced to change your password.
Unfortunately there usually is no dialogue for this (it’s present only in HANA Studio but which end-user has access to this?), so one either has to have access to a SQL command line (like we happen to have right now) or somebody must have taken care of this, by e.g. implementing a little macro in Excel for this.
Anyhow, there should be a better general solution, but for now just move on and change the password:

hdbsql HAN=> alter user backop password Back1234
0 rows affected (0 usec)

Worked like a charm. Next step: take the backup!

hdbsql HAN=> BACKUP DATA USING FILE ('BACKME')
* 258: insufficient privilege: Not authorized SQLSTATE: HY000

Dang! We didn’t provide any of the required system privileges yet.

In a separate command line window I login again as SYSTEM and run the following:

hdbsql -U S

Welcome to the SAP HANA Database interactive terminal.

Type:  \\h for help with commands
\\q to quit
hdbsql HAN=> grant BACKUP ADMIN to backop
0 rows affected (1 usec)

That was easy cheesy, wasn’t it?
Back to our backop-session and try once again:

hdbsql HAN=> BACKUP DATA USING FILE ('BACKME')
0 rows affected (12 usec)

HOOORAY!

We finally did it! Now we’ve got a safe user that can login to the HANA DB but is only able to create backups.
Very nice.
Still, we would have to put the login data into the backup script file. But we have a second step to go…

Step 2 – Create and use a user store entry

The MaxDB veterans amongst the readers of this blog will already have recognized it:
HDBSQL is a ported version of SQLCLI the command line SQL client for SAPs very own MaxDB.
SQLCLI had a way to store user credentials and login data called XUSER (I even wrote about it way back…).
And guess what: this had been ported as well.
It’s called hdbuserstore now but works pretty much the same way.

ATTENTION: hdbuserstore is part of the HANA client package ONLY. I wasn’t able to find it on a vanilla installation of HANA, so if you want to use it on the server (e.g. for the backup use case I present here) you’ve to install the HANA client on the server manually.

For this demonstration, I simply used the HANA client on my laptop. To start, let’s review the command line options for this tool:

C:\Program Files\SAP\hdbclient>hdbuserstore.exe -h
Usage: hdbuserprofileadm [options] command [arguments]
Options:
-u <USER>       perform operation for other operating system user
-v              verbose mode, print operations done
-h              this help message
Commands:
HELP
Print help message.
SET <KEY> <ENV> <USERNAME> <PASSWORD>
Set a profile entry.
<KEY>       entry key
<ENV>       connection environment (host and port)
<USERNAME>  user name
<PASSWORD>  password
DELETE <KEY>
Delete entry with key <KEY>.
LIST [<KEY> ...]
List entries of store. The passwords are not shown.

In short, the program allows us to create, display and delete sets of login data.
These sets have a

  • name or <KEY>,
  • the ‘connection environment’ <ENV> which means HANA hostname and port (in my case this is vml3012:30015),
  • of course a <USERNAME>
  • and the <PASSWORD>.

Let’s go straight ahead and create an entry:

hdbuserstore set b vml3012:30015 backop Aa12345678

As you see, I’ve used

  • b as <KEY>,
  • vml3012:3001 as <ENV>,
  • backop as <USERNAME>
  • and Aa12345678 as <PASSWORD>.

I’d wish the hdbuserstore command would provide more feedback in case everything went OK, something along “OK, profile b successfully created!” or similar, but today it’s more along “no news is good news“…

Anyhow, we can review our newly created login profile:

hdbuserstore list
LIST command for all entries is not yet implemented.

I guess we have to provide exactly what we want to see:

hdbuserstore list b
KEY b
ENV : vml3012:30015
USER: backop

Et voilá, there is our entry.

One thing to keep in mind is that these entries are accessible only by the operating system user that created it.
So in order to use this for the CRONJOB backup, you’d have to login to the user cron executed the backup script with and create the login profile there!

Now we put it all together and change the backup script to

BACKUP DATA USING FILE ('Monday')
//

With this file we can now call the backup just like this:

hdbsql -U b -I backup.sql

Again, we don’t get any output here if no errors occur.
This can become a bit annoying, especially when problems occur and debugging of the script should be done.
For that, you can add the -f (show the SQL command) parameter and/or the -t (printout debugging information) parameter.
The output then looks like this:

hdbsql -U b -I backup.sql -f
BACKUP DATA USING FILE ('Monday')

or like this:

hdbsql -U b -I backup.sql -f -tUsername  : “”
DB name   : “”
Instance  : “(null)”
server    : “”
xuserkey  : “b”
nullvalue : “?”
fieldsep  : “(null)”
cmdsep    : “//”
CMDTrace  : “TRUE”
autocommit: “TRUE”
noHeader  : “FALSE”
useDBS    : “TRUE”
batch     : “FALSE”
rollbOnErr: “FALSE”
SQL mode  : “(null)”
Command   : “NULL”
BACKUP DATA USING FILE (‘Monday’)


That’s it once again!

As always I hope you liked the hands-on style of my blog and that you share your experiences with HANA.

How most BW performance support messages are solved

As the fourth quarter of the year started everybody seems to check the goals that had been set for this year and what has already been reached. Looking back on the messages I worked on I found a very clear pattern, when the message was about BW on Oracle and performance.

Most messages were solved by the very same simple five steps:

  • ensuring the database software is up do date
  • ensuring the database parameters are set as recommended
  • ensuring that the CBO statistics were, in fact, current and correct
  • ensuring that all indexes are present and usable
  • ensuring that the f-fact table doesn’t have too many partitions

Yes, these steps are boring and not very popular.
To give the recommendation to implement these steps, not a lot of special analysis is required.
In fact, it’s a rather quick thing to check on these and tell if they need to be applied. It reminds me of the flossing discussions with the dentist… Anyhow, let’s review these points briefly one by one:

Up to date Oracle database software

BW systems regularly present the Oracle query optimizer with highly complex issues to solve.
And the Oracle cost-based optimizer (CBO) is by far the most complex piece of coding in the whole DBMS software.
So, even without the long history of epic optimizer errors (check note #176754) it should be obvious, that it’s fairly impossible to not have bugs in this code.
For example, one issue that struck often with the early 11.2. patches was that the optimizer started to use the infamous MERGE JOIN CARTESIAN (MJC) operation.
Due to bugs and/or outdated statistics, the CBO believed to get one or zero rows back from one-row source and with that knowledge, a MJC eventually seems to be a good choice.
Once you get more rows than that you easily end up with PSAPTEMP overflows (ORA-1652) and horrid runtimes.

And sorry, but Oracle 8i, 9i and 10g are all outdated and not supported anymore.
See note  #1110995 Extended maintenance for Oracle Version 10.2 on this.
Of course, nobody can deny you to use Oracle 10g, it’s your system landscape after all.
But be prepared to have to install Oracle 10.2.0.5 PLUS the latest SAP Bundle Patch before any in-depth analysis (e.g.via  CBO trace) is performed.
We’re not chasing bugs in this old versions anymore.
Besides: very very likely the problems just won’t be there with the latest patch anyhow. It’s pretty mature nowadays.

Database parameters

Well, this pretty much just goes with the paragraph before: without the correct parameter settings that fit the current version and patch of the database software, bugs and issues cannot be avoided, even if they are already known.

Current/fitting CBO statistics

Still, one of the links that I give to my customers most often is the one to my own old blog post

BW on Oracle: a performance hitch I see more and more often…

Even today it seems that there are yet many systems where the DBSTATC needs to be cleaned up.
Today I just added a small detail: when you used to use report SAP_ANALYZE_ALL_INFOCUBES to collect statistics, you really should stop that.
It basically brings back the DBSTATC entries…

Another aspect of this topic is that the standard heuristic to gather new statistics is based on a threshold of changed data in a table. Usually, this threshold is at 50% of changed (that is added, updated, deleted) rows. For BW systems this can be suboptimal. For example, time dimension tables might get new data and thus certain say 0CALMONTH values might suddenly be present in the histograms. By relying on the standard heuristic the CBO won’t notice that, as no new statistics have been gathered. One way to overcome this would be to force new statistics on single tables.

Another step to take would be to force the re-collection of statistics for tables that have very old (say older than a year) statistics.

Unfortunately, there is – not yet – an automatism build into brtools or BW, but internal discussions on this have started!

Indexes present and usable

Indexes are critical for the optimizer. They speed up queries, they provide data access possibilities and sometimes even allow to leave out a table access alltogether.
But only if they are actually present on the database and usable.
For BW systems, where table partitioning is used per default, it’s important to have the indexes also partitioned.
Otherwise partition-wide actions like DROP PARTITION/TRUNCATE PARTITION will render the indexes on the table UNUSABLE or INVALID.
And they cannot be used for queries in this state.
The most common cases are described or at least referenced in note
#1513510 – 900 index in unsuable state with SAP Basis 7.00/7.01/7.02
Although I write the note initially to cover a specific bug in a SAP Basis SP it contains extensive explanations, a check SQL statement and links to other related sap notes.
Make sure to check it out and to get your indexes straight.

Too many f-table partitions

F-fact tables can be seen as the data inbox table of an infocube.
The request-wise partitioning allows quick and easy data removal in case the loaded data is not OK.
Usually this shouldn’t happen once your data mart has been properly setup and is stabilized.
Anyhow, keeping a good 30 partitions (e.g. a months worth of data) won’t do any harm.
But more than that make the work for the optimizer much more difficult.
Also, nearly nobody using business reporting checks on specific data load requests.
Instead queries are usually build upon a time reference.
The E-fact table can be partitioned by time and that way the query performance can be improved by factors.

So, make sure to check blog post
Partition count contest!
and note #590370 – Too many uncompressed request (f table partitions)
and start compressing your cubes today.

Is this it? Yep, kinda…

Are there other possible performance problems in the BW on Oracle world?
Sure they are. But these aren’t nearly as often as the ones that are solved for good by implementing the five steps discussed here.
As a little extra, now that you know about these five steps, you can plan and decide when you implement them – you don’t have to wait until you hit a serious performance issue and open a support message for that 🙂

That’s it for now. Have a great weekend!

Cheers, Lars

HANA operational concept released to customers

During the past months, HANA made a fairly big wave in the SAP community. Looked like everybody wanted to get the hands onto this and gather some experience with the new piece of technology. With those, tons of questions arisen about how to operate such a system.

The recently released note #1599888 – SAP HANA: Operational Concept answers these questions and clearly describes what’s supported and what’s not.

I’m pretty sure that this will make life a lot easier for everybody involved in HANA projects.

A big THANK YOU from my side to the authors of this note!