Most folks working with SQL DBs for a while will likely come to see how beginners in this area struggle with “declaritive code” and “tell the DB what you want not how to do it”
In my experience, this issue stems partly from the way SQL is tought – examples are often way too simplistic to cover “advanced” topics – partly this is because it is very different thinking required when one comes from Python, JAVA or the whole notion of “telling the computer what to do”.
This often leads to horrendous SQL code and data models, that neither give correct results nor deliver them fast.
Despite all of this, I see many questions around tiny optimization options, command variations and “optimizer flags” from beginners all the time.
Astonishingly, this approach of looking for workarounds for problems with the data model understanding occurs regardless of the DBMS platform used.
– Lars Breddemann
What’s often missing in those questions is the broader context, WHY they want the DB to do something and WHAT eventually should be achived.
Basically, the core question “WHAT should be the outcome, the result?” remains unanswered both on top- and bottom-level.
Maybe this is because it is hard to express this desired outcome in a good way?
I recently read a really good presentation slide deck from Markus Winand that, besides explaining an interesting new SQL standard feature (row pattern matching), makes use of a nice way to illustrate what data should be selected and returned.
The example used is non-trivial and certainly beyond what most SQL 101 courses cover and that’s another positive! The visualization he uses gives a really good grasp on the kind of data and what the result should be.
Maybe this could be a pointer for how to improve your SQL writing in 2019: Draw a picture of the data you have and what aspects of it that you want and then work build your SQL statement step by step based on this. This is what I do in many cases when I try to understand what a SQL statement does, so this technique works both for constructing a new statement as well as for deciphering existing ones.
To finish this slightly ranting post of here are some links to presentations and twitter accounts that think are worthwhile to read or follow (not exhaustive at all). If you do SQL programming as part of your daily struggle for bread work, take some time and check these resources.
About a year ago I answered a question on the SAP Community Q&A site and have it on my posts-to-write list ever since to turn it into a blog post.
I feel that I learn a lot from seeing applied examples of concepts and rules – for example from this great presentation from Kevlin Henney that includes a “talk-through” a refactoring example. So, this is my example for others to review.
The premise of the question is rather common: a piece of code that was used on Oracle should be made to run on SAP HANA. The OP had already managed to translate several parts of the Oracle code to work on SAP HANA but got stuck with an SQL Script error message. The SAP HANA version used by the OP did not support BETWEEN as a check condition in an IF-clause. Current SAP HANA versions meanwhile have support for this but the code suffered from much more than just lack of syntactic sugar. Here’s the original user-defined scalar function:
create FUNCTION "KABIL_PRACTICE"."F_KEEP_DATE"( TDATE DATE, TTODAY DATE)
RETURNS RETVALUE CHAR
SQL SECURITY INVOKER AS
PURPOSE: RETURNS T/F WHETHER OR NOT THE DATE MEETS THE CRITERIA TO MAINTAIN DATA.
VER DATE AUTHOR DESCRIPTION
--------- ---------- --------------- ------------------------------------
1.0 6/18/2008 1. CREATED THIS FUNCTION.
RETVALUE := 'F';
IF :TDATE BETWEEN ADD_DAYS (TO_DATE (:TTODAY, 'YYYY-MM-DD'), -30) AND :TTODAY
OR TO_CHAR( EXTRACT (MONTH FROM ADD_DAYS (TO_DATE (:TDATE, 'YYYY-MM-DD'), 1)))<>
TO_CHAR( EXTRACT (MONTH FROM TO_DATE (TDATE, 'YYYY-MM-DD')))
RETVALUE := 'T';
Step 1: remove the fluff
First, remove the useless fluff of comments. The change history of the code is captured in the code repository, so keeping a log of changes is pointless here. This implies that there is a code repository, of course. SAP HANA comes with the repository for XS (classic) and git support for XSA out of the box and both options allow for distributed development and change tracking of development resources. Just removing the comments and adding line-breaks and some indentation gets us to this:
create FUNCTION "KABIL_PRACTICE"."F_KEEP_DATE"
( TDATE DATE, TTODAY DATE)
RETURNS RETVALUE CHAR
SQL SECURITY INVOKER AS
RETVALUE := 'F';
IF :TDATE BETWEEN ADD_DAYS (TO_DATE (:TTODAY, 'YYYY-MM-DD')
OR TO_CHAR(EXTRACT(MONTH FROM ADD_DAYS
(TO_DATE (:TDATE, 'YYYY-MM-DD'), 1)))
<> TO_CHAR(EXTRACT(MONTH FROM
TO_DATE (:TDATE, 'YYYY-MM-DD')))
RETVALUE := 'T';
Step 2: Find out what the code should do
It’s a single IF statement, so the job of this function is to encapsulate the condition check. Let’s focus on that and find out what are the conditions checked here:
:TDATE BETWEEN ADD_DAYS (TO_DATE (:TTODAY, 'YYYY-MM-DD'),-30)
TO_CHAR(EXTRACT(MONTH FROM ADD_DAYS
(TO_DATE (:TDATE, 'YYYY-MM-DD'), 1)))
<> TO_CHAR(EXTRACT(MONTH FROM
TO_DATE (:TDATE, 'YYYY-MM-DD')))
We see that there are in fact two conditions of which either one needs to be true to make the whole function true. Still, with all the data type conversion in place, it is hard to see, what is being checked here. So, let’s get rid of them, where we can.
Step 3: remove type conversions where possible
Something I see quite a lot in converted code is that date values get converted into text values and back again. This is probably due to practices on Oracle, but on SAP HANA, date values can be used directly without the need for conversion or data format specification. A date does not have a specific format, this only comes into play when the display should be printed to the screen. More on this here.
:TDATE BETWEEN ADD_DAYS (:TTODAY, -30) AND :TTODAY
EXTRACT (MONTH FROM ADD_DAYS (:TDATE, 1))
<> EXTRACT (MONTH FROM :TDATE)
Now, we can see, that the first condition checks whether :TDATE is within the past 30 days.
The second condition checks if the day after :TDATE is still in the same month as :TDATE. That means, this condition checks if :TDATE is the last day of the month.
Summarizing, the function returns 'T' (TRUE) when the provided date is either within the last 30 days or when the provided date is the last day of the month.
Checking the Datetime functions in SAP HANA we find a couple of useful functions that we can use to make this code clearer.
Step 4: make use of built-in functions
The LAST_DAY function returns the last day of the month for the date it is used on, saving us the manual steps of extracting the month and checking if the next day would be a different month.
:TDATE BETWEEN ADD_DAYS (:TTODAY, -30) AND :TTODAY
OR :TDATE = LAST_DAY(:TDATE)
Step 5: Pick names that help understanding the code
Removing the TTODAY variable is a bit tricky here; and we should remove it in order to make the function easier to understand.
This variable could, in fact, be used as a flexible reference or anchor date. In this case, it should be renamed to something like reference_date.
But I have the suspicion that it really is always only used with today’s date. This allows us to remove the parameter altogether.
We also should pick a better name for both the parameter and the function itself and change the return value to the more commonly used integer type as a substitute for a BOOLEAN.
The original comment indicated that the purpose of the function is to check whether data was still in a time window during which it can be modified.
I am not really happy with dataInEditPeriod for the function name but it sure is more obvious than F_KEEP_DATE. Indicating the object type (‘F‘ for function? ‘SP‘ for stored procedure?) in the object name does not make it easier to understand the code. If I can access the database via SQL I can easily look up what type any object has. The remaining “KEEP_DATE” part is problematic as it refers to the intended usage scenario of the function (determining whether data should be kept). This says something about what the function shall be used for instead of what it does. My proposed alternative dataInEditPeriod tries to convey what the function does on the semantic level of the data model. Here, we are dealing with data that has a time frame in which it can be changed: “EditPeriod” and the function checks if the data is in this period.
As for the parameter, how about checked_day?
Finally, swap the clumsy RETVALUE for RESULT and we get this:
create FUNCTION "dataInEditPeriod"(checked_day DATE)
RETURNS result INTEGER
SQL SECURITY INVOKER AS
result := 0;
IF (:checked_day BETWEEN ADD_DAYS (current_date, -30)
OR (:checked_day = LAST_DAY(:checked_day))
result := 1;
Step 6: review how the function gets used
The calling code would then read like this:
if "dataInEditPeriod" (:CC_DATE) = 1
That is not perfect, but I’d say it is heaps better to understand than your starting position.
A recurring question of folks using SAP HANA is how to only run a DROP command for objects that actually exist in the database. One could argue that the effect of executing DROP does not depend on whether the object to be dropped exists, because after running the command it will not be there either way, but there is a bit more to it.
See, DROP commands, like say DROPTABLE, report back an error message when the object cannot be found (e.g. because it does not exist).
drop table bla;
Could not execute 'drop table bla'
SAP DBTech JDBC: : invalid table name: BLA: line 1 col 12 (at pos 11)
This can be annoying when running scripts to set up a database structure which is a common part of product installation and update procedures.
On MS SQL Server there exists (ha ha ha – extra point for the flattest pun) an extension to the DROP command (and to other DDL commands): DROP IF EXISTS
SAP HANA up to now (HANA 2 SPS 03) does not provide such a functionality, but it is easy to come up with a workaround.
The following utility function can be used:
drop function obj_exists ;
create function obj_exists (in schema_name NVARCHAR(256)
, in object_name NVARCHAR(256))
returns obj_exists int
declare _exists int := 0;
when count(*) >0 then 1
end into _exists
schema_name = IFNULL (:schema_name, current_schema)
and object_name = IFNULL (:object_name, '');
obj_exists = :_exists;
The function returns 1 if the object exists and 0 if it does not. If the input parameters contain NULL, then the function still works using the CURRENT schema but will likely return 0 as output.
obj_exists (schema_name => NULL
, object_name => 'OBJ_EXISTS') as OBJ_EXISTS
The above worked because the OBJ_EXISTS function was created in the current schema. With that one can easily write a SQL Script that “looks before it fires“.
There is one predominant reason for why databases and database technology get a spot in the conversation of “the business”: performance. Understanding system performance and coming up with ideas for improving it typically requires a form of end-to-end understanding with different levels of detail insight on how a query from a client application is processed by SAP HANA. Before long, one looks into “explain plans“, “execution paths”, SQL, PlanViz diagrams, data distributions, table definitions, stack traces and even NUMA-process management on Linux/Hypervisor-level. That’s a lot of concepts and layered technology and most of those concepts come with a development history of at least 30 years if not more. This means that prior knowledge is not only implied but required to get a proper understanding of what concepts are put to work in your specific setup. Intuition about how the system works is not your friend here.
But where do you learn about all these things? The product documentation? A university SQL 101 class? Blog posts from strangers on the Internet? Books? Or a mooc with a certificate?
Personally, I’m always happy to find good collections of material, maybe even curated into topics. One of those collections is The Red Book (Readings in Database Systems), which covers all important topics of Database Management Systems. It even features a chapter “3. Techniques Everyone Should Know” in case you really only want to have a quick overview.
I read about this awesome (and free) online book in a blog that I can only recommend: The morning paper by Adrian Coyler. He linked to the red book in his post (Database) Techiques Everyone Should Know. There are also many more recent posts about database related papers like Learned Index Structures or the design and implementation of modern column-oriented database systems which explain current developments in DBMS technology that will probably end up as a feature in one of the big DBMS systems. The great thing about Adrian’s blog is that it is a kind of Cliff’s Notes for computer science papers. One does not have to actually read the paper to get the core ideas; but Adrian always makes the paper available to read for oneself. While I mention this blog, I like to say “Thank you!” to Mr. Coyler whom I never met, but I certainly benefitted a lot from reading his outstanding blog.
Last Monday I read DJ Adam’s blog post about GraphQL and how this tool might be useful to be added to the toolbox of SAP UI5 developers. There is a lively discussion over the blog’s topics going on and most of it was around the GraphQL technology and its potential usage in UI5. I added a comment to this and Mr Adams used this to hook me in for a blog post. So here we are.
As someone that read through the discussion somewhat uninvolved since I am not doing any front-end development what stroke me was the narrow focus of it. Why was there no question about how this added tool would likely make a change to the development projects and teams? Reading a bit deeper in the linked blog post from Jeff Handley a lot of the arguments are not about technology, features and implementation details but about how the tool is helping the product development teams to do their jobs better. To me, it seems like a rather typical SAP technology discussion, where “THE TOOL”, “THE PLATFORM” or “THE SUITE” is the bit that will make the difference between successful developments and the other ones.
Now, while I am a sucker for great tools like the next person, I am in the corner of folks that believe “it’s the photographer that makes the great picture – not the camera“. Technology can help with many problems but if one does not have a clear understanding what problem needs solving simply picking up the lastest-greatest-shiniest tool likely means the wrong tool is used and/or used in a wrong way.
One example I mentioned in my comment hinged on the “declarative” nature of GraphQL. I do not know how GraphQL works specifically but I have some experience with how declarative programming works in many SAP shops. The main declarative language present in these organisations is SQL. ABAP programs use it, SAP HANA models have it (in some way) and the analytics teams are constantly building “queries”.
My point is that this typically does not work too well. Invariably I come across data models and queries that show the developers tried to “tell the database what to do“. And that even though the very same developers commonly do neither have a proper understanding of how the database works nor how to understand what it actually does in the first place. Still, “that index needs to be used!” and “this hint is crucial for the performance“!
I cannot claim to know for sure why this is, but I got some suspicions:
often ABAP/SAP developers moved into their roles horizontally from either the “tech support” or from the “business support” team. That means great familiarity with the organisation’s landscape and processes but does not guarantee a firm grasp of the concepts implemented in the technology.
education and learning of abstract concepts is not a common activity in SAP development teams as there is no immediate payoff.
the official SAP technology documentation and training is focussed on functions and features and actively avoids concepts and broader ideas.
This is not to say that it is the development team’s fault – but a lot of the IT technology that SAP has embraced in recent years requires a comprehension of the underlying concepts. Most of those concepts tend to be more involved than computation “mechanics” and cannot be intuitively understood correctly. “Learning by example”, which I would bet is the most prevalent learning style in SAP environments, only goes so far if one is trying to get a hold on new non-trivial ideas.
Coming back to the GraphQL discussion to me this looks like another tool that will be “used the wrong way without error messages” if at all. Will the majority of development teams be able to realize the same benefits as the Concur team? I doubt that.
Will the choice of tool make a difference to the outcome for the organisation in a way that makes a difference to the organisation? Probably not.
This gives a lead into another aspect of the discussion around DJ Adams’ blog post: the argument that it would “help to get more developers to SAP platforms” if GraphQL (or any other currently popular tool) would be available. When I read claims like that I wonder if there are any organisations that use IT as part of their established infrastructure and that then go and swap out major parts of it, because some developers like the new parts better. In my personal experience that never happens. Sure, the new marketing hero gets the reporting frontend s/he needs to be successful – but that is closely tied to tangible results. If developers cannot show how the end result will be (very) positively affected in concrete, preferably short terms, then existing tools tend to stay in place.
Change is hard, slow and expensive and the ticket for this journey really must be worth it. And commonly the new lands look a lot nicer from a good distance.
This I want to achieve using Graphical or Script Cal view, please help me to solve the issue”
The OP did not bother googling for solutions for this very common problem nor does (s)he seem to have access to classic SQL development literature, like “Jeo Celko’s Thinking in Sets“.
This is one of those “throw the requirement over the fence” question and really should have been moderated. However, as so often, an eager answerer offered his help and provided the following code to solve the problem:
declare v_i integer;
declare v_count integer;
sel1 = select
"0CALDAY" as "CALDAY",
"Yesterday" as "YESTERSDAY",
sum("DATE_DIFF") AS "DATE_DIFF" ,
sum((select 0 from dummy)) as "HOUR_DIFF",
sum((select 0 from dummy)) as "WRK_HOUR_DIFF"
-- where "0CALDAY" >= '20180701'
GROUP BY "0CALDAY", "Yesterday";
select count(*) into v_count from :sel1;
for v_i in 1..v_count do
WHILE to_date(:sel1.CALDAY[v_i]) < to_date(:sel1.YESTERSDAY[v_i]) do
if( weekday(to_date(:sel1.CALDAY[v_i])) ) < 5
sel1.WRK_HOUR_DIFF[v_i] = :sel1.WRK_HOUR_DIFF[v_i] + 24 ;
sel1.HOUR_DIFF[v_i] = :sel1.HOUR_DIFF[v_i] + 24 ;
sel1.HOUR_DIFF[v_i] = :sel1.HOUR_DIFF[v_i] + 24 ;
sel1.CALDAY[v_i] = ADD_DAYS(to_date(:sel1.CALDAY[v_i]), 1) ;
end while ;
select * from :sel1;
Reading through this approach several problems stick out:
this really shouldn’t require SQLScript at all and definitively not a loop
using ‘SELECT 0 from DUMMY’ to generate a constant value of integer 0 is plain wrong
summing constant 0 over a grouping set is plain wrong as well – the result is a constant 0
removing the aggregation allows for removing the grouping altogether: a SELECT DISTINCT would do the trick here
the number of records in sel1 depends solely on the number of different values in ZTABLE.CALDAY a table that has not been defined
this number of different CALDAY values is used to loop and add 24 hours to the sel1 table variable HOUR_DIFF and WRK_HOUR_DIFF to keep a tally of hours between dates, which could have been done with HANA SQL functions directly
This looks very much like the kind of SQL code developers tend to write that are not “at home” with SQL and rather try pushing an imperative programming style into their SQL code.
This code is inefficient, slow and hard to maintain. In short: don’t do it like this.
The first approach for this type of common problem really should be to not trying to develop the solution yourself, but instead to look for existing standard solutions. The mentioned book uses so-called auxiliary tables for this (see chapter 9.5 Calendar Tables) and so does SAP NetWeaver and also the freestyle solution presented in “Want to Calculate WORKDAYS_BETWEEN without TFACS Table?” .
The core of this solution approach is to maintain a table of all dates that indicates whether or not a given date is a weekend or not. Such a table could even hold the information if a given date has been a working day or a public holiday.
Looking at the requirement of “excluding weekends” from counting hours being able to discern workdays from holidays seem like a close-by requirement.
Let’s stick to the case of simply excluding the weekends which the OP defined as Saturday and Sunday. When working on such a requirement it is particularly important to consider the calendar context in which the solution should work. Certain calendars (e.g. Isreali calendar) have Friday and Saturday as their weekends. When the underlying requirement actually is “hours of working days” this might become more complicated with different calendars. Consider, for example, that daylight saving time shifts effectively add/subtract hours – commonly over weekends, but sometimes also on working days. The OP hasn’t mentioned anything beyond the description of a mechanical calculation problem, so we cannot know what the underlying requirement really was.
Anyway, a quick solution for the simple counting of hours on days other than Saturday and Sunday would be to use the SAP HANA built-in table “_SYS_BI”.”M_TIME_DIMENSION”. This table provides dates in different formats such as DATE_SQL as well a day of the week number DAY_OF_THE_WEEK_INT (Monday =0, Sunday=6).
With data generated in the table, the original question can be answered as “number of non-weekend days between the given days multiplied by 24 hours“
select count(*) as no_of_workdays,
count(*) * 24 as no_of_hours
where day_of_week not in (5, 6)
and date_sql between to_date('19.07.2018', 'DD.MM.YYYY')
and to_date('26.07.2018', 'DD.MM.YYYY')
Note how this differs from the OPs calculation by one day. Checking the inner query gives us:
Which means that the OP did miscalculate the hours in the example, by not including the last day of the given range. This points to another common trap when dealing with dates and intervals: one needs to be clear about whether interval end dates should be included or excluded.
The same approach can be taken with graphical calculation views:
count(distinct "DATE_SQL" ) as no_of_workdays,
count(distinct "DATE_SQL" ) * 24 as no_of_hours
The broader view
I mentioned above that this common problem can be broadened to “separate working days from non-working days”. The SAP business applications have been dealing with such problems for many decades now and so it’s not surprising that there is indeed a standard solution available for this. This standard solution comes in the form of “FACTORY CALENDARS”. SAP Netweaver table TFACS holds one row for each year and each calendar. Every row contains twelve column with a “bitfield” represented by ‘1’ and ‘0’ characters for each day of the month. A ‘1’ indicates a working day and a ‘0’ denotes a non-working day.
There are some other columns available, but for our purpose, this is what we need to know.
IDENT JAHR MON01 MON02 ... MON12
AU 2018 0111100111110011111001111000111 1100111110011111001111100111 ... 0011111001111100111110010010110
Typically this table is re-used or replicated from the SAP NetWeaver environment so that all date calculations are using the same calendar definitions. If you want to use the table without an SAP NetWeaver system you could use the following commands to create it yourself:
create row table TFACS
( IDENT NVARCHAR(2) not null -- factory calendar
, JAHR NVARCHAR(4) not null -- year stored
, MON01 NVARCHAR(31) not null -- Calendar: days in a month, 1 for working day, 0 for non-working day
, MON02 NVARCHAR(31) not null -- Calendar: days in a month, 1 for working day, 0 for non-working day
, MON03 NVARCHAR(31) not null -- Calendar: days in a month, 1 for working day, 0 for non-working day
, MON04 NVARCHAR(31) not null -- Calendar: days in a month, 1 for working day, 0 for non-working day
, MON05 NVARCHAR(31) not null -- Calendar: days in a month, 1 for working day, 0 for non-working day
, MON06 NVARCHAR(31) not null -- Calendar: days in a month, 1 for working day, 0 for non-working day
, MON07 NVARCHAR(31) not null -- Calendar: days in a month, 1 for working day, 0 for non-working day
, MON08 NVARCHAR(31) not null -- Calendar: days in a month, 1 for working day, 0 for non-working day
, MON09 NVARCHAR(31) not null -- Calendar: days in a month, 1 for working day, 0 for non-working day
, MON10 NVARCHAR(31) not null -- Calendar: days in a month, 1 for working day, 0 for non-working day
, MON11 NVARCHAR(31) not null -- Calendar: days in a month, 1 for working day, 0 for non-working day
, MON12 NVARCHAR(31) not null -- Calendar: days in a month, 1 for working day, 0 for non-working day
, BASIS NVARCHAR(5) not null -- start of factory day numbering
, FENUM NVARCHAR(3) not null -- number of days in year
, WENUM NVARCHAR(3) not null -- number of workdays in year
, LOAD NVARCHAR(1) not null -- load year in bufer flag ('X' = YES)
, primary key (IDENT, JAHR));
As an example I entered one row with the working days of Australia/Victoria in 2018:
insert into tfacs values (
-- 1 2 3
, '0111100111110011111001111000111' -- january
, '1100111110011111001111100111' -- february
, '1100111110011111001111100111100' -- march
, '001111100111110011111001011001' -- april
, '1111001111100111110011111001111' -- may
, '100111110001111001111100111110' -- june
, '0111110011111001111100111110011' -- july
, '1110011111001111100111110011111' -- august
, '001111100111110011111001111100' -- september
, '011110011111001111100111110011' -- october
, '11001111100111110011111001111' -- november
, '0011111001111100111110010010110' -- december
, ' '
, ' '
, ' '
, ' ');
WORKDAYS_BETWEEN('AU', to_date('19.07.2018', 'DD.MM.YYYY')
, to_date('26.07.2018', 'DD.MM.YYYY'))
as "no of working days"
no of working days
Note that here, the “between” condition is understood as “workdays computed starting on the 19th and finishing on, but not including, the 26th” (see documentation on this) which seem to be the understanding of the OP.
The benefits of using calendar tables over pure function approaches are clear:
it’s easy to understand which days are considered working days and which days are not
multiple calendars can be maintained and used in parallel (different years, countries, union agreements, operating shifts, etc.)
the SQL/application code does not need to be changed when the calendars need updating
There are a couple of topics that are an ongoing source of confusion even for experienced IT professionals and developers. UNICODE handling is one example (see here for info), floating point calculations is another one (this is a good primer) and then there is everything to do with time.
Dealing with dates, time, time zones and the like is confusing and counter-intuitive in itself; adding computers to the mix doesn’t make this any better.
This post is going to shine some light on what goes into date/time handling in SAP HANA and tries to answer some questions that I came across recently.
The big picture
The SAP HANA DBMS obviously doesn’t come with its own timekeeper (though I think it would make a nice gift for HANA DBAs) but uses date/time information provided by the Linux operating system whenever it needs to know the current time.
Linux gets its time from different possible sources, starting with the hardware clock (aka BIOS clock) and, if configured, the ntpd(aemon).
The man-pages on this (hwclock, date, tz, adjtimex, /proc/timer_list) are an interesting read.
A super-short version of the story here is: When the computer is started, the hardware clock sets the current time kept by the battery and the BIOS. This time does not have a time zone assigned to it – it is open to interpretation.
At some point, Linux starts up, reads the hardware clock and sets the current system time and date considering the time zone that was declared for this computer when Linux was installed (yes, one can change this later on).
From here on the hardware clock is out of the picture. All system requests concerning current date/time are served from the Linux managed clock(s) – the System time – and likely continuously updated via the ntp-daemon.
It is possible to check the current hardware time (output in ISO 8061 format):
The +1100 is the time zone offset that the system time uses/assumes. The hardware clock does not ‘know’ what timezone it is in, so specifying either –utc or –localtime and checking which results in the correct local time is crucial here. Once again, the man-page for hwclock has the whole story.
Comparing this to the output of date:
Mon Mar 26 11:05:37 AEDT 2018
we learn the system time is set to use the Australian Eastern time zone.
Another way to check which timezone has been set on system level is to check where the file-system link /etc/localtime points to:
> ls -l /etc/localtime
lrwxrwxrwx 1 root root 39 Mar 12 18:48 /etc/localtime -> /usr/share/zoneinfo/Australia/Melbourne
The /usr/share/zoneinfo folder is the system timezone database directory.
All time zones, including Daylight Saving Time (DST), that the Linux system knows about, can be found here. Since both time zone definitions, as well as DST rules, change regularly (see Time Zone News), these files are commonly updated during Linux updates.
We will come back to the topic of Daylight Saving Time later on – it is yet another can of worms.
The time of the day
Here we are, with Linux booted up and ready to go. We can log in now and ask for the current time and date (as seen above):
Mon Mar 26 18:17:07 AEDT 2018
But what if my user should use a different time zone?
To do that, the Linux shell provides a setting in form of an environment variable named TZ.
Setting the variable to one of the time zones in the timezone database directory allows programs to return the date in the local time of choice:
> export TZ=UTC+0
> echo $TZ
Mon Mar 26 07:18:00 UTC 2018
Any program started from a shell with this setting will now work with the time zone set to UTC.
Becoming a profiler
If this setting should be made permanent (just for the current user) the environment variable has to be set in the shell profile of the user.
The shell profile is an auto-start script that contains the commands the shell should execute every time it gets started.
For the common Bourne shellthis script file is called .profile.
As the name starts with a dot, it is a so-called hidden file, but using ls -a in the user’s home directory lists it nevertheless:
> cd ~ # change into the user's home directory
> ls -la # list all files, including the 'hidden' files
drwx------ 1 root root 730 Mar 26 05:53 .
drwxr-xr-x 1 root root 200 Jan 31 00:51 ..
drwx------ 1 root root 16 Jun 13 2017 .hdb
-rw------- 1 root root 146 Mar 26 07:15 .lesshst
drwx------ 1 root root 10 Jun 6 2017 .local
drwx------ 1 root root 34 Sep 11 2017 .mozilla
-rw-r--r-- 1 root root 80 Jan 3 2017 .profile << this one
If the line export TZ=UTC+0 is added to the .profile-file then the new time zone will be set for the user every time the shell is used.
Now, for Linux based SAP systems things are a tad more … flexible. And with “flexible” of course, I mean complicated.
Putting all settings into just a single file would make managing system- or organisation-wide settings rather tedious.
Maybe some setting should be user-specific, while others shall apply to all users of a system and so on.
For that, there is the option to call other .profile scripts.
Cutting to the chase, for the SAP HANA operating system user <sid>adm we’re looking at the following profile scripts:
.profile - standard autostart-file for Bourne shell
.login - C-shell version
.bashrc - init script for the BOURNE and KORN Shell
.cshrc - standard profile for the C-shell (same content, but slightly different syntax)
.sapsrc.sh - looks for .sapenv scripts and executes them
.sapsrc.csh - C-shell version
.sapenv_<hostname>.sh - SAP system environment setting - specific to the local host
.sapenv_<hostname>.csh - SAP system environment setting - specific to the local host - C-shell version
.sapenv.sh - SAP system environment setting
.sapenv.csh - C-shell version
.customer.sh - User specific settings
.customer.csh - C-shell version
Most of these files are created by the installation program for SAP HANA.
When we log in as <sid>adm the files get executed in the following order:
(1) .profile -> (2) .bashrc -> (3) .sapsrc.sh -> (4) sapenv_<hostname>.sh -> (5) HDBSettings.sh (for HANA systems only)
| OR -> (4) sapenv.sh
+> (6) .customer.sh
Looking at this impressive entanglement you might not be too clear about where to put the time zone setting.
The answer, fortunately, is easy, as nearly all the script files come with a warning in a header comment, telling you to not edit it manually.
This leaves the .customer.sh as the target file for all customization including time zone settings and additional paths or environment settings for Smart Data Access (SDA) connections.
Fortunately, somebody put a comment into .profile to make clear that all custom settings will be found in .customer.sh. That way, even the SAP-unaware Linux admin has a fair chance to figure out in which file settings are done.
SAP HANA zones in time
At this point, we can configure the system’s hardware clock, the Linux system clock and the timezone that SAP HANA should use for anything where the “local” timezone is used.
This setting sets the home-timezone for the SAP HANA installation. Important here is to pay attention to the term “installation”.
Since tenant databases become the standard for SAP HANA, it has also become possible to use “strong isolation” mode, separating SAP HANA processes of different tenant databases on Linux level. Each tenant database can run in its own Linux user security context.
It is easy now to assume that, since the databases run in their own Linux user accounts, they could each have their own environment with separate timezone settings.
This is, however, not the case, since all tenant databases of an SAP HANA installation share the same nameserver (hdbnameserver) process. And it is the environment of this nameserver process that is shared by all tenant databases.
Therefore, all databases of an SAP HANA installation share the same timezone (as well as the same OS setup for external libraries, like 3rd party DBMS clients, etc.). If one needs SAP HANA databases with different time zones on the same system, a separate installation is necessary. (state as of SAP HANA 2 SPS 02)
What’s it all used for – SAP system timezone handling
All of the mechanisms described so far are in place to provide Linux programs access to a current time that makes sense to the user of the system.
For shared programs like SAP HANA, it is debatable what the current local time should be since users can access the system from around the globe out of completely different time zones.
Whatever time zone you decide upon, often it is simply the local timezone of where the data center is actually located, where most of the users are or the UTC timezone, you should probably pick a Standard timezone since the time changes for summer and winter time can lead to confusion with timestamp gaps and duplication, which can be especially confusing when trying to read log files.
Also, in an SAP system landscape, say with SAP NetWeaver systems based on SAP HANA, the NetWeaver application servers and SAP HANA have to have the same time, including the timezone.
Up to here, all we’ve looked at was telling the local current time. In SAP HANA one can use functions like now(), current_time(), current_date(), etc. to get the current SAP HANA system time. In many cases, there is also a UTC counterpart for these functions, e.g. current_utctime() if your application should rather work with the UTC timestamps.
SAP HANA also provides conversion functions, utctolocal()/localtoutc(), to convert between local time and UTC date/timestamp information.
Note that the documentation strongly discourages to store data in local time zone. SAP HANA date/timestamp data types don’t store the used timezone together with the data, so the application reading the data would need to keep track of that.
Instead, data should be stored as UTC and, if local date information is required, the conversion functions should be used.
Let’s stop at this point and take a look at what goes into this conversion.
I mentioned above that time zones and the rules for Daylight Saving Time (DST) are in constant change.
How can SAP HANA know what the currently correct mapping is?
In You got the time? I explained that SAP HANA can use built-in defaults or use a special set of tables that contain the current mapping information.
To make sure the timezone data had been loaded without problems, the view M_TIMEZONE_ALERTS can be used.
If SAP HANA is started without the timezone tables loaded, this view returns:
HOST PORT STATUS TIMEZONE_NAME MISMATCH_BEGIN MISMATCH_END
skullbox 31203 TABLES NOT FOUND ? ? ?
The following timezone tables could not be found or are not readable: SYSTEM.TTZD, SYSTEM.TTZDF, SYSTEM.TTZDV, SYSTEM.TTZZ
HOST PORT STATUS TIMEZONE_NAME MISMATCH_BEGIN MISMATCH_END
skullbox 31204 TABLES NOT FOUND ? ? ?
The following timezone tables could not be found or are not readable: SYSTEM.TTZD, SYSTEM.TTZDF, SYSTEM.TTZDV, SYSTEM.TTZZ
Per default, SAP HANA looks for these tables in the SYSTEM schema and expects the entries to be relevant for client 001.
If you want to use a different set of tables, you can adapt these configuration parameters
Name | Default | System
indexserver.ini | |
global | |
timezone_dataset | sap |
timezone_default_data_client_name | 001 |
timezone_default_data_schema_name | SYSTEM |
Important to note here is that there is only one set of conversions possible per SAP HANA database, even if you have multiple SAP clients in the database.
Likewise important: these tables need to be loaded into every tenant database and the SystemDB.
The tenant DBs cannot use the tables from the SystemDB or from another tenant DB.
Yet another caveat with the SAP HANA timezone conversion functions is that they don’t take a reference date, but always work based on the currently available conversion rules. If your application requires historical timezone conversions, then this needs to be implemented separately.
In order to check what time zones are currently known to SAP HANA the system view TIMEZONES is available:
select * from timezones;
Time zone conversion and Daylight Saving Time (DST)
At a glimpse, this whole topic of timezone conversions can look very simple. After all, it is simply adding or subtracting some hours between two time-zones, isn’t it?
To make the life of time zone converters more interesting, Daylight Saving Times have been introduced in some countries for some time. When referring to “local time” this usually includes any DST rules, while the term “Standard Time” indicates that no DST rules are applied.
The major complication with DST is not so much that the number of hours to be added or substracted changes based on the (local) day of the year, but that the mapping between timestamps with and without DST is not contiguous.
Often we think of clock-timestamp values as continuously increasing values with no gaps. It’s always one second, and another and another second.
This leads to implicit assumptions like:
every possible timestamp between 00:00:00 and 23:59:59 will occur.
every timestamp will occur only once.
Both of those assumptions are false during the days when time changes from summer to winter time or back again.
When changing from winter to summer time, the clocks are set back from e.g. 03:00:00 to 02:00:00.
Every second between 02:00:00 and 02:59:00 will happen twice this night.
One the change back, summer to winter time, clocks will be set forward, jumping from 01:59:59 directly to 03:00:00, skipping all timestamps between 02:00:00 and 02:59:59.
Here we see, why it is so important to use standard time for storing timestamps (also in filesystems and trace files, etc.): DST makes timestamps ambiguous.
There are more things wrong with the simplified model of timestamps, e.g. the need for leap seconds to adjust for drifts of the defined day length (one full turn of our planet).
“Counting” based timekeeping systems like UNIX epoch time consider leap seconds, leap years, centuries etc. Covering these is beyond this post but reading up on it is highly recommended.
More problems on a stack
Knowing all this about timezones and conversions between time zones with and without DST, we should be able to use timestamp data correctly in our programs with SAP HANA.
And yet, questions like “Problem with “summer” time (DST)” come up and even I wasn’t able to immediately pinpoint the cause of the problem.
At face value, the OP found what looked like a bug in SAP HANA’s JDBC driver.
When querying the database via JDBC certain timestamps did not seem to get returned correctly, even though the SAP HANA system timezone was set to UTC:
The answer lies with JAVA, more precisely the way how JDBC handles timestamp data (see also here)
When fetching timestamp data from a database that does not store the timezone with the timestamp data, the JDBC method getTimestamp (int columnIndex, Calendar cal) gets used.
The method returns a java.sql.Timestamp object which in turn is based on java.util.Date. Java handles dates/timestamps as moments (instants) in time that can be expressed in calendar fields (year, month, day of month, etc.). This requires defining the calendar that should be used here.
In our example, without specifying any specific calendar, the JVM uses its default timezone.
If the JVM property user.timezone is not set, then the JVM will try to find the current operating system users timezone. If the JVM runs on Linux, that would be the value of TZ again, on MS Windows it would be the current timezone setting for the Windows user that runs the JVM.
What JDBC is doing basically means: if the JAVA program does not explicitly set the Calendar to be used, then the setting of the client OS is used. A prominent example for this is SAP HANA Studio.
To avoid this implicit, nearly hidden conversion, one can set the JVM parameter -Duser.timezone=UTC in the hdbstudio.ini file.
For the interested: JAVA comes with its own complete implementation of a timezone database and the conversion data can be – just as in SAP HANA – managed independently from the operating system. (see here, here, here, here, here, here)
Time to end
And here we are now. Two full stacks of timekeeping, time zone management and conversions with and without DST.
What do I take home from this?
To me, the JDBC behaviour was the most surprising effect, so I probably try to make sure that I set the timezone to UTC whenever I want to avoid automatic conversion (e.g. server-side data access like SDI adapters or development/DBA tools like SAP HANA Studio).
For the rest, I think a good rule of thumb is to keep all timezone and time settings the same, especially across SAP systems.
Finally for applications and analytic scenarios that rely heavily on timezone conversion, making sure to fully understand all involved conversion mechanisms is really important.
The development tools and techniques for SAP HANA have been extended and enhanced a couple of times since SAP HANA had first been released in 2011. For apps developed for SAP HANA XS, the new XSA environment means a considerable change in the way application code accesses the database.
One example, where this is being done is the SAP Health application Medical Research Insights 2. Starting with FP 5, several new features have been added as XSA apps, while keeping the base application and the underlying Connected Health Platform as XS classic applications.
Christoph’s post covers all necessary steps to enable such a cross-application data access but relies on a prerequisite that can be hard to fulfil. The very first step in the tutorial is to create two roles in the XS classic schema that group all required privileges to the XS classic database objects. One of these roles has to include the privileges with the GRANT OPTION that allows granting those privileges further to other users or roles. It is fairly obvious that this is an extremely wide-ranging privilege that can make controlling who has access to which data and who manages this access very difficult. For this reason, the XS classic repository roles don’t allow the use of the GRANT OPTION.
But Christoph post does not mention anything about repository roles, so what is the problem here, you might ask.
What one can do and what one cannot do
One point is, that having to create roles manually via SQL makes the handling of the application life-cycle more complex. Ideally one wants to have an XS classic application fully contained in one or more DUs that can be transported (installed, imported) together. The less manual installation activities required, the better.
The more important point, though, is that only the owner of an object or a user with the GRANT OPTION privilege can go on and grant other users privileges with the GRANT OPTION.
In case of an application that uses a .hdbschema repository object to define the application schema as recommended, the owner of that schema is the _SYS_REPO user. This special SAP HANA system user cannot be used to login to the database, so even if one would settle for a manual installation step, there technically is no way to run a GRANT … WITH GRANT OPTION as the _SYS_REPO user.
A way out
So, how can we make _SYS_REPO grant privileges to a user with the GRANT OPTION? Meet “SQLScript procedure with DEFINER RIGHTS“! This setting allows executing the procedure as the user that created it.
For repository procedures, this “creator” is always _SYS_REPO. By now you probably see where this is going.
To have _SYS_REPO grant with GRANT OPTION, we create following repository objects:
external_access_g.hdbrole (see Christoph’s post for the two types of roles for external access) This one is just an empty role file. Don’t get confused by the ‘pred’ prefix in the package name, this is just the name I used for the demo package and without functional relevance
external_access.hdbrole This one contains all access privileges without the GRANT OPTION. For the sake of this example, I chose SELECT on the schema.
catalog schema "PRED": SELECT;
PROCEDURE "PRED"."pred.roles::grantSchemaAccess" ( )
SQL SECURITY DEFINER
DEFAULT SCHEMA PRED
exec 'GRANT SELECT on schema PRED to "pred.roles::pred_external_access_g" WITH GRANT OPTION';
Important to note here:
SQL SECURITY DEFINER– this really is core to this approach. Leave it on the default INVOKER and it won’t work.
no READS SQL DATA– we use Dynamic SQL in this procedure which counts as READ/WRITE access.
the GRANT statement is executed via ‘exec’. If you try to put the GRANT in the procedure body directly, you will get the following error
message:Error while activating /pred/roles/grantSchemaAccess.hdbprocedure:[pred.roles:grantSchemaAccess.hdbprocedure] Syntax error in procedure object: feature not supported;
ddl statement 'GRANT SELECT on schema PRED to "pred.roles::pred_external_access_g" WITH GRANT OPTION' is not supported in procedure/function: line 11 col 5 (at pos 288)
(BTW: To whoever decided that it was a good thing to disable selecting text in the console area of the Web-Based Development Workbench Editor: it’s not! Being able to copy error message output is rather important and having to manually change the ‘user-select’ option for the CSS style is rather cumbersome )
Another role that covers the privilege to run the procedure
sql object pred.roles::grantSchemaAccess: //Objecttype: PROCEDURE
With these objects in place, the setup procedure for granting privileges looks like this:
Import the DU with the objects created above into the target system.
Grant the installUser role to the user that performs the installation and setup activities (e.g. INSTALLUSER).
As the INSTALLUSER execute the procedure:
Check that the assignment was successful:
grantee like 'pred_riles::external_access%'
GRANTEE GRANTEE_TYPE GRANTOR OBJECT_TYPE SCHEMA_NAME ... PRIVILEGE IS_GRANTABLE IS_VALID
pred.roles::pred_external_access ROLE _SYS_REPO SCHEMA PRED ... SELECT FALSE TRUE
pred.roles::pred_external_access_g ROLE _SYS_REPO SCHEMA PRED ... SELECT TRUE TRUE
Revoke the role to execute the procedure from the INSTALLUSER again. As this is a setup-time-only activity, the role and the privilege should not be granted to anyone once the setup is done. To see who the role has been granted to run the following SQL statements:
where object_type = 'PROCEDURE'
and object_name = 'pred.roles::grantSchemaAccess';
–> only the procedure pred.roles::installUser should be returned here.
role_name = 'pred.roles::installUser';
–> only _SYS_REPO should be returned here.
Once these steps are successfully finished, the remaining procedure is the same as described in Christoph’s blog post, continuing at step 2.
Due to the critical nature of granting WITH GRANT OPTION one might also consider to set up auditing the roles and the procedure so that any use of them gets logged.
A brief twitter exchange triggered some thoughts about expectations and how those determine whether something is a good A, a passable B or a terrible C.
When you want to go somewhere and you do not know the way, you need directions. Somebody or something needs to show you the path to follow. That’s what turn by turn navigation on your mobile does. Following this path does not mean you got to know the area. You know one just one way to get from A to B know.
A map, on the other hand, provides an overview, context and alternatives but no immediate path.
Using a map means you have to actively engage with the information, see for yourself which roads cannot be taken (maybe you are in a huge truck and the small sideways with wood bridges are not going to work for you) and eventually plot your own path.
Both maps and paths, are related but are far from being the same.
A useful analogy
This analogy works not just for navigating transportation networks.
If your company wants to build an information system to do X then usually it will give the job to someone with a path to get there. It does not need to discuss all the alternative technologies or architectures. When X can be achieved reliably, in-time and on-budget, then that’s usually already a great outcome. Getting such a path (think project) is commodity shopping.
On the other hand, when you want to do new things or old things in a very different way than before, a path is not the answer to that. This is where you need to work with a map. You need different viewpoints and scenarios to walk through. This situation requires you to discover and create a path. You can buy help for this process but you cannot skip the work.
(You might notice, that product or service roadmaps are specifically not maps. They are usually paths with a decreasing level of certainty towards the end. They can provide a heads-up to what might be coming but they don’t show alternatives, contexts or overview.)
To avoid disappointment and frustration it is important to be clear about what you expect.
Are you asking questions in user forums to get a map? Do you want blog posts to give you a path from A to B? Do you want to discuss options and discover/create paths?
For the SAP Community Platform getting answers to such questions will be important to avoid further disappointments.
I took a swing at it and the following is my result (a slightly edited version of my original answer).
Alright, let’s do this! (even though I don’t have an idea why it would matter at all…)
So, I’m not a computer scientist, and only a third through “The Annotated Turing” (which seems to be a great book, btw). This means what follows is the layman’s approach on this.
First off, I had to understand, what is meant by “Turing complete” and how to show that a language has that characteristic.
For that, there are a couple of Wikipedia entries and SO discussions available and I leave that for everyone to read (e.g. here and here).
One of those discussions linked to a presentation that claimed to prove that PostgreSQL-SQL with recursive common table expressions (CTS) is Turing complete. In order to prove this, the author of the presentation (here) said, that it’s enough to prove that a language can emulate another language that has already been shown to be Turing complete.
The author’s choice was a cyclic tag system with a specific rule set (rule 110) which apparently has been shown to be Turing complete.
Then the author goes on and implements this cyclic tag system with a recursive common table expression and thereby proves the claim.
So, what does that mean for SAP HANA SQL?
SAP HANA SQL/SQLScript does not support recursive common table expressions (much to the distaste of everyone who tries to handle hierarchies and does not know about SAP HANA’s special hierarchy views and functions (look there) and it also does not support recursive procedure calls.
Bummer, one might think.
Fortunately, every recursion can be expressed as an iteration (compare here), so I thought, let’s try this cyclic tag system in SQLScript.
This is the result (HANA 1, rev.122.15, on my NUC system). The SQL code is also available in my GitHub repo.
declare prods VARCHAR(4) ARRAY;
declare currProd, initWord, currWord VARC<span data-mce-type="bookmark" style="display: inline-block; width: 0px; overflow: hidden; line-height: 0;" class="mce_SELRES_start"></span>HAR(300); -- 300 is arbitrary and would be exceeded for more runs
declare currProdNo integer = 0;
declare runs, maxruns bigint = 0;
initWord :='11001'; -- the starting/initial 'word'
maxruns := 100; -- a limit to the number of iterations
-- rule 110 is suspected to run indefinitively
prods = ARRAY ('010', '000', '1111'); -- the three 'producer rules' stored in a string array
currWord := :initWord;
runs := 0;
-- dummy table var to monitor output
tmp = select :runs as RUNS, :currProd as CURRPROD, :currWord as CURRWORD
while (:runs < :maxruns) DO
runs := :runs+1;
currProdNo := mod(:runs,3)+1; -- pick rule no. 1,2 or 3 but never 0
-- as SQLScript arrays are 1 based
currProd := :prods[:currProdNo];
if (left (:currWord, 1)='1') then -- add current producer to the 'word'
currWord := :currWord || :currProd;
currWord := substring (:currWord, 2); -- remove leftmost character
-- save current state into temp table var
tmp = select RUNS, CURRPROD, CURRWORD from :tmp
select :runs as RUNS, :currProd as CURRPROD, :currWord as CURRWORD
select * from :tmp; -- output the table var
That looks suspiciously like the output from the Wikipedia link (above) and does not seem to stop (except for the super-clever maxruns variable ).
With that, I’d say, it’s possible to create a cyclic tag system rule 110 with SQLScript.
The tag system is Turing complete, therefore SQLScript must be Turing complete.
Still, I have no idea why this would matter at all and that’s really all I can say/write about this.