This blog contains experience gained over the years of implementing (and de-implementing) large scale IT applications/software.

Making saptune Actually Work & Patching to v2

Having recently spent some time analysing the performance of a HANA database system, I got down to the depths of Linux device I/O performance on an Azure hosted VM.

There was no reason to suspect any issue, because during the implementation of the VM image build process, we had followed all the relevant SAP notes.
In our case, on SUSE Enterprise Linux for SAP 12, we were explicitly following SAP Note 1275776 “Linux: Preparing SLES for SAP environments”.
Inside that SAP note, you go through the process of understanding the difference between sapconf and saptune, plus actually configure saptune (since it comes automatically with the “for SAP” versions of SLES 12).

Once configured, saptune should apply all the best practices that are encompassed in a number of SAP notes including SAP Note 2205917 “SAP HANA DB: Recommended OS settings for SLES 12 / SLES for SAP Applications 12”, which is itself needed during the HANA DB installation preparation work.
If you follow the note, there are a number of required O/S adjustments that are needed for HANA, which can be either applied manually, or (as recommended) automatically via saptune, provided the correct saptune profile is selected.

As part of our configuration, we had applied saptune solution profile S4HANA-DBSERVER (also noted in the SUSE documentation for SAP HANA).
This is applied using the standard:

saptune solution apply S4HANA-DBSERVER

You don’t get a lot of feedback from the saptune execution, but the fact there are no errors, indicates (normally) that it has done what has been requested.
You can check it has applied the profile by executing:

saptune solution list

The item that is starred in the returned list, is the profile that has been applied.
That’s it.

As part of my troubleshooting I even took the trouble of running the publicly available script sapconf_saptune_check (see here: https://github.com/scmschmidt/sapconf_saptune_check/blob/master/sapconf_saptune_check ), which just confirmed that saptune was indeed active/enabled and had a valid profile configured:

Back to the task of checking out the performance issue, and you can probably see where this is going now.
On investigation of the actual saptune profile contents, it was possible to see that a large majority of O/S changes had not been applied.
Specifically, we were not seeing the NOOP scheduler selected for the HANA disks devices.

By executing either of the following, you can check the currently selected scheduler:

grep -l ‘.*’ /sys/block/s??/queue/scheduler

or

cat /sys/block/s??/queue/scheduler

The selected scheduler will be in square brackets.
In my case, I was seeing “[cfq]” for all devices. Not good and not the recommendation from SAP and SUSE.
This setting should be automatically adjusted by the tuned daemon.

Looking at my version of saptune, I could see it was version 1.1.7 (from the output of the execution of the sapconf_saptune_check script).

Reading some of the recent blog posts from Soeren Schmidt here: https://blogs.sap.com/2019/05/03/a-new-saptune-is-knocking-on-your-door/
I could see that version 2 of saptune was now released.

Downloading the newer version (not installing directly!), reverting the old solution profile, installing the new saptune version and finally re-applying the same profile, confirmed that saptune was the culprit.

The new saptune2 fixed the issue, immediately activating a number of critical O/S changes, including the NOOP scheduler setting on each device.

The moral of the story, is therefore that as well as following the SAP processes, you still need to actually validate what it says it should have done.
The new saptune2 has been incorporated into our build process, plus the configuration check scripts will be specifically checking for it.
However, since the upgrade from saptune1 to saptune2 could cause issues if it just blindly re-applied the “new” profile settings, SAP have made saptune follow a backwards compatible upgrade process, whereby the O/S settings are retained as they were before the upgrade was executed.

Therefore, as per the SAP Note 2816790 “Differences between sapconf and saptune” links, the upgrade process for an already applied profile, is to revert it prior to the saptune upgrade, then applied the upgrade, then re-apply.
This could therefore not just be rolled out via our standard SLES patching routine. We had to develop an automated script that would specifically pre-patch saptune to saptune2 using the correct procedure, before we embarked on the next SLES patching round.

As a post-note, you should make yourself familiar with the coming changes to the SLES scheduler settings, with the introduction of the NONE scheduler (see below links for link to the blog).

Useful notes/links:
https://www.suse.com/c/sles-1112-os-tuning-optimisation-guide-part-1/
https://blogs.sap.com/2019/06/25/sapconf-versus-saptune-in-more-detail/
https://blogs.sap.com/2019/05/03/a-new-saptune-is-knocking-on-your-door/
https://www.suse.com/c/noop-now-named-none/

SAP ASE – Blocking Factor Madness

I spent around a day looking into a performance issue with a specific peice of SQL in an ERP 6 system running on a SAP ASE database.
The system has recently been migrated from Oracle, so we were expecting issues with hints, however this didn’t seem to be an index choice issue.
Look at the following two SQL statements, the first one is the system experiencing a performance problem:

e17933e9-cd69-4108-9924-3eb8e0b4900e

The second picture is a system where the performance issue doesn’t exist:

65731635-82b1-4893-90a0-bcc2bd00cf8e

Can you spot the difference?
Hint: Look at the number of question marks in the prepared statement.
The number of question marks indicate the number of items included in the “IN LIST” of the WHERE clause.

Since the ABAP SQL statement will be interpreted at the Kernel level, there is no way to see any difference in the ABAP layer other than this output from a SQL trace.
The consequence of the first statement (with fewer question marks) are that the SQL statement is executed multiple times in order to query for the same “PERNR” records.  This can result in as much as 4 to five times more effort for the SAP layer, plus the database layer.  Which adds more to the database response time and a little to the “processing time”.

What impacts the number of question marks?  Simple, the parameters “rsdb/*blocking_factor” at the Kernel level, will adjust how many parameters are fed into the prepared statement in the DBSL layer.

SAP Note 1996340 – SYB: Default RSDB profile parameters for SAP ASE  will provide all the answers.
The SAP note also answered my specific issue, which was why was one system in the landscape different.  The answer was that the production system (where the problem was seen) had it’s parameters mostly carried across during a migration from Oracle.  Whereas a smaller “release” system had it’s parameters left behind to die with the Oracle database.

As you will read in the SAP note 1996340, these rsdb parameters are pretty essential and should be re-evaluated when changing database platform.

Always re-evaluate all parameters when migrating from one platform to another.
Don’t assume that someone more experienced has set them with some future knowledge of the landscape/setup.

Oracle 11g Methods of Performance Tuning SQL

>90% of upgrade related problems are performance issues after an upgrade.

Source: Oracle Corp

Oracle tools for helping you tune the database:

  • Statspack – FREE – (See note 394937.1)

  • AWR – Diagnostics Pack & Tuning Pack license required.
  • Real Application Testing (Features: SQL Performance Analyser & Database Replay) – Tuning Pack license required.

Since 11g, Oracle recommend, instead of: storing outlines, fixing stats, using SQL hints, using the Rule Based Optimiser (desupported); you should use the SQL Plan Management tool along with SQL Profiling.

See spm_white_paper_ow07.pdf for more information.

Java VM 5.0 Default Heap Size

If you have Java 5.0 (1.5), you may wish to know the default heap size for a JVM.
If you don’t specify -Xmx or -Xms to control the heap size, then the defaults are used.

They are described in detail in the strangely title “Ergonomics in the 5.0 Java Virtual Machine” page here: https://www.oracle.com/technetwork/java/ergo5-140223.html

The page states:
In the J2SE platform version 5.0 a class of machine referred to as a server-class machine has been defined as a machine with
2 or more physical processors
2 or more Gbytes of physical memory

On server-class machines by default the following are selected.
Throughput garbage collector
Heap sizes
initial heap size of 1/64 of physical memory up to 1Gbyte
maximum heap size of ¼ of physical memory up to 1Gbyte
Server runtime compiler

So if you have a “server class” machine, you could expect your Java 5.0 JVM to utilise a maximum of 1GB of heap with no tuning (-Xmx & -Xms).

Although it’s not very clear, it seems that a non-“server class” machine would allocate the same as a Java 1.4.2 virtual machine:

In the J2SE platform version 1.4.2 by default the following selections were made
Serial garbage collector
Heap sizes
initial heap size of 4 Mbyte
maximum heap size of 64 Mbyte
Client runtime compiler

Therefore, a maximum of 64MB of heap would be utilised.

I don’t know if a single dual core CPU is recognised as a “server class” machine, but you can find out what your Java version thinks your machine is by running:

> java -help
Usage: java [-options] class [args...]
(to execute a class)
or java [-options] -jar jarfile [args...]
(to execute a jar file)

where options include:
-d32         use a 32-bit data model if available
-d64         use a 64-bit data model if available
-client       to select the "client" VM
-server      to select the "server" VM
-hotspot    is a synonym for the "client" VM [deprecated]
                 The default VM is server,
                 because you are running on a server-class machine
.

-cp <class search path of directo....

As you can see, the output tells you that it thinks you are running on a “server class” machine.

Either way, it’s probably best to use the “-server” command line option to be sure, plus the “-Xmx” option to restrict memory usage if you don’t need a whole 1GB heap.

SAP note 830576 – PGA_AGGREGATE_TARGET on Oracle 10gR2

SAP note 830576Parameter Recommendations for Oracle 10g” is quite a popular one for me.
It lists all the SAP recommended Oracle 10g parameter settings for Oracle 10.2.0.4 and 10.2.0.5.
It’s a good point of reference and I’d recommend you implement it as a baseline before tuning the system further.
It has a buddy note, 1289199Information About Oracle Parameters” which describes some of the parameters in more detail.

Unfortunately, there is a major flaw on note 830576.  When setting PGA_AGGREGATE_TARGET the SAP note says 20% of available memory.  It fails to mention that this should be 20% of the SGA size, not O/S memory.
The Oracle docs (see MYOS note 153367.1) say that the value should be:

Syntax                PGA_AGGREGATE_TARGET = integer [K | M | G]
Default value      10 MB or 20% of the size of the SGA, whichever is greater
Modifiable         ALTER SYSTEM
Range of values Minimum: 10 MB
                         Maximum: 4096 GB – 1

The Oracle note goes on to say that when sizing the Oracle database memory areas, you should consider the SGA size first, then assign any spare memory to PGA.
Now in an SAP landscape with a single Central Instance + Dialog Instance on the same server as the database, you may wish to use the SAP 70/30 rule (70% to SAP, 30% to Oracle).

My order of sizing would look something like this:
1, Determine number of users of SAP system.
2, Determine number of DIALOG work processes + Background work processes + Update processes (~= Oracle “processes”).
3, Determine leftover memory for Oracle SGA (split between pools, SAP doesn’t support automatic memory management).
4, Determine leftover memory for PGA + overheads.

If you get to step 4 and you have diddly squat RAM left (hardly any), then consider adding more RAM to your server.  Remember, we don’t like pageing.

SAP note 789011 “FAQ: Oracle Memory Areas”, provides a range of SQL statements for checking the actual size of the PGA.  Since PGA_AGGREGATE_TARGET is only telling Oracle what you would like the maximum PGA allocation to be.

When you set PGA_AGGREGATE_TARGET, you also allow Oracle to release PGA memory back to the O/S.  Using the *_AREA_SIZE parameters and setting PGA_AGGREGATE_TARGET to 0, forces a specific size of PGA which does not release the memory to the O/S.

/* Actual PGA consumption */
SELECT VALUE FROM V$PGASTAT WHERE NAME = 'total PGA allocated';

/* Chronological PGA allocation (needs AWR license) */
SELECT SUBSTR(S.END_INTERVAL_TIME, 1, 40) TIME,
               P.VALUE PGA_ALLOCATION
  FROM DBA_HIST_SNAPSHOT S, DBA_HIST_PGASTAT P
WHERE P.NAME = 'total PGA allocated'
    AND S.SNAP_ID = P.SNAP_ID
ORDER BY P.SNAP_ID;

Oracle states:
Memory Area                                                             Dedicated Server     Shared Server
Nature of session memory                                                     Private           Shared
Location of the persistent area                                               PGA              SGA
Location of part of the runtime area for SELECT statements  PGA              PGA
Location of the runtime area for DML/DDL statements          PGA              PGA

When installing Oracle for SAP, by default it uses DEDICATED server mode (see note 70197).