Scheduler Archives » Musings of an IT Implementor

Having recently spent some time analysing the performance of a HANA database system, I got down to the depths of Linux device I/O performance on an Azure hosted VM.

There was no reason to suspect any issue, because during the implementation of the VM image build process, we had followed all the relevant SAP notes.
In our case, on SUSE Enterprise Linux for SAP 12, we were explicitly following SAP Note 1275776 “Linux: Preparing SLES for SAP environments”.
Inside that SAP note, you go through the process of understanding the difference between sapconf and saptune, plus actually configure saptune (since it comes automatically with the “for SAP” versions of SLES 12).

Once configured, saptune should apply all the best practices that are encompassed in a number of SAP notes including SAP Note 2205917 “SAP HANA DB: Recommended OS settings for SLES 12 / SLES for SAP Applications 12”, which is itself needed during the HANA DB installation preparation work.
If you follow the note, there are a number of required O/S adjustments that are needed for HANA, which can be either applied manually, or (as recommended) automatically via saptune, provided the correct saptune profile is selected.

As part of our configuration, we had applied saptune solution profile S4HANA-DBSERVER (also noted in the SUSE documentation for SAP HANA).
This is applied using the standard:

saptune solution apply S4HANA-DBSERVER

You don’t get a lot of feedback from the saptune execution, but the fact there are no errors, indicates (normally) that it has done what has been requested.
You can check it has applied the profile by executing:

saptune solution list

The item that is starred in the returned list, is the profile that has been applied.
That’s it.

As part of my troubleshooting I even took the trouble of running the publicly available script sapconf_saptune_check (see here: https://github.com/scmschmidt/sapconf_saptune_check/blob/master/sapconf_saptune_check ), which just confirmed that saptune was indeed active/enabled and had a valid profile configured:

Back to the task of checking out the performance issue, and you can probably see where this is going now.
On investigation of the actual saptune profile contents, it was possible to see that a large majority of O/S changes had not been applied.
Specifically, we were not seeing the NOOP scheduler selected for the HANA disks devices.

By executing either of the following, you can check the currently selected scheduler:

grep -l ‘.*’ /sys/block/s??/queue/scheduler

cat /sys/block/s??/queue/scheduler

The selected scheduler will be in square brackets.
In my case, I was seeing “[cfq]” for all devices. Not good and not the recommendation from SAP and SUSE.
This setting should be automatically adjusted by the tuned daemon.

Looking at my version of saptune, I could see it was version 1.1.7 (from the output of the execution of the sapconf_saptune_check script).

Reading some of the recent blog posts from Soeren Schmidt here: https://blogs.sap.com/2019/05/03/a-new-saptune-is-knocking-on-your-door/
I could see that version 2 of saptune was now released.

Downloading the newer version (not installing directly!), reverting the old solution profile, installing the new saptune version and finally re-applying the same profile, confirmed that saptune was the culprit.

The new saptune2 fixed the issue, immediately activating a number of critical O/S changes, including the NOOP scheduler setting on each device.

The moral of the story, is therefore that as well as following the SAP processes, you still need to actually validate what it says it should have done.
The new saptune2 has been incorporated into our build process, plus the configuration check scripts will be specifically checking for it.
However, since the upgrade from saptune1 to saptune2 could cause issues if it just blindly re-applied the “new” profile settings, SAP have made saptune follow a backwards compatible upgrade process, whereby the O/S settings are retained as they were before the upgrade was executed.

Therefore, as per the SAP Note 2816790 “Differences between sapconf and saptune” links, the upgrade process for an already applied profile, is to revert it prior to the saptune upgrade, then applied the upgrade, then re-apply.
This could therefore not just be rolled out via our standard SLES patching routine. We had to develop an automated script that would specifically pre-patch saptune to saptune2 using the correct procedure, before we embarked on the next SLES patching round.

As a post-note, you should make yourself familiar with the coming changes to the SLES scheduler settings, with the introduction of the NONE scheduler (see below links for link to the blog).

Useful notes/links:
https://www.suse.com/c/sles-1112-os-tuning-optimisation-guide-part-1/
https://blogs.sap.com/2019/06/25/sapconf-versus-saptune-in-more-detail/
https://blogs.sap.com/2019/05/03/a-new-saptune-is-knocking-on-your-door/
https://www.suse.com/c/noop-now-named-none/

Whilst administering a SAP ASE based SAP system, I came across an issue in the ASE Job server error log “JSTASK.log”:

00:140737306879744:140737340581728:2016/02/24 16:50:00.87 worker ct_connect() failed.

00:140737306879744:140737340581728:2016/02/24 16:50:00.87 worker jsj__RunSQLJob: jsd_MakeConnection() failed for user sapsa to server SID

00:140737306879744:140737340581728:2016/02/24 16:50:00.87 worker jsj__RunSQLJob() failed for xid 66430

00:140737317369600:140737340581728:2016/02/24 16:55:00.87 worker Client message: ct_connect(): protocol specific layer: external error: The attempt to connect to the server failed.

The issue was caused by a change of the sapsa user password whereby the SAP recommended method of using the hostctrl process, wasn’t followed.
The recommended method updates the sapsa user, the secure storage file plus also the external login for the Job Server.
This is mentioned at the very end of SAP note 1706410 (although it is suggested that the process in this note is no longer followed to change the passwords).
To fix the issue, follow finals steps in the SAP note 1706410:

isql -X -Usapsa -S<SID> -w999

use master
go
sp_helpexternlogin
go

Server Login Externlogin
———————- ——————– ————
SYB_JSTASK sapsa sapsa

Drop the SYB_JSTASK entry:

exec sp_dropexternlogin SYB_JSTASK, sapsa
go

Re-create it with the new password:

exec sp_addexternlogin SYB_JSTASK, sapsa, sapsa, ‘<new sapsa password>’
go

This should fix the issue.

Tag: Scheduler

Making saptune Actually Work & Patching to v2

saptune solution apply S4HANA-DBSERVER

saptune solution list

grep -l ‘.*’ /sys/block/s??/queue/scheduler

cat /sys/block/s??/queue/scheduler

SAP ASE Job Server Error

00:140737306879744:140737340581728:2016/02/24 16:50:00.87 worker ct_connect() failed.

00:140737306879744:140737340581728:2016/02/24 16:50:00.87 worker jsj__RunSQLJob: jsd_MakeConnection() failed for user sapsa to server SID

00:140737306879744:140737340581728:2016/02/24 16:50:00.87 worker jsj__RunSQLJob() failed for xid 66430

00:140737317369600:140737340581728:2016/02/24 16:55:00.87 worker Client message: ct_connect(): protocol specific layer: external error: The attempt to connect to the server failed.

isql -X -Usapsa -S<SID> -w999

use master
go
sp_helpexternlogin
go

Server Login Externlogin
———————- ——————– ————
SYB_JSTASK sapsa sapsa

exec sp_dropexternlogin SYB_JSTASK, sapsa
go

exec sp_addexternlogin SYB_JSTASK, sapsa, sapsa, ‘<new sapsa password>’
go

saptune solution apply S4HANA-DBSERVER

saptune solution list

grep -l ‘.*’ /sys/block/s??/queue/scheduler

cat /sys/block/s??/queue/scheduler

00:140737306879744:140737340581728:2016/02/24 16:50:00.87 worker ct_connect() failed.

00:140737306879744:140737340581728:2016/02/24 16:50:00.87 worker jsj__RunSQLJob: jsd_MakeConnection() failed for user sapsa to server SID

00:140737306879744:140737340581728:2016/02/24 16:50:00.87 worker jsj__RunSQLJob() failed for xid 66430

00:140737317369600:140737340581728:2016/02/24 16:55:00.87 worker Client message: ct_connect(): protocol specific layer: external error: The attempt to connect to the server failed.

isql -X -Usapsa -S<SID> -w999 use mastergosp_helpexternlogingo Server Login Externlogin———————- ——————– ————SYB_JSTASK sapsa sapsa

exec sp_dropexternlogin SYB_JSTASK, sapsago

exec sp_addexternlogin SYB_JSTASK, sapsa, sapsa, ‘<new sapsa password>’go

isql -X -Usapsa -S<SID> -w999

use master
go
sp_helpexternlogin
go

Server Login Externlogin
———————- ——————– ————
SYB_JSTASK sapsa sapsa

exec sp_dropexternlogin SYB_JSTASK, sapsa
go

exec sp_addexternlogin SYB_JSTASK, sapsa, sapsa, ‘<new sapsa password>’
go