HANA Archives » Musings of an IT Implementor

Power BI Desktop Single-Sign-On to SAP HANA

This post is all about Single-Sign-On (SSO) from Microsoft Power BI Desktop to SAP HANA database 2.0.
When you initially get the task to design and/or set this up, there are a few questions that need to be asked first, before you start setting up SSO.
In the post I will compare the two methods for single-sign-on: Kerberos and SAML, plus the use of the Power BI Gateway (also known as the “On-Premises Data Gateway”).

Index:

Questions to Ask
On-Premises Data Gateway or no On-Premises Data Gateway
HANA Integration Point
Kerberos or SAML
Power BI Direct to HANA using Kerberos (SSO2DB)
Power BI via On-Premise Data Gateway to HANA using Kerberos
Power BI via On-Premise Data Gateway to HANA using SAML
Troubleshooting
References

Questions to Ask

Before setting up SSO from Power BI Desktop to SAP HANA, you should ask these questions:

Define what the Power BI Desktop end-user will be doing:
Are the end users creating reports or are they consuming already published reports?

End users that are creating new reports will need a direct Power BI Desktop to HANA connection with SSO (a.k.a SSO2DB). This will need to use Kerberos because SAML is not supported.
End users that are consuming already published reports can use the On-Premises Data Gateway with SSO to access and execute the reports from Power BI Desktop. The On-Premises Data Gateway can use Kerberos or SAML.
Define where Power BI Desktop will be running:
Do end-users all have Windows accounts in the same domain?

For direct to HANA connections with SSO, Kerberos is used and requires the end-user to be signed into Windows with a Windows account on the machine where Power BI Desktop is running.
If the end-user does not have a Windows account (or the they sign into Windows with a different, un-trusted domain) they can enter Windows credentials into the login box inside Power BI Desktop (this is not quite so seamless), but they will still need an AD account and one that is federated with the domain in which SAP HANA has been added to (it gets it’s own service account).
If using On-Premises Data Gateway, define how many HANA systems will be connected to it:
Is the On-Prem Data Gateway needing to connect to multiple HANA systems?

When connecting On-Premises Data Gateway to HANA using SAML for SSO, there is a one-to-many relationship with the SAML key and certificate generated for On-Premises Data Gateway. The On-Premises Data Gateway can only use one certificate and this one has to be deployed and trusted on all the HANA systems that it will be connecting to. Therefore, you really need to have a On-Premises Data Gateway for each HANA environment (or at least, one for Production and one for Non-Production) to allow proper testing and better security for production.
If planning to use Kerberos for SSO, identify corporate security policies & settings for Active Directory (AD) service accounts:
Do AD service accounts required AES256 bit encryption?
What are the server names and domains of the required domain Key Distribution Centre (KDC) server(s)?
What will be the full UPN of the user account when using Kerberos?

When AD service accounts have AES256 bit encryption, it changes the process for setting up the keytab file that is placed onto the SAP HANA server.
The KDC and domain information will be needed for the configuration of the HANA server’s krb5_hdb.conf file.
The AD administrators should be asked for the above information.

On-Premises Data Gateway or no On-Premises Data Gateway

You can use the On-Premises Data Gateway (Power BI Gateway) for accessing the data in “on-premise” systems. This includes HANA databases. The gateway acts as a kind of reverse proxy because it connects out to Microsoft from inside the customer’s network (where it is hosted).

The Gateway provides a distribution (publishing) framework where reports can be curated and published for access by many users.
End-users can connect from their Power BI Desktop (installed on their local computer) to the On-Premises Data Gateway *over the internet*.

Without the On-Premises Data Gateway, each Power BI Desktop end-user will need a direct connection to the SAP HANA database. It is recommended that this is performed over a VPN connection, or for the end-user to be physically in a corporate office on the LAN/WAN. In the future the Azure “v-Net” connection option may support SAP HANA connections if you happen to host your SAP HANA in Microsoft Azure.
NOTE: In the below, we could be using Azure AD or classic Active Directory Domain Services.

HANA Integration Point

Before we continue we need to highlight that the Power BI Desktop and On-Premises Data Gateway connect to the SAP HANA database indexserver for SSO via both Kerberos and SAML.
Changes are not required to the HANA XSA (Application Server Extended) for these integrations. It is not the same integration that you may read in other guides (especially guides related to HANA’s analytical capabilities).

Kerberos or SAML

Whether to use Kerberos or SAML is really up to your organisation’s preferences and capabilities.
Microsoft and SAP recommend SAML as a modern approach to Single-Sign-On.

SAML de-couples the SAP HANA system from the identity provider and is simpler to use with potentially less firewall changes.
Be aware, the On-Premises Data Gateway can only use one certificate for SAML for all the HANA databases it talks to.
When using SAML, the On-Premises Data Gateway connection to HANA needs securing with TLS, otherwise the SAML assertion (kind of like a certificate) would be sent unencrypted.

On the other hand, Kerberos provides a centralised identity management approach and is much more rigid in design with a few more steps involved in the setup. It is also a much older protocol with its own set of vulnerabilities, but it comes without the requirement to setup TLS (although it is still recommended).

If you need to have Power BI Desktop connecting directly to SAP HANA (a.k.a SSO2DB), then as of writing this can only use Kerberos for single-sign-on. Kerberos delegation is not needed in this scenario.
For connection from Power BI Desktop via the On-Premises Data Gateway to SAP HANA, then both Kerberos or SAML can be used.
When using the On-Premises Data Gateway with SAML, the On-Premises Data Gateway becomes the SAML identity provider (IdP).
When using the On-Premises Data Gateway with Kerberos, the On-Premises Data Gateway will use Kerberos delegation on behalf of the end-user.

Power BI Direct to HANA via Kerberos (SSO2DB)

NOTE: This is also known as SSO2DB.

The first thing to note about connecting Power BI directly to SAP HANA using Kerberos for single-sign-on, is that your BASIS team will need to work with the Microsoft Active Directory (AD) team.
It is possible that the AD team can delegate a proportion of the work to the BASIS team by creating a separate (dedicated) organisation unit (OU) and apply permissions to allow the BASIS team to use their Windows accounts to manage the AD entities created in this new OU.

Here is how the architecture will look for a direct connection from Power BI to SAP HANA via Kerberos:

Process Flow:

User opens Power BI (or Excel).
User connects to SAP HANA database using a Windows authentication account (authenticates via Azure AD in this example).
Kerberos authentication token (ticket) is forwarded to SAP HANA during the HANA logon process.
HANA decrypts token using keytab file which contains the key for the stored service principle (SPN) and maps the decrypted Windows account name (UPN) to the HANA DB account.

There is no requirement for Kerberos delegation in this setup.

For the above setup to work, there are some required steps and some optional steps:

Required: Install SAP HANA client
The main requirement is that the SAP HANA client is to be installed onto the end-user’s computer (where Power BI desktop is running). For SAP administrators, you will note that this HANA client will also need to be included in your frequent patching & maintenance routines to ensure it is aligned with the version of SAP HANA in use.
Recommended: Install SAPCRYPTOLIB
As well as the requirement for the SAP HANA client, it is recommended that you secure the connection to SAP HANA using TLS.
For this, you will need the SAPCRYPTOLIB also installing into the HANA client location on the end-user’s machine.
This set of libraries allow TLS to be used to encrypt the connection which is part of your “data-in-transit” security posture.
You will also therefore need a SAP Personal Security Environment (PSE) file placing onto the end-user’s machine along with the SAPCRYPTOLIB.
These libraries will also need to be included in your frequent patching & maintenance routines to ensure it is aligned with the version of SAPCRYPTOLIB in use on the SAP HANA server.
Required: Define Env Variable SECUDIR
So that the HANA client knows where the SAPCRYPTOLIB libraries (DLLs) have been deployed (if they are being deployed), you should set a SYSTEM environment variable called “SECUDIR” to point to the location of the SAPCRYPTOLIB files.
Optional: Enable “Server Validation”
An optional step is to enable “Server Validation” on the connection properties. It is recommended to enable this, because without server validation, it is not possible to know that the target SAP HANA server that has been connected to, is to be trusted with the Kerberos ticket that will be sent during logon.
This also serves as a method of helping to restrict who can connect to which servers, by un-trusting specific servers (maybe old sandbox ones).
For “Server Validation” to work, the PSE file which is located in the HANA client directory on the end-user’s computer, will need to be populated with the public TLS certificate(s) of the SAP HANA system(s) the end-user will be connecting to and these certificates will need to contain the FQDN that has been used to initiate the connection (e.g. my-virtual-db-hostname.corp.net).
Required: Configure Kerberos on HANA server
The krb5_hdb.conf is configured on the HANA server, according to your AD domain setup and whether AES256 is needed for the AD service account.
Once krb5_hdb.conf is configured, the AD service account can be tested at the Linux level using the required kinit and ktutil tools.
The Kerberos keytab can only be created once the AD service account has been created and the required SPN(s) mapped. The method of creating this changes depending on whether AES256 encryption is needed on the service account.
When using AES256 bit encryption, you cannot simply rotate the key in the keytab, you will need to take an interruption to SSO connectivity while you update the password in AD, then generate a new keytab key and update the keytab on the HANA system.
The SAP document speaks of not needing to restart HANA, this was not the case on all systems for whatever reason. Be prepared for HANA restarts or place the files into the /etc folder (changing names and permissions accordingly) until a restart can be done.
An important point is the host name resolution. When you setup the Kerberos keytab, the SPNs you are told to create are prefixed with “hdb/server-host”. When authentication tracing is enabled on HANA with “debug” level, you can see the hostname detection in the trace files. HANA finds its hostname then finds every canonical name it finds from DNS, then looks for matching entries in the keytab file. Obviously it has an order but from what I’ve seen you can get it to match on any canonical name even if the entry in DNS is uppercase and the keytab is lowercase.
Required: Map HANA User to UPN
In the HANA system, the database user account(s) need their “External ID” setting to the UPN that is passed in the Kerberos ticket. The UPN may not be apparent as you may imagine this to be “user.name@corp.net”, but in actual fact it may be the actual domain name “user.name@REALM”. Testing and tracing in the HANA system with the auth trace turned on, will reveal the UPN to you.

All of the above software and files can be packaged up and distributed to the end-user’s computer using orchestration tools such as SCCM.

Power BI via On-Premise Data Gateway to HANA using Kerberos

Connecting Power BI via the On-Premise Data Gateway to SAP HANA using Kerberos for single-sign-on will need to use something called Kerberos delegation. This delegation technique allows the On-Premise Data Gateway to impersonate the source user account when accessing the target SAP HANA system. It is similar to you lending your credit card to your partner (not your pin, but just your card, allowing them to make contact-less payments up-to a predefined value).
Again, the AD team will need to be involved in a similar way to the “direct to HANA via Kerberos” method.
In this setup, the On-Premise Data Gateway must be running as a domain service user (for delegation to be allowed).

As well as the AD team, you will also need to involve the Power BI administrators (or someone to configure the On-Premise Data Gateway) as some specific changes will need to be made on the gateway machine.

Here is how the architecture will look for a connection from Power BI via the On-Premise Data Gateway to SAP HANA using Kerberos for SSO:

Process Flow:

User login to Power BI Desktop.
Authentication via Azure AD (in this example).
User accesses query/connection for SAP HANA configured and published from the On-prem Data Gateway.
On-prem Data Gateway receives UPN and switches context to impersonate the end-user (account delegation), getting the token from AD and sending on to the HANA system.
HANA decrypts token using keytab file which contains the key for the stored SPN and maps the decrypted Windows account name (UPN) to the HANA DB account.

For the above setup to work, there are some required steps and some optional steps:

Required: Install SAP HANA client
The main requirement is that the SAP HANA client is to be installed onto the On-Premise Data Gateway machine. For SAP administrators, you will note that this client will also need to be included in your frequent patching & maintenance routines to ensure it is aligned with the version of SAP HANA in use.
Recommended: Install SAPCRYPTOLIB
As well as the requirement for the SAP HANA client, it is recommended that you secure the connection to SAP HANA using TLS.
For this, you will need the SAPCRYPTOLIB also installing into the HANA client location on the On-Premise Data Gateway machine.
This set of libraries allow TLS to be used to encrypt the connection which is part of your “data-in-transit” security posture.
You will also therefore need a SAP Personal Security Environment (PSE) file placing onto the On-Premise Data Gateway machine along with the SAPCRYPTOLIB.
These libraries will also need to be included in your frequent patching & maintenance routines to ensure it is aligned with the version of SAPCRYPTOLIB in use on the SAP HANA server.
Required: Define Env Variable SECUDIR
So that the HANA client knows where the SAPCRYPTOLIB libraries (DLLs) have been deployed (if they are being deployed), you should set a SYSTEM environment variable called “SECUDIR” to point to the location of the SAPCRYPTOLIB files.
Optional: Enable “Server Validation”
An optional step is to enable “Server Validation” on the connection properties. It is recommended to enable this, because without server validation, it is not possible to know that the target SAP HANA server that has been connected to, is to be trusted with the Krberos ticket that will be sent during logon.
For “Server Validation” to work, the PSE file which is located in the HANA client directory on the On-Premise Data Gateway machine, will need to be populated with the public TLS certificate(s) of the SAP HANA system(s) being connected to and these certificates will need to contain the FQDN that has been used to initiate the connection (e.g. my-virtual-db-hostname.corp.net).
Although this is optional, I suspect there is a bug in the On-Premise Data Gateway software, since it does not seem possible to use the “test connection” facility without enabling “Server Validation”.
For “Server Validation” to work, the PSE file will need to be populated with the public TLS certificate(s) of the SAP HANA system(s) the end-user will be connecting to.
Required: Configure Kerberos on HANA server
The krb5_hdb.conf is configured according to your AD domain setup and whether AES256 is needed for the AD service account.
Once krb5_hdb.conf is configured, the AD service account can be tested at the Linux level using the required kinit and ktutil tools.
The Kerberos keytab can only be created once the AD service account has been created and the required SPN(s) mapped. The method of creating this changes depending on whether AES256 encryption is needed on the service account.
Once configured, the AD service account can be tested.
The keytab can only be created once the AD service account has been created and the required SPN(s) mapped.
Required: Map HANA User to UPN
In the HANA system, the database user account(s) need their “External ID” setting to the UPN that is passed in the Kerberos ticket. The UPN may not be apparent as you may imagine this to be “user.name@corp.net”, but in actual fact it may be the actual domain name “user.name@REALM”. Testing and tracing in the HANA system with the auth trace turned on, will reveal the UPN to you.

In a High Availability cluster with 2 nodes for the On-Premise Data Gateway, both nodes will need the same files and config.

Power BI via On-Premise Data Gateway to HANA using SAML

Connecting Power BI via the On-Premise Data Gateway to SAP HANA using SAML is the most simplistic setup because it de-couples the HANA system from Azure AD, with the On-Premise Data Gateway becoming the Identity Provider in this scenario.
There is no need to make changes to AD and in this setup, the On-Premise Data Gateway service can be running as a local computer account.
You will need to involve the Power BI administrators (or someone to configure the On-Premise Data Gateway) as some specific changes will need to be made on the gateway machine.

One important point to note about this setup: The On-Premise Data Gateway can only use one certificate to connect to HANA. This means if you have more than one HANA system, they will need to all trust the same On-Premise Data Gateway certificate.
This is a limitation in the configuration file of the On-Premise Data Gateway.

Here is how the architecture will look for a connection from Power BI via the On-Premise Data Gateway to SAP HANA using SAML for SSO:

Process Flow:

User login to Power BI Desktop.
Authentication via Azure AD.
User accesses query/connection for SAP HANA configured and published from the On-prem Data Gateway.
On-prem Data Gateway receives UPN and generates a SAML assertion.
Gateway signs the SAML assertion including target user account details, using IdP key and sends to HANA DB server over TLS. HANA DB validates signature using IdP pub key then maps the target user to a DB user ID and performs the query work.

For the above setup to work, there are some required steps and some optional steps:

Required: Install SAP HANA client
The main requirement is that the SAP HANA client is to be installed onto the On-Premise Data Gateway machine. For SAP administrators, you will note that this client will also need to be included in your frequent patching & maintenance routines to ensure it is aligned with the version of SAP HANA in use.
Recommended: Install SAPCRYPTOLIB
As well as the requirement for the SAP HANA client, it is recommended that you secure the connection to SAP HANA using TLS.
For this, you will need the SAPCRYPTOLIB also installing into the HANA client location on the On-Premise Data Gateway machine.
This set of libraries allow TLS to be used to encrypt the connection which is part of your “data-in-transit” security posture.
You will also therefore need a SAP Personal Security Environment (PSE) file placing onto the On-Premise Data Gateway machine along with the SAPCRYPTOLIB.
These libraries will also need to be included in your frequent patching & maintenance routines to ensure it is aligned with the version of SAPCRYPTOLIB in use on the SAP HANA server.
Required: Define Env Variable SECUDIR
So that the HANA client knows where the SAPCRYPTOLIB libraries (DLLs) have been deployed (if they are being deployed), you should set a SYSTEM environment variable called “SECUDIR” to point to the location of the SAPCRYPTOLIB files.
Optional: Enable “Server Validation”
An optional step is to enable “Server Validation” on the connection properties. It is recommended to enable this, because without server validation, it is not possible to know that the target SAP HANA server that has been connected to, is to be trusted with the SAML assertion that will be sent during logon.
For “Server Validation” to work, the PSE file which is located in the HANA client directory on the On-Premise Data Gateway machine, will need to be populated with the public TLS certificate(s) of the SAP HANA system(s) being connected to and these certificates will need to contain the FQDN that has been used to initiate the connection (e.g. my-virtual-db-hostname.corp.net).
Although this is optional, I suspect there is a bug in the On-Premise Data Gateway software, since it does not seem possible to use the “test connection” facility without enabling “Server Validation”.
For “Server Validation” to work, the PSE file will need to be populated with the public TLS certificate(s) of the SAP HANA system(s) the end-user will be connecting to.
Required: Create HANA SAML Provider
In the HANA system a new SAML provider needs creating and assinging the IdP certificate that is to be trusted.
Required: Map HANA User to UPN
In the HANA system, the database user account(s) need their account enabling for SAML authentication and mapping to an allowed provider IdP.
The UPN may not be the same as the UPN use for any Kerberos setup. You may imagine this to be “user.name@REALM”, but in actual fact it may be the actual domain name “user.name@corp.net”. Testing and tracing in the HANA system with the auth trace turned on, will reveal the UPN to you.
Once a provider is mapped, the user’s HANA account needs updating with their external ID (the UPN).

In a High Availability cluster with 2 nodes for the On-Premise Data Gateway, both nodes will need the same files.

A new private key and certificate will need to be generated for the On-Premise Data Gateway. Whilst the Microsoft documentation for SAML setup shows using OpenSSL to create the certificate, it is entirely possible to do this in PowerShell (see my other post here which will save you much hassle 😉 ).

Another step that the Microsoft documentation has, is to create a Certificate Authority key, then create a signing request for a new non-CA key. This is just not required with SAML. A certificate chain is not needed and HANA does not verify the chain.
Instead just create a CA key and certificate (again see my other post here). If you use my linked PowerShell method you don’t even need to manually transfer keys around, just create and import into the Microsoft Certificate Store (for local computer).

In the Microsoft documentation there are a couple of additional errors/ommissions that may catch you out:

The On-Premise Data Gateway configuration file is prefixed with “Microsoft.”. This was missing in the documentation.
It should be: Microsoft.PowerBI.DataMovement.Pipeline.GatewayCore.dll.config
The thumbprint of the certificate should be in lowercase. It is not known if this is actually required, but ad-hoc Google searching revealed some customers were not able to get it to work with an uppercase certificate thumbprint.
When adding the certificate thumbprint to the Gateway config file, the file is XML format.
This means you need to change the closing tag of the “setting” element and add a child “value” element.
Overall it should look like this for the thumbprint:

<setting name=”SapHanaSAMLCertThumbprint” serializeAs=”String”>
<value>the-thumbprint-here</value>
</setting>

Troubleshooting

During the setup process, do not expect it to be straightforward.
From experience, the following areas will cause issues:

Knowledge of the Active Directory KDC servers.
Setting up the Kerberos configuration on the HANA server will need asnwers from the AD adminstrators.
Knowledge of the AD domain federation.
Setting up the Kerberos configuration on the HANA server will need asnwers from the AD adminstrators.
Knowledge of public key cryptography.
Creation of the IdP SAML keys is tricky and the documentation shows a convoluted method with added confusion.
Lack of accurate documentation.
Some of the Microsoft documntation is not correct or accurate enough.
On-Premise Data Gateway trace log files
These are difficult to get at as they have to be downloaded and un-zipped each time.
HANA system fails to find the Kerberos config and keytab, with only resolution being to place them in /etc or full HANA system restart.

My best advice is:

Test the Kerberos setup on the HANA server using the kinit and other tools. If using Kerberos this must work and will report “valid” during the testing of the kvno (key version number).
Use the On-Premise Data Gateway trace logs (if using On-Premise Data Gateway).
Once you are sure that it has selected the IdP certificate and is trying to talk to HANA, then switch to the HANA traces.
Use the HANA authorisation trace with “debug” setting, then check the traces.
This is useful once you know that the On-Premise Data Gateway is actually trying to talk to HANA (if using On-Premise Data Gateway), or if you are using SS2DB use these traces straight away.
These traces will tell you the decoded UPN and whether HANA has found an appropriate user account mapping (or SAML provider if using SAML).

Thanks for reading and good luck!

References:

https://learn.microsoft.com/en-us/data-integration/gateway/service-gateway-onprem-indepth

https://learn.microsoft.com/en-us/power-bi/guidance/whitepaper-powerbi-security#vnet-connectivity-preview—coming-soon

SAP Note 2093286 – Migration from OpenSSL to CommonCryptoLib

SAP Note 2303807 – SAP HANA Smart Data Access: SSO with Kerberos and Microsoft Windows Active Directory

SAP Note 1837331 – HowTo configure Kerberos SSO to SAP HANA DB using Microsoft Windows Active Directory

https://learn.microsoft.com/en-us/power-bi/connect-data/service-gateway-sso-kerberos-sap-hana

https://learn.microsoft.com/en-us/power-bi/connect-data/service-gateway-sso-saml

https://en.wikipedia.org/wiki/Kerberos_(protocol)

https://en.wikipedia.org/wiki/Security_Assertion_Markup_Language

https://help.sap.com/docs/SAP_HANA_PLATFORM/b3ee5778bc2e4a089d3299b82ec762a7/1885fad82df943c2a1974f5da0eed66d.html?version=2.0.03&locale=1885fad82df943c2a1974f5da0eed66d.html

https://help.sap.com/docs/SAP_HANA_PLATFORM/6b94445c94ae495c83a19646e7c3fd56/c786f2cfd976101493dfdf14cf9bcfb1.html?version=2.0.03

https://help.sap.com/docs/SAP_HANA_PLATFORM/b3ee5778bc2e4a089d3299b82ec762a7/db6db355bb571014b56eb25057daec5f.html?version=2.0.03&locale=1885fad82df943c2a1974f5da0eed66d.html

https://social.technet.microsoft.com/wiki/contents/articles/36470.active-directory-using-kerberos-keytabs-to-integrate-non-windows-systems.aspx

Using Single Sign-on with the Power BI Gateway

https://blogs.sap.com/2020/03/22/sap-bi-platform-saml-sso-to-hana-database/

Korn Shell Calling SAP HANA – Hello Hello!

“So you’ve automated SAP HANA stuff huh?
What tools do you use?
Python? Chef? Puppet? Ansible? DSC/Powershell?“

No. I use Korn shell. ¯\_(ツ)_/¯

Me, Trying to Support Korn…

I find Korn shell is a lowest common denominator across many Linux/Unix systems, and also extremely simple to support.
It does exactly what I need.

For me it’s readable, fairly self-explanatory, easily editable and dependable.

…and Failing?

But I do know what you’re saying:

there’s no built in version control
it’s not easy to debug
it’s definitely not very cool 🤓
you can’t easily do offline development
my Grandad told me about something called Korn shell ?

Have you Considered

If you have no budget for tools, then you can start automating by using what you already have. Why not.
Don’t wait for the right tool, start somewhere and only then will you understand what works for you and doesn’t work for you.

Sometimes it’s not about budget. There are companies out there that do not allow certain sets of utilities and tools to be installed on servers, because they can provide too much help to hackers. Guess what they do allow? Yep, Korn shell is allowed.

Let’s Get Started

Here’s a sample function to run some SQL in HANA by calling the hdbsql (delivered with the HANA client) and return the output:

#!/bin/ksh
function run_sql {
   typeset -i l_inst="${1}" 
   typeset l_db="${2}" 
   typeset l_user="${3}" 
   typeset l_pw="${4}" 
   typeset l_col_sep="${5}" 
   typeset l_sql="${6}" 
   typeset l_output="" 
   typeset l_auth="" 
   typeset -i l_ret=0

   # Check if PW is blank, then use hdbuserstore (-U). 
   if [[ -n "${l_pw}" && "${l_pw}" != " " ]] ; then 
      l_auth="-u ${l_user} -p ${l_pw}" 
    else l_auth="-U ${l_user}" 
   fi

   l_output="$(hdbsql -quiet -x -z -a -C -j -i ${l_inst} ${l_auth} -d ${l_db} -F "${l_col_sep}"<<-EOF1 2>>/tmp/some-script.log 
		${l_sql}; 
		quit 
EOF1 )"
   
   l_ret=$?

   # For HANA 1.0 we need to trim the first 6 lines of output, because it doesn't understand "-quiet". 
   #if [[ "$(check_major_version)" -lt 2 ]] ; then 
      # l_output="$(print "${l_output}"| tail -n +7)" 
   #fi

   print "${l_output}" 
   return $l_ret 

}

To call the above function, we then just do (in the same script):

l_result="$(run_sql "10" "SystemDB" "SYSTEM" "SYSTEMPW" " " "ALTER SYSTEM ALTER CONFIGURATION ('global.ini', 'SYSTEM') SET ('persistence','log_mode')='overwrite' WITH RECONFIGURE")"

We are passing in the HANA instance number 10, you can use whatever your instance number is.

We can check the function return code (did the function return cleanly) like so:

if [[ $? -ne 0 ]] ; then 
   print "FAILED" 
   exit 1; 
fi

Here’s what we’re passing in our call to hdbsql (you can find this output by calling “hdbsql –help”):

-i instance number of the database engine
-d name of the database to connect
-U use credentials from user store
-u user name to connect
-p password to connect
-x run quietly (no messages, only query output)
-quiet Do not print the welcome screen
-F use as the field separator (default: ‘,’)
-C suppress escape output format
-j switch the page by page scroll output off
-Q show each column on a separate line
-a do not print any header for SELECT commands

If you wanted to return a value, then the “l_result” variable would contain the output.

Ideally, the function we wrote would be put into a chunk of modular code that could be referenced time and again from other Korn shell scripts.

You would also be looking to create some sets of standard functions for logging of messages to help with debugging. You can make it as complex as you wish.

In the call to “run_sql” we pass a column separator.
I usually like to use a “|” (pipe), then parse the returned values using the “awk” utility like so:

l_result="$(run_sql "10" "SystemDB" "SYSTEM" "SYSTEMPW" "|" "SELECT file_name,layer_name,section,key, value FROM SYS.M_INIFILE_CONTENTS WHERE layer_name='SYSTEM'")"

echo "${l_result}" | /bin/awk -F'|' '{ print $2" "$3" "$4 }'

When we execute the script we get the first 3 columns like so:

daemon.ini SYSTEM daemon 
diserver.ini SYSTEM communication 
global.ini SYSTEM auditing 
configuration global.ini SYSTEM 
backup global.ini SYSTEM
...

Obviously we don’t really embed the password in the script; it gets passed in.
You can either pass it in using the command line parameter method (./myscript.ksh someparam) or via the Linux environment variables (export myparam=something; ./myscript.ksh).
If you want you can even pipe it in (echo “myparam”| ./myscript.ksh) and “read” it into a variable.
You can also take a look at the “expect” utility to automate command line input.
Also, take a look at the “hdbuserstore” utility to store credentials for automation scripts (remember to set appropriatly secure privs on these database users).

That’s all there is to it for you to get started using Korn shell to call HANA.

Cookies, SAP Analytics Cloud and CORS in Netweaver & HANA

Back in 2019 (now designated as 2019AC – Anno-Covid19), I wrote a post explaining in simple terms what CORS is and how it can affect a SAP landscape.
In that post I showed a simple “on-premise” setup using Fiori, a back-end system and how a Web Dispatcher can help alleviate CORS issues without needing too much complexity.
This post is about a recent CORS related issue that impacts access to back-end SAP data repositories.

Back To The Future

If we hit the “Fast-Forward” button to 2020MC (Mid-Covid19), CORS is now an extremely important technical setup to enable Web Browser based user interfaces to be served from Internet based SAP SaaS services (like SAP Analytics Cloud) and communicate with back-end on-premise/private data sources such as SAP BW systems or SAP HANA databases.

We see that CORS is going to become ever more important going forward, since Web Browser based user interfaces will become more abundant (due to the increase of SaaS products) for the types of back-end data access. The old world of installing a software application on-premise takes too much time and effort to keep up with changing technology.
Using SaaS applications as user interfaces to on-premise data allows a far more agile delivery of user functionality.

The next generation of Web Interfaces will be capable of processing ever larger data sets, with richer capabilities and more in-built intelligence. We’re talking about the Web Browser being a central hub of cross-connected Web Based services.
Imagine, one “web application” that needs a connection to a SaaS product that provides the analytical interface and version management, a connection to one or more back-end data repositories, a connection to a separate SaaS product for AI data analysis and pattern matching (deep insights), a connection to a separate SaaS product for content management (publishing), a connection to a separate SaaS product for marketing and customer engagement.

All of that, from one central web “origin” will mean CORS will become critical to prevent unwanted connections and data leaks. The Web Browser is already the target of many cyber security exploits, therefore staying secure is extremely important, but security is always at the expense of functionality.

IETF Is On It

The Internet Engineering Task Force already have this in hand. That’s how we have CORS in the first place (tools.ietf.org/html/rfc6454).
The Web Origin Concept is constantly evolving to provide features for useability and also security. Way back in 2016 an update to RFC 6265 was proposed, to enhance the HTTP state management mechanism, which is commonly known to you and I as “cookies”.

This amendment (the RFC details are here: tools.ietf.org/html/draft-ietf-httpbis-cookie-same-site-00) was the SameSite attribute that can be set for cookies.
Even in this RFC, you can see that it actually attributes the idea of “samedomain-cookies” back to Mozilla, in 2011. So this is not really a “new” security feature, it’s a long time coming!

The Deal With SAC

The “problem” that has brought me back around to CORS, is recent experience with a CORS issue and SAP Analytics Cloud (SAC).
The issue led me to a blog post by Dong Pan of SAP Canada in Feb 2020 and a recent blog post by Ian Henry, also of SAP in Aug 2020.

Dong Pan wrote quite a long technical blog post on how to fix or work-around the full introduction of the SameSite cookie attribute in Google Chrome version 80 when using SAP Analytics Cloud (SAC).

Ian Henry’s post is also based on the same set of solutions that Dong Pan wrote about, but his issue was accessing a backend HANA XS Engine via Web Dispatcher.

The problem in both cases is that SAP Analytics Cloud (SAC) uses the Web Browser as a middleman to create a “Live Connection” back to an “on-premise” data repository (such as SAP BW or SAP S/4HANA), but the back-end SAP Netweaver/SAP ABAP Platform stack/HANA XS engine, that hosts the “on-premise” data repository does not apply the “SameSite” attribute to cookies that it creates.

You can read Dong Pan’s blog post here: www.sapanalytics.cloud/direct-live-connections-in-sap-analytics-cloud-and-samesite-cookies/
You can read Ian Henry’s blog post here: https://blogs.sap.com/2020/08/26/how-to-fix-google-chrome-samesite-cookie-issue-with-sac-and-hana/

By not applying the “SameSite” attribute to the cookie, Google Chrome browsers of version 80+ will not allow SAC to establish a full session to the back-end system.
You will see an HTTP 400 “session expired” error when viewing the HTTP browser traffic, because SAC tries to establish the connection to the back-end, but no back-end system cookies are allowed to be visible to SAC. Therefore SAC thinks you have no session to the back-end.

How to See the Problem

You will need to be proficient at tracing HTTP requests to be able to capture the problem, but it looks like the following in the HTTP response from the back-end system:

You will see (in Google Chrome) two yellow warning triangles on the “set-cookie” headers in the response from the back-end during the call to “GetServerInfo” to establish the actual connection.
The call is the GET for URL “/sap/bw/ina/GetServerInfo?sap-client=xxx&sap-language=EN&sap-sessionviaurl=X“, with the sap-sessionviaurl in the query-string being the key part.
The text when you hover over the yellow triangle is: “This Set-Cookie didn’t specify a “SameSite” attribute and was defaulted to “SameSite=Lax,” and was blocked because it came from a cross-site response which was not the response to a top-level navigation. The Set-Cookie had to have been set with “SameSite=None” to enable cross-site usage.“.

The Fix(es)

SAP Netweaver (or SAP ABAP Platform) needs some code fixes to add the required cookie attribute “SameSite”.

A workaround (it is a workaround) is possible by using the rewrite module capability of the Internet Communication Management (ICM) or using a rewrite rule in a Web Dispatcher, to re-write the responses and include a generic “SameSite” attribute on each cookie.
This is a workaround for a reason, because using the rewrite method causes unnecessary extra work in the ICM (or Web Dispatcher) for every request (matched or not matched) by the rewrite engine.

It’s always better (more secure, more efficient) to apply the code fix to Netweaver (or ABAP Platform) so the “SameSite” attribute is added at the point of the cookie creation.
For HANA XS, it will need a patch to be applied (if it ever gets fixed in the XS since it is soon deprecated).
With the workaround, we are forcing a setting onto cookies outside of the creation process of those cookies.

Don’t get me wrong, I’m not saying that the workaround should not be used. In some cases it will be the only way to fix this problem in some older SAP systems. I’m just pointing out that there are consequences and it’s not ideal.

Dong Pan and Ian Henry have done a good job of providing options for fixing this in a way that should work for 99% of cases.

Is There a Pretty Picture?

This is something I always find useful when I try and work something through in my mind.
I’ve adjusted my original CORS diagram to include an overview of how I think this “SameSite” attribute issue can be imagined.
Hopefully it will help.

We see the following architecture setup with SAC and it’s domain “sapanalytics.cloud”, issuing CORS requests to back-end system BE2, which sits in domain “corp.net”:

Using the above picture for reference, we can now show where the “SameSite” issue occurs in the processing of the “Resource Response” when it comes back to the browser from the BE2 back-end system:

The blocking, by the Chrome Web browser, of the cookies set by the back-end system in domain “corp.net”, means that from the point of view of SAC, no session was established.
There are a couple more “Request”, “Response” exchanges, before the usual HTTP Authorization header is sent from SAC, but at that point it’s really too late as the returned SAP SSO cookie will also be blocked.

At this point you could see a number of different error messages in SAC, but in the Chrome debugging you will see no HTTP errors because the actual HTTP request/response mechanism is working and HTTP content is being returned. It’s just that SAC will know it does not have a session established, because it will not be finding the usual cookies that it would expect from a successfully established session.

Hopefully I’ve helped explain what was already a highly technical topic, in a more visual way and helped convey the problem and the solution.

Useful Links:

The Web Origin Concept – RFC6454 tools.ietf.org/html/rfc6454
RFC 6265 amendment draft: tools.ietf.org/html/draft-ietf-httpbis-rfc6265bis-03
CORS In a SAP Context – www.it-implementor.co.uk/2019/06/cors-in-sap-netweaver-landscape.html
SAP Analytics Cloud SameSite Cookie Issue – www.sapanalytics.cloud/direct-live-connections-in-sap-analytics-cloud-and-samesite-cookies/
SameSite Cookies Issue with SAC and HANA – blogs.sap.com/2020/08/26/how-to-fix-google-chrome-samesite-cookie-issue-with-sac-and-hana

Analysing & Reducing HANA Backup Catalog Records

In honour of DBA Appreciation Day today 3rd July, I’ve written a small piece on a menial but crucial task that HANA database administrators may wish to check. It’s very easy to overlook but the impact can be quite amazing.

HANA Transaction Logging

In “normal” log mode (for recoverability), the HANA database, like Oracle, has an automatic transaction log backup process, which is responsible for backing up transaction log segments so that the HANA log volume disk space can be re-used by new transactions.
No free disk space in the HANA log volume, means the database will hang, until free space becomes available.

It is strongly recommended by SAP, to have your HANA database in log mode “normal”, since this offers the point-in-time recovery capability through the use of the transaction log backups.

By default a transaction log backup will be triggered automatically by HANA every time a log segment becomes full or if the timeout for an individual service is hit, whichever of those is sooner.
This is known as “immediate” interval mode.

I’m not going to go into the differences of the various interval options and the pros and cons of each since this is highly scenario specific. A lot of companies have small HANA databases and are quite happy with the default options. Some companies have high throughput, super low latency requirements, and would be tuning the log backup process for maximum throughput, while other companies want minimal data-loss and adjust the parameters to ensure that transactions are backed up off the machine as soon as possible.

The SITREP

In this specific situation that I encountered, I have a small HANA database of around ~200GB in memory, serving a SAP Solution Manager 7.2 system (so it has 2x tenant databases plus the SystemDB).

The settings are such that all databases run in log_mode “normal” with consolidated log backups enabled in “immediate” mode and a max_log_backup_size of 16GB (the default, but specified).

All backups are written to a specific disk area, before being pushed off the VM to an Azure Storage Account.

The Issue

I noticed that the local disk area was becoming quite full where the HANA database backups are written. Out of context you might have said it’s normal for an increase of activity in the system, but I know that this system is not doing anything at all (it’s a test system for testing Solution Manager patches and nobody was using it).

What Was Causing the Disk Usage?

Looking at the disk backup file system, I could easily see at the O/S level, that the HANA database log backups were the reason for the extra space usage.
Narrowing that down even further, I could be specific enough to see that the SYSTEMDB was to blame.

The SYSTEMDB in a very lightly used HANA database should not be transacting enough to have a day-to-day noticeable increase in log backup disk usage.
This was no ordinary increase!
I was looking at a total HANA database size on disk of ~120GB (SYSTEMDB plus 2x TenantDBs), and yet I was seeing ~200GB of transaction log backups per day from just the SYSTEMDB.

Drilling down further into the log backup directory for the SYSTEMDB, I could see the name of the log backup files and their sizes.
I was looking at log backup files of 2.8GB in size every ~10 to ~15 minutes.
The files that were biggest were….

… log_backup_0_0_0_0.<unix epoch time>
That’s right, the backup catalog backups!

Whenever HANA writes a backup, whether it is a complete data backup, or a transaction log backup, it also writes a backup of the backup catalog.
This is extremely useful if you have to restore a system and need to know about the backups that have taken place.
By default, the backup catalog backups are accumulated, which means that HANA doesn’t need to write out multiple backups of the backup catalog for each log backup (remember, we have 2x tenantDBs).

Why Were Catalog Backup Files So Big?

The catalog backups include the entire backup catalog.
This means every prior backup is in the backup file, so by default the backup catalog backup file will increase in size at each backup, unless you do some housekeeping of the backup catalog records.

My task was to write some SQL to check the backup catalog to see how many backup catalog records existed, for what type of backups, in which database and how old they were.

I came up with the following SQL:

--- Breakdown of age of backup records in months, by type of record.
SELECT smbc.DATABASE_NAME,
smbc.ENTRY_TYPE_NAME,
MONTHS_BETWEEN(smbc.SYS_START_TIME, CURRENT_DATE) as AGE_MONTHS,
COUNT(MONTHS_BETWEEN(smbc.SYS_START_TIME, CURRENT_DATE)) RECORDS,
t_smbc.YOUNGEST_BACKUP_ID
FROM	"SYS_DATABASES"."M_BACKUP_CATALOG" AS smbc,
		(SELECT xmbc.DATABASE_NAME, 
				xmbc.ENTRY_TYPE_NAME, 
				MONTHS_BETWEEN(xmbc.SYS_START_TIME, CURRENT_DATE) as AGE_MONTHS, 
				max (xmbc.BACKUP_ID) as YOUNGEST_BACKUP_ID 
				FROM "SYS_DATABASES"."M_BACKUP_CATALOG" xmbc 
				GROUP BY xmbc.DATABASE_NAME, 
						xmbc.ENTRY_TYPE_NAME, 
						MONTHS_BETWEEN(xmbc.SYS_START_TIME, CURRENT_DATE) 
		) as t_smbc 
WHERE t_smbc.DATABASE_NAME = smbc.DATABASE_NAME 
AND t_smbc.ENTRY_TYPE_NAME = smbc.ENTRY_TYPE_NAME 
AND t_smbc.AGE_MONTHS = MONTHS_BETWEEN(smbc.SYS_START_TIME, CURRENT_DATE) 
GROUP BY 	smbc.DATABASE_NAME, 
			smbc.ENTRY_TYPE_NAME, 
			MONTHS_BETWEEN(smbc.SYS_START_TIME, CURRENT_DATE), 
			t_smbc.YOUNGEST_BACKUP_ID 
ORDER BY DATABASE_NAME, 
		AGE_MONTHS DESC,
		RECORDS

The key points to note are:

I use the SYS_DATABASES.M_BACKUP_CATALOG view in the SYSTEMDB to see across all databases in the HANA system instead of checking in each one.
For each database, the SQL outputs:
– type of backup (complete or log).
– age in months of the backup.
– number of backup records in that age group.
– youngest backup id for that age group (so I can do some cleanup).

An example execution is:

(NOTE: I made a mistake with the last column name, it’s correct in the SQL now – YOUNGEST_BACKUP_ID)

You can see that the SQL execution took only 3.8 seconds.
Based on my output, I could immediately see one problem, I had backup records from 6 months ago in the SYSTEMDB!

All of these records would be backed up on every transaction log backup.
For whatever reason, the backup process was not able to honour the “BACKUP CATALOG DELETE” which was meant to keep the catalog to less than 1 month of records.
I still cannot adequately explain why this had failed. The same process is used on other HANA databases and none had exhibited the same issue.

I can only presume something was preventing the deletion somehow, since in the next few steps you will see that I was able to use the exact same process with no reported issues.
For reference this is HANA 2.0 SPS04 rev47, patched all the way from SPS02 rev23.

Resolving the Issue

How did I resolve the issue? I simply re-ran the catalog deletion that was already running after each backup.
I was able to use the backup ID from the YOUNGEST_BACKUP_ID column to reduce the backup records.

In the SYSTEMDB:

BACKUP CATALOG DELETE ALL BEFORE BACKUP_ID xxxxxxxx

Then for each TenantDB (still in the SYSTEMDB):

BACKUP CATALOG DELETE FOR <TENANTBD> ALL BEFORE BACKUP_ID xxxxxxxx

At the end of the first DELETE execution *in the first Tenant*, I re-ran the initial SQL query to check and this was the output:

We now only have 1 backup record, which was the youngest record in that age group for that first tenant database (compare to screenshot of first execution of the SQL query with backup id 1,590,747,286,179).
Crucially we have way less log backups for that tenant. Weve gone down from 2247 to 495.
Nice!
I then progressed to do the delete in the SYSTEMDB and other TenantDB of this HANA system.

Checking the Results

As a final check, I was able to compare the log backup file sizes:

The catalog backup in file “log_backup_0_0_0_0.nnnnnnn” at 09:16 is before the cleanup and is 2.7GB in size.
Whereas the catalog backup in “log_backup_0_0_0_0.nnnnnnn” at 09:29 is after the cleanup and is only 76KB in size.
An absolutely massive reduction!

How do we know that file “log_backup_0_0_0_0.nnnnnnn” is a catalog backup?
Because we can check using the Linux “strings” command to see the file string contents.
Way further down the listing it says it is a catalog backup, but I thought it was more interesting to see the “MAGIC” of Berlin:

UPDATE: August 2020 – SAP note 2962726 has been released which contains some standard SQL to help remove failed backup entries from the catalog.

Summary

Check your HANA backup catalog backup sizes.
Ensure you have alerting on file systems (if doing backups to disk).
Double check the backup catalog record age.
Give tons of freebies and thanks to your DBAs on DBA Appreciation Day!

Useful Links

Enable and Disable Automatic Log Backup
https://help.sap.com/viewer/6b94445c94ae495c83a19646e7c3fd56/2.0.05/en-US/241c0f0020b2492fb93a69a40b1b1b9a.html

Accumulated Backups of the Backup Catalog
https://help.sap.com/viewer/6b94445c94ae495c83a19646e7c3fd56/2.0.05/en-US/3def15378b954aac85f2b93bb3f85a49.html

Log Modes
https://help.sap.com/viewer/6b94445c94ae495c83a19646e7c3fd56/2.0.05/en-US/c486a0a3bb571014ab46c0633224f02f.html

Consolidated Log Backups
https://help.sap.com/viewer/6b94445c94ae495c83a19646e7c3fd56/2.0.05/en-US/653b5c6d5f9d41808011a5bd0fac6709.html

Azure Disk Cache Settings for an SAP Database on Linux

One of your go-live tasks once you have built a VM in Azure, should be to ensure that the Azure disk cache settings on the Linux VM data disks, are set correctly in accordance with the Microsoft recommended settings.
In this post I explain the disk cache options and how they apply to SAP and especially to SAP databases such as SAP ASE and SAP HANA, to ensure you get optimum performance.

What Are the Azure Disk Cache Settings?

In Microsoft Azure you can configure different disk cache settings on data disks that are attached to a VM.
NOTE: You do not need to consider changing the O/S root disk cache settings, as by default they are applied as per the Azure recommendations.

Only specific VMs and specific disks (Standard or Premium Storage) have the ability to use caching.
If you use Azure Standard storage, the cache is provided by local disks on the physical server hosting your Linux VM.
If you use Azure Premium storage, the cache is provided by a combination of RAM and local SSD on the physical server hosting your Linux VM.

There are 3 different Azure disk cache settings:

None
ReadOnly (or “read-only”)
ReadWrite (or “read/write”)

The cache settings can influence the performance and also the consistency of the data written to the Azure storage service where your data disks are stored.

Cache Setting: None

By specifying “None” as the cache setting, no caching is used and a write operation at the VM O/S level is confirmed as completed once the data is written to the storage service.
All read operations for data not already in the VM O/S file system cache, will be read from the storage service.

Cache Setting: ReadOnly

By specifying “ReadOnly” as the cache setting, a write operation at the VM O/S level is confirmed as completed once the data is written to the storage service.
All read operations for data not already in the VM O/S file system cache, will be read from the read cache on the underlying physical machine, before being read from the storage service.

Cache Setting: ReadWrite

By specifying “ReadWrite” as the cache setting, a write operation at the VM O/S level is confirmed as completed once the data is written to the cache on the underlying physical machine.
All read operations for data not already in the VM O/S file system cache, will be read from the read cache on the underlying physical machine, before being read from the storage service.

Where Do We Configure the Disk Cache Settings?

The disk cache settings are configured in Azure against the VM (in the Disks settings), since the disk cache is both physical host and VM series dependent. It is *not* configured against the disk resource itself, as explained in my previous blog post: Listing Azure VM DataDisks and Cache Settings Using Azure Portal JMESPATH & Bash

Any Recommendations for Disk Cache Settings?

There are specific recommendations for Azure disk cache settings, especially when running SAP and especially when running databases like SAP ASE or SAP HANA.

In general, the rules are:

Disk Usage	Azure Disk Cache Setting
Root O/S disk (/)	ReadWrite – ALWAYS!
HANA Shared	ReadOnly
ASE Home (/sybase/<SID>)	ReadOnly
Database Data	HANA=None, ASE=ReadOnly
Database Log	None

The above settings for SAP ASE have been obtained from SAP note 2367194 (SQL Server is same as ASE) and from the general deployment guide here: https://docs.microsoft.com/en-us/azure/virtual-machines/workloads/sap/dbms_guide_general
The use of write caching on the ASE home is optional, you could choose ReadOnly, it would help protect the ASE config file in a very specific scenario. It is envisaged that using ASE 16.0 with SRS/HADR you would have a separate data disk for the Replication Server data (I’ll talk about this in another post).

The above settings for HANA have been taken from the updated guide here: https://docs.microsoft.com/en-us/azure/virtual-machines/workloads/sap/hana-vm-operations-storage which is designed to meet the KPIs mentioned in SAP note 2762990.

The reason for not using a write cache every time, is because an issue at the physical host level, affecting the cache, could cause the application (e.g database) to think it has committed data, when it actually isn’t written to disk. This is not good for databases, especially if the issue affects the transaction/redo log area. Data loss could occur.

It’s worth noting that this cache “issue” has always been true of every caching technology ever created, on which databases run. Storage tech vendors try to mitigate this by putting batteries into the storage appliances, but since the write cache in Azure is at the physical host level, there’s just no guarantee that when the VM O/S thinks the write operation has committed to disk, that it has actually been written to disk.

How About Write Accelerator?

There are specific Azure VM series (M-series at current) that support something known as “Write Accelerator”.
This is an extra VM level setting for Premium Storage disks attached to M-series VMs.

Enabling the Write Accelerator setting is a requirement by Microsoft for production SAP HANA transaction log disks on M-Series VMs. This setting ebales the Azure VM to meet the SAP HANA key performance indicators in note 2762990. Azure Write Accelerator is designed to provide lower latency write times on Premium Storage.

You should ensure that the Write Accelerator setting is enabled where appropriate, for your HANA database transaction log disks. You can check if it is enabled following my previous blog post: Listing Azure VM DataDisks and Cache Settings Using Azure Portal JMESPATH & Bash

I’ve tried my best to find more detailed information on how the Write Accelerator feature is actually provided, but unfortunately it seems very elusive. Robert Boban (of Microsoft) commented on a LinkedIn post here: “It is special caching impl. for M-Series VM to fulfill SAP HANA req. for <1ms latency between VM and storage layer.“.

Check the IOPS

Once you have configured your disks and the cache settings, you should ensure that you test the IOPS achieved using the Microsoft recommended process.
You can follow similar steps as my previous post: Recreating SAP ASE Database I/O Workload using Fio on Azure

As mentioned in other places in the Microsoft documentation and SAP notes such as 2367194, you need to ensure that you choose the correct size and series of VM to ensure that you align the required VM maximum IOPS with the intended amount of data disks and their potential IOPS maximum. Otherwise you could hit the VM max IOPS before touching the disk IOPS maximum.

Enable Accelerated Networking

Since the storage is itself connected to your VM via the network, you should ensure that Accelerator Networking is enabled in your VMs Network Settings:

Checking Cache Settings Directly on the VM

As per my previous post Checking Azure Disk Cache Settings on a Linux VM in Shell, you can actually check the Azure disk cache settings on the VM itself. You can do it manually, or write a script (better option for whole landscape validation).

Summary:

I discussed the two types of storage (standard or premium) that offer disk caching, plus where in Azure you need to change the setting.
The table provided a list of cache settings for both SAP ASE and SAP HANA databases and their data disk areas, based on available best-practices.

I mentioned Write Accelerator for HANA transaction log disks and ensuring that you enable Accelerated Networking.
Also provided was a link to my previous post about running a check of IOPS for your data disks, as recommended by Microsoft as part of your go-live checks.

A final mention was made another post of mine, with a great way of checking the disk cache settings across the VMs in the landscape.

Useful Links:

Windows File Cache

https://docs.microsoft.com/en-us/azure/virtual-machines/linux/premium-storage-performance

https://docs.microsoft.com/en-us/azure/virtual-machines/windows/how-to-enable-write-accelerator

https://docs.microsoft.com/en-us/azure/virtual-machines/workloads/sap/hana-vm-operations-storage#production-storage-solution-with-azure-write-accelerator-for-azure-m-series-virtual-machines

https://petri.com/digging-into-azure-vm-disk-performance-features

https://techcommunity.microsoft.com/t5/running-sap-applications-on-the/sap-on-azure-general-update-march-2019/ba-p/377456

https://docs.microsoft.com/en-us/azure/virtual-machines/workloads/sap/dbms_guide_general

https://docs.microsoft.com/en-us/azure/virtual-machines/workloads/sap/hana-vm-operations-storage

SAP Note 2762990 – How to interpret the report of HWCCT File System Test

SAP Note 2367194 – Use of Azure Premium SSD Storage for SAP DBMS Instance

Index:

Questions to Ask

On-Premises Data Gateway or no On-Premises Data Gateway

HANA Integration Point

Kerberos or SAML

Power BI Direct to HANA via Kerberos (SSO2DB)

Power BI via On-Premise Data Gateway to HANA using Kerberos

Power BI via On-Premise Data Gateway to HANA using SAML

Troubleshooting

References:

You may also be interested in:

Me, Trying to Support Korn…

…and Failing?

Have you Considered

Let’s Get Started

You may also be interested in:

Back To The Future

IETF Is On It

The Deal With SAC

How to See the Problem

The Fix(es)

Is There a Pretty Picture?

Useful Links:

You may also be interested in:

HANA Transaction Logging

The SITREP

The Issue

What Was Causing the Disk Usage?

Why Were Catalog Backup Files So Big?

Resolving the Issue

Checking the Results

Summary

Useful Links

You may also be interested in:

What Are the Azure Disk Cache Settings?

Cache Setting: None

Cache Setting: ReadOnly

Cache Setting: ReadWrite

Where Do We Configure the Disk Cache Settings?

Any Recommendations for Disk Cache Settings?

How About Write Accelerator?

Check the IOPS

Enable Accelerated Networking

Checking Cache Settings Directly on the VM

Summary:

Useful Links:

You may also be interested in: