Programming Archives » Musings of an IT Implementor

Create a SAML Key & Cert in Powershell for PowerBI to HANA SSO

You’ve seen the Microsoft documentation, you want to create a private key and a certificate that can be used with a SAML assertion for single-sign-on between Microsoft PowerBI Gateway and a SAP HANA 2.0 database.

Where’s the problem?

The problem, is that the Microsoft documentation simply says to use “openssl”. Not only that, but it suggests that you create a CA (Certificate Authority) key and certificate first, then create a separate key and certificate for the actual IdP (which in this setup is the PowerBI Gateway), and sign it with the new CA certificate, creating a chain.

This is just not needed in SAML. In a SAML assertion, the certificate used for signature doesn’t need to be signed/issued by anyone in particular because the chain is not validated. The lifetime of the certificate can be loooong, because there is no chain. In the case of a compromised key, the IdP key and certificate would need to be re-created and the new certificate distributed to the target systems (service providers like SAP HANA database).

In this post I’m actually addressing two issues with the Microsoft documentation (I’ve raised one on the github page, but alas it has not been closed yet).

No CA is needed. A self-signed IdP certificate works perfectly fine.
OpenSSL is not needed. We can do it all in Powershell (with an evil laugh added in for extra evilness).
We don’t need to export the key and certificate into a PFX file, then import it into the Windows certificate store.

Show me the Code

After fiddling around with various iterations, I found that the following Powershell code produces a key and certificate that is accepted by PowerBI Gateway (the key uses a valid provider).
The certificate is also accepted by SAP HANA once you’ve imported it into the new Identity Provider that you create in the HANA Cockpit (or HANA Studio if you must – tut tut tut).

Code to create a private key, plus certificate valid for 10 years from today and store it in the Windows “LocalMachine” certificates store in the “Personal” folder on the PowerBI Gateway:

$params = @{
Subject = "CN=PowerBI HANA IdP,OU=MyDept,O=MyCompany,L=MyTown,ST=MyShire,C=GB"
KeyAlgorithm = 'RSA'
KeyLength = 2048
NotAfter = $(Get-Date).AddMonths(120)
FriendlyName = "PowerBI HANA IdP Cert"
KeyFriendlyName = "PowerBI HANA IdP Cert"
HashAlgorithm = 'SHA256'
KeyUsage = 'None'
SuppressOid = '2.5.29.37'
CertStoreLocation = 'Cert:\LocalMachine\My'
Provider = 'Microsoft Enhanced RSA and AES Cryptographic Provider'
KeySpec = 'KeyExchange'
TextExtension = @("2.5.29.19={text}CA=true")
}

$cert = New-SelfSignedCertificate @params

The code above generates an x.509 certificate that contains “CA=true” in the BasicConstraints section.
This is the only part that may not actually be needed. I was really tired after all the investigation. Think of the time I have just saved you here!
If you don’t do this part and it still works, let me know.

Without the “Provider” parameter specifically set to “Microsoft Enhanced RSA and AES Cryptographic Provider” the PowerBI Gateway would not select the key during the SSO connection test.
Instead it would throw an error like so in the GatewayErrors log:

InnerToString=<ccon>System.Security.Cryptography.CryptographicException: Invalid provider type specified.

I had to use the “SuppressOid” parameter to prevent the KeyUsage being added, even if I specifically set “KeyUsage = ‘None'”.
This is really just cosmetic, because I wanted the certificate to look as identical to the openssl generated certificate. If you don’t do this part and it still works, let me know.

I believe that the OpenSSL equivalent command lines are:

openssl req -x509 -sha256 -days 3650 -newkey rsa:2048 -keyout IdP_Key.pem -out IdP_Cert.pem -subj "/C=GB/ST=MyShire/L=MyTown/O=MyCompany/OU=MyDept/CN=PowerBI HANA IdP"
[enter my password]

openssl pkcs12 -export -out IdP_PFX.pfx -in IdP_Cert.pem -inkey IdP_Key.pem -passin pass:my-password -passout pass:my-password

By doing all the work in Powershell, there is no need to enter passwords, export files and transfer files (if you don’t have openssl.exe on the PowerBI Gateway).

You may also be interested in:

PowerShell Encrypt / Decrypt OpenSSL AES256 CBC

A few months back I had a Korn shell script which used OpenSSL to encrypt some text using AES 256 CBC.
I managed, through the power of stackoverflow.com and various other blogs, to write a Java routine to perform the exact same encrypt/decrypt.
This allowed me to encrypt in Korn on Linux and decrypt in Java which was running inside a SAP Netweaver application server, or the other way around (encrypt in Java and decrypt in Korn using OpenSSL).

About 2 months after that, I needed the same set of routines to be written in PowerShell, allowing the same encrypted text to be encrypted on Linux with OpenSSL and decrypted on Windows in PowerShell (no need for OpenSSL).

I forked the PowerShell code which did the initial encryption and wrote the decryption routine which I’ve published as a Github gist here:

https://gist.github.com/Darryl-G/d1039c2407262cb6d735c3e7a730ee86

You may also be interested in:

Capture HTTP POST Using Simple Python Script

In this post I show a simple and quick way to capture a basic HTTP POST using Python to provide a basic HTTP Web Server with cgi capability in just a few lines of code and in most cases, it is executable on almost any Python capable server.

Why I Used This Code

I used this successfully to test an interface which wasn’t particularly clear exactly what data it was going to POST to a target web server.
Usually a developer could use real developer tools to do this analysis, however, the server doing the POST is POSTing to another server, and both servers sit behind firewalls. It was far quicker to create something simple that could be executed direct on the server.

What Does the Code Do?

The code simply creates a HTTP web server on the server on which it is executed.
The web server serves the content that exists in the directory structures from the current working directory and below.
The web server is also able to execute CGI scripts written in Python and stored in the cgi-bin subdirectory.

The Code

#!/usr/bin/python 
# 
import sys, os, cgi, cgitb

# Enable easy error reporting output. 
cgitb.enable()

# Create a custom log output to print to stdout and a log file. 
class CustomLogger(object): 
   def __init__(self): 
      self.terminal = sys.stdout 
      self.log = open("logfile.log", "a")

   def write(self, message): 
      self.terminal.write(message) 
      self.log.write(message)

   def flush(self): 
      pass

# Swap stdout for our custom log class. 
sys.stdout = CustomLogger() 
sys.stderr = sys.stdout

# Call the standard CGI test. 
cgi.test()

### END OF SCRIPT ###

Deploying the Code

To install the code, we need to create a new temporary directory on our host server, I used ssh to do this:

mkdir -p /tmp/dmg/cgi-bin

Put the code into the file called form.py in the cgi-bin directory:

cd /tmp/dmg/cgi-bin

vi form.py
[insert code then press shift-ZZ]

chmod 755 form.py

Still on an ssh session, switch back to the dmg directory and execute the Python CGI handler to listen on port 8080:

cd /tmp/dmg

python -m CGIHTTPServer 8080

Call your HTTP tool to POST to the address:
http://<your-server>:8080/cgi-bin/form.py

If the tool returns output, then you will see the output on your ssh session screen.
The output response from the CGI script is also stored in the /tmp/dmg/logfile.log file.

To quit/end the HTTP web server, simply press CTRL+C multiple times until you are returned to the command prompt.

The output will look like:

Content-type: text/html

Current Working Directory:
/tmp/dmg

Command Line Arguments:
['/tmp/dmg/cgi-bin/form.py', '']

Form Contents:
parameter: <type 'instance'>
MiniFieldStorage('parameter', 'test')

Shell Environment:
...

You will see the POST content in the “Form Contents” section of the output.
The values of fields are pre-fixed with “MiniFieldStorage“.

Also included in the output, is the execution environment which contains the environment variables that contain CGI related variables and their respective values such as HTTP_METHOD.

A Test Form

You can also deploy a simple form in order to test the CGI capability manually from a web browser (although this was not required in my case).
The form is simple HTML that POSTs two text input fields to our form.py CGI script:

<html><body>
<div style="text-align: center;">
<h1>Test Form</h1>
<form action="/cgi-bin/form.py" method="POST">
f1 : <input style="text-align: center;" name="f1" type="text" />
f2 : <input style="text-align: center;" name="f2" type="text" />
<input type="submit" value="Submit" />
</form>
</div>
</body></html>

The form should be saved to a new file called index.html in the /tmp/dmg directory.
You can then manually access the test web server using http://<your-server>:8080 and you will see the form.
Enter two values into the form and click submit, to see the output from your CGI script.

Korn Shell Calling SAP HANA – Hello Hello!

“So you’ve automated SAP HANA stuff huh?
What tools do you use?
Python? Chef? Puppet? Ansible? DSC/Powershell?“

No. I use Korn shell. ¯\_(ツ)_/¯

Me, Trying to Support Korn…

I find Korn shell is a lowest common denominator across many Linux/Unix systems, and also extremely simple to support.
It does exactly what I need.

For me it’s readable, fairly self-explanatory, easily editable and dependable.

…and Failing?

But I do know what you’re saying:

there’s no built in version control
it’s not easy to debug
it’s definitely not very cool 🤓
you can’t easily do offline development
my Grandad told me about something called Korn shell ?

Have you Considered

If you have no budget for tools, then you can start automating by using what you already have. Why not.
Don’t wait for the right tool, start somewhere and only then will you understand what works for you and doesn’t work for you.

Sometimes it’s not about budget. There are companies out there that do not allow certain sets of utilities and tools to be installed on servers, because they can provide too much help to hackers. Guess what they do allow? Yep, Korn shell is allowed.

Let’s Get Started

Here’s a sample function to run some SQL in HANA by calling the hdbsql (delivered with the HANA client) and return the output:

#!/bin/ksh
function run_sql {
   typeset -i l_inst="${1}" 
   typeset l_db="${2}" 
   typeset l_user="${3}" 
   typeset l_pw="${4}" 
   typeset l_col_sep="${5}" 
   typeset l_sql="${6}" 
   typeset l_output="" 
   typeset l_auth="" 
   typeset -i l_ret=0

   # Check if PW is blank, then use hdbuserstore (-U). 
   if [[ -n "${l_pw}" && "${l_pw}" != " " ]] ; then 
      l_auth="-u ${l_user} -p ${l_pw}" 
    else l_auth="-U ${l_user}" 
   fi

   l_output="$(hdbsql -quiet -x -z -a -C -j -i ${l_inst} ${l_auth} -d ${l_db} -F "${l_col_sep}"<<-EOF1 2>>/tmp/some-script.log 
		${l_sql}; 
		quit 
EOF1 )"
   
   l_ret=$?

   # For HANA 1.0 we need to trim the first 6 lines of output, because it doesn't understand "-quiet". 
   #if [[ "$(check_major_version)" -lt 2 ]] ; then 
      # l_output="$(print "${l_output}"| tail -n +7)" 
   #fi

   print "${l_output}" 
   return $l_ret 

}

To call the above function, we then just do (in the same script):

l_result="$(run_sql "10" "SystemDB" "SYSTEM" "SYSTEMPW" " " "ALTER SYSTEM ALTER CONFIGURATION ('global.ini', 'SYSTEM') SET ('persistence','log_mode')='overwrite' WITH RECONFIGURE")"

We are passing in the HANA instance number 10, you can use whatever your instance number is.

We can check the function return code (did the function return cleanly) like so:

if [[ $? -ne 0 ]] ; then 
   print "FAILED" 
   exit 1; 
fi

Here’s what we’re passing in our call to hdbsql (you can find this output by calling “hdbsql –help”):

-i instance number of the database engine
-d name of the database to connect
-U use credentials from user store
-u user name to connect
-p password to connect
-x run quietly (no messages, only query output)
-quiet Do not print the welcome screen
-F use as the field separator (default: ‘,’)
-C suppress escape output format
-j switch the page by page scroll output off
-Q show each column on a separate line
-a do not print any header for SELECT commands

If you wanted to return a value, then the “l_result” variable would contain the output.

Ideally, the function we wrote would be put into a chunk of modular code that could be referenced time and again from other Korn shell scripts.

You would also be looking to create some sets of standard functions for logging of messages to help with debugging. You can make it as complex as you wish.

In the call to “run_sql” we pass a column separator.
I usually like to use a “|” (pipe), then parse the returned values using the “awk” utility like so:

l_result="$(run_sql "10" "SystemDB" "SYSTEM" "SYSTEMPW" "|" "SELECT file_name,layer_name,section,key, value FROM SYS.M_INIFILE_CONTENTS WHERE layer_name='SYSTEM'")"

echo "${l_result}" | /bin/awk -F'|' '{ print $2" "$3" "$4 }'

When we execute the script we get the first 3 columns like so:

daemon.ini SYSTEM daemon 
diserver.ini SYSTEM communication 
global.ini SYSTEM auditing 
configuration global.ini SYSTEM 
backup global.ini SYSTEM
...

Obviously we don’t really embed the password in the script; it gets passed in.
You can either pass it in using the command line parameter method (./myscript.ksh someparam) or via the Linux environment variables (export myparam=something; ./myscript.ksh).
If you want you can even pipe it in (echo “myparam”| ./myscript.ksh) and “read” it into a variable.
You can also take a look at the “expect” utility to automate command line input.
Also, take a look at the “hdbuserstore” utility to store credentials for automation scripts (remember to set appropriatly secure privs on these database users).

That’s all there is to it for you to get started using Korn shell to call HANA.

You may also be interested in:

Checking Azure Disk Cache Settings on a Linux VM in Shell

In a previous blog post, I ended the post by showing how you can use the Azure Enhanced Monitoring for Linux to obtain the disk cache settings.
Except, as we found, it doesn’t easily allow you to relate the Linux O/S disk device names and volume groups, to the Azure data disk names.

You can read the previous post here: Listing Azure VM DataDisks and Cache Settings Using Azure Portal JMESPATH & Bash

In this short post, I pick up where I left off and outline a method that will allow you to correlate the O/S volume group name, with the Linux O/S disk devices and correlate those Linux disk devices with the Azure data disk names, and finally, the Azure data disks with their disk cache settings.

Using the method I will show you, you will see how easily you can verify that the disk cache settings are consistent for all disks that make up a single volume group (very important), and also be able to easily associate those volume groups with the type of usage of the underlying Azure disks (e.g. is it for database data, logs or executable binaries).

1. Check If AEM Is Installed

Our first step is to check if the Azure Enhanced Monitoring for Linux (AEM) extension is installed on the Azure VM.
This extension is required, for your VM to be supported by SAP.

We use standard Linux command line to check for the extension on the VM:

ls -1 /var/lib/waagent/Microsoft.OSTCExtensions.AzureEnhancedMonitorForLinux-*/config/0.settings

The listing should return at least 1 file called “0.settings”.
If you don’t have this and you don’t have a directory starting with “Microsoft.OSTCExtensions.AzureEnhancedMonitorForLinux-“, then you don’t have AEM and you should get it installed following standard Microsoft documentation.

2. Get the Number of Disks Known to AEM

We need to know how many disks AEM knows about:

grep -c 'disk;Caching;' /var/lib/AzureEnhancedMonitor/PerfCounters

3. Get the Number of SCSI Disks Known to Linux

We need to know how many disks Linux knows about (we exclude the root disk /dev/sda):

lsscsi --size --size | grep -cv '/dev/sda'

4. Compare Disk Counts

Compare the disks quantity from AEM and from Linux. They should be the same. This is the number of data disks attached to the VM.

If you have a lower number from the AEM PerfCounters file, then you may be suffering the effects of an Azure bug in the AEM extension which is unable to handle more than 9 data disks.
Do you have more than 9 data disks?

At this point if you do not have matching numbers, then you will not be able to continue, as the AEM output is vital in the next steps.

Mapping Disks to the Cache Settings

Once we know our AEM PerfCounters file contains all our data disks, we are now ready to map the physical volumes (on our disk devices) to the cache settings. On the Linux VM:

pvs -o "pv_name,vg_name" --separator=' ' --noheadings

Your output should be a list of disks and their volume groups like so (based on our diagram earlier in the post):

/dev/sdc vg_data
/dev/sdd vg_data

Next we look for a line in the AEM PerfCounters file that contains that disk device name, to get the cache setting:

awk -F';' '/;disk;Caching;/ { sub(/\/dev\//,"",$4); printf "/dev/%s %s\n", tolower($4), tolower($6) }' /var/lib/AzureEnhancedMonitor/PerfCounters

The output will be the Linux disk device name and the Azure data disk cache setting:

/dev/sdc none
/dev/sdd none

For each line of disks from the cache setting, we can now see what volume group it belongs to.
Example: /dev/sdc is vg_data and the disk in Azure has a cache setting of “none”.

If there are multiple disks in the volume group, they all must have the same cache setting applied!

Finally, we look for the device name in the PerfCounters file again, to get the name of the Azure disk:

NOTE: Below is looking specifically for “sdc”.

awk -F';' '/;Phys. Disc to Storage Mapping;sdc;/ { print $6 }' /var/lib/AzureEnhancedMonitor/PerfCounters

The output will be like so:

None sapserver01-datadisk1
None sapserver01-datadisk2

We can ignore the first column output (“None”) in the above, it’s not needed.

Summary

If you package the AEM disk count check and the subsequent AEM PerfCounters AWK scripts into one neat script with the required loops, then you can get the output similar to this, in one call:

/dev/sdd none vg_data sapserver01-datadisk2
/dev/sdc none vg_data sapserver01-datadisk1
/dev/sda readwrite

Based on the above output, I can see that my vg_data volume group disks (sdc & sdd) all have the correct setting for Azure data disk caching in Azure for a HANA database data disk location.

Taking a step further, if you have intelligently named your volume group names, you then also check in your script, the cache setting based on the name of the volume group to determine if it is correct, or not.
You can then embed this validation script into a “custom validation” within SAP LaMa and it will alert you automatically if your VM disk cache settings are not correct.

You may be wondering, why not do all this from the Azure Portal?
Well, the answer to that is that you don’t know what Linux VM volume groups those Azure disks are used by, unless you have tagged them or named them intelligently in Azure.