This blog contains experience gained over the years of implementing (and de-implementing) large scale IT applications/software.

Checking Azure Disk Cache Settings on a Linux VM in Shell

In a previous blog post, I ended the post by showing how you can use the Azure Enhanced Monitoring for Linux to obtain the disk cache settings.
Except, as we found, it doesn’t easily allow you to relate the Linux O/S disk device names and volume groups, to the Azure data disk names.

You can read the previous post here: Listing Azure VM DataDisks and Cache Settings Using Azure Portal JMESPATH & Bash

In this short post, I pick up where I left off and outline a method that will allow you to correlate the O/S volume group name, with the Linux O/S disk devices and correlate those Linux disk devices with the Azure data disk names, and finally, the Azure data disks with their disk cache settings.

Using the method I will show you, you will see how easily you can verify that the disk cache settings are consistent for all disks that make up a single volume group (very important), and also be able to easily associate those volume groups with the type of usage of the underlying Azure disks (e.g. is it for database data, logs or executable binaries).

1. Check If AEM Is Installed

Our first step is to check if the Azure Enhanced Monitoring for Linux (AEM) extension is installed on the Azure VM.
This extension is required, for your VM to be supported by SAP.

We use standard Linux command line to check for the extension on the VM:

ls -1 /var/lib/waagent/Microsoft.OSTCExtensions.AzureEnhancedMonitorForLinux-*/config/0.settings

The listing should return at least 1 file called “0.settings”.
If you don’t have this and you don’t have a directory starting with “Microsoft.OSTCExtensions.AzureEnhancedMonitorForLinux-“, then you don’t have AEM and you should get it installed following standard Microsoft documentation.

2. Get the Number of Disks Known to AEM

We need to know how many disks AEM knows about:

grep -c 'disk;Caching;' /var/lib/AzureEnhancedMonitor/PerfCounters

3. Get the Number of SCSI Disks Known to Linux

We need to know how many disks Linux knows about (we exclude the root disk /dev/sda):

lsscsi --size --size | grep -cv '/dev/sda'

4. Compare Disk Counts

Compare the disks quantity from AEM and from Linux.  They should be the same.  This is the number of data disks attached to the VM.

If you have a lower number from the AEM PerfCounters file, then you may be suffering the effects of an Azure bug in the AEM extension which is unable to handle more than 9 data disks.
Do you have more than 9 data disks?

At this point if you do not have matching numbers, then you will not be able to continue, as the AEM output is vital in the next steps.

Mapping Disks to the Cache Settings

Once we know our AEM PerfCounters file contains all our data disks, we are now ready to map the physical volumes (on our disk devices) to the cache settings. On the Linux VM:

pvs -o "pv_name,vg_name" --separator=' ' --noheadings

Your output should be a list of disks and their volume groups like so (based on our diagram earlier in the post):

/dev/sdc vg_data
/dev/sdd vg_data

Next we look for a line in the AEM PerfCounters file that contains that disk device name, to get the cache setting:

awk -F';' '/;disk;Caching;/ { sub(/\/dev\//,"",$4); printf "/dev/%s %s\n", tolower($4), tolower($6) }' /var/lib/AzureEnhancedMonitor/PerfCounters

The output will be the Linux disk device name and the Azure data disk cache setting:

/dev/sdc none
/dev/sdd none

For each line of disks from the cache setting, we can now see what volume group it belongs to.
Example: /dev/sdc is vg_data and the disk in Azure has a cache setting of “none”.

If there are multiple disks in the volume group, they all must have the same cache setting applied!

Finally, we look for the device name in the PerfCounters file again, to get the name of the Azure disk:

NOTE: Below is looking specifically for “sdc”.

awk -F';' '/;Phys. Disc to Storage Mapping;sdc;/ { print $6 }' /var/lib/AzureEnhancedMonitor/PerfCounters

The output will be like so:

None sapserver01-datadisk1
None sapserver01-datadisk2

We can ignore the first column output (“None”) in the above, it’s not needed.

Summary

If you package the AEM disk count check and the subsequent AEM PerfCounters AWK scripts into one neat script with the required loops, then you can get the output similar to this, in one call:

/dev/sdd none vg_data sapserver01-datadisk2
/dev/sdc none vg_data sapserver01-datadisk1
/dev/sda readwrite

Based on the above output, I can see that my vg_data volume group disks (sdc & sdd) all have the correct setting for Azure data disk caching in Azure for a HANA database data disk location.

Taking a step further, if you have intelligently named your volume group names, you then also check in your script, the cache setting based on the name of the volume group to determine if it is correct, or not.
You can then embed this validation script into a “custom validation” within SAP LaMa and it will alert you automatically if your VM disk cache settings are not correct.

You may be wondering, why not do all this from the Azure Portal?
Well, the answer to that is that you don’t know what Linux VM volume groups those Azure disks are used by, unless you have tagged them or named them intelligently in Azure.

Finding Your SAP F&R Version

The Forecasting & Replenishment offering from SAP runs on SAP SCM.
SAP originally bought the F&R binary calculation engine from a Swiss company called SAF.  This was integrated to the SCM platform and is called through an RFC connection.

If you’re planning an upgrade you need to easily identify which version you’re running.

There are two areas to check:
– The SAP SCM version.
– The SAF (FRP) binary version.

Check in SPAM (or in SAP GUI, System -> Status -> Component Version) for the SAP SCM Server version.

Note that:
SCM 7.02 (EHP 2) = F&R 5.2.
SCM 7.01 (EHP 1) = F&R 5.1.

Checking the SAF binary must be done at the operating system level.
The usual location is either “/usr/sap/<SID>/SYS/global/frp/bin” or “/usr/sap/<SID>/FRP/bin”.

As the <sid>adm user simply call the “safcnfg” binary with the “-version” command line option:

SAP F&R SAF binary version  

See SAP note 1487615 for details on where to find FRP binary patches.

Finally, you should note that SAP SCM is itself a Business Suite software package like SAP ERP.  Therefore, it is not classed as a HUB or Sidecar landscape pattern, but instead, a source business system.  This means that there is no real dependency link to the SAP ERP version (providing you’re on the same technology platform level e.g. 7.31). 
You do need to consider that the interface from ERP to F&R may need some notes applying, some of which may be better implemented through an SPS upgrade instead of notes upon notes. 

See application component SCM-FRE-ERP.

RMAN 10.2 Block Corruption Checking – Physical, Logicial or Both

It’s an old topic, so I won’t dwell on the actual requirements or the process.

However, what I was not certain about, was whether RMAN in 10.2 (10gR2) would perform both physical *and* logical corruption checking if you use the command:

RMAN> BACKUP VALIDATE CHECK LOGICAL DATABASE;

I kept finding various documents with wording like that found here: https://docs.oracle.com/cd/B19306_01/backup.102/b14191/rcmbackp.htm#i1006353

“For example, you can validate that all database files and archived redo logs can be backed up by running a command as follows:

RMAN> BACKUP VALIDATE DATABASE ARCHIVELOG ALL;

This form of the command would check for physical corruption. To check for logical corruption,

RMAN> BACKUP VALIDATE CHECK LOGICAL DATABASE ARCHIVELOG ALL;"

It took a while, but I found the original document from Oracle here: https://docs.oracle.com/cd/B19306_01/backup.102/b14191/rcmconc1.htm#i1008614

Right at the bottom, it confirms that ordinarily “BACKUP VALIDATE DATABASE;” would check for physical corruption.
The additional keywords “CHECK LOGICAL” will check for logical corruption *in addition* to physical corruption.

So RMAN doesn’t need running twice with each validate command combination.

HowTo: Check HANA LM Is Running

Scenario: You want to check if the SAP HANA Lifecycle Manager is running/installed.

The SAP HANA Lifecycle Manager is installed separately the HANA DB and runs in its own Java VM.
It’s installed by default into the “/usr/sap/hlm_bootstraps” directory and occupies ~700MB of disk space.

By default the HLM is not usually started with the instance.  It gets started when you call it from the HANA Studio, or if you manually start it from the Linux command line using the bootstrap-hlm.sh script located in “/usr/sap/hlm_bootstraps/<SID>/HLM”.

From HANA Studio, right click the HANA instance as SYSTEM, then select “Lifecycle Management“:

image

From the command line on the Linux server, as the <sid>adm Linux user:

> cd /usr/sap/hlm_bootstraps/H10/HLM
> ./bootstrap-hlm.sh

You will be dropped into the OSGI (Open Service Gateway Interface, see here: https://www.osgi.org/) command line.