This blog contains experience gained over the years of implementing (and de-implementing) large scale IT applications/software.

Uplifting & Expanding Linux LVM Managed Disks in Azure

One of the great things about public cloud, is the potential to simply increase the data disk space for your systems.
In the on-premise, non-virtualised world, you would have needed to physically add more disk or swap the disk for a bigger unit.
Even in the on-premise virtualised world, you may have needed more actual disk to expand the virtual hard disk.

Sometimes those disks can be storing data files for databases.
In which case, if you have followed the Azure architecture best practices, then you will be using LVM or some other volume management layer on top of the raw disk device. This will give you more flexibility and performance (through striping).

In this guide I show how to increase the disk size by uplifting the data disks in Azure, then resizing the disk devices in Linux, allowing us eventually to grow the XFS file system to a larger size.
I will discuss good reasons to uplift vs adding extra data disks.

My Initial Setup

In this step-by-step, we will be using Linux (SUSE Enterprise Linux Server 12) with LVM as our volume management software.
The assumption is that you have already created your data disks (2 of them) and striped across those disks with a single logical volume.
(Remember, the striping gives you double the IOPS when reading/writing the data to/from the disks).

In my simple example, I used the following the create my LVM setup:

  1. Add 2x data disks of size 128GB (I used 2x S10) to a VM running SLES 12 using the Azure Portal.
  2. Create the physical volumes (mine were sdd and sde on LUNs 1 and 2):
    pvcreate /dev/sdd
    pvcreate /dev/sde

  3. Create the volume group:
    vgcreate volTMP /dev/sdd /dev/sde
  4. Create the striped logical volume using all the space:
    lvcreate -l +100%FREE volTMP/lvTMP1
  5. Create the file system using XFS:
    mkfs.xfs /dev/mapper/volTMP-lvTMP1
  6. Mount the file system to a new mount point:
    mkdir /BIGSTRIPEDDISK
    mount /dev/mapper/volTMP-lvTMP1 /BIGSTRIPEDDISK

In Azure my setup looked like the below (I already had 1 data disk, so I added 2 more):

In the VM, we can see the file system is mounted and has a size of 256GB (2x 128GB disks):

You can double check the striping using the lvdisplay command with “-m” flag:

Once the disk was setup, I then created a simple text file with ASCII text inside:

I also used “dd” to create a large 255GB file (leaving 1GB free):

dd if=/dev/zero of=./mybigfile.data bs=1024k count=261120

The disk usage is now close to 100%:

I ran a checksum on the large file:

Value is: 3494419206

With the checksum completed (it took a few minutes), I now have a way of checking that my file is the same before/after the disk resize, plus the cksum tool will force reading of the whole file (checking for filesystem I/O issues).

Increasing the Data Disk Size

Within the Azure portal, we first need to stop the VM:

Once stopped, we can go to each of the two data disks and uplift from an S10 (in my example) to an S15 (256GB):

We can now start the VM up again:

When the VM is running again, we can log in and check.
Our file system is the same size:

We check with the LVM command “pvdisplay” to display one of the physical disks, and we can see that the size has not changed, it is still 128GB:

We need to make LVM re-scan the disk to make it aware of the new increased size. We use the pvresize command:

Re-checking the disk using pvdisplay, and we can see it has increased to 256GB in size:

We do the same for the /dev/sde disk:

Once the physical disks are resized (in the eyes of LVM), we can now check the volume group:

We have now got 256GB of free space (see row: “Free PE / Size”) in our volume group.

To allow our file system to get this space, the logical volume within the volume group needs to be expanded into the free space.
We use the lvresize command to make our logical volume use all free space in the volume group “+100%FREE”:

NOTE: It is also possible to specify an exact size should you want to be specific.

Our file system is still only 256GB in size, until we resize it.
For XFS file systems, we use the xfs_growfs command as follows:

Checking the file system now, shows we have 512GB of free space (50% free):

Are my files still present? Yes:

Let’s check the contents of my text file:

Finally, I validate that my big data file has not been corrupted:

Value is: 3494419206

What is the Alternative to Uplifting?

Instead of uplifting the existing data disks, it is possible to increase the amount of storage in my volume group, by adding two new additional disks.
To prevent performance issues, these new disks should be of the same scale level (S10) as the existing disks.
You should definitely not be mixing disk types in a logical volume, so to prevent this, you should not mix them in a volume group (even though you could technically separate them at the logical volume level).

Is there a good reason when to add more disks? When you are going to create a new logical volume, it is ideal to keep the data on separate physical disks to help avoid data-loss (from a lost/deleted disk).
There are also performance reasons to have additional Linux devices, since parameters such as queue depth affect the Linux device level. The Linux O/S can effectively issue more simultaneous read requests since additional data disks are additional devices.

As we have seen, uplifting a disk tier, you will need to take the VM offline or detach the disk. Adding additional new disks on the other hand, you can do this all online. Adding new disks does pose a slight issue if you have one large logical volume with striping, since any new disk quantity needs to match the existing stripe layout (e.g. if you have 2 disks striped, you should add another 2 disks to increase the volume), and the striping balance will only ever be over the quantity of disks in the original striping, not over the new quantity of disks (e.g. striped over 2 disks, even if you have added another 2 to make 4).

Is there a good reason when to not add more disks? When you know you could exceed the VM data disk count limitations. Each VM has a limit, the bigger the VM, the bigger the limit.

When you know that you always leave a small proportion of disk space free. Adding more disks which will only ever be max 80% used, is more wasteful compared to upscaling an existing set of disks which will only ever be 80% used.

Summary

Using the power of Azure, we have increased the data disk sizes of our VM.
The increase needed the VM to be stopped and started again while the disks were uplifted from an S10 (128GB) to an S15 (256GB).

Once uplifted, we had to make LVM aware of the new disk sizes by using the pvresize command, then the free space in the volume group was given to the logical volume and finally the file system was grown.
To maintain the logical volume stripe, both disks were uplifted, then both disks needed the pvresize command running.
We validated the before and after state of our ASCII and data files and they were not corrupted.

Finally, we looked at the alternative to uplifting and saw that uplifting may not be appropriate in all cases.
Adding new disks can be done online.

Best Disk Topology for SAP ASE Databases on Azure

Maybe you are considering migration of on-premise SAP ASE databases to Microsoft Azure, or you may be considering migrating from your existing database vendor to SAP ASE on Azure.
Either way, you will benefit from understanding a good, practical disk topology for SAP ASE on Azure.

In this post, I show how you can optimise use of the SAP ASE, Linux and Azure technical layers to provide a balanced approach to disk use, considering both performance and disk (ASE device) management.

The Different Layers

In an ASE on Linux on Azure (IaaS) setup, you have the following layers:

  • Azure Storage Services
  • Azure Data Disk Cache Settings
  • Linux Physical Disks
  • Linux Logical Volumes
  • Linux File Systems
  • ASE Database Data Devices
  • ASE Instance

Each layer has different options around tuning and setup, which I will highlight below.

Azure Storage Services

Starting at the bottom of the diagram we need to consider the Azure Disk Storage that we wish to use.
There are 2 design considerations here:

  • size of disk space required.
  • performance of disk device.

For performance, you are more than likely tied by the SAP requirements for running SAP on Azure.
Currently these require a minimum of Premium SSD storage, since it provides a guaranteed SLA. However, as of June 2020, Standard SSD was also given an SLA by Microsoft, potentially paving the way for cheaper disk (when certified by SAP) provided that it meets your SLA expectations.

Generally, the size of disk determines the performance (IOPS and MBps), but this can also be influenced by the quantity of data disk devices.
For example, by using 2 data disks striped together you can double the available IOPS. The IOPS are an important factor for databases, especially on high throughput database systems.

When considering multiple data disks, you also need to remember that each VM has limitations. There is a VM level IOPS limit, a VM level throughput limit (megabytes per second) plus a limit to the number of data disks that can be attached. These limit values are different for different Azure VM types.

Also remember that in Linux, each disk device has its own set of queues and buffers. Making use of multiple Linux disk devices (which translates directly to the number of Azure data disks) usually means better performance.

Essentials:

  • Choose minimum of Premium SSD (until Standard SSD is supported by SAP).
  • Spread database space requirements over multiple data disks.
  • Be aware of the VM level limits.

Azure Data Disk Cache Settings

Correct configuration of the Azure data disk cache settings on the Azure VM can help with performance and is an easy step to complete.
I have already documented the best practice Azure Disk Cache settings for ASE on Azure in a previous post.

Essentials:

  • Correctly set Azure VM disk cache settings on Azure data disks at the point of creation.

Use LVM For Managing Disks

Always use a logical volume manager, instead of formatting the Linux physical disk devices directly.
This allows the most flexibility for growing, shrinking and striping the disks for size and performance.

You should stripe the data logical volumes with a minimum of 2 physical disks and a maximum stripe size of 128KB (test it!). This fits within the window of testing that Microsoft have performed in order to achieve the designated IOPS for the underlying disk. It’s also the maximum size that ASE will read at. Depending on your DB read/write profile, you may choose a smaller stripe size such as 64KB, but it depends on the amount of Large I/O and pre-fetch. When reading the Microsoft documents, consider ASE to be the same as MS SQL Server (they are are from the same code lineage).

Stripe the transaction log logical volume(s) with a smaller stripe size, maybe start at 32KB and go lower but test it (remember HANA is 2KB stripe size for log volumes, but HANA uses Azure WriteAccelerator).

Essentials:

  • Use LVM to create volume groups and logical volumes.
  • Stipe the data logical volumes with (max) 128KB stripe size & test it.

Use XFS File System

You can essentially choose to use your preferred file system format; there are no restrictions – see note 405827.
However, if you already run or are planning to run HANA databases in your landscape, then choosing XFS for ASE will make your landscape architecture simpler, because HANA is recommended to run on an XFS file system (when on local disk) on Linux; again see SAP note 405827.

Where possible you will need to explicitly disable any Linux file system write barrier caching, because Azure will be handling the caching for you.
In SUSE Linux this is the “nobarrier” setting on the mount options of the XFS partition and for EXT4 partitions it is option “barrier=0”.

Essentials:

  • Choose disk file system wisely.
  • Disable write barriers.

Correctly Partition ASE

For SAP ASE, you should segregate the disk partitions of the database to avoid certain system specific databases or logging areas, from filling other disk locations and causing a general database system crash.

If you are using database replication (maybe SAP Replication Server a.k.a HADR for ASE), then you will have additional replication queue disk requirements, which should also be segregated.

A simple but flexible example layout is:

Volume
Group
Logical
Volume
Mount PointDescription
vg_aselv_ase<SID>/sybase/<SID>For ASE binaries
vg_sapdatalv_sapdata<SID>_1./sapdata_1One for each ASE device for SAP SID database.
vg_saploglv_saplog<SID>_1./saplog_1One for each log device for SAP SID database.
vg_asedatalv_asesec<SID>./sybsecurityASE security database.
lv_asesyst<SID>./sybsystemASE system databases (master, sybmgmtdb).
lv_saptemp<SID>./saptempThe SAP SID temp database.
lv_asetemp<SID>./sybtempThe ASE temp database.
lv_asediag<SID>./sapdiagThe ASE saptools database.
vg_asehadrlv_repdata<SID>./repdataThe HADR queue location.
vg_backupslv_backups<SID>./backupsDisk backup location.

The above will allow each disk partition usage type to be separately expanded, but more importantly, it allows specific Azure data disk cache settings to be applied to the right locations.
For instance, you can use read-write caching on the vg_ase volume group disks, because that location is only for storing binaries, text logs and config files for the ASE instance. The vg_asedata contains all the small ASE system databases, which will not use too much space, but could still benefit from read caching on the data disks.

TIP: Depending on the size of your database, you may decide to also separate the saptemp database into its own volume group. If you use HADR you may benefit from doing this.

You may not need the backups disk area if you are using a backup utility, but you may benefit from a scratch area of disk for system copies or emergency dumps.

You should choose a good naming standard for volume groups and logical volumes, because this will help you during the check phase, where you can script the checking of disk partitioning and cache settings.

Essentials:

  • Segregate disk partitions correctly.
  • Use a good naming standard for volume groups and LVs.
  • Remember the underlying cache settings on those affected disks.

Add Whole New ASE Devices

Follow the usual SAP ASE database practices of adding additional ASE data devices on additional file system partitions sapdata_2, sapdata_3 etc.
Do not be tempted to constantly (or automatically) expand the ASE device on sapdata_1 by adding new disks, you will find this difficult to maintain because striped logical volumes need at least 2 disks in the stripe set.
It will get complicated and is not easy to shrink back from this.

When you add new disks to an existing volume group and then expand an existing lv_sapdata<SID>_n logical volume, it is not as clean as adding a whole new logical volume (e.g. lv_sapdata<SID>_n+1) and then adding a whole new ASE data device.
The old problem of shrinking data devices is more easily solved by being able to drop a whole ASE device, instead of trying to shrink one.

NOTE: The Microsoft notes suggest enabling automatic DB expansion, but on Azure I don’t think it makes sense from a DB administration perspective.
Yes, by adding a new ASE device, as data ages you may end up with “hot” devices, but you can always move specific devices around and add more underlying disks and re-stripe etc. Keep the layout flexible.

Essentials:

  • Add new disks to new logical volumes (sapdata_n+1).
  • Add big whole new ASE devices to the new LVs.

Summary:

We’ve been through each of the layers in detail and now we can summarise as follows:

  • Choose a minimum of Premium SSD.
  • Spread database space requirements over multiple data disks.
  • Correctly set Azure VM disk cache settings on Azure data disks at the point of creation.
  • Use LVM to create volume groups and logical volumes.
  • Stipe the logical volumes with (max) 128KB stripe size & test it.
  • Choose disk file system wisely.
  • Disable write barriers.
  • Segregate disk partitions correctly.
  • Use a good naming standard for volume groups (and LVs).
  • Remember the underlying cache settings on those affected disks.
  • Add new disks to new logical volumes (sapdata_n).
  • Add big whole new ASE devices to the new LVs.

Useful Links:

Azure Disk Cache Settings for an SAP Database on Linux

One of your go-live tasks once you have built a VM in Azure, should be to ensure that the Azure disk cache settings on the Linux VM data disks, are set correctly in accordance with the Microsoft recommended settings.
In this post I explain the disk cache options and how they apply to SAP and especially to SAP databases such as SAP ASE and SAP HANA, to ensure you get optimum performance.

What Are the Azure Disk Cache Settings?

In Microsoft Azure you can configure different disk cache settings on data disks that are attached to a VM.
NOTE: You do not need to consider changing the O/S root disk cache settings, as by default they are applied as per the Azure recommendations.

Only specific VMs and specific disks (Standard or Premium Storage) have the ability to use caching.
If you use Azure Standard storage, the cache is provided by local disks on the physical server hosting your Linux VM.
If you use Azure Premium storage, the cache is provided by a combination of RAM and local SSD on the physical server hosting your Linux VM.

There are 3 different Azure disk cache settings:

  • None
  • ReadOnly (or “read-only”)
  • ReadWrite (or “read/write”)

The cache settings can influence the performance and also the consistency of the data written to the Azure storage service where your data disks are stored.

Cache Setting: None

By specifying “None” as the cache setting, no caching is used and a write operation at the VM O/S level is confirmed as completed once the data is written to the storage service.
All read operations for data not already in the VM O/S file system cache, will be read from the storage service.

Cache Setting: ReadOnly

By specifying “ReadOnly” as the cache setting, a write operation at the VM O/S level is confirmed as completed once the data is written to the storage service.
All read operations for data not already in the VM O/S file system cache, will be read from the read cache on the underlying physical machine, before being read from the storage service.

Cache Setting: ReadWrite

By specifying “ReadWrite” as the cache setting, a write operation at the VM O/S level is confirmed as completed once the data is written to the cache on the underlying physical machine.
All read operations for data not already in the VM O/S file system cache, will be read from the read cache on the underlying physical machine, before being read from the storage service.

Where Do We Configure the Disk Cache Settings?

The disk cache settings are configured in Azure against the VM (in the Disks settings), since the disk cache is both physical host and VM series dependent. It is *not* configured against the disk resource itself, as explained in my previous blog post: Listing Azure VM DataDisks and Cache Settings Using Azure Portal JMESPATH & Bash

Any Recommendations for Disk Cache Settings?

There are specific recommendations for Azure disk cache settings, especially when running SAP and especially when running databases like SAP ASE or SAP HANA.

In general, the rules are:

Disk UsageAzure Disk Cache Setting
Root O/S disk (/)ReadWrite – ALWAYS!
HANA SharedReadOnly
ASE Home
(/sybase/<SID>)
ReadOnly
Database DataHANA=None, ASE=ReadOnly
Database LogNone

The above settings for SAP ASE have been obtained from SAP note 2367194 (SQL Server is same as ASE) and from the general deployment guide here: https://docs.microsoft.com/en-us/azure/virtual-machines/workloads/sap/dbms_guide_general
The use of write caching on the ASE home is optional, you could choose ReadOnly, it would help protect the ASE config file in a very specific scenario. It is envisaged that using ASE 16.0 with SRS/HADR you would have a separate data disk for the Replication Server data (I’ll talk about this in another post).

The above settings for HANA have been taken from the updated guide here: https://docs.microsoft.com/en-us/azure/virtual-machines/workloads/sap/hana-vm-operations-storage which is designed to meet the KPIs mentioned in SAP note 2762990.

The reason for not using a write cache every time, is because an issue at the physical host level, affecting the cache, could cause the application (e.g database) to think it has committed data, when it actually isn’t written to disk. This is not good for databases, especially if the issue affects the transaction/redo log area. Data loss could occur.

It’s worth noting that this cache “issue” has always been true of every caching technology ever created, on which databases run. Storage tech vendors try to mitigate this by putting batteries into the storage appliances, but since the write cache in Azure is at the physical host level, there’s just no guarantee that when the VM O/S thinks the write operation has committed to disk, that it has actually been written to disk.

How About Write Accelerator?

There are specific Azure VM series (M-series at current) that support something known as “Write Accelerator”.
This is an extra VM level setting for Premium Storage disks attached to M-series VMs.

Enabling the Write Accelerator setting is a requirement by Microsoft for production SAP HANA transaction log disks on M-Series VMs. This setting ebales the Azure VM to meet the SAP HANA key performance indicators in note 2762990. Azure Write Accelerator is designed to provide lower latency write times on Premium Storage.

You should ensure that the Write Accelerator setting is enabled where appropriate, for your HANA database transaction log disks. You can check if it is enabled following my previous blog post: Listing Azure VM DataDisks and Cache Settings Using Azure Portal JMESPATH & Bash

I’ve tried my best to find more detailed information on how the Write Accelerator feature is actually provided, but unfortunately it seems very elusive. Robert Boban (of Microsoft) commented on a LinkedIn post here: “It is special caching impl. for M-Series VM to fulfill SAP HANA req. for <1ms latency between VM and storage layer.“.

Check the IOPS

Once you have configured your disks and the cache settings, you should ensure that you test the IOPS achieved using the Microsoft recommended process.
You can follow similar steps as my previous post: Recreating SAP ASE Database I/O Workload using Fio on Azure

As mentioned in other places in the Microsoft documentation and SAP notes such as 2367194, you need to ensure that you choose the correct size and series of VM to ensure that you align the required VM maximum IOPS with the intended amount of data disks and their potential IOPS maximum. Otherwise you could hit the VM max IOPS before touching the disk IOPS maximum.

Enable Accelerated Networking

Since the storage is itself connected to your VM via the network, you should ensure that Accelerator Networking is enabled in your VMs Network Settings:

Checking Cache Settings Directly on the VM

As per my previous post Checking Azure Disk Cache Settings on a Linux VM in Shell, you can actually check the Azure disk cache settings on the VM itself. You can do it manually, or write a script (better option for whole landscape validation).

Summary:

I discussed the two types of storage (standard or premium) that offer disk caching, plus where in Azure you need to change the setting.
The table provided a list of cache settings for both SAP ASE and SAP HANA databases and their data disk areas, based on available best-practices.

I mentioned Write Accelerator for HANA transaction log disks and ensuring that you enable Accelerated Networking.
Also provided was a link to my previous post about running a check of IOPS for your data disks, as recommended by Microsoft as part of your go-live checks.

A final mention was made another post of mine, with a great way of checking the disk cache settings across the VMs in the landscape.

Useful Links:

Windows File Cache

https://docs.microsoft.com/en-us/azure/virtual-machines/linux/premium-storage-performance

https://docs.microsoft.com/en-us/azure/virtual-machines/windows/how-to-enable-write-accelerator

https://docs.microsoft.com/en-us/azure/virtual-machines/workloads/sap/hana-vm-operations-storage#production-storage-solution-with-azure-write-accelerator-for-azure-m-series-virtual-machines

https://petri.com/digging-into-azure-vm-disk-performance-features

https://techcommunity.microsoft.com/t5/running-sap-applications-on-the/sap-on-azure-general-update-march-2019/ba-p/377456

https://docs.microsoft.com/en-us/azure/virtual-machines/workloads/sap/dbms_guide_general

https://docs.microsoft.com/en-us/azure/virtual-machines/workloads/sap/hana-vm-operations-storage

SAP Note 2762990 – How to interpret the report of HWCCT File System Test

SAP Note 2367194 – Use of Azure Premium SSD Storage for SAP DBMS Instance

Checking Azure Disk Cache Settings on a Linux VM in Shell

In a previous blog post, I ended the post by showing how you can use the Azure Enhanced Monitoring for Linux to obtain the disk cache settings.
Except, as we found, it doesn’t easily allow you to relate the Linux O/S disk device names and volume groups, to the Azure data disk names.

You can read the previous post here: Listing Azure VM DataDisks and Cache Settings Using Azure Portal JMESPATH & Bash

In this short post, I pick up where I left off and outline a method that will allow you to correlate the O/S volume group name, with the Linux O/S disk devices and correlate those Linux disk devices with the Azure data disk names, and finally, the Azure data disks with their disk cache settings.

Using the method I will show you, you will see how easily you can verify that the disk cache settings are consistent for all disks that make up a single volume group (very important), and also be able to easily associate those volume groups with the type of usage of the underlying Azure disks (e.g. is it for database data, logs or executable binaries).

1. Check If AEM Is Installed

Our first step is to check if the Azure Enhanced Monitoring for Linux (AEM) extension is installed on the Azure VM.
This extension is required, for your VM to be supported by SAP.

We use standard Linux command line to check for the extension on the VM:

ls -1 /var/lib/waagent/Microsoft.OSTCExtensions.AzureEnhancedMonitorForLinux-*/config/0.settings

The listing should return at least 1 file called “0.settings”.
If you don’t have this and you don’t have a directory starting with “Microsoft.OSTCExtensions.AzureEnhancedMonitorForLinux-“, then you don’t have AEM and you should get it installed following standard Microsoft documentation.

2. Get the Number of Disks Known to AEM

We need to know how many disks AEM knows about:

grep -c 'disk;Caching;' /var/lib/AzureEnhancedMonitor/PerfCounters

3. Get the Number of SCSI Disks Known to Linux

We need to know how many disks Linux knows about (we exclude the root disk /dev/sda):

lsscsi --size --size | grep -cv '/dev/sda'

4. Compare Disk Counts

Compare the disks quantity from AEM and from Linux.  They should be the same.  This is the number of data disks attached to the VM.

If you have a lower number from the AEM PerfCounters file, then you may be suffering the effects of an Azure bug in the AEM extension which is unable to handle more than 9 data disks.
Do you have more than 9 data disks?

At this point if you do not have matching numbers, then you will not be able to continue, as the AEM output is vital in the next steps.

Mapping Disks to the Cache Settings

Once we know our AEM PerfCounters file contains all our data disks, we are now ready to map the physical volumes (on our disk devices) to the cache settings. On the Linux VM:

pvs -o "pv_name,vg_name" --separator=' ' --noheadings

Your output should be a list of disks and their volume groups like so (based on our diagram earlier in the post):

/dev/sdc vg_data
/dev/sdd vg_data

Next we look for a line in the AEM PerfCounters file that contains that disk device name, to get the cache setting:

awk -F';' '/;disk;Caching;/ { sub(/\/dev\//,"",$4); printf "/dev/%s %s\n", tolower($4), tolower($6) }' /var/lib/AzureEnhancedMonitor/PerfCounters

The output will be the Linux disk device name and the Azure data disk cache setting:

/dev/sdc none
/dev/sdd none

For each line of disks from the cache setting, we can now see what volume group it belongs to.
Example: /dev/sdc is vg_data and the disk in Azure has a cache setting of “none”.

If there are multiple disks in the volume group, they all must have the same cache setting applied!

Finally, we look for the device name in the PerfCounters file again, to get the name of the Azure disk:

NOTE: Below is looking specifically for “sdc”.

awk -F';' '/;Phys. Disc to Storage Mapping;sdc;/ { print $6 }' /var/lib/AzureEnhancedMonitor/PerfCounters

The output will be like so:

None sapserver01-datadisk1
None sapserver01-datadisk2

We can ignore the first column output (“None”) in the above, it’s not needed.

Summary

If you package the AEM disk count check and the subsequent AEM PerfCounters AWK scripts into one neat script with the required loops, then you can get the output similar to this, in one call:

/dev/sdd none vg_data sapserver01-datadisk2
/dev/sdc none vg_data sapserver01-datadisk1
/dev/sda readwrite

Based on the above output, I can see that my vg_data volume group disks (sdc & sdd) all have the correct setting for Azure data disk caching in Azure for a HANA database data disk location.

Taking a step further, if you have intelligently named your volume group names, you then also check in your script, the cache setting based on the name of the volume group to determine if it is correct, or not.
You can then embed this validation script into a “custom validation” within SAP LaMa and it will alert you automatically if your VM disk cache settings are not correct.

You may be wondering, why not do all this from the Azure Portal?
Well, the answer to that is that you don’t know what Linux VM volume groups those Azure disks are used by, unless you have tagged them or named them intelligently in Azure.

With SAP LaMa you can auto-save on your Azure Managed Disk Costs

By now, most people know that after you’ve moved your SAP landscape to the cloud, you could save hosting costs by shutting down SAP system VMs when they are expected to not be used.
(There are caveats around this as it depends on whether you’re paying for reserved instances).

But did you know there’s also an extra saving that can be had in the cloud?

For SAP to support your SAP systems in Microsoft Azure, you must use Premium tier storage.
The reason for this is primarily because Premium tier storage comes with an SLA from Microsoft, which means you are expected to receive a certain level of performance from those disks.
However, you pay more for this SLA and the proposed performance. Which is quite correct, when you’re using the disk but what about when you’re not using the disk?

Right now, in the Azure “West Europe” region, a Premium tier P10 disk (SSD, 128GiB in size with 500 IOPS and 100MB/s throughput), will cost you £16.16 per month, excluding any deals and discounts (such as Azure Managed Disk Reservation).
The P10 is probably the work-horse of the majority of mid-sized server estates. Microsoft recommend a P10 as the Linux root disk for SUSE Linux based HANA database M-Series Azure VMs.

At the other end, the cost of a Standard tier E10 disk (SSD, 128GiB with 500 IOPS and 60MB/s throughput) is £7.16 per month, with the only performance difference being the throughput and the SLA:

So for the same size disk, although with lower throughput, we pay £9 per month less (55% less). I am going to say this saving is roughly 30 pence per day.

(There is one caveat and that is for standard SSD disks like the E10, you pay a transaction fee of 0.1 pence (£0.001) on the disk for every 10,000 256KiB I/O operations.
However, we will see that this transaction fee will not impact us and our saving, in a moment.)

Here’s how we can save money on this Premium managed disk.

In Microsoft Azure, you can change the disk tier from Premium to Standard, when the VM on which the disk is attached is shutdown (deallocated).
It’s simple, you just use the Azure Portal to change the disk configuration once the VM is shutdown.

While this is nice for just a couple of disks, this is not something you’re going to want to do on a regular basis.
Don’t forget, before you start the VM you need to switch the disk tier back to Premium (to retain your SAP support).
So for mass-changes, you may want to use PowerShell to adjust the disks before starting the VMs.
This itself could become a bit of a burden, since you now lose the ability to mass-power-on VMs from the Azure Portal completely.  You would need to use PowerShell all of the time, or setup an Azure based operation schedule (a.k.a. Power Automate – previously Microsoft Flow).

This is where SAP Landscape Manager (LaMa) really comes into its own.
With SAP LaMa, your BASIS team can:

  • Perform the start-up & shutdown of the SAP relevant VMs.
  • Perform the start-up & shutdown of the SAP systems on the VMs once they have been started (or the reverse).
  • Use the inbuilt scheduling capability of SAP LaMa to schedule the VM and SAP system operations (full automation of start-up and shutdown operations of the whole stack).

The security capabilities of Azure, coupled with SAP LaMa mean that the BASIS team can only perform specific VM related operations on the SAP VMs. Which gives the cloud Ops team peace of mind.

Now for the best bit.
To be able to save money on managed disk costs in Azure, the SAP BASIS administrator has to merely tick a tickbox in the SAP LaMa cloud provider settings, to “Change Storage Type to save costs”:

The next time the VM is de-allocated, SAP LaMa automatically changes the disk configuration in Azure, to a lower cost disk tier.
As we mentioned earlier, since the start/stop is controlled by SAP LaMa, it knows to switch the disk back to Premium tier during the start-up operation.

How simple is that!

As mentioned, there are some complications around any reservation payments for managed disk, so you need to understand what you’re paying for, before just enabling the tick-box!

Here are my very basic calcs for our P10/E10 disk combination example:

  • Weekends per year: 52
  • Saving per weekend: 60 pence
  • Total possible saving per year for 1 disk if it was unused every weekend: £31.20

Now let’s imagine that saving opportunity was applied across your 100 server estate, whereby every server had at least 1x P10 disk.
You can’t shutdown production, because it’s 24/7, but you don’t do development & testing round-the-clock and you have no international locations, so we are going to imagine our SAP estate is maybe 70% applicable to this saving opportunity. That’s 70 servers x £31.20 equals a saving of £2,184 per year on managed disk, by ticking a tickbox.

These are obviously just best guesses, but it shows how costs can build up and can also be reduced.

Happy ticking.