This blog contains experience gained over the years of implementing (and de-implementing) large scale IT applications/software.

Listing & Tagging Orphaned Azure Disks using Azure CLI from Bash Cloud Shell

Nobody likes wastage, but it happens.
Finding orphaned Azure disks is super simple using the Cloud Shell.

In this post I show how you can use the CLI, with a little Bash scripting in the Cloud Shell, to find unattached disks and also tag them for future removal.
There are plenty of PowerShell examples in the PowerShell Runbook gallery, but I want to show a CLI example.
I then go on to show how this could be scheduled, but it’s not as simple as first thought.

Using Bash Cloud Shell & the Az Command (CLI)

We can use the az command (CLI) in a bash Cloud Shell to list the disks, then use JMESPath to filter and find the ones that are not attached (“Unattached”) to any VM:

az disk list --query "[?diskState=='Unattached'].{name:name,state:diskState,size:diskSizeGb,sku:sku.name,tag:to_string(tags)}" -o tsv

ase01_datadisk_2        Unattached      256     Standard_LRS    null
ase01_OsDisk_1_8e31587ee6604463ada5167f91e6345f Unattached      30      StandardSSD_LRS null

Notice I have also included the “tags” column with some processing.
If you wanted to search for disks that are not attached, and filter by tag, then you can do the following.
First I apply a tag; I’m going to create a tag called “testTag”, with a value “testTagValue”:

az disk update --name ase01_datadisk_2 --resource-group UK-West --set tags.testTag=testTagValue

Now I have set a value for “testTag”, let’s query based on that specific tag, for “Unattached” disks:

az disk list --query "[?diskState=='Unattached'&&tags.testTag=='testTagValue'].{name:name,state:diskState,size:diskSizeGb,sku:sku.name,tag:to_string(tags)}" -o tsv

ase01_datadisk_2        Unattached      256     Standard_LRS    {"testTag":"testTagValue"}

You can search for a specific value, contains a part of a value, or alternatively, search for a value that is “null”.

Adjusting the Disks

Now we have a list of disks output from our query, we could just delete them.
But it’s probably better to tag them for deletion, allowing a deletion at some point after a more detailed review. I would look to produce a report that could go to a CAB review, ready for deletion.

Here’s how we can use the power of the Cloud Shell, the Azure CLI and Bash scripting to tag those disks that are Unattached:

set updated=0
set failed=0
set tabchar="$(printf "\t")"
set deldate="$(date -d "+ 30 days" "+%Y%m%d")"
az disk list --query "[?diskState=='Unattached'].{name:name,rg:resourceGroup}" -o tsv | while read line
do
   diskname="${line%%${tabchar}*}"
   rgname="${line##*${tabchar}}"
   echo -n "Updating: $diskname ... "
   az disk update --name ${diskname} --resource-group ${rgname} --set tags.deleteDate=${deldate} >/dev/null
   if [[ $? -eq 0 ]] ; then 
      echo "[SUCCESS]"
      (( updated++ ))
    else
      echo "[FAILED]"
      (( failed++ ))
   fi
done
printf "### SUMMARY ### \n Updated: %s\n Failed: %s\n" $updated $failed

Let’s look at the above script in detail below:

  1. We define the variable to hold our updated count.
  2. We define the variable to hold our failed count.
  3. We execute the CLI query to get the list of disks, outputting the disk name and resource group into a “do” loop a line at a time.
  4. The start of the loop body.
  5. We have to capture the character that represents a TAB character.
  6. We get the disk name out of the line by splitting the line by TAB from the left.
  7. We get the disk resource group out of the line by splitting the line by TAB from the right.
  8. A little output text to say which disk is being worked on.
  9. The call to the CLI to update the disk, applying the tag “deleteDate” with a value of today + 30 days in the format yyyymmdd.
  10. We detect the successful (or not) execution of the CLI command.
  11. Output “success” for a successful execution.
  12. Update the success variable count.
  13. Alternatively, if we failed.
  14. Output “failed” for a failed execution.
  15. Update the failed variable count.
  16. Closure of the “if” statement.
  17. Closure of the loop.
  18. Output a summary count of success vs failed.

Here is the execution sample:

Reporting on the deleteDate Tag

Now we have our disks tagged ready for deletion, we might want to find disks that have a deleteDate older than a specific date. These disks would then be ripe for deletion.
Here’s how we can do that,:

az disk list --query "[?tags.deleteDate<'20210402'].{name:name,rg:resourceGroup,tag:to_string(tags)}" -o tsv

We can even go as far as using Bash to inject the current date into the query, thereby allowing us to look for any disks due for deletion from today without adjusting the query each time:

az disk list --query "[?tags.deleteDate<'$(date "+%Y%m%d")'].{name:name,rg:resourceGroup,tag:to_string(tags)}" -o tsv

How About Scheduled Execution?

Well, using the CLI and the Bash shell inside an Azure Automation account is not possible right now.
Instead we would need to convert our disk update scripted CLI example to PowerShell to make it Runbook compatible, then we can schedule inside an Azure Runbook as a PowerShell Runbook.
The code needs to change slightly because we need to use the automation “RunAs Account” feature to connect from our Runbook, but it looks very much like our CLI code in Bash:

IMPORTANT: All the “simple” help guides on creating a PowerShell Runbook fail to mention the need to actually import the required PowerShell modules that you will need to run your code. In the Automation Account section, look at the “modules”. For the code below you need 2 additional modules: Az.Accounts and Az.Compute which you can import from the Gallery. Seems simple, but as I said, not obvious.

$Conn = Get-AutomationConnection -Name 'AzureRunAsConnection'
try{
    $auth = Connect-AzAccount -ServicePrincipal -Tenant $Conn.TenantID -ApplicationId $Conn.ApplicationID -CertificateThumbprint $Conn.CertificateThumbprint
} Catch {
    throw $_.Exception
}
$updated=0
$failed=0
$deldate=(get-date -Format "yyyyMMdd" -date $([datetime]::parseexact($(get-date -format 'yyyyMMdd'), 'yyyyMMdd', $null)).AddDays(+30))
$updateConfig = New-AzDiskUpdateConfig -tag @{ "deleteDate" = "$deldate" }
Get-AzDisk |? { $_.DiskState -eq "Unattached" } |% { 
   write-output "Updating: $($_.Name) ..."
   Update-AzDisk -ResourceGroupName $_.ResourceGroupName -DiskName $_.Name -DiskUpdate $updateConfig >$null 2>&1
   if ( $? -eq $true ) { 
      echo "   [SUCCESS]"
      $updated++
    }
    else {
      echo "   [FAILED]"
      $failed++
   }
}
echo "### SUMMARY ### `n Updated: ${updated}`n Failed: ${failed}`n"

With the above code in a PowerShell Runbook, we can test it:

Obviously if you will be scheduling the code above, then you may wish to change it slightly so that it excludes any orphaned disks already with a deleteDate tag.

Summary

Using the CLI, Cloud Shell and some Bash scripting, we have a simple mechanism to tag unattached disks, then use that tag to report on disks due for deletion after a specific date (or we could use today’s date).
This is a great solution for those with Bash shell scripting skills and as shown it is reasonably simple.

We have also looked at the possibility of scheduling the code and found that it is not possible for CLI. Instead a PowerShell Runbook is a possible solution that allows the scheduling of PowerShell code.
PowerShell could be your pain point, but it really isn’t that far from shell.

HowTo: Install Azure Enhanced Monitoring for Linux for SAP

One SAP support prerequisite for running SAP on Azure, is that you must have Azure Enhanced Monitoring for Linux installed onto the Azure Linux VMs where your SAP application runs (including DB servers). Details are in SAP note 2015553.

In this brief post I show how to check if it is already installed, then how to install it, without needing to install the Powershell Azure Cmdlets.

What is Azure Enhanced Monitoring for Linux?

Azure Enhanced Monitoring for Linux (AEM) is an Azure VM extension installed onto the target Linux VM.
The extension uses the Azure Instance Agent to pull additional telemetry information down onto the local VM, and places it into a file on the Linux file system called /var/lib/AzureEnhancedMonitor/PerfCounters.

This special file is pure ASCII text with data inside that is semi-colon separated.
You can use Linux command line utilities to query information from the file (it’s readable by any user).

The file is parsed by the SAP Host Agent (also installed on every SAP VM) and made available in the monitoring memory segment used by the Netweaver ABAP stack, with the data being visible in transaction ST06 (OS06).

How to Check if AEM Is Installed

There are a number of ways to check if Azure Enhanced Monitoring for Linux is installed on a VM:

  • Inside the VM in Linux we can check for the existence of file: “/var/lib/AzureEnhancedMonitor/PerfCounters”
  • Inside the VM in Linux we can check the extension home dir exists: “/var/lib/waagent/Microsoft.OSTCExtensions.AzureEnhancedMonitorForLinux-*”
  • In the Azure Portal, we can check the status of the extension in the Azure Portal:
  • In the Azure Cloud Shell, we can either Test or Get the AEM Extension to see if it is installed:
Get-AzVMAEMExtension -ResourceGroupName <RG-NAME> -VMName <VM-Name>
Test-AzVMAEMExtension -ResourceGroupName <RG-NAME> -VMName <VM-Name>

Installing AEM

There are two ways to install the Azure Enhanced Monitoring for Linux extension into a VM:

  • Using local PowerShell (on your computer) with the Azure Cmdlets installed.
    You will need to have the rights on the local machine to perform the install of the Azure Cmdlets.
    I will not cover this method as it is quite tedious to setup and the chances are that your PowerShell is locked down by your company and will not allow you to install the required Cmdlets.
  • Using Powershell in the Azure Portal Cloud Shell.
    This has all the required Cmdlets already installed, but to setup the Cloud Shell you will need rights in Azure to be able to create a Storage Account to use for your shell home location.

Out of the two options, I usually opt for the Cloud Shell. Once you have it setup, you will find you can use it for many other things and access it from anywhere!
In this post I will be using Cloud Shell to do the installation.

To install the AEM extension, we use Powershell commands to do the following sequence of tasks:

  • Obtain our subscription context.
  • Deploy the extension to the specific VM in the subscription.

Let’s start the Cloud Shell (NOTE: You will need a Storage Account for the Cloud Shell to work).
Go to the Azure Portal and click the button on the button bar:

Make sure that you are in a PowerShell Shell:

We may need to switch to a specific subscription.
We can list all subscriptions by calling Get-AzSubscription and filtering on the Id property:

Get-AzSubscription | Select-Object Id

We can then set the context of our Cloud Shell to the specific subscription Id as follows:

$context = Get-AzSubscription -SubscriptionId '<SubscriptionID>'
Set-AzContext -SubscriptionObject $context

Once the code has executed, we can check if the AEM extension is already installed:

Get-AzVMAEMExtension -ResourceGroupName <RG-NAME> -VMName <VM-Name>

If the AEM extension is already installed, then we will see output being returned from the Get command:

ResourceGroupName       : UK-West
VMName                  : vm01
Name                    : AzureEnhancedMonitorForLinux
Location                : ukwest
Etag                    : null
Publisher               : Microsoft.OSTCExtensions
ExtensionType           : AzureEnhancedMonitorForLinux
TypeHandlerVersion      : 3.0
Id                      : /subscriptions/mybigid/resourceGroups/UK-West/providers/Microsoft.Compute/virtualMachines
                          /vm01/extensions/AzureEnhancedMonitorForLinux
PublicSettings          : {
                            "cfg": [
                              {
                                "key": "vmsize",
                                "value": "Standard_D4s_v3"
                              },
                              {
                                "key": "vm.role",
                                "value": "IaaS"
                              },
                              {
                                "key": "vm.memory.isovercommitted",
                                "value": 0
                              },
                              {
                                "key": "vm.cpu.isovercommitted",
                                "value": 0
                              },
                              {
                                "key": "script.version",
                                "value": "3.0.0.0"
                              },
                              {
                                "key": "verbose",
                                "value": "0"
                              },
                              {
                                "key": "href",
                                "value": "http://aka.ms/sapaem"
                              },
                              {
                                "key": "vm.sla.throughput",
                                "value": 96
                              },
                              {
                                "key": "vm.sla.iops",
                                "value": 6400
                              },
                              {
                                "key": "wad.isenabled",
                                "value": 0
                              }
                            ]
                          }
ProtectedSettings       :
ProvisioningState       : Succeeded
Statuses                :
SubStatuses             :
AutoUpgradeMinorVersion : True
ForceUpdateTag          : 637516905202791108
EnableAutomaticUpgrade  :


If the AEM extension is not installed, not output will be seen from the “Get” command.
We can then install the AEM extension with the “Set-AzVMAEMExtension” command as follows:

Set-AzVMAEMExtension -ResourceGroupName <RG-NAME> -VMName <VM-Name>

The extension should be installed successfully.
If you need to remove it, you can use the “Remove-AzVMAEMExtension” command.

There is a “Test” command that you can call to test the AEM:

Test-AzVMAEMExtension -ResourceGroupName <RG-NAME> -VMName <VM-Name>

Finally, if you want to see the additional command line options, then use the standard “Get-Help” as follows:

Get-Help Set-AzVMAEMExtension -Full

Issues with AEM

There’s one known issue with Azure Enhanced Monitoring for Linux, the number of data disks reported in the PerfCounters file seems to be limited to 9.
This means that if you have more than 9 data disks, the performance data may not be visible in the file and therefore not visible in SAP.
It’s possible a fix is on the way.

Uplifting & Expanding Linux LVM Managed Disks in Azure

One of the great things about public cloud, is the potential to simply increase the data disk space for your systems.
In the on-premise, non-virtualised world, you would have needed to physically add more disk or swap the disk for a bigger unit.
Even in the on-premise virtualised world, you may have needed more actual disk to expand the virtual hard disk.

Sometimes those disks can be storing data files for databases.
In which case, if you have followed the Azure architecture best practices, then you will be using LVM or some other volume management layer on top of the raw disk device. This will give you more flexibility and performance (through striping).

In this guide I show how to increase the disk size by uplifting the data disks in Azure, then resizing the disk devices in Linux, allowing us eventually to grow the XFS file system to a larger size.
I will discuss good reasons to uplift vs adding extra data disks.

My Initial Setup

In this step-by-step, we will be using Linux (SUSE Enterprise Linux Server 12) with LVM as our volume management software.
The assumption is that you have already created your data disks (2 of them) and striped across those disks with a single logical volume.
(Remember, the striping gives you double the IOPS when reading/writing the data to/from the disks).

In my simple example, I used the following the create my LVM setup:

  1. Add 2x data disks of size 128GB (I used 2x S10) to a VM running SLES 12 using the Azure Portal.
  2. Create the physical volumes (mine were sdd and sde on LUNs 1 and 2):
    pvcreate /dev/sdd
    pvcreate /dev/sde

  3. Create the volume group:
    vgcreate volTMP /dev/sdd /dev/sde
  4. Create the striped logical volume using all the space:
    lvcreate -l +100%FREE volTMP/lvTMP1
  5. Create the file system using XFS:
    mkfs.xfs /dev/mapper/volTMP-lvTMP1
  6. Mount the file system to a new mount point:
    mkdir /BIGSTRIPEDDISK
    mount /dev/mapper/volTMP-lvTMP1 /BIGSTRIPEDDISK

In Azure my setup looked like the below (I already had 1 data disk, so I added 2 more):

In the VM, we can see the file system is mounted and has a size of 256GB (2x 128GB disks):

You can double check the striping using the lvdisplay command with “-m” flag:

Once the disk was setup, I then created a simple text file with ASCII text inside:

I also used “dd” to create a large 255GB file (leaving 1GB free):

dd if=/dev/zero of=./mybigfile.data bs=1024k count=261120

The disk usage is now close to 100%:

I ran a checksum on the large file:

Value is: 3494419206

With the checksum completed (it took a few minutes), I now have a way of checking that my file is the same before/after the disk resize, plus the cksum tool will force reading of the whole file (checking for filesystem I/O issues).

Increasing the Data Disk Size

Within the Azure portal, we first need to stop the VM:

Once stopped, we can go to each of the two data disks and uplift from an S10 (in my example) to an S15 (256GB):

We can now start the VM up again:

When the VM is running again, we can log in and check.
Our file system is the same size:

We check with the LVM command “pvdisplay” to display one of the physical disks, and we can see that the size has not changed, it is still 128GB:

We need to make LVM re-scan the disk to make it aware of the new increased size. We use the pvresize command:

Re-checking the disk using pvdisplay, and we can see it has increased to 256GB in size:

We do the same for the /dev/sde disk:

Once the physical disks are resized (in the eyes of LVM), we can now check the volume group:

We have now got 256GB of free space (see row: “Free PE / Size”) in our volume group.

To allow our file system to get this space, the logical volume within the volume group needs to be expanded into the free space.
We use the lvresize command to make our logical volume use all free space in the volume group “+100%FREE”:

NOTE: It is also possible to specify an exact size should you want to be specific.

Our file system is still only 256GB in size, until we resize it.
For XFS file systems, we use the xfs_growfs command as follows:

Checking the file system now, shows we have 512GB of free space (50% free):

Are my files still present? Yes:

Let’s check the contents of my text file:

Finally, I validate that my big data file has not been corrupted:

Value is: 3494419206

What is the Alternative to Uplifting?

Instead of uplifting the existing data disks, it is possible to increase the amount of storage in my volume group, by adding two new additional disks.
To prevent performance issues, these new disks should be of the same scale level (S10) as the existing disks.
You should definitely not be mixing disk types in a logical volume, so to prevent this, you should not mix them in a volume group (even though you could technically separate them at the logical volume level).

Is there a good reason when to add more disks? When you are going to create a new logical volume, it is ideal to keep the data on separate physical disks to help avoid data-loss (from a lost/deleted disk).
There are also performance reasons to have additional Linux devices, since parameters such as queue depth affect the Linux device level. The Linux O/S can effectively issue more simultaneous read requests since additional data disks are additional devices.

As we have seen, uplifting a disk tier, you will need to take the VM offline or detach the disk. Adding additional new disks on the other hand, you can do this all online. Adding new disks does pose a slight issue if you have one large logical volume with striping, since any new disk quantity needs to match the existing stripe layout (e.g. if you have 2 disks striped, you should add another 2 disks to increase the volume), and the striping balance will only ever be over the quantity of disks in the original striping, not over the new quantity of disks (e.g. striped over 2 disks, even if you have added another 2 to make 4).

Is there a good reason when to not add more disks? When you know you could exceed the VM data disk count limitations. Each VM has a limit, the bigger the VM, the bigger the limit.

When you know that you always leave a small proportion of disk space free. Adding more disks which will only ever be max 80% used, is more wasteful compared to upscaling an existing set of disks which will only ever be 80% used.

Summary

Using the power of Azure, we have increased the data disk sizes of our VM.
The increase needed the VM to be stopped and started again while the disks were uplifted from an S10 (128GB) to an S15 (256GB).

Once uplifted, we had to make LVM aware of the new disk sizes by using the pvresize command, then the free space in the volume group was given to the logical volume and finally the file system was grown.
To maintain the logical volume stripe, both disks were uplifted, then both disks needed the pvresize command running.
We validated the before and after state of our ASCII and data files and they were not corrupted.

Finally, we looked at the alternative to uplifting and saw that uplifting may not be appropriate in all cases.
Adding new disks can be done online.

Cleaning Up /tmp for SAP On SLES On Azure

In an Azure SLES 12 Linux VM the default installation image mounts the /tmp file system as a regular file system off the root (/). Historically, for many Unix/Linux environments, this is not “normal”.

In this post I will discuss what the impact is for this irregular setup of /tmp and what you can do to work around it, ensuring SAP continues to work as usual.

What is “Normal”?

In traditional Linux installations the /tmp file system is usually mounted as a temporary file system (tmpfs), which means it would be cleaned on O/S reboot.
This has been the case for many years. There’s a recent post here that highlights 1994 for Solaris.
Plus, you can find a detailed explanation of tmpfs here: https://www.kernel.org/doc/html/latest/filesystems/tmpfs.html

With the default SLES setup, any files placed into /tmp will not be automatically be cleaned up on reboot and as per the previous links, there can be performance reasons to use a tmpfs.

From memory, there are differing standards on what /tmp should be used for (again see here), and it is possible that the traditional setup is no longer following a newly agreed standard. I really am not certain why SLES does not mount /tmp as tmpfs.
All I know from over 20 years of working with different Unix/Linux products, is that it is generally accepted that /tmp is a dumping ground, that gets cleaned on reboot and from what I can see, SAP think the same.

What is the Impact of /tmp Not Being tmpfs?

When you have /tmp and it is not cleaned on reboot and is not a tmpfs, then it can cause issues when using software that expects some form of clean up to be performed.
When I look at some of the SAP systems in Azure on SLES 12, I see a build up of files in the /tmp directory, which results in the need for a scripted job to clean them up on a periodic cycle.

If some of the more prolific files are not cleaned up regularly, then they can build up into many thousands of files. While this shouldn’t impact the day-to-day running of the SAP system, it can impact some ad-hoc operations such as patching the SAP system or the database.
The reason is that sometimes the patching tools write out files to the /tmp area, then crudely perform a “ls” to list files or find files in that location. If there are many thousands of files, then those listing operations can fail or be delayed.
A perfect example is the patching of the SAP ASE database, which can be affected by thousands of files in the /tmp location.

Finally, with the /tmp directory mounted off the root disk, any filling of /tmp will fill your root disk and this will bring your VM to a halt pretty quickly! Be careful!

What Sort of Files Exists in /tmp ?

In the list below, I am looking specifically at SAP related files and some files that are culprits for building up in the /tmp directory.

File Name PatternDescription
.saphostagent_nnnnnSAP Host Agent run files.
.sapicmnnnSAP ICM run file.
.sapstartsrv##_sapstartsrv.logSAP Instance Agent run file.
.sapstreamnnnnSAP IPC files.
.theagentlives.tmpOwned by Sybase O/S user, is related to SAP ASE instance. Maybe JS Agent.
ctisql_*Temporary iSQL executions using sybctrl.
sap_jvm_nnnn_nnnnnn
sapjvm_profiling_server_nnnn_nnnnn
sap_jvm_monitoringboard_nnnn_nnnnn
SAP JVM execution.
sapinst_instdirFrom an execution of SWPM (contains sapinst).
saplg*Owned by sapadm and are part of the SAP Instance Agent logon ticket generated from the Hostagent.
sb*From an ASE installation.
tmp*Owned by root, lots and lots, possibly Azure agent related as they contain the text “Windows Azure CRP Certificate Generator” when passed through a base64 decoder.
tmp.*Lots and lots, seem to be Kerberos related.

How Can We Clean Up these Files?

The most common way is to use a script.
Within the script will be a “find” statement, which finds the specific files and removes each one.
It needs to be done this way, because if there are too many files, then trying to do “rm /tmp/tmp*” will exceed the number of lines in the shell space for globbing and it will either error or produce no output at all and no files will be removed.

The script will need to be executed as root frequently (maybe weekly or even daily) to ensure that the file quantities are kept consistently low. This can be achieved using an enterprise scheduler or a crontab on each server.

Here’s an example of how to clean up the /tmp/tmp* files with a very specific criteria. The files are removed if they are:

  • located in the /tmp directory
  • with a name length of at least 7 chars beginning with ‘tmp’ followed by A-z or 0-9 at least 4 times.
  • last modified more than 7 days ago.
  • owned by root, with a group of root.
find /tmp -type 'f' -regextype posix-awk -regex '/tmp/tmp[A-z0-9]{4,} -mtime +7 -user root -group root -delete -print

The above will remove the files due to the “-delete”. To test it, just remove the “-delete”.

In summary, you should check how /tmp is setup in your VMs, and then check the files that are created in /tmp.

Best Disk Topology for SAP ASE Databases on Azure

Maybe you are considering migration of on-premise SAP ASE databases to Microsoft Azure, or you may be considering migrating from your existing database vendor to SAP ASE on Azure.
Either way, you will benefit from understanding a good, practical disk topology for SAP ASE on Azure.

In this post, I show how you can optimise use of the SAP ASE, Linux and Azure technical layers to provide a balanced approach to disk use, considering both performance and disk (ASE device) management.

The Different Layers

In an ASE on Linux on Azure (IaaS) setup, you have the following layers:

  • Azure Storage Services
  • Azure Data Disk Cache Settings
  • Linux Physical Disks
  • Linux Logical Volumes
  • Linux File Systems
  • ASE Database Data Devices
  • ASE Instance

Each layer has different options around tuning and setup, which I will highlight below.

Azure Storage Services

Starting at the bottom of the diagram we need to consider the Azure Disk Storage that we wish to use.
There are 2 design considerations here:

  • size of disk space required.
  • performance of disk device.

For performance, you are more than likely tied by the SAP requirements for running SAP on Azure.
Currently these require a minimum of Premium SSD storage, since it provides a guaranteed SLA. However, as of June 2020, Standard SSD was also given an SLA by Microsoft, potentially paving the way for cheaper disk (when certified by SAP) provided that it meets your SLA expectations.

Generally, the size of disk determines the performance (IOPS and MBps), but this can also be influenced by the quantity of data disk devices.
For example, by using 2 data disks striped together you can double the available IOPS. The IOPS are an important factor for databases, especially on high throughput database systems.

When considering multiple data disks, you also need to remember that each VM has limitations. There is a VM level IOPS limit, a VM level throughput limit (megabytes per second) plus a limit to the number of data disks that can be attached. These limit values are different for different Azure VM types.

Also remember that in Linux, each disk device has its own set of queues and buffers. Making use of multiple Linux disk devices (which translates directly to the number of Azure data disks) usually means better performance.

Essentials:

  • Choose minimum of Premium SSD (until Standard SSD is supported by SAP).
  • Spread database space requirements over multiple data disks.
  • Be aware of the VM level limits.

Azure Data Disk Cache Settings

Correct configuration of the Azure data disk cache settings on the Azure VM can help with performance and is an easy step to complete.
I have already documented the best practice Azure Disk Cache settings for ASE on Azure in a previous post.

Essentials:

  • Correctly set Azure VM disk cache settings on Azure data disks at the point of creation.

Use LVM For Managing Disks

Always use a logical volume manager, instead of formatting the Linux physical disk devices directly.
This allows the most flexibility for growing, shrinking and striping the disks for size and performance.

You should stripe the data logical volumes with a minimum of 2 physical disks and a maximum stripe size of 128KB (test it!). This fits within the window of testing that Microsoft have performed in order to achieve the designated IOPS for the underlying disk. It’s also the maximum size that ASE will read at. Depending on your DB read/write profile, you may choose a smaller stripe size such as 64KB, but it depends on the amount of Large I/O and pre-fetch. When reading the Microsoft documents, consider ASE to be the same as MS SQL Server (they are are from the same code lineage).

Stripe the transaction log logical volume(s) with a smaller stripe size, maybe start at 32KB and go lower but test it (remember HANA is 2KB stripe size for log volumes, but HANA uses Azure WriteAccelerator).

Essentials:

  • Use LVM to create volume groups and logical volumes.
  • Stipe the data logical volumes with (max) 128KB stripe size & test it.

Use XFS File System

You can essentially choose to use your preferred file system format; there are no restrictions – see note 405827.
However, if you already run or are planning to run HANA databases in your landscape, then choosing XFS for ASE will make your landscape architecture simpler, because HANA is recommended to run on an XFS file system (when on local disk) on Linux; again see SAP note 405827.

Where possible you will need to explicitly disable any Linux file system write barrier caching, because Azure will be handling the caching for you.
In SUSE Linux this is the “nobarrier” setting on the mount options of the XFS partition and for EXT4 partitions it is option “barrier=0”.

Essentials:

  • Choose disk file system wisely.
  • Disable write barriers.

Correctly Partition ASE

For SAP ASE, you should segregate the disk partitions of the database to avoid certain system specific databases or logging areas, from filling other disk locations and causing a general database system crash.

If you are using database replication (maybe SAP Replication Server a.k.a HADR for ASE), then you will have additional replication queue disk requirements, which should also be segregated.

A simple but flexible example layout is:

Volume
Group
Logical
Volume
Mount PointDescription
vg_aselv_ase<SID>/sybase/<SID>For ASE binaries
vg_sapdatalv_sapdata<SID>_1./sapdata_1One for each ASE device for SAP SID database.
vg_saploglv_saplog<SID>_1./saplog_1One for each log device for SAP SID database.
vg_asedatalv_asesec<SID>./sybsecurityASE security database.
lv_asesyst<SID>./sybsystemASE system databases (master, sybmgmtdb).
lv_saptemp<SID>./saptempThe SAP SID temp database.
lv_asetemp<SID>./sybtempThe ASE temp database.
lv_asediag<SID>./sapdiagThe ASE saptools database.
vg_asehadrlv_repdata<SID>./repdataThe HADR queue location.
vg_backupslv_backups<SID>./backupsDisk backup location.

The above will allow each disk partition usage type to be separately expanded, but more importantly, it allows specific Azure data disk cache settings to be applied to the right locations.
For instance, you can use read-write caching on the vg_ase volume group disks, because that location is only for storing binaries, text logs and config files for the ASE instance. The vg_asedata contains all the small ASE system databases, which will not use too much space, but could still benefit from read caching on the data disks.

TIP: Depending on the size of your database, you may decide to also separate the saptemp database into its own volume group. If you use HADR you may benefit from doing this.

You may not need the backups disk area if you are using a backup utility, but you may benefit from a scratch area of disk for system copies or emergency dumps.

You should choose a good naming standard for volume groups and logical volumes, because this will help you during the check phase, where you can script the checking of disk partitioning and cache settings.

Essentials:

  • Segregate disk partitions correctly.
  • Use a good naming standard for volume groups and LVs.
  • Remember the underlying cache settings on those affected disks.

Add Whole New ASE Devices

Follow the usual SAP ASE database practices of adding additional ASE data devices on additional file system partitions sapdata_2, sapdata_3 etc.
Do not be tempted to constantly (or automatically) expand the ASE device on sapdata_1 by adding new disks, you will find this difficult to maintain because striped logical volumes need at least 2 disks in the stripe set.
It will get complicated and is not easy to shrink back from this.

When you add new disks to an existing volume group and then expand an existing lv_sapdata<SID>_n logical volume, it is not as clean as adding a whole new logical volume (e.g. lv_sapdata<SID>_n+1) and then adding a whole new ASE data device.
The old problem of shrinking data devices is more easily solved by being able to drop a whole ASE device, instead of trying to shrink one.

NOTE: The Microsoft notes suggest enabling automatic DB expansion, but on Azure I don’t think it makes sense from a DB administration perspective.
Yes, by adding a new ASE device, as data ages you may end up with “hot” devices, but you can always move specific devices around and add more underlying disks and re-stripe etc. Keep the layout flexible.

Essentials:

  • Add new disks to new logical volumes (sapdata_n+1).
  • Add big whole new ASE devices to the new LVs.

Summary:

We’ve been through each of the layers in detail and now we can summarise as follows:

  • Choose a minimum of Premium SSD.
  • Spread database space requirements over multiple data disks.
  • Correctly set Azure VM disk cache settings on Azure data disks at the point of creation.
  • Use LVM to create volume groups and logical volumes.
  • Stipe the logical volumes with (max) 128KB stripe size & test it.
  • Choose disk file system wisely.
  • Disable write barriers.
  • Segregate disk partitions correctly.
  • Use a good naming standard for volume groups (and LVs).
  • Remember the underlying cache settings on those affected disks.
  • Add new disks to new logical volumes (sapdata_n).
  • Add big whole new ASE devices to the new LVs.

Useful Links: