This blog contains experience gained over the years of implementing (and de-implementing) large scale IT applications/software.

Listing Azure VM DataDisks and Cache Settings Using Azure Portal JMESPATH & Bash

As part of a SAP HANA deployment, there are a set of recommendations around the Azure VM disk caching settings and the use of the Azure VM WriteAccelerator.
These features should be applied to the SAP HANA database data volume and log volume disks to ensure optimum performance of the database I/O operations.

This post is not about the cache settings, but about how it’s possible to gather the required information about the current settings across your landscape.

There are 3 main methods available to an infrastructure person, to see the current Azure VM disk cache settings.
I will discuss these method below.

1, Using the Azure Portal

You can use the Azure Portal to locate the VM you are interested in, then checking the disks, and looking on each disk.
You can only see the disk cache settings under the VM view inside the Azure Portal.

While slightly counter intuitive (you would expect to see the same under the “Disks” view), it’s because the disk cache feature is provided for by the VM onto which the disks are bound, therefore it’s tied to the VM view.

2, Using the Azure CLI

Using the Azure CLI (bash or powershell) to find the disks and get the settings.

This is by far the most common approach for anyone managing a large estate. It uses the existing Azure API layers and the Azure CLI to query your Azure subscription, return the data in JSON format and parse it.
The actual query is written in JMESPATH (https://jmespath.org/) and is similar to XPath (for XML).

A couple of sample queries in BASH (my favourite shell):

List all VM names:

az vm list --query [].name -o table

List VM names, powerstate, vmsize, O/S and RG:

az vm list --show-details --query '[].{name:name, state:powerState, OS:storageProfile.osDisk.osType, Type:hardwareProfile.vmSize, rg:resourceGroup, diskName:storageProfile.dataDisks.name, diskLUN:storageProfile.dataDisks.lun, diskCaching:storageProfile.dataDisks.caching, diskSizeG:storageProfile.dataDisks.diskSizeGb, WAEnabled:storageProfile.dataDisks.writeAcceleratorEnabled }' -o table

List all VMs with names ending d01 or d02 or d03, then pull out the data disk details and whether the WriteAccelerator is enabled:

az vm list --query "[?ends_with(name,'d01')||ends_with(name,'d02')||ends_with(name,'d03')]|[].storageProfile.dataDisks[].[lun,name,caching,diskSizeGb,writeAcceleratorEnabled]" -o tsv

To execute the above, simply launch the Cloud Shell and select “Bash” in the Azure Portal:

Then paste in the query and hit return:

3, A Most Obscure Method.

Since SAP require you to have the “Enhanced Monitoring for Linux” (OEM) agent extension installed, you can obtain the disk details directly on each VM.

For Linux VMs, the OEM creates a special text file for performance counters, which is used by the Saposcol (remember that) for use by SAP diagnostic agents, ABAP stacks and other tools.

Using a simple piece of awk scripting, we can pull out the disk cache settings from the file like so:

awk -F';' '/;disk;Caching;/ { sub(//dev//,"",$4); printf "/dev/%s %sn", tolower($4), tolower($6) }' /var/lib/AzureEnhancedMonitor/PerfCounters

There’s a lot more information in the text file (/var/lib/AzureEnhancedMonitor/PerfCounters) and my later post Checking Azure Disk Cache Settings on a Linux VM in Shell, I show how you can pull out the complete mapping between Linux disk devices, disk volume groups, Azure disk names and the disk caching settings, like so:

Useful Links

HowTo: OEL/RHEL 5.7 Create New VolGroup and LVol for new disk

If you need to add an additional mount point onto a RHEL or OEL Linux server, here’s how to do it using the logical volume manager for maximum flexibility:

We assume that you’ve added a new physical disk and that it’s called /dev/sdb.

First check size of the device to ensure you’ve got the correct one:

# fdisk -l /dev/sdb

Now create a new primary partition on the disk:

# fdisk /dev/sdb
n        (new partition)
p        (primary partition)
1        (partition number)
<return> for 1st block
<return> for last block
w       (write config)
q       (quit)
Check you can see the new partition:

# ls -la /dev/sdb*

(You should see /dev/sdb1)

Now ensure that you create a new physical volume that the volume manager can see:

# pvcreate /dev/sdb1

Physical volume "/dev/sdb1" successfully created

Create a new Volume Group containing the new physcial partition:

# vgcreate VolGroup01 /dev/sdb1

Volume group "VolGroup01" successfully created

Create a new logical volume inside the volume group:

# lvcreate -L 480GB -n LogVol01 VolGroup01

Logical volume "LogVol01" created

Format the new logical volume using EXT3 (you can choose which version of EXT you want):

# mkfs -t ext3 /dev/VolGroup01/LogVol02

Now you just need to mount the partition up.

Estimating Oracle Export Space Requirements

To estimate a full Oracle export space requirements, you can use DataPump.
The below command estimates for a full export of the database.
It makes use of the ESTIMATE_ONLY option on the expdp command line.

First you need to create your directory in the database:

> sqlplus / as sysdba

SQL> create directory dp_dump as '/your_path';

SQL> quit;

> expdp "/ as sysdba" full=y directory=DP_EXPORT logfile=full_exp_estimate.log estimate_only=yes
...

Total estimation using BLOCKS method: 2.418 GB

Job "SYS"."SYS_EXPORT_FULL_01" successfully completed at 18:50:48

The method above runs through each object to be exported and calculates the number of blocks relevant to the block size of each object, that will be required on disk.

If you have a large database and your statistics are up-to-date, then you could use the additional “ESTIMATE=STATISTICS” option, which uses the data gathered from the Oracle statistics collections to estimate the space required. This is a lot quicker but needs accurate stats.

The example above took 1 min.

With the “ESTIMATE=STATISTICS” option, it took 46 seconds, but estimated only 991.3 MB would be required (half as much as the BLOCKS method).  There’s obviously some missing stats on objects in my DB.