This blog contains experience gained over the years of implementing (and de-implementing) large scale IT applications/software.

Recovering From a Deleted Data Disk with XFS on LVM in Azure

It’s quite a hefty long title, and it still doesn’t quite convey what I want to write about in the post.
This post is about a specific situation that can occur whereby you may have accidentally deleted a data disk or recovered a VM that had “selective disk backup” enabled and you’re missing a data disk, of a Linux VM that had the data disk as part of a Logical Volume Manager (LVM) managed file system.

In this post I show how to recover the unbootable VM using a rescue VM, then repair the volume by adding a new data disk and eventually repairing the LVM volume group and the XFS file system.

The Setup

In our setup, we have a SLES 12 VM (the victim) with the following disk architecture:

I actually have 3 data disks, but we will only be working with 2 of them.
The 2 data disk LUNs map to Linux physical disks /dev/sdd and /dev/sde and are part of volume group volTMP, which contains a logical volume lvTMP1 striped over the two disks and on lvTMP1 is an XFS file system mounted as “/BIGSTRIPEDDISK”.

I actually created this setup as part of this post here, so you can follow the instructions on that post to get to the same state if you wish.

I also have, ready to start up, a Ubuntu VM created using a basic Azure VM type (it’s a B1s) and an Azure Ubuntu Server 18 LTS image.
This will be my rescue VM. It’s small, light and fast to boot up.
You don’t have to use a Ubuntu VM, but you will need another VM that is running Linux and able to mount the file systems that you use for your root file systems (mine is ext4).

We Do the Damage

In this scenario we are deleting one of the data disks of the SLES 12 VM, from inside the Azure Portal.
The same situation could occur if you restored a VM from backup, but the VM had “selective disk backup” enabled, and restored with missing data disks.

The first thing we do, with the VM already shutdown, is remove the data disk (LUN2) from the Portal:

NOTE: We are not actually deleting the disk here in our test setup. It just detaches it from the VM. But imagine that we did detach and delete it completely.

Save the change:

We then start the VM:

The VM May Not Boot

Depending on your file system mount options and your O/S (I’m using SLES 12), by default the Linux VM will refuse to boot fully.
It will actually get stuck trying to mount the file system /BIGSTRIPEDDISK because the data disk is now missing (we deleted it!).

NOTE: If you have “nofail” in the fstab mount options, then your VM may boot normally, with the file system missing. You’re lucky. Skip though to the section on adding a new data disk (after section “Swap O/S Disk”).

The Linux O/S will go into recovery mode. If you have Boot Diagnostics enabled, you can verify this in the “Serial Console” within Azure Portal on the VM resource details screen.
In recovery mode, you are prompted to enter the root password to give you access to a basic shell. However, when deploying from Azure images, you don’t get a root user password, so you won’t know it!

If you don’t have Boot Diagnostics enabled, then you will be waiting a some minutes until the VM boot hits a timeout and Azure Portal informs you it failed to start:

In either of the above cases, you may end up at this same point. The VM will not boot due to the failed disk.

What we need to do to recover from this situation and allow our SLES 12 VM to boot, is to comment out the failed file system from the /etc/fstab file on the SLES 12 VM’s O/S disk.
This will involve the use of the handy “swap O/S disk” button in the Azure Portal.

Create an Image of the O/S Disk

We have to create a snapshot image of the existing SLES 12 VM O/S disk, because we cannot detach the O/S disk from the existing VM.

Locate the SLES 12 VM in the Portal and click it’s O/S disk:

Click the “Create Snapshot” button, then give the snapshot a useful name:

I used standard HDD (cheaper), but you can choose SSD if you wish:

Click to go to the snapshot once it has been created:

We now have an image of the O/S disk, which we can use to create a new O/S disk.

Create New Disk from Image

We will create a new managed disk from the image of the O/S disk.
This will allow us to mount it on our Ubuntu VM (the rescue VM).

From the Azure Portal create a brand new disk same size and specification as the original O/S disk.
NOTE: The Ubuntu VM is limited and may not support higher performing disk types like Ultra Disk. In which case you may need to create the new disk as a lower performance disk.

Select the image you created as the source and give the new disk a recognisable name:

Attach New Disk to Rescue VM

We now attach the new disk to the rescue VM (my Ubuntu VM) from the “disks” section of the Ubuntu VM resource:

It’s the first data disk, so is going on LUN 0:

Mounting the Disk on Rescue VM

Start the rescue VM (Ubuntu) if it is not already started, log onto the VM and either as root or using “sudo” check the disk devices present by running “lsblk”:

In my example the new disk is visible as /dev/sdc.
Because the disk is an O/S disk, it has partitions (it’s not a whole disk). For this reason, we have to mount the specific partition that the root (“/”) file system was mounted from.
In my case I can easily see that partition 4 (sdc4) because it is the largest partition on the /dev/sdc disk at 28.8G in size.

We have to create a location to mount the partition (“mkdir /mnt/suse_os_disk”) then mount partition 4 from sdc using the “mount” command:

The mount command is intelligent enough to know what file system is on the disk.

Adjust Fstab File

With the new disk mounted on the rescue VM, we can use our favourite text editor to adjust the fstab file and comment out the affected file system, to prevent it from being mounted.

vi /mnt/suse_os_disk/etc/fstab

We comment out /BIGSTRIPEDDISK :

Save the file changes.

We can now safely unmount the disk and then disconnect it from the rescue VM:

From the Azure Portal, we delete the new data disk from the rescue VM:

Swap O/S Disk

In Azure Portal, go to the SLES 12 VM and in the disks view of the VM, click the “Swap OS disk” button:

Select the new disk that we have just unmounted from the rescue VM:

Start the SLES VM and it will boot off the new disk:

The VM will boot up successfully.
Great stuff. All that effort and so far we have a booting VM.
We still have the initial problem, we deleted one of our data disks. We need to create a new data disk.

Add New Data Disk

In Azure Portal on the SLES VM, create a new data disk to the same specification as it existed originally.
You can guess if you are not sure, but you have to remember that it should be the same tier and size as other disks in a striped LVM logical volume.

Save the change:

Repair Volume Group

With the new data disk added, we can now start the process of repairing the volume group.

We execute a pvscan to list physical volumes on the VM:

In the above we can see that LVM is reporting a missing physical volume. This is the one we deleted.

Using “lsblk” we can see the new device right at the end, it’s /dev/sde:

We can create the new physical volume and apply the previous UUID to the disk, to make LVM think this is the same disk, then we get LVM to write the configuration backup to the new disk.

First, let’s check what LVM configuration backups we have for our volume group:

ls -ltr /etc/lvm/archive/volTMP*

We choose the latest one available before we lost the disk:

We can now re-create the physical volume, applying the previously used UUID and LVM configuration (metadata):

pvcreate --uuid '<previous missing uuid from the pvscan output>' /
 --restorefile /etc/lvm/archive/volTMP_<latest>.vg /dev/sde

Now we tell LVM to restore itself into a working order using the configurations available on the disks:

Let’s check the status of our logical volume that exists in the volTMP volume group:

In the above we notice that the “a” (active) flag is not set, the logical volume is therefore not yet active.
Let’s activate it:

lvchange -ay /dev/volTMP/lvTMP1

You can see that it is now active. Great!
We have repaired LVM. We no longer get any warnings about missing disks when executing the LVM related commands like “pvs, lvs, vgs”.

Repair File System

If we were to try and mount the file system /BIGSTRIPEDDISK, it would show an XFS error, because our new disk does not yet have a file system on it.
The file system is in a strange status, because 50% of the blocks are on the disk that was not deleted, and 50% are non-existent, because they were on the disk that was deleted.
So we actually have to repair the file system.
Instead of repairing, we could have chosen to just apply a new file system with mkfs.xfs, but let’s do a repair and see what the process is.

xfs_repair -L /dev/mapper/volTMP-lvTMP1

We can now edit the fstab and uncomment our file system /BIGSTRIPEDDISK:

Finally, we try and mount the file system:

It worked, and it was a clean mount. Nice.

Where Are My Files

With our repaired file system mounted, we dive in and look for files:

Ah yes! It’s clean!
No files will exist because we lost the disk. The LVM striping that we use is for performance, not redundancy, which means when you have to re-create the disk and repair the file system, all files will be lost.

Summary:

  • Actually deleting data disks is not simple in the Azure Portal. Microsoft have done a good job to try and prevent you from doing it by mistake, but it is still possible to do it by accident and also through code.
  • Turn on boot diagnostics on your VMs, it helps to see what is going on during boot.
  • Add “nofail” to the mount options for the data disks on Debian based systems. This will allow them to boot even with missing data disks.
  • When a data disk goes missing, that is actively mounted at Linux boot, the VM may not boot at all.
    You could reset the all your root account passwords and securely store them, which would allow you to enter recovery mode, but this is not something that most companies do.
    Be prepared and have a rescue VM ready to start up. This is the best option and could help in a number of scenarios.
  • Once booting again, we can use LVM to help simply restore the state of the volumes and file systems. We don’t need to re-create the LVM setup.
  • In a striped logical volume, we stripe for performance, not redundancy, you will lose data if you lose one of the data disks of a striped logical volume.
  • Using the “selective disk backup” feature saves backup vault space, but it means you will need to use this process to restore the volume groups for missing disks! Be wary and plan ahead!
  • Test backup & restore processes!

In another blog post, I will show how to automate the root disk snapshot and disk creation followed by attaching to another VM. We will have a single script that can be run to automate the whole process. This is useful to help fix other issues such as when you have enabled Linux HugePages with more memory than the VM has!

New SAP ASE Audit Logging Destination in 16.0.4

Let’s face it, auditing in SAP ASE 16 is difficult to configure due to the requirement to produce your own stored procedure and correctly size the rotating table setup with multiple database segments for each of the multiple audit tables. Once configured, you then had the realisation that to obtain the records, you needed to extract them from the database somehow, and then the problem of who does this task, what privileges they need, should they themselves be audited etc etc.

Good news! With the introduction of ASE 16.0 SP04, configuring auditing in the SAP ASE database just got dramatically easier and way more useable!

Introducing “audit trail type”

In previous versions of the SAP ASE database, database audit records were stored only in the sybsecurity database in a specific set of tables that you had to size and create and rotate yourself (or with a stored procedure).

Once the records were in the database, it was then up to you to define how and when those records would be analysed.
Depending on whether your SIEM tool supports direct ODBC/JDBC access to SAP ASE or not, would depend on how complex the extraction process would be.

In SP04 a new parameter was introduced called “audit trail type” where you can now set the audit store to be “syslog”.

When setting the store to be “syslog”, the audit records are pushed out to the Linux syslogd daemon (or rsyslogd or syslog-ng) and written to the O/S defined location according to the configuration of syslogd:

Each audit record gets a tag/program name of “SAP_ASE_AUDIT”, which means you can define a custom syslogd log file to hold the records, and also then specify a custom rotation should you wish.
Your syslogd logs may already be pulled into your SIEM tools, in which case you will simply need to classify and store those records for analysis.

With the new parameter set to “syslog” and the audit records being stored as file(s) on the file system, you will need to ensure that the file system has adequate space and establish a comfortable file retention (logrotate) configuration to ensure that audit records do not cause the file system to fill (preventing persistence of additional audit records).

Of course, should you enjoy torture, you can always go ahead and continue to use the database to store the audit records. Simply setting the new parameter “audit trail type” to “table”, will store the audit records in the database just like the previous versions of ASE.

Useful Links

What’s new in ASE 16.0.4

Parameter: audit trail type

Listing & Tagging Orphaned Azure Disks using Azure CLI from Bash Cloud Shell

Nobody likes wastage, but it happens.
Finding orphaned Azure disks is super simple using the Cloud Shell.

In this post I show how you can use the CLI, with a little Bash scripting in the Cloud Shell, to find unattached disks and also tag them for future removal.
There are plenty of PowerShell examples in the PowerShell Runbook gallery, but I want to show a CLI example.
I then go on to show how this could be scheduled, but it’s not as simple as first thought.

Using Bash Cloud Shell & the Az Command (CLI)

We can use the az command (CLI) in a bash Cloud Shell to list the disks, then use JMESPath to filter and find the ones that are not attached (“Unattached”) to any VM:

az disk list --query "[?diskState=='Unattached'].{name:name,state:diskState,size:diskSizeGb,sku:sku.name,tag:to_string(tags)}" -o tsv

ase01_datadisk_2        Unattached      256     Standard_LRS    null
ase01_OsDisk_1_8e31587ee6604463ada5167f91e6345f Unattached      30      StandardSSD_LRS null

Notice I have also included the “tags” column with some processing.
If you wanted to search for disks that are not attached, and filter by tag, then you can do the following.
First I apply a tag; I’m going to create a tag called “testTag”, with a value “testTagValue”:

az disk update --name ase01_datadisk_2 --resource-group UK-West --set tags.testTag=testTagValue

Now I have set a value for “testTag”, let’s query based on that specific tag, for “Unattached” disks:

az disk list --query "[?diskState=='Unattached'&&tags.testTag=='testTagValue'].{name:name,state:diskState,size:diskSizeGb,sku:sku.name,tag:to_string(tags)}" -o tsv

ase01_datadisk_2        Unattached      256     Standard_LRS    {"testTag":"testTagValue"}

You can search for a specific value, contains a part of a value, or alternatively, search for a value that is “null”.

Adjusting the Disks

Now we have a list of disks output from our query, we could just delete them.
But it’s probably better to tag them for deletion, allowing a deletion at some point after a more detailed review. I would look to produce a report that could go to a CAB review, ready for deletion.

Here’s how we can use the power of the Cloud Shell, the Azure CLI and Bash scripting to tag those disks that are Unattached:

set updated=0
set failed=0
set tabchar="$(printf "\t")"
set deldate="$(date -d "+ 30 days" "+%Y%m%d")"
az disk list --query "[?diskState=='Unattached'].{name:name,rg:resourceGroup}" -o tsv | while read line
do
   diskname="${line%%${tabchar}*}"
   rgname="${line##*${tabchar}}"
   echo -n "Updating: $diskname ... "
   az disk update --name ${diskname} --resource-group ${rgname} --set tags.deleteDate=${deldate} >/dev/null
   if [[ $? -eq 0 ]] ; then 
      echo "[SUCCESS]"
      (( updated++ ))
    else
      echo "[FAILED]"
      (( failed++ ))
   fi
done
printf "### SUMMARY ### \n Updated: %s\n Failed: %s\n" $updated $failed

Let’s look at the above script in detail below:

  1. We define the variable to hold our updated count.
  2. We define the variable to hold our failed count.
  3. We execute the CLI query to get the list of disks, outputting the disk name and resource group into a “do” loop a line at a time.
  4. The start of the loop body.
  5. We have to capture the character that represents a TAB character.
  6. We get the disk name out of the line by splitting the line by TAB from the left.
  7. We get the disk resource group out of the line by splitting the line by TAB from the right.
  8. A little output text to say which disk is being worked on.
  9. The call to the CLI to update the disk, applying the tag “deleteDate” with a value of today + 30 days in the format yyyymmdd.
  10. We detect the successful (or not) execution of the CLI command.
  11. Output “success” for a successful execution.
  12. Update the success variable count.
  13. Alternatively, if we failed.
  14. Output “failed” for a failed execution.
  15. Update the failed variable count.
  16. Closure of the “if” statement.
  17. Closure of the loop.
  18. Output a summary count of success vs failed.

Here is the execution sample:

Reporting on the deleteDate Tag

Now we have our disks tagged ready for deletion, we might want to find disks that have a deleteDate older than a specific date. These disks would then be ripe for deletion.
Here’s how we can do that,:

az disk list --query "[?tags.deleteDate<'20210402'].{name:name,rg:resourceGroup,tag:to_string(tags)}" -o tsv

We can even go as far as using Bash to inject the current date into the query, thereby allowing us to look for any disks due for deletion from today without adjusting the query each time:

az disk list --query "[?tags.deleteDate<'$(date "+%Y%m%d")'].{name:name,rg:resourceGroup,tag:to_string(tags)}" -o tsv

How About Scheduled Execution?

Well, using the CLI and the Bash shell inside an Azure Automation account is not possible right now.
Instead we would need to convert our disk update scripted CLI example to PowerShell to make it Runbook compatible, then we can schedule inside an Azure Runbook as a PowerShell Runbook.
The code needs to change slightly because we need to use the automation “RunAs Account” feature to connect from our Runbook, but it looks very much like our CLI code in Bash:

IMPORTANT: All the “simple” help guides on creating a PowerShell Runbook fail to mention the need to actually import the required PowerShell modules that you will need to run your code. In the Automation Account section, look at the “modules”. For the code below you need 2 additional modules: Az.Accounts and Az.Compute which you can import from the Gallery. Seems simple, but as I said, not obvious.

$Conn = Get-AutomationConnection -Name 'AzureRunAsConnection'
try{
    $auth = Connect-AzAccount -ServicePrincipal -Tenant $Conn.TenantID -ApplicationId $Conn.ApplicationID -CertificateThumbprint $Conn.CertificateThumbprint
} Catch {
    throw $_.Exception
}
$updated=0
$failed=0
$deldate=(get-date -Format "yyyyMMdd" -date $([datetime]::parseexact($(get-date -format 'yyyyMMdd'), 'yyyyMMdd', $null)).AddDays(+30))
$updateConfig = New-AzDiskUpdateConfig -tag @{ "deleteDate" = "$deldate" }
Get-AzDisk |? { $_.DiskState -eq "Unattached" } |% { 
   write-output "Updating: $($_.Name) ..."
   Update-AzDisk -ResourceGroupName $_.ResourceGroupName -DiskName $_.Name -DiskUpdate $updateConfig >$null 2>&1
   if ( $? -eq $true ) { 
      echo "   [SUCCESS]"
      $updated++
    }
    else {
      echo "   [FAILED]"
      $failed++
   }
}
echo "### SUMMARY ### `n Updated: ${updated}`n Failed: ${failed}`n"

With the above code in a PowerShell Runbook, we can test it:

Obviously if you will be scheduling the code above, then you may wish to change it slightly so that it excludes any orphaned disks already with a deleteDate tag.

Summary

Using the CLI, Cloud Shell and some Bash scripting, we have a simple mechanism to tag unattached disks, then use that tag to report on disks due for deletion after a specific date (or we could use today’s date).
This is a great solution for those with Bash shell scripting skills and as shown it is reasonably simple.

We have also looked at the possibility of scheduling the code and found that it is not possible for CLI. Instead a PowerShell Runbook is a possible solution that allows the scheduling of PowerShell code.
PowerShell could be your pain point, but it really isn’t that far from shell.

SAP Netweaver ICM Fast Channel Architecture

SAP Netweaver has been around for many, many years now. In fact we have had very nearly 20 years of Netweaver.
Back in March 2001, SAP acquired TopTier and went on to use TopTier’s application as the underpinning to the SAP Netweaver application server (WebAS).
Now this would not have been the Netweaver Java stack, that was to come later in the form of WebAS 6.30.
My point is, you would imagine by now that Netweaver is known inside and out by most BASIS professionals, but this is just not the case. It’s a complex and very capable application server and there are things that we know and things that we know in detail.
One of the things that seems to be little known is the FCA and it’s role within the ICM of the Netweaver Java stack.

In this post I want to explain the function of the SAP Netweaver Internet Communication Manager (ICM) Fast Channel Architecture (FCA) and how this is responsible for routing the HTTP communications to your Netweaver Java stack.

As usual, a little context will help set the scene.

A History of Netweaver Java

Before Netweaver 7.1, the Java stack did not have an Internet Communication Manager (ICM). This was reserved only for the Netweaver ABAP stack.
Instead, these old Netweaver Java versions had additional Java nodes (JVMs) called dispatcher nodes (in addition to the server0 node).

The dispatcher node was responsible for receiving and dispatching the inbound HTTP requests to the server nodes of the instance.

The ICM Was Added

Since Netweaver 7.1, the Java stack was given the ICM, which runs from the Kernel binaries, instead of a JVM.


The benefits of this change were:

  • Faster startup and response time (Kernel is C++ compiled binary code).
  • Smaller memory requirements.
  • Same ICM in Netweaver ABAP and Netweaver Java (same Kernel DB independent part).
  • Use of profile files for configuration (for SSL, security, memory params) instead of ConfigTool.

Identifying the FCA

We know the ICM is visible as a separate binary executable process at the operating system level.
In Windows we see “icman.exe” and in Unix/Linux we see “icman”.
At execution, the icman program reads the instance profile to determine it’s configuration.

The Fast Channel Architecture (FCA) is a specific, dedicated set of memory pipes (MPIs) in the shared memory region, accessible by both the ICM and the Java server nodes and used as a method of super fast inter-process communication between the ICM and the Java server nodes.
In Linux, shared memory segments are visible using the “ipcs -m” command, in Windows these are memory mapped files and you cannot see them so easily, you would need a 3rd party tool.

By using shared memory and the concept of memory pipes, it avoids the need for the data in a HTTP request/response to be sent from the ICM to the Java Server node. Instead of sending the actual data, a simple memory pointer can be sent (smaller and consistent in size), telling the Java Server node where to look in memory, for the data.
Effectively what this means is that the shared memory area for the MPIs, sits logically between the ICM and the Java Server nodes.

According to the Netweaver AS Java documentation, the FCA is itself just another MPI, that acts as a FIFO queue.
The HTTP requests coming into the ICM via a TCP port, travel through a regular (anonymous) MPI, before the ICM dispatches the request into a specific FCA queue.
If you have two server nodes on your Java stack (server0 and server1), then the ICM will query the server node to determine the back-end load, then push the request to the specific FCA queue of the target server node that has capacity to handle the request.
Therefore, if you have two server nodes, you will have a dedicated FCA queue for each.
It is the responsibility of the Java server node, to create the FCA queue in the ICM shared memory during start-up.

Once the HTTP request (or rather, the memory pointer to the request) hits the FCA, it becomes the responsibility of the Java server node to pull the request off the queue into a thread for processing.
Inside the Java Server node, these threads are known as the FCA threads or HTTP Worker Threads.
If you run a SAP PI/PO system, then you may already be familiar with these threads and their configuration.
You may have seen these threads when running thread dumps for SAP support incidents.

There are two methods to actually see the FCA Queues:

  • Within the SAP ICM Web Administration page.
  • Using the “icmon” command line tool.

We can call the icmon tool as follows:

icmon pf=<path-to-instance-profile>

then from the menu select "m"
then from the menu select "y"

Once the MPI list is dumped (option “y”), the the FCA queues are visible at the end of the output:

...
MPI<174>: 4d50494d 'ANON' 11 50 0 0 0 0(4996) 1(30001) 1(30001)
MPI<173>: 4d50494d 'ANON' 10 50 0 0 0 0(4996) 1(30001) 1(30001)
MPI<60>: 4d50494d 'TS1_00_1234650_HTTP_WAIT' 5 -1 20 0 0 0(4996) 1(10002) 0(-1)
MPI<5f>: 4d50494d 'TS1_00_1234650_HTTP' 4 -1 20 0 0 0(4996) 1(10002) 1(30001)
MPI<58>: 4d50494d 'TS1_00_1234651_HTTP_WAIT' 2 -1 20 0 4406 0(4996) 1(10003) 0(-1)
MPI<57>: 4d50494d 'TS1_00_1234651_HTTP' 7 -1 20 0 0 0(4996) 1(10003) 1(30001)
MPI<52>: 4d50494d 'TS1_00_1234650_P4' 6 -1 20 0 0 0(4996) 1(10002) 1(30001)
MPI<4d>: 4d50494d 'TS1_00_1234651_P4' 3 -1 20 0 0 0(4996) 1(10003) 1(30001)
MPI<4>: 4d50494d 'ANON' 1 1 0 0 0 0(4996) 1(30001) 1(30001)
MPI<2>: 4d50494d 'ANON' 0 1 0 0 0 0(4996) 1(30001) 1(30001)
 
    q - quit
    m - menue 

NOTE: For those interested, the 4d 50 49 4d at the beginning of each line, translates from HEX to ASCII as “MPIM”.

In my example, you can see I have 2 Java server nodes registered at this ICM: 1234650 and 1234651.
You will notice that there are 3 queues for each Java server node.
The P4 queue is self explanatory, it is used to talk to the Java server node on it’s P4 port (SAP proprietary protocol) and is probably used to acquire capacity/load information from the server node.
Of the other 2 queues, one queue is the “WAIT” queue and is where (I think) the inbound requests (destined to the Java server node) are held, before they enter the other request queue which is where (I think) the Java server node is waiting to process the requests.
(There is not a great deal of documentation on the above, but I have seen instances where the WAIT queue fills, which makes me believe it’s a holding area).

In the dev_icm trace we can also see the joining of the server nodes to the ICM for the HTTP protocol (other protocols are supported, such as Telnet, P4):

[Thr 140608759801600] Wed Mar 17 22:59:32:934 2021
[Thr 140608759801600] JNCMIHttpCallLBListener: node 1234650, service Http joins load balancing
[Thr 140608759801600] HttpJ2EELbPut: server 1234650 started protocol HTTP, attached to request queue TS1_00_1234650_HTTP
[Thr 140608759801600] JNCMIHttpMsPutLogon: set http logon port (port:50000) (lbcount: 2)
[Thr 140608759801600] JNCMIHttpMsPutLogon: set https logon port (port:50001) (lbcount: 2)

In the Java server node developer trace files (e.g. dev_server0 and dev_server1), we can see the name of the node (JNODE_10002 for server0) which is also visible in the dev_icm trace output in column 10:

F [Thr 139637668607872] Wed Mar 17 22:53:49 2021
F [Thr 139637668607872] JSFSetLocalAddr: using NI defaults for bind()
I [Thr 139637668607872] MtxInit: JNODE_10002 0 2

The relevant dev_icm output:

MPI<60>: 4d50494d ‘TS1_00_1234650_HTTP_WAIT’ 5 -1 20 0 0 0(4996) 1(10002) 0(-1)
MPI<5f>: 4d50494d ‘TS1_00_1234650_HTTP’ 4 -1 20 0 0 0(4996) 1(10002) 1(30001)

Sizing the FCA

The size of the FCA is not directly configurable.
Instead, we can configure the size of the shared memory area (total area) for all the MPIs using parameter “mpi/total_size_MB“, then from this total size, the maximum possible size of any individual MPI is fixed to 25% of the total area size.

In later Netweaver versions (7.40+), it is not recommended to adjust “mpi/total_size_MB“, instead, adjust the “icm/max_conn” parameter, which is then used to calculate “mpi/total_size_MB“.
The internal formula is described as:
mpi/total_size_MB = min(0.06 * $(icm/max_conn) + 50, 2000)

There is another undocumented (apart from SAP notes) parameter, which can allow you to increase the max size of an MPI. However it means any one MPI can consume more of the total area than the default 25%.
It is therefore not advised to be adjusted.

We can see the value of the parameter “mpi/total_size_MB” in the ICM developer trace file (dev_icm) during it’s start up. This is useful as it shows us the calculation based on the formula mentioned above.
We are looing at “total size MB” right at the end of the line:

[Thr 140610607359872] MPI init, created: pipes=40010 buffers=19985 reserved=5995 quota=10%, buffer size=65536, total size MB=1250

Common FCA Errors

There are a dedicated set of SAP notes for FCA errors, such as 1867119.
Based on the architecture we can see that they describe issues with throughput (through the FCA Queue), and with issues in the Java server node threads causing the FCA Queues to fill.
They also show issues with sizing of the MPIs, and the number of the worker threads (for high throughput scenarios).

In my experience the following types of FCA errors can be seen in the Java server developer traces “dev_server<n>” files:

  • “-3” error: The Java server node is unable to put a response back onto the FCA Queue, probably because the MPI area is full from a full FCA Queue. This can happen if one of the Java server node HTTP Worker threads has become stuck (waiting) for resources or for the database.
    As you will see from my previous diagram, a full MPI area will then start to affect HTTP access to both Java server nodes as they share the ICM (it’s a single point of failure).
  • “-7” error: This affects one individual Java server node and prevents it from pulling requests off the FCA queue in a timely manner. This specific issue is usually a timeout mismatch between the HTTP provider and the ICM.

Both of the above errors look similar, but one is a lack of resources in the Java stack and the other is a full FCA Queue (in shared memory) due to inaction (stuck threads) in the Java stack.
The “-7” error can therefore present itself as an issue in the ICM or in the Java stack, but it is usually a problem in the Java stack that causes it to close the connection early.

Summary

There you have it, the simple FCA queue that serves HTTP requests to your Java Server nodes.
We learned:

  • Netweaver Java was given the ICM in 7.1 onwards.
  • The ICM in the Netweaver Java and ABAP stacks is the same binary.
  • The ICM uses shared memory for the MPIs.
  • The shared memory area is controlled via a parameter of which it’s value is controlled via 1 parameter (in NW 7.40+).
  • The FCA queues are MPIs.
  • Only memory pointers are passed through the FCA Queues.
  • The Java server nodes are responsible for creating the FCA queues in the ICM shared memory.
  • There are 2 FCA queues for each server node.
  • The developer traces store information about the size of the ICM shared memory and the registration of the Java Server nodes to a queue.
  • There are a known set of errors that can occur and are documented in SAP notes.
Useful SAP References
  • SAP Note 1867119 – No more memory for FCA
  • SAP Note 2417488 – Resource leak for MPI buffers in FCA communication
  • SAP Note 1945745 – How to increase HTTP Worker (FCA) threads in PI
  • SAP Note 2579836 – AS Java system has performance problem – FCAException – Best practices and tuning recommendations.
  • SAP Note 2997765 – AS Java system has performance problem – FCAException – Best practices for analysis
  • SAP Note 2276273 – AS Java – How to identify the largest MPI buffer consumer by MPI dump

HowTo: Install Azure Enhanced Monitoring for Linux for SAP

One SAP support prerequisite for running SAP on Azure, is that you must have Azure Enhanced Monitoring for Linux installed onto the Azure Linux VMs where your SAP application runs (including DB servers). Details are in SAP note 2015553.

In this brief post I show how to check if it is already installed, then how to install it, without needing to install the Powershell Azure Cmdlets.

What is Azure Enhanced Monitoring for Linux?

Azure Enhanced Monitoring for Linux (AEM) is an Azure VM extension installed onto the target Linux VM.
The extension uses the Azure Instance Agent to pull additional telemetry information down onto the local VM, and places it into a file on the Linux file system called /var/lib/AzureEnhancedMonitor/PerfCounters.

This special file is pure ASCII text with data inside that is semi-colon separated.
You can use Linux command line utilities to query information from the file (it’s readable by any user).

The file is parsed by the SAP Host Agent (also installed on every SAP VM) and made available in the monitoring memory segment used by the Netweaver ABAP stack, with the data being visible in transaction ST06 (OS06).

How to Check if AEM Is Installed

There are a number of ways to check if Azure Enhanced Monitoring for Linux is installed on a VM:

  • Inside the VM in Linux we can check for the existence of file: “/var/lib/AzureEnhancedMonitor/PerfCounters”
  • Inside the VM in Linux we can check the extension home dir exists: “/var/lib/waagent/Microsoft.OSTCExtensions.AzureEnhancedMonitorForLinux-*”
  • In the Azure Portal, we can check the status of the extension in the Azure Portal:
  • In the Azure Cloud Shell, we can either Test or Get the AEM Extension to see if it is installed:
Get-AzVMAEMExtension -ResourceGroupName <RG-NAME> -VMName <VM-Name>
Test-AzVMAEMExtension -ResourceGroupName <RG-NAME> -VMName <VM-Name>

Installing AEM

There are two ways to install the Azure Enhanced Monitoring for Linux extension into a VM:

  • Using local PowerShell (on your computer) with the Azure Cmdlets installed.
    You will need to have the rights on the local machine to perform the install of the Azure Cmdlets.
    I will not cover this method as it is quite tedious to setup and the chances are that your PowerShell is locked down by your company and will not allow you to install the required Cmdlets.
  • Using Powershell in the Azure Portal Cloud Shell.
    This has all the required Cmdlets already installed, but to setup the Cloud Shell you will need rights in Azure to be able to create a Storage Account to use for your shell home location.

Out of the two options, I usually opt for the Cloud Shell. Once you have it setup, you will find you can use it for many other things and access it from anywhere!
In this post I will be using Cloud Shell to do the installation.

To install the AEM extension, we use Powershell commands to do the following sequence of tasks:

  • Obtain our subscription context.
  • Deploy the extension to the specific VM in the subscription.

Let’s start the Cloud Shell (NOTE: You will need a Storage Account for the Cloud Shell to work).
Go to the Azure Portal and click the button on the button bar:

Make sure that you are in a PowerShell Shell:

We may need to switch to a specific subscription.
We can list all subscriptions by calling Get-AzSubscription and filtering on the Id property:

Get-AzSubscription | Select-Object Id

We can then set the context of our Cloud Shell to the specific subscription Id as follows:

$context = Get-AzSubscription -SubscriptionId '<SubscriptionID>'
Set-AzContext -SubscriptionObject $context

Once the code has executed, we can check if the AEM extension is already installed:

Get-AzVMAEMExtension -ResourceGroupName <RG-NAME> -VMName <VM-Name>

If the AEM extension is already installed, then we will see output being returned from the Get command:

ResourceGroupName       : UK-West
VMName                  : vm01
Name                    : AzureEnhancedMonitorForLinux
Location                : ukwest
Etag                    : null
Publisher               : Microsoft.OSTCExtensions
ExtensionType           : AzureEnhancedMonitorForLinux
TypeHandlerVersion      : 3.0
Id                      : /subscriptions/mybigid/resourceGroups/UK-West/providers/Microsoft.Compute/virtualMachines
                          /vm01/extensions/AzureEnhancedMonitorForLinux
PublicSettings          : {
                            "cfg": [
                              {
                                "key": "vmsize",
                                "value": "Standard_D4s_v3"
                              },
                              {
                                "key": "vm.role",
                                "value": "IaaS"
                              },
                              {
                                "key": "vm.memory.isovercommitted",
                                "value": 0
                              },
                              {
                                "key": "vm.cpu.isovercommitted",
                                "value": 0
                              },
                              {
                                "key": "script.version",
                                "value": "3.0.0.0"
                              },
                              {
                                "key": "verbose",
                                "value": "0"
                              },
                              {
                                "key": "href",
                                "value": "http://aka.ms/sapaem"
                              },
                              {
                                "key": "vm.sla.throughput",
                                "value": 96
                              },
                              {
                                "key": "vm.sla.iops",
                                "value": 6400
                              },
                              {
                                "key": "wad.isenabled",
                                "value": 0
                              }
                            ]
                          }
ProtectedSettings       :
ProvisioningState       : Succeeded
Statuses                :
SubStatuses             :
AutoUpgradeMinorVersion : True
ForceUpdateTag          : 637516905202791108
EnableAutomaticUpgrade  :


If the AEM extension is not installed, not output will be seen from the “Get” command.
We can then install the AEM extension with the “Set-AzVMAEMExtension” command as follows:

Set-AzVMAEMExtension -ResourceGroupName <RG-NAME> -VMName <VM-Name>

The extension should be installed successfully.
If you need to remove it, you can use the “Remove-AzVMAEMExtension” command.

There is a “Test” command that you can call to test the AEM:

Test-AzVMAEMExtension -ResourceGroupName <RG-NAME> -VMName <VM-Name>

Finally, if you want to see the additional command line options, then use the standard “Get-Help” as follows:

Get-Help Set-AzVMAEMExtension -Full

Issues with AEM

There’s one known issue with Azure Enhanced Monitoring for Linux, the number of data disks reported in the PerfCounters file seems to be limited to 9.
This means that if you have more than 9 data disks, the performance data may not be visible in the file and therefore not visible in SAP.
It’s possible a fix is on the way.