This blog contains experience gained over the years of implementing (and de-implementing) large scale IT applications/software.

Create a SAML Key & Cert in Powershell for PowerBI to HANA SSO

Create a SAML Key & Cert in Powershell for PowerBI to HANA SSO

You’ve seen the Microsoft documentation, you want to create a private key and a certificate that can be used with a SAML assertion for single-sign-on between Microsoft PowerBI Gateway and a SAP HANA 2.0 database.

Where’s the problem?

The problem, is that the Microsoft documentation simply says to use “openssl”. Not only that, but it suggests that you create a CA (Certificate Authority) key and certificate first, then create a separate key and certificate for the actual IdP (which in this setup is the PowerBI Gateway), and sign it with the new CA certificate, creating a chain.

This is just not needed in SAML. In a SAML assertion, the certificate used for signature doesn’t need to be signed/issued by anyone in particular because the chain is not validated. The lifetime of the certificate can be loooong, because there is no chain. In the case of a compromised key, the IdP key and certificate would need to be re-created and the new certificate distributed to the target systems (service providers like SAP HANA database).

In this post I’m actually addressing two issues with the Microsoft documentation (I’ve raised one on the github page, but alas it has not been closed yet).

  1. No CA is needed. A self-signed IdP certificate works perfectly fine.
  2. OpenSSL is not needed. We can do it all in Powershell (with an evil laugh added in for extra evilness).
  3. We don’t need to export the key and certificate into a PFX file, then import it into the Windows certificate store.

Show me the Code

After fiddling around with various iterations, I found that the following Powershell code produces a key and certificate that is accepted by PowerBI Gateway (the key uses a valid provider).
The certificate is also accepted by SAP HANA once you’ve imported it into the new Identity Provider that you create in the HANA Cockpit (or HANA Studio if you must – tut tut tut).

Code to create a private key, plus certificate valid for 10 years from today and store it in the Windows “LocalMachine” certificates store in the “Personal” folder on the PowerBI Gateway:

$params = @{
Subject = "CN=PowerBI HANA IdP,OU=MyDept,O=MyCompany,L=MyTown,ST=MyShire,C=GB"
KeyAlgorithm = 'RSA'
KeyLength = 2048
NotAfter = $(Get-Date).AddMonths(120)
FriendlyName = "PowerBI HANA IdP Cert"
KeyFriendlyName = "PowerBI HANA IdP Cert"
HashAlgorithm = 'SHA256'
KeyUsage = 'None'
SuppressOid = '2.5.29.37'
CertStoreLocation = 'Cert:\LocalMachine\My'
Provider = 'Microsoft Enhanced RSA and AES Cryptographic Provider'
KeySpec = 'KeyExchange'
TextExtension = @("2.5.29.19={text}CA=true")
}

$cert = New-SelfSignedCertificate @params

The code above generates an x.509 certificate that contains “CA=true” in the BasicConstraints section.
This is the only part that may not actually be needed. I was really tired after all the investigation. Think of the time I have just saved you here!
If you don’t do this part and it still works, let me know.

Without the “Provider” parameter specifically set to “Microsoft Enhanced RSA and AES Cryptographic Provider” the PowerBI Gateway would not select the key during the SSO connection test.
Instead it would throw an error like so in the GatewayErrors log:

InnerToString=<ccon>System.Security.Cryptography.CryptographicException: Invalid provider type specified.

I had to use the “SuppressOid” parameter to prevent the KeyUsage being added, even if I specifically set “KeyUsage = ‘None'”.
This is really just cosmetic, because I wanted the certificate to look as identical to the openssl generated certificate. If you don’t do this part and it still works, let me know.

I believe that the OpenSSL equivalent command lines are:

openssl req -x509 -sha256 -days 3650 -newkey rsa:2048 -keyout IdP_Key.pem -out IdP_Cert.pem -subj "/C=GB/ST=MyShire/L=MyTown/O=MyCompany/OU=MyDept/CN=PowerBI HANA IdP"
[enter my password]

openssl pkcs12 -export -out IdP_PFX.pfx -in IdP_Cert.pem -inkey IdP_Key.pem -passin pass:my-password -passout pass:my-password

By doing all the work in Powershell, there is no need to enter passwords, export files and transfer files (if you don’t have openssl.exe on the PowerBI Gateway).

Cluster Config Issue for SAP ERS Instance

Running SAP Netweaver A(SCS)/ERS in a Pacemaker cluster in Azure, AWS or GCP or potentially even on-premise?

Be aware, there is a configuration issue in the current version of the Microsoft, AWS, GCP and SUSE documentation for the Pacemaker cluster configuration (on SLES with non-native SystemD Startup Framework) for the SAP Enqueue Replication Server (ERS) instance primitive in an ENSA1 (classic Enqueue) architecture.


Having double checked with both the SAP and SUSE documentation (I don’t have access to check RHEL) I believe that the SAP certified SUSE cluster design is correct, but that the instructions to configure it are not inline with the SAP recommendations.

In this post I explain the issue, what the impact is and how to potentially correct it.


NOTE: This is for SLES systems that are not using the new “Native Startup Framework for SystemD” Services for SAP, see here.

Don’t put your SAP system at risk, get a big coffee and let me explain below.

SAP ASCS and High Availability

The Highly Available (HA) cluster configuration for the SAP ABAP Central Services (ASCS) instance is critical to successful operation of the SAP system, with the SAP Enqueue (EN) process being the guardian of the SAP application level logical locks (locks on business objects) and the SAP Enqueue Replication Server (ERS) instance being the guarantor for the EN when a cluster failover occurs.

In a two-node HA SAP ASCS/ERS setup, the EN process is on the first server node and the ERS instance is on the second node.
The EN proccess (within the ASCS instance) is replicating to the ERS instance (or rather, the ERS is pulling from the EN process on a loop).


If the EN process goes down, the surrounding cluster is usually configured to fail over the ASCS instance to the secondary cluster node where the ERS instance is running. The EN process will then take ownership of the replica in-memory lock table:

What is an ideal Architecture design?

In an ideal design, according to the SUSE documentation here.
(again, I’m sure RHEL is similar but if someone can verify that would be great), the ASCS and ERS instances are installed on cluster controlled disk storage on both cluster nodes in the cluster:

We mount the /sapmnt (and potentially /usr/sap/SID/SYS) file system from NFS storage, but these file systems are *not* cluster controlled file systems.
The above design ensures that the ASCS and ERS instances have access to their “local” file system areas before they are started by the cluster. In the event of a failover from node A to node B, the cluster ensures the relevant file system area is present before starting the SAP instance.

We can confirm this by looking at the SAP official filesystem layout design for an HA system here:

What is Microsoft’s Design in Azure?

Let’s look at the cluster design on Microsoft’s documentation here:

It clearly shows that /usr/sap/SID/ASCS and /usr/sap/SID/ERS is being stored on the HA NFS storage.
So this matches with the SAP design.

What is Amazon’s design in AWS?

If we look at the documentation provided by Amazon here:

We can see that they are using EFS (Elastic File Storage) for the ASCS and ERS instance locations, which is mounted using the NFS protocol.
So the design is the same as Microsoft’s and again this matches the SAP design.

What is Google’s design in GCP?

If we look at the documentation provided by Google, the diagram doesn’t show clearly how the filesystems are provided for the ASCS and ERS, so we have to look further into the configuration step here:

The above shows the preparation of the NFS storage.

Later in the process we see in the cluster configuration that the ASCS and ERS file systems are cluster controlled:

and

The above is going to mount /usr/sap/SID/ASCS## or /usr/sap/SID/ERS## and they will be cluster controlled.
Therefore the GCP design is also matching the SAP, Azure and AWS designs.

Where is the Configuration Issue?

So far we have:

  • Understood that /sapmnt is not a cluster controlled file system.
  • established that Azure, AWS, GCP and the SUSE documentation are in alignment regarding the cluster controlled file systems for the ASCS and the ERS.

Now we need to pay closer attention to the cluster configuration of the SAP instances themselves.

The Pacemaker cluster configuration of a SAP instance involves 3 (or more) different cluster resources: Networking, Storage and SAP instance. With the “SAP Instance” resource being the actual running SAP software process(es).

Within the cluster the “SAP Instance” Resource Adapter (RA) is actually called “SAPInstance” and in the cluster configuration is takes a number of parameters specific to the SAP instance that it is controlling.
One of these parameters is called “START_PROFILE” which should point to the SAP instance profile.

The SAP instance profile file is an executable script (on Linux) that contains all the required commands and settings to start (and stop) the SAP instance in the correct way and also contains required configuration for the instance once it has started. It is needed by the SAP Instance Agent (sapstartsrv) and the executable that is the actual SAP instance (ASCS binaries: msg_server and enserver, ERS binary: enrepserver).
Without the profile file, the SAP Instance Agent cannot operate a SAP instance and the process that is the instance itself is unable to read any required parameter settings.

Usually, the SAP Instance Agent knows where to find the instance profile because at server startup, the sapinit script (triggered through either systemd unit-file or the old Sys-V start scripts) will execute the entries in the file /usr/sap/sapservices.
These entries call the SAP Instance Agent for each SAP instance and they pass the location of the start profile.

Here’s a diagram from a prior blog post which shows what I mean:

In our two-node cluster setup example, after a node is started, we will see 2 SAP Instance Agents running, one for ASCS and one for ERS. This happens no matter what the cluster configuration is. The instance agents are always started and they always start with the profile file specific in the /usr/sap/sapservices file.
NOTE: This changes in the latest cluster setup in SLES 15, which is a pure SystemD controlled SAP Instance Agent start process.

The /usr/sap/sapservices file is created at installation time. So it contains the entries that the SAP Software Provisioning Manager has created.
The ASCS instance entry in the sapeservices file, looks like so:

LD_LIBRARY_PATH=/usr/sap/SID/ASCS01/exe:$LD_LIBRARY_PATH; export LD_LIBRARY_PATH; /usr/sap/SID/ASCS01/exe/sapstartsrv pf=/usr/sap/SID/SYS/profile/SID_ASCS01_myhost -D -u sidadm

But the ERS instance entry looks slightly different:

LD_LIBRARY_PATH=/usr/sap/SID/ERS11/exe:$LD_LIBRARY_PATH; export LD_LIBRARY_PATH; /usr/sap/SID/ERS11/exe/sapstartsrv pf=/usr/sap/SID/ERS11/profile/SID_ERS11_myhost -D -u sidadm

If we compare the “pf=” (profile) parameter entry between ASCS and ERS after installation, you will notice the specified ERS instance profile location is not using the location accessible under the link /usr/sap/SID/SYS.
Since we have already investigated the file system layout, we know that the “SYS” location only contains links, which point to other locations. In this case, the ASCS is looking for sub-directory “profile”, which is a link to directory /sapmnt/SID/profile.
The ERS on the other hand, is using a local copy of the profile.

This is something that usually would go unnoticed, but the ERS must be using a local copy of the profile for a reason?
Looking at SAP notes, we find SAP note 2954193, which explains that an ERS instance in an ENSA1 architecture should be started using a local instance profile file:

Important part: “this configuration must not be changed“.
Very interesting. It doesn’t go into any further detail, but from what we have understood about the criticality of the SAP ASCS and ERS instances, we have to assume that SAP have tested a specific failure scenario (maybe failure of sapmnt) and deemed it necessary to ensure that the ERS instance profile is always available.
I can’t think of any other reason (maybe you can?).

The next question, how does that ERS profile get created in that local “profile” location? It’s not something the other instances do.
After some testing it would appear that the “.lst” file in the sapmnt location is used by the SAP Instance Agent to determine which files to copy at instance startup:

It is important to notice that the DEFAULT.PFL is also copied by the above process.
Make sure you don’t go removing that “.lst” file from “/sapmnt/SID/profile”, otherwise those local profiles will be outdated!

To summarise in a nice diagram, this setup is BAD:

This is GOOD:

What about sapservices?

When we discussed the start process of the server, we just mentioned that the SAP Instance Agent is always started from the /usr/sap/sapervices file settings. We also noted how in the /usr/sap/sapservices file, the settings for the ERS profile file location are correct.
So why would the cluster affect the profile location of the ERS at all?
It’s a good question, and the answer is not a simple explanation because it requires a specific circumstance to happen in the lifecycle of the cluster.

Here’s the specific circumstance:

  • Cluster starts, the ERS Instance Agent was already running and so it has the correct profile.
  • We can run “ps -ef | grep ERS” and we would see the “er” process has the correct “pf=/path-to-profile” and correctly pointing to the local copy of the profile.
  • If the ERS instance somehow is now terminated (example: “rm /tmp/.sapstream50023”) then the cluster will restart the whole SAP Instance Agent of the ERS (without a cluster failover).
  • At this point, the cluster starts the ERS Instance Agent with the wrong profile location, and the “er” binary now inherits this when it starts. This will be inplace until the next complete shutdown of the ERS Instance Agent.

As you can see, it’s not an easy situation to detect, because from an outside perspective, the ERS died and was successfully restarted.
Except it was restarted with the incorrect profile location.
If a subsequent failure happens to the sapmnt file system, this would render the ERS at risk (we don’t know the exact risk because it is not mentioned in the referenced SAP note that we noted earlier).
What is more, the ERS instance is not monitorable using SAP Solution Manager (out-of-the-box), you would need to create your own monitoring element for it.

Which Documentation has this Issue?

Now we know there is a difference required for the ERS instance profile location, we need to go back and look at how the cluster configurations have been performed, because of “this configuration must not be changed”!

Let’s look first at the SUSE documentation here:

Alright, the above would seem to show that “/sapmnt” is used for the ERS.
That’s not good as it doesn’t comply with the SAP note.

How about the Microsoft documentation for Azure:

No that’s also using /sapmnt for the ERS profile location. That’s not right either.

Looking at the AWS document now:

Still /sapmnt for the ERS.

Finally, let’s look at the GCP document:

This one is a little harder, but essentially, the proposed variable “PATH_TO_PROFILE” looks like it is bound to the same one as the ASCS instance defined just above it, so I’m going to say, it’s going to be “/sapmnt” because when you try and change it on one, it forces the same on the other:

We can say that all documentation available for the main hyperscalers, provides an incorrect configuration of the cluster, which could cause the ERS to operate in a way that is strongly not recommended by SAP.

Should we correct the issue and How can we correct the issue?

I have reported my finding to both Microsoft and SUSE, so I would expect them to validate.
However, in the past when providing such feedback, the relevant SAP note has been updated to exclude or invalidate the information altogether, rather than instigating the effort of fixing or adjusting any incorrect configuration documentation.
That’s just the way it is and it’s not my product, so I have no say in the solution, I can only report on what I know is correct at the time.

If you would like to correct the issue using the information known at this point in time, then the steps to be taken to validate that the ERS is operating and configured in the correct way are provided in a high-level below:

  1. Check the cluster configuration for the ERS instance to ensure it is using the local copy of the instance profile.
  2. Check the current profile location used by the running ERS Instance Agent (on both nodes in the cluster).
  3. Double check someone has not adjusted the /usr/sap/sapservices file incorrectly (on both nodes in the cluster).
  4. Check that the instance profile “.lst” file exists in the /sapmnt/SID/profile directory, so that the ERS Instance Agent can copy the latest versions of the profile and the DEFAULT.PFL to the local directory location when it next starts.
  5. Check for any differences between the current local profile files and the files in the /sapmnt/SID/profile directory and consider executing the “sapcpe” process manually.

Thanks for reading.

Preventing File System Corruption from Halting Boot Up of SLES in Azure

When you create a Linux VM in Azure, you don’t get to know the “root” user password.
By default, if a Linux VM detects journaled file system corruption at boot if will go into recovery mode, requiring the root password to be able to fix it.
Without the root password, the only other way to fix the issue is copying the O/S disk, mounting on another VM and fixing the issue.
If you don’t have Azure Boot Diagnostics enabled, you might not even know what the problem is! The VM will just appear to not boot.

In this post I show a simple way to prevent Debian based Linux distributions (I use SLES) from failing boot up due to file system corruption. Our example is an XFS file system just like in my previous post.
XFS is journaled and will check the integrity on mounting. If there are problems with the file system then Linux will fail to mount it, which will cause the O/S boot up process to stall.

In a production system, you can imagine the scenario where a simple restart of a VM causes an hour long downtime (or longer).

NOTE: In my scenario there is no Linux device encryption, which could make the job or repair even harder, and all the more important to prevent boot failure.

Preventing Boot Failure

To prevent our corrupt XFS file system from halting boot, we just need to add 1 single option to the mount options in file /etc/fstab.
We use the “nofail” option.

We could just go and write this straight out to the fstab file and expect it to work.
However, we can test it first to make sure that it is:

  • supported on your version/distribution of Linux.
  • supported for your file system type (mine is XFS).

We could use the “-f” (fake) mount option to the “mount” command, but in testing I cannot get this to actually show an error when it is passed an invalid mount point option.
Instead, let’s actually mount the file system to check if “nofail” is accepted.

As the root user (or with sudo) get the current mount options for your file system (the one you will be applying “nofail” to):

grep BIG /etc/fstab

/dev/volTMP/lvTMP1 /BIGSTRIPEDDISK xfs defaults 0 0

I can see that my /BIGSTRIPEDDISK is mounted from a volume group and has the “defaults” mount options. Yours may be different.
We can now create a new mount point location and temporarily mount the file system adding the “nofail” option to test it is accepted (adjust the mount options using your current mount point settings):

mkdir /mnt/tempmount
mount -o defaults,nofail /dev/volTMP/lvTMP1 /mnt/tempmount

If you got an error or warning, then the file system type or your Linux distribution does not support the use of “nofail”. Maybe check the man page for an equivalent option (“man mount”).

If you didn’t get an error, then you know that you can successfully apply the “nofail” option to the end of the options column (column number 4) in the fstab:

vi /etc/fstab
...
/dev/volTMP/lvTMP1 /BIGSTRIPEDDISK xfs defaults,nofail 0 0
...

Once applied, it is recommended that you always verify boot related changes, by taking some downtime to restart the machine. There is nothing worse than applying a change and not testing it.

With “nofail” in place, the next time the O/S boots and the file system is mounted if there are issues with the integrity or even if the device is missing, the O/S will move forward in the boot process and ignore the error.
There is obviously a small consequence of this, file systems may not be mounted after a boot has completed.
It is possible to mitigate this problem with monitoring (scripts that monitor file system free space, for example) or other checks after boot.
Of course there is also a second option to all of this, set the root user password on new VMs and store in your secure password location. You can use a 16 character random string like those generated from a password manager.
You will also need to ensure that you can use the Azure Serial Console to get to the VM command line, because in some configurations, security practices can indirectly prevent this.

HowTo: Calculate SAP ASE HADR Replication Agent Buffer Pool Maximum Size

Scenario: In a SAP ASE HADR system, the primary ASE database log contains the following warning:
RepAgent(X): Autoscaling: Warning: The maximum configured buffer pool size:8 is less than the allocated buffer pool size:200. The maximum is set to 200“.

You would like to know what is a good value for buffer pool max size and how do you know we have enough memory assigned to the Rep Agent?

If we reference here: https://help.sap.com/viewer/075940003f1549159206fcc89d020515/16.0.3.8/en-US/fe0b1842bd1c1014858c88846330cc94.html
We can see that the “buffer pool” number is actually the number of buffers, which is also known as the “number of packages”.

If your current parameter values are as such:
“buffer pool maximum size” = 200
“stream buffer size” = 1572864 bytes (1.5MB)
“rep agent memory size” = 500MB

We can calculate the amount of memory needed for the Rep Agent like so:
max 200 packages x 1.5MB = 300MB of Rep Agent memory.

Now we know how to work forwards in the calculation, we can work backwards, starting with the Rep Agent memory:
300MB of Rep Agent memory / 1.5MB = 200 packages (buffer pool max size).

Recovering From a Deleted Data Disk with XFS on LVM in Azure

It’s quite a hefty long title, and it still doesn’t quite convey what I want to write about in the post.
This post is about a specific situation that can occur whereby you may have accidentally deleted a data disk or recovered a VM that had “selective disk backup” enabled and you’re missing a data disk, of a Linux VM that had the data disk as part of a Logical Volume Manager (LVM) managed file system.

In this post I show how to recover the unbootable VM using a rescue VM, then repair the volume by adding a new data disk and eventually repairing the LVM volume group and the XFS file system.

The Setup

In our setup, we have a SLES 12 VM (the victim) with the following disk architecture:

I actually have 3 data disks, but we will only be working with 2 of them.
The 2 data disk LUNs map to Linux physical disks /dev/sdd and /dev/sde and are part of volume group volTMP, which contains a logical volume lvTMP1 striped over the two disks and on lvTMP1 is an XFS file system mounted as “/BIGSTRIPEDDISK”.

I actually created this setup as part of this post here, so you can follow the instructions on that post to get to the same state if you wish.

I also have, ready to start up, a Ubuntu VM created using a basic Azure VM type (it’s a B1s) and an Azure Ubuntu Server 18 LTS image.
This will be my rescue VM. It’s small, light and fast to boot up.
You don’t have to use a Ubuntu VM, but you will need another VM that is running Linux and able to mount the file systems that you use for your root file systems (mine is ext4).

We Do the Damage

In this scenario we are deleting one of the data disks of the SLES 12 VM, from inside the Azure Portal.
The same situation could occur if you restored a VM from backup, but the VM had “selective disk backup” enabled, and restored with missing data disks.

The first thing we do, with the VM already shutdown, is remove the data disk (LUN2) from the Portal:

NOTE: We are not actually deleting the disk here in our test setup. It just detaches it from the VM. But imagine that we did detach and delete it completely.

Save the change:

We then start the VM:

The VM May Not Boot

Depending on your file system mount options and your O/S (I’m using SLES 12), by default the Linux VM will refuse to boot fully.
It will actually get stuck trying to mount the file system /BIGSTRIPEDDISK because the data disk is now missing (we deleted it!).

NOTE: If you have “nofail” in the fstab mount options, then your VM may boot normally, with the file system missing. You’re lucky. Skip though to the section on adding a new data disk (after section “Swap O/S Disk”).

The Linux O/S will go into recovery mode. If you have Boot Diagnostics enabled, you can verify this in the “Serial Console” within Azure Portal on the VM resource details screen.
In recovery mode, you are prompted to enter the root password to give you access to a basic shell. However, when deploying from Azure images, you don’t get a root user password, so you won’t know it!

If you don’t have Boot Diagnostics enabled, then you will be waiting a some minutes until the VM boot hits a timeout and Azure Portal informs you it failed to start:

In either of the above cases, you may end up at this same point. The VM will not boot due to the failed disk.

What we need to do to recover from this situation and allow our SLES 12 VM to boot, is to comment out the failed file system from the /etc/fstab file on the SLES 12 VM’s O/S disk.
This will involve the use of the handy “swap O/S disk” button in the Azure Portal.

Create an Image of the O/S Disk

We have to create a snapshot image of the existing SLES 12 VM O/S disk, because we cannot detach the O/S disk from the existing VM.

Locate the SLES 12 VM in the Portal and click it’s O/S disk:

Click the “Create Snapshot” button, then give the snapshot a useful name:

I used standard HDD (cheaper), but you can choose SSD if you wish:

Click to go to the snapshot once it has been created:

We now have an image of the O/S disk, which we can use to create a new O/S disk.

Create New Disk from Image

We will create a new managed disk from the image of the O/S disk.
This will allow us to mount it on our Ubuntu VM (the rescue VM).

From the Azure Portal create a brand new disk same size and specification as the original O/S disk.
NOTE: The Ubuntu VM is limited and may not support higher performing disk types like Ultra Disk. In which case you may need to create the new disk as a lower performance disk.

Select the image you created as the source and give the new disk a recognisable name:

Attach New Disk to Rescue VM

We now attach the new disk to the rescue VM (my Ubuntu VM) from the “disks” section of the Ubuntu VM resource:

It’s the first data disk, so is going on LUN 0:

Mounting the Disk on Rescue VM

Start the rescue VM (Ubuntu) if it is not already started, log onto the VM and either as root or using “sudo” check the disk devices present by running “lsblk”:

In my example the new disk is visible as /dev/sdc.
Because the disk is an O/S disk, it has partitions (it’s not a whole disk). For this reason, we have to mount the specific partition that the root (“/”) file system was mounted from.
In my case I can easily see that partition 4 (sdc4) because it is the largest partition on the /dev/sdc disk at 28.8G in size.

We have to create a location to mount the partition (“mkdir /mnt/suse_os_disk”) then mount partition 4 from sdc using the “mount” command:

The mount command is intelligent enough to know what file system is on the disk.

Adjust Fstab File

With the new disk mounted on the rescue VM, we can use our favourite text editor to adjust the fstab file and comment out the affected file system, to prevent it from being mounted.

vi /mnt/suse_os_disk/etc/fstab

We comment out /BIGSTRIPEDDISK :

Save the file changes.

We can now safely unmount the disk and then disconnect it from the rescue VM:

From the Azure Portal, we delete the new data disk from the rescue VM:

Swap O/S Disk

In Azure Portal, go to the SLES 12 VM and in the disks view of the VM, click the “Swap OS disk” button:

Select the new disk that we have just unmounted from the rescue VM:

Start the SLES VM and it will boot off the new disk:

The VM will boot up successfully.
Great stuff. All that effort and so far we have a booting VM.
We still have the initial problem, we deleted one of our data disks. We need to create a new data disk.

Add New Data Disk

In Azure Portal on the SLES VM, create a new data disk to the same specification as it existed originally.
You can guess if you are not sure, but you have to remember that it should be the same tier and size as other disks in a striped LVM logical volume.

Save the change:

Repair Volume Group

With the new data disk added, we can now start the process of repairing the volume group.

We execute a pvscan to list physical volumes on the VM:

In the above we can see that LVM is reporting a missing physical volume. This is the one we deleted.

Using “lsblk” we can see the new device right at the end, it’s /dev/sde:

We can create the new physical volume and apply the previous UUID to the disk, to make LVM think this is the same disk, then we get LVM to write the configuration backup to the new disk.

First, let’s check what LVM configuration backups we have for our volume group:

ls -ltr /etc/lvm/archive/volTMP*

We choose the latest one available before we lost the disk:

We can now re-create the physical volume, applying the previously used UUID and LVM configuration (metadata):

pvcreate --uuid '<previous missing uuid from the pvscan output>' /
 --restorefile /etc/lvm/archive/volTMP_<latest>.vg /dev/sde

Now we tell LVM to restore itself into a working order using the configurations available on the disks:

Let’s check the status of our logical volume that exists in the volTMP volume group:

In the above we notice that the “a” (active) flag is not set, the logical volume is therefore not yet active.
Let’s activate it:

lvchange -ay /dev/volTMP/lvTMP1

You can see that it is now active. Great!
We have repaired LVM. We no longer get any warnings about missing disks when executing the LVM related commands like “pvs, lvs, vgs”.

Repair File System

If we were to try and mount the file system /BIGSTRIPEDDISK, it would show an XFS error, because our new disk does not yet have a file system on it.
The file system is in a strange status, because 50% of the blocks are on the disk that was not deleted, and 50% are non-existent, because they were on the disk that was deleted.
So we actually have to repair the file system.
Instead of repairing, we could have chosen to just apply a new file system with mkfs.xfs, but let’s do a repair and see what the process is.

xfs_repair -L /dev/mapper/volTMP-lvTMP1

We can now edit the fstab and uncomment our file system /BIGSTRIPEDDISK:

Finally, we try and mount the file system:

It worked, and it was a clean mount. Nice.

Where Are My Files

With our repaired file system mounted, we dive in and look for files:

Ah yes! It’s clean!
No files will exist because we lost the disk. The LVM striping that we use is for performance, not redundancy, which means when you have to re-create the disk and repair the file system, all files will be lost.

Summary:

  • Actually deleting data disks is not simple in the Azure Portal. Microsoft have done a good job to try and prevent you from doing it by mistake, but it is still possible to do it by accident and also through code.
  • Turn on boot diagnostics on your VMs, it helps to see what is going on during boot.
  • Add “nofail” to the mount options for the data disks on Debian based systems. This will allow them to boot even with missing data disks.
  • When a data disk goes missing, that is actively mounted at Linux boot, the VM may not boot at all.
    You could reset the all your root account passwords and securely store them, which would allow you to enter recovery mode, but this is not something that most companies do.
    Be prepared and have a rescue VM ready to start up. This is the best option and could help in a number of scenarios.
  • Once booting again, we can use LVM to help simply restore the state of the volumes and file systems. We don’t need to re-create the LVM setup.
  • In a striped logical volume, we stripe for performance, not redundancy, you will lose data if you lose one of the data disks of a striped logical volume.
  • Using the “selective disk backup” feature saves backup vault space, but it means you will need to use this process to restore the volume groups for missing disks! Be wary and plan ahead!
  • Test backup & restore processes!

In another blog post, I will show how to automate the root disk snapshot and disk creation followed by attaching to another VM. We will have a single script that can be run to automate the whole process. This is useful to help fix other issues such as when you have enabled Linux HugePages with more memory than the VM has!