This blog contains experience gained over the years of implementing (and de-implementing) large scale IT applications/software.

SAP’s Deeper Partnership with Red Hat

An announcement back in February 2023 from Waldorf tells us of a “deepening” partnership between SAP and the Enterprise Linux Operating System vendor Red Hat.

They have a long history together already, with the SAP Linux Labs encompassing the Red Hat tech team to ensure SAP on Red Hat Linux works and performs as it should.
 

Here are the lines of significance from the SAP news article: https://news.sap.com/2023/02/red-hat-and-sap-deepen-partnership/

…SAP is boosting support for the RISE with SAP solution using Red Hat Enterprise Linux as the preferred operating system for net new business for RISE with SAP solution deployments.

The platform builds on this trust by offering a consistent, reliable foundation for SAP software deployments, providing a standard Linux backbone to support SAP customers across hybrid and multi-cloud environments.

…building on Red Hat’s scalable, flexible, open hybrid cloud infrastructure.

…SAP’s internal IT environments and SAP Enterprise Cloud Services adopting Red Hat Enterprise Linux can gain greater flexibility to address modern and future technology requirements.

“…Red Hat Enterprise Linux offers enhanced performance capabilities to support RISE with SAP solution deployments across cloud environments…

There are a lot of points to cover and, as always, a little history is useful.
Grab a bagel (that’s what American’s eat right?) put some Obatzda cheese on it (it’s German, I’m trying to equate eating with the subject of this article) and settle in for a read.

Who is Red Hat?

You can read all about Red Hat on Wikipedia here: https://en.wikipedia.org/wiki/Red_Hat , but suffice to say:

  • It is owned by IBM since 2019.
  • It owns Ansible.
  • It owns Red Hat Enterprise Linux CoreOS (RHCOS), which is the production Linux Operating System beneath the container platform OpenShift.  RHCOS is built on the same Red Hat Enterprise Linux (RHEL) kernel.

What is RISE with SAP?

There are many views on why “RISE with SAP” came to fruition and who it benefits, but the official line is that RISE with SAP is a solution designed to support the needs of the customer’s business in any industry, with SAP responsible for the holistic service level agreement (SLA), cloud operations, and technical support and the partner (insert any Global SI) provides sales, consulting and application managed services (AMS).

…SAP is boosting support for the RISE with SAP solution using Red Hat Enterprise Linux as the preferred operating system for net new business for RISE with SAP solution deployments.

When the article talks about “net new” that just means any brand new RISE subscriptions.

Notice that one of the significant lines I pulled out of the article says:

…providing a standard Linux backbone to support SAP customers across hybrid and multi-cloud environments.

Since SAP are doing the hosting, the “multi-cloud” part is probably referring to SAP’s hybrid and multi-cloud.  i.e. SAP’s own datacentres and also the hyperscalers.

An enticing option that comes as part of the RISE deal (depending on the customer spend) is SAP Business Technology Platform (BTP).
SAP BTP is a PaaS solution under a subscription model, in which SAP customers can combine and deploy curated SAP services from SAP or third-parties, or use services to code their own solutions in a variety of languages including SAP’s proprietary ABAP language.

The SAP BTP environments are hybrid and multi-cloud, as they are hosted in Cloud Foundry (the newest) or Neo (currently sun-setting), with these being run from a combination of SAP’s own datacentres and/or on the main hyperscalers (Cloud Foundry).  There are two other environments Kyma, a micro-services runtime based on Kubernetes and the ABAP environment, hosted in Cloud Foundry.

In conclusion on this section, I suggest that the described “net new business” is actually internal business inside of SAP and not directly the hosting of customer’s S/4HANA systems.  In fact, S/4HANA is only very loosely mentioned in the article, which leads me to believe that this announcement is purely for BTP and other surround services.

SAP HANA and Compute Power

In one of the statements from SAP on this:

“deepening” partnership, we see “…Red Hat Enterprise Linux offers enhanced performance capabilities to support RISE with SAP solution deployments across cloud environments…
 

I can’t see anything specifically mentioned about how Red Hat’s Linux operating system is more performant than SUSE, other than an article from 2019 where a SAP Business Warehouse (BW) on HANA system (maybe, could be BW/4HANA, difficult to tell) holds a world record.

See here for more:  https://www.redhat.com/en/resources/red-hat-enterprise-linux-for-sap-solutions-datasheet   which links to here:  https://www.redhat.com/en/blog/red-hat-enterprise-linux-intels-newest-xeon-processors-posts-record-performance-results-across-wide-range-industry-benchmarks?source=blogchannel

The thing to note about those claims are that:

  • This was based on a 2nd Gen Intel Xeon (3rd Gen is already available).
  • The CPU used Intel Advanced Vector Extensions 512 (AVX-512) instruction set, which Intel says arrived in 3rd Gen chips (is the Red Hat article quoting the wrong chip generation?).
  • Generally we run HANA on hyperscalers on Intel Skylake or Cascade lake CPUs.  Only HANA on bare metal may allow Xeon CPUs.
  • The Red Hat Linux Operating System version was 7.2 for the world record, but 7.9 is the latest support pack version and  9.0 is out now.  Also, 7.2 is now only supported for older versions of HANA 2.0 (up to SPS03).
  • The use of Intel OptaneDC (Intel’s non-volatile memory persistence technology) was used in the world record, but recently announced in 2022 as defunct (superseded by another initiative).
  • 2019 was the year that the IBM acquisition of Red Hat concluded.  Coincidence?

My summary of this section is that I don’t believe performance is the reason for any switch by SAP from (mainly) SUSE to Red Hat.  The one article of relevance that I can find seems just too old and outdated.

What I think, is that the announcement from SAP is referring to something other than the Linux Operating System alone.

Red Hat’s Scalable, flexible, open hybrid cloud infrastructure

We maybe need to look past the Red Hat Linux Operating System and at the infrastructure eco-system that the Operating System is part of.

…building on Red Hat’s scalable, flexible, open hybrid cloud infrastructure.

When the article talks about “open” we are inclined to think about Open Source, freely available or even open APIs (sometimes just having APIs can make something “open”).

In my mind, something that can run seamlessly almost anywhere on hybrid cloud would involve containers.  Containers provide scalability (scale-out) and flexibility (multiple environments offered).

Let me introduce you to OpenShift.  Yeah, it’s got “open” in the name.

See here for a wiki article:  https://en.wikipedia.org/wiki/OpenShift

As a summary of OpenShift, the Red Hat Enterprise Linux CoreOS (RHCOS) underpins the OpenShift hybrid cloud platform and RHCOS uses the same kernel as Red Hat Enterprise Linux.

The orchestration of OpenShift containers is done using Kubernetes and Red Hat is the second largest contributor to Kubernetes after Google (Red Hat is a platinum member: https://www.cncf.io/about/members/).

I think you might be able to see where we are heading in this section.

Could SAP be adopting OpenShift internally for its future container hosting platform strategy?

IBM Cloud deprecated support for Cloud Foundry in mid-2022.  As suspected, Red Hat OpenShift is one of the touted solutions to replace it: https://cloud.ibm.com/docs/cloud-foundry-public?topic=cloud-foundry-public-deprecation#dep_nextsteps

Need greater efficiency and revolutionary delivery? Red Hat OpenShift on IBM Cloud might be your solution.

The above quote on the IBM Cloud site does provide some hint that operating Cloud Foundry platform services at scale, could be less efficient and less innovative compared to Red Hat OpenShift.


Maybe this is something that, internally, SAP have also concluded?

What Does SUSE Offer to Compete with Red Hat and it’s OpenShift offering?

The SUSE Linux Enterprise Server (SLES) Operating has been a solid foundation for running SAP systems.

Similar to Red Hat, SUSE has a varied portfolio of products in the Linux container technology space.
Rancher Labs is one of those products, and allows easier management of Kubernetes, especially once the quantity of containers accelerates.

SUSE is also a contributor to Kubernetes (it is a silver member).

SUSE also owns Rancher, which is an open source container management platform similar to Red Hat’s OpenShift. 

The SUSE Rancher product is open armed, in that it embraces many different operating systems and a number of license options, whereas Red Hat OpenShift supports only the Red Hat CoreOs and requires a SUSE subscription.

While being open is a good thing, it also adds complexity, since Red Hat’s CoreOs is a purpose built Operating System with all required features and it would appear to have a simpler method of deploying and maintaining it.

It’s possible that SAP’s announcement comes after some internal evaluation of the two products, with Red Hat’s being favoured the most.

Conclusions

We’ve looked at the article from the SAP site where the new “deeper” partnership with Red Hat was announced.

I think I ruled out performance as a reason for the Operating System change.  The article just didn’t have enough depth for my liking.

I have speculated on how this SAP and Red Hat partnership could be about the internal SAP hosting of PaaS and maybe SaaS related systems and not directly related to hosting of customer’s S/4HANA systems.

What we could be looking at, is the next generation of hosting platform for SAP BTP or possibly SAP S/4HANA Cloud public edition.
Red Hat’s OpenShift platform, underpinned with the Red Hat CoreOS and the Red Hat tools to monitor, automate and orchestrate, could all combine to provide a solid accompaniment to solve SAP’s internal strategic issues.

It’s one of the platforms chosen by IBM Cloud (a no brainer for them really), with the justification that Cloud Foundry was no longer the strategic platform.

The announcement has no impact on the certification of SUSE for running S/4HANA and therefore should not reflect any customer decisions during their RISE with SAP journey for their S/4HANA systems.

Resources:

https://news.sap.com/2023/02/red-hat-and-sap-deepen-partnership/
https://blogs.sap.com/2019/07/15/evolution-of-sap-cloud-platform-retirement-of-sap-managed-backing-services/
https://blogs.sap.com/2023/06/14/farewell-neo-sap-btp-multi-cloud-environment-the-deployment-environment-of-choice/
https://me.sap.com/notes/2235581
https://learn.microsoft.com/en-us/azure/virtual-machines/mv2-series
https://learn.microsoft.com/en-us/azure/virtual-machines/sizes-compute
https://www.intel.com/content/www/us/en/architecture-and-technology/avx-512-solution-brief.html
https://www.redhat.com/en/resources/red-hat-enterprise-linux-for-sap-solutions-datasheet
https://www.redhat.com/en/blog/red-hat-enterprise-linux-intels-newest-xeon-processors-posts-record-performance-results-across-wide-range-industry-benchmarks
https://docs.openshift.com/container-platform/4.8/architecture/architecture-rhcos.html#rhcos-key-features_architecture-rhcos
https://www.anandtech.com/show/14146/intel-xeon-scalable-cascade-lake-deep-dive-now-with-optane
https://www.sap.com/products/erp/s4hana.html
https://en.wikipedia.org/wiki/Red_Hat
https://en.wikipedia.org/wiki/Rancher_Labs
https://en.wikipedia.org/wiki/OpenStack
https://en.wikipedia.org/wiki/OpenShift
https://en.wikipedia.org/wiki/Cloud_Foundry
https://en.wikipedia.org/wiki/3D_XPoint
https://www.ibm.com/support/pages/sap-s4hana-red-hat-openshift-container-platform-business-perspective-cloud-hosting-provider
https://cloud.ibm.com/docs/cloud-foundry-public?topic=cloud-foundry-public-deprecation
https://www.cncf.io/about/members/

Cluster Config Issue for SAP ERS Instance

Running SAP Netweaver A(SCS)/ERS in a Pacemaker cluster in Azure, AWS or GCP or potentially even on-premise?

Be aware, there is a configuration issue in the current version of the Microsoft, AWS, GCP and SUSE documentation for the Pacemaker cluster configuration (on SLES with non-native SystemD Startup Framework) for the SAP Enqueue Replication Server (ERS) instance primitive in an ENSA1 (classic Enqueue) architecture.


Having double checked with both the SAP and SUSE documentation (I don’t have access to check RHEL) I believe that the SAP certified SUSE cluster design is correct, but that the instructions to configure it are not inline with the SAP recommendations.

In this post I explain the issue, what the impact is and how to potentially correct it.


NOTE: This is for SLES systems that are not using the new “Native Startup Framework for SystemD” Services for SAP, see here.

Don’t put your SAP system at risk, get a big coffee and let me explain below.

SAP ASCS and High Availability

The Highly Available (HA) cluster configuration for the SAP ABAP Central Services (ASCS) instance is critical to successful operation of the SAP system, with the SAP Enqueue (EN) process being the guardian of the SAP application level logical locks (locks on business objects) and the SAP Enqueue Replication Server (ERS) instance being the guarantor for the EN when a cluster failover occurs.

In a two-node HA SAP ASCS/ERS setup, the EN process is on the first server node and the ERS instance is on the second node.
The EN proccess (within the ASCS instance) is replicating to the ERS instance (or rather, the ERS is pulling from the EN process on a loop).


If the EN process goes down, the surrounding cluster is usually configured to fail over the ASCS instance to the secondary cluster node where the ERS instance is running. The EN process will then take ownership of the replica in-memory lock table:

What is an ideal Architecture design?

In an ideal design, according to the SUSE documentation here.
(again, I’m sure RHEL is similar but if someone can verify that would be great), the ASCS and ERS instances are installed on cluster controlled disk storage on both cluster nodes in the cluster:

We mount the /sapmnt (and potentially /usr/sap/SID/SYS) file system from NFS storage, but these file systems are *not* cluster controlled file systems.
The above design ensures that the ASCS and ERS instances have access to their “local” file system areas before they are started by the cluster. In the event of a failover from node A to node B, the cluster ensures the relevant file system area is present before starting the SAP instance.

We can confirm this by looking at the SAP official filesystem layout design for an HA system here:

What is Microsoft’s Design in Azure?

Let’s look at the cluster design on Microsoft’s documentation here:

It clearly shows that /usr/sap/SID/ASCS and /usr/sap/SID/ERS is being stored on the HA NFS storage.
So this matches with the SAP design.

What is Amazon’s design in AWS?

If we look at the documentation provided by Amazon here:

We can see that they are using EFS (Elastic File Storage) for the ASCS and ERS instance locations, which is mounted using the NFS protocol.
So the design is the same as Microsoft’s and again this matches the SAP design.

What is Google’s design in GCP?

If we look at the documentation provided by Google, the diagram doesn’t show clearly how the filesystems are provided for the ASCS and ERS, so we have to look further into the configuration step here:

The above shows the preparation of the NFS storage.

Later in the process we see in the cluster configuration that the ASCS and ERS file systems are cluster controlled:

and

The above is going to mount /usr/sap/SID/ASCS## or /usr/sap/SID/ERS## and they will be cluster controlled.
Therefore the GCP design is also matching the SAP, Azure and AWS designs.

Where is the Configuration Issue?

So far we have:

  • Understood that /sapmnt is not a cluster controlled file system.
  • established that Azure, AWS, GCP and the SUSE documentation are in alignment regarding the cluster controlled file systems for the ASCS and the ERS.

Now we need to pay closer attention to the cluster configuration of the SAP instances themselves.

The Pacemaker cluster configuration of a SAP instance involves 3 (or more) different cluster resources: Networking, Storage and SAP instance. With the “SAP Instance” resource being the actual running SAP software process(es).

Within the cluster the “SAP Instance” Resource Adapter (RA) is actually called “SAPInstance” and in the cluster configuration is takes a number of parameters specific to the SAP instance that it is controlling.
One of these parameters is called “START_PROFILE” which should point to the SAP instance profile.

The SAP instance profile file is an executable script (on Linux) that contains all the required commands and settings to start (and stop) the SAP instance in the correct way and also contains required configuration for the instance once it has started. It is needed by the SAP Instance Agent (sapstartsrv) and the executable that is the actual SAP instance (ASCS binaries: msg_server and enserver, ERS binary: enrepserver).
Without the profile file, the SAP Instance Agent cannot operate a SAP instance and the process that is the instance itself is unable to read any required parameter settings.

Usually, the SAP Instance Agent knows where to find the instance profile because at server startup, the sapinit script (triggered through either systemd unit-file or the old Sys-V start scripts) will execute the entries in the file /usr/sap/sapservices.
These entries call the SAP Instance Agent for each SAP instance and they pass the location of the start profile.

Here’s a diagram from a prior blog post which shows what I mean:

In our two-node cluster setup example, after a node is started, we will see 2 SAP Instance Agents running, one for ASCS and one for ERS. This happens no matter what the cluster configuration is. The instance agents are always started and they always start with the profile file specific in the /usr/sap/sapservices file.
NOTE: This changes in the latest cluster setup in SLES 15, which is a pure SystemD controlled SAP Instance Agent start process.

The /usr/sap/sapservices file is created at installation time. So it contains the entries that the SAP Software Provisioning Manager has created.
The ASCS instance entry in the sapeservices file, looks like so:

LD_LIBRARY_PATH=/usr/sap/SID/ASCS01/exe:$LD_LIBRARY_PATH; export LD_LIBRARY_PATH; /usr/sap/SID/ASCS01/exe/sapstartsrv pf=/usr/sap/SID/SYS/profile/SID_ASCS01_myhost -D -u sidadm

But the ERS instance entry looks slightly different:

LD_LIBRARY_PATH=/usr/sap/SID/ERS11/exe:$LD_LIBRARY_PATH; export LD_LIBRARY_PATH; /usr/sap/SID/ERS11/exe/sapstartsrv pf=/usr/sap/SID/ERS11/profile/SID_ERS11_myhost -D -u sidadm

If we compare the “pf=” (profile) parameter entry between ASCS and ERS after installation, you will notice the specified ERS instance profile location is not using the location accessible under the link /usr/sap/SID/SYS.
Since we have already investigated the file system layout, we know that the “SYS” location only contains links, which point to other locations. In this case, the ASCS is looking for sub-directory “profile”, which is a link to directory /sapmnt/SID/profile.
The ERS on the other hand, is using a local copy of the profile.

This is something that usually would go unnoticed, but the ERS must be using a local copy of the profile for a reason?
Looking at SAP notes, we find SAP note 2954193, which explains that an ERS instance in an ENSA1 architecture should be started using a local instance profile file:

Important part: “this configuration must not be changed“.
Very interesting. It doesn’t go into any further detail, but from what we have understood about the criticality of the SAP ASCS and ERS instances, we have to assume that SAP have tested a specific failure scenario (maybe failure of sapmnt) and deemed it necessary to ensure that the ERS instance profile is always available.
I can’t think of any other reason (maybe you can?).

The next question, how does that ERS profile get created in that local “profile” location? It’s not something the other instances do.
After some testing it would appear that the “.lst” file in the sapmnt location is used by the SAP Instance Agent to determine which files to copy at instance startup:

It is important to notice that the DEFAULT.PFL is also copied by the above process.
Make sure you don’t go removing that “.lst” file from “/sapmnt/SID/profile”, otherwise those local profiles will be outdated!

To summarise in a nice diagram, this setup is BAD:

This is GOOD:

What about sapservices?

When we discussed the start process of the server, we just mentioned that the SAP Instance Agent is always started from the /usr/sap/sapervices file settings. We also noted how in the /usr/sap/sapservices file, the settings for the ERS profile file location are correct.
So why would the cluster affect the profile location of the ERS at all?
It’s a good question, and the answer is not a simple explanation because it requires a specific circumstance to happen in the lifecycle of the cluster.

Here’s the specific circumstance:

  • Cluster starts, the ERS Instance Agent was already running and so it has the correct profile.
  • We can run “ps -ef | grep ERS” and we would see the “er” process has the correct “pf=/path-to-profile” and correctly pointing to the local copy of the profile.
  • If the ERS instance somehow is now terminated (example: “rm /tmp/.sapstream50023”) then the cluster will restart the whole SAP Instance Agent of the ERS (without a cluster failover).
  • At this point, the cluster starts the ERS Instance Agent with the wrong profile location, and the “er” binary now inherits this when it starts. This will be inplace until the next complete shutdown of the ERS Instance Agent.

As you can see, it’s not an easy situation to detect, because from an outside perspective, the ERS died and was successfully restarted.
Except it was restarted with the incorrect profile location.
If a subsequent failure happens to the sapmnt file system, this would render the ERS at risk (we don’t know the exact risk because it is not mentioned in the referenced SAP note that we noted earlier).
What is more, the ERS instance is not monitorable using SAP Solution Manager (out-of-the-box), you would need to create your own monitoring element for it.

Which Documentation has this Issue?

Now we know there is a difference required for the ERS instance profile location, we need to go back and look at how the cluster configurations have been performed, because of “this configuration must not be changed”!

Let’s look first at the SUSE documentation here:

Alright, the above would seem to show that “/sapmnt” is used for the ERS.
That’s not good as it doesn’t comply with the SAP note.

How about the Microsoft documentation for Azure:

No that’s also using /sapmnt for the ERS profile location. That’s not right either.

Looking at the AWS document now:

Still /sapmnt for the ERS.

Finally, let’s look at the GCP document:

This one is a little harder, but essentially, the proposed variable “PATH_TO_PROFILE” looks like it is bound to the same one as the ASCS instance defined just above it, so I’m going to say, it’s going to be “/sapmnt” because when you try and change it on one, it forces the same on the other:

We can say that all documentation available for the main hyperscalers, provides an incorrect configuration of the cluster, which could cause the ERS to operate in a way that is strongly not recommended by SAP.

Should we correct the issue and How can we correct the issue?

I have reported my finding to both Microsoft and SUSE, so I would expect them to validate.
However, in the past when providing such feedback, the relevant SAP note has been updated to exclude or invalidate the information altogether, rather than instigating the effort of fixing or adjusting any incorrect configuration documentation.
That’s just the way it is and it’s not my product, so I have no say in the solution, I can only report on what I know is correct at the time.

If you would like to correct the issue using the information known at this point in time, then the steps to be taken to validate that the ERS is operating and configured in the correct way are provided in a high-level below:

  1. Check the cluster configuration for the ERS instance to ensure it is using the local copy of the instance profile.
  2. Check the current profile location used by the running ERS Instance Agent (on both nodes in the cluster).
  3. Double check someone has not adjusted the /usr/sap/sapservices file incorrectly (on both nodes in the cluster).
  4. Check that the instance profile “.lst” file exists in the /sapmnt/SID/profile directory, so that the ERS Instance Agent can copy the latest versions of the profile and the DEFAULT.PFL to the local directory location when it next starts.
  5. Check for any differences between the current local profile files and the files in the /sapmnt/SID/profile directory and consider executing the “sapcpe” process manually.

Thanks for reading.

Preventing File System Corruption from Halting Boot Up of SLES in Azure

When you create a Linux VM in Azure, you don’t get to know the “root” user password.
By default, if a Linux VM detects journaled file system corruption at boot if will go into recovery mode, requiring the root password to be able to fix it.
Without the root password, the only other way to fix the issue is copying the O/S disk, mounting on another VM and fixing the issue.
If you don’t have Azure Boot Diagnostics enabled, you might not even know what the problem is! The VM will just appear to not boot.

In this post I show a simple way to prevent Debian based Linux distributions (I use SLES) from failing boot up due to file system corruption. Our example is an XFS file system just like in my previous post.
XFS is journaled and will check the integrity on mounting. If there are problems with the file system then Linux will fail to mount it, which will cause the O/S boot up process to stall.

In a production system, you can imagine the scenario where a simple restart of a VM causes an hour long downtime (or longer).

NOTE: In my scenario there is no Linux device encryption, which could make the job or repair even harder, and all the more important to prevent boot failure.

Preventing Boot Failure

To prevent our corrupt XFS file system from halting boot, we just need to add 1 single option to the mount options in file /etc/fstab.
We use the “nofail” option.

We could just go and write this straight out to the fstab file and expect it to work.
However, we can test it first to make sure that it is:

  • supported on your version/distribution of Linux.
  • supported for your file system type (mine is XFS).

We could use the “-f” (fake) mount option to the “mount” command, but in testing I cannot get this to actually show an error when it is passed an invalid mount point option.
Instead, let’s actually mount the file system to check if “nofail” is accepted.

As the root user (or with sudo) get the current mount options for your file system (the one you will be applying “nofail” to):

grep BIG /etc/fstab

/dev/volTMP/lvTMP1 /BIGSTRIPEDDISK xfs defaults 0 0

I can see that my /BIGSTRIPEDDISK is mounted from a volume group and has the “defaults” mount options. Yours may be different.
We can now create a new mount point location and temporarily mount the file system adding the “nofail” option to test it is accepted (adjust the mount options using your current mount point settings):

mkdir /mnt/tempmount
mount -o defaults,nofail /dev/volTMP/lvTMP1 /mnt/tempmount

If you got an error or warning, then the file system type or your Linux distribution does not support the use of “nofail”. Maybe check the man page for an equivalent option (“man mount”).

If you didn’t get an error, then you know that you can successfully apply the “nofail” option to the end of the options column (column number 4) in the fstab:

vi /etc/fstab
...
/dev/volTMP/lvTMP1 /BIGSTRIPEDDISK xfs defaults,nofail 0 0
...

Once applied, it is recommended that you always verify boot related changes, by taking some downtime to restart the machine. There is nothing worse than applying a change and not testing it.

With “nofail” in place, the next time the O/S boots and the file system is mounted if there are issues with the integrity or even if the device is missing, the O/S will move forward in the boot process and ignore the error.
There is obviously a small consequence of this, file systems may not be mounted after a boot has completed.
It is possible to mitigate this problem with monitoring (scripts that monitor file system free space, for example) or other checks after boot.
Of course there is also a second option to all of this, set the root user password on new VMs and store in your secure password location. You can use a 16 character random string like those generated from a password manager.
You will also need to ensure that you can use the Azure Serial Console to get to the VM command line, because in some configurations, security practices can indirectly prevent this.

Uplifting & Expanding Linux LVM Managed Disks in Azure

One of the great things about public cloud, is the potential to simply increase the data disk space for your systems.
In the on-premise, non-virtualised world, you would have needed to physically add more disk or swap the disk for a bigger unit.
Even in the on-premise virtualised world, you may have needed more actual disk to expand the virtual hard disk.

Sometimes those disks can be storing data files for databases.
In which case, if you have followed the Azure architecture best practices, then you will be using LVM or some other volume management layer on top of the raw disk device. This will give you more flexibility and performance (through striping).

In this guide I show how to increase the disk size by uplifting the data disks in Azure, then resizing the disk devices in Linux, allowing us eventually to grow the XFS file system to a larger size.
I will discuss good reasons to uplift vs adding extra data disks.

My Initial Setup

In this step-by-step, we will be using Linux (SUSE Enterprise Linux Server 12) with LVM as our volume management software.
The assumption is that you have already created your data disks (2 of them) and striped across those disks with a single logical volume.
(Remember, the striping gives you double the IOPS when reading/writing the data to/from the disks).

In my simple example, I used the following the create my LVM setup:

  1. Add 2x data disks of size 128GB (I used 2x S10) to a VM running SLES 12 using the Azure Portal.
  2. Create the physical volumes (mine were sdd and sde on LUNs 1 and 2):
    pvcreate /dev/sdd
    pvcreate /dev/sde

  3. Create the volume group:
    vgcreate volTMP /dev/sdd /dev/sde
  4. Create the striped logical volume using all the space:
    lvcreate -l +100%FREE volTMP/lvTMP1
  5. Create the file system using XFS:
    mkfs.xfs /dev/mapper/volTMP-lvTMP1
  6. Mount the file system to a new mount point:
    mkdir /BIGSTRIPEDDISK
    mount /dev/mapper/volTMP-lvTMP1 /BIGSTRIPEDDISK

In Azure my setup looked like the below (I already had 1 data disk, so I added 2 more):

In the VM, we can see the file system is mounted and has a size of 256GB (2x 128GB disks):

You can double check the striping using the lvdisplay command with “-m” flag:

Once the disk was setup, I then created a simple text file with ASCII text inside:

I also used “dd” to create a large 255GB file (leaving 1GB free):

dd if=/dev/zero of=./mybigfile.data bs=1024k count=261120

The disk usage is now close to 100%:

I ran a checksum on the large file:

Value is: 3494419206

With the checksum completed (it took a few minutes), I now have a way of checking that my file is the same before/after the disk resize, plus the cksum tool will force reading of the whole file (checking for filesystem I/O issues).

Increasing the Data Disk Size

Within the Azure portal, we first need to stop the VM:

Once stopped, we can go to each of the two data disks and uplift from an S10 (in my example) to an S15 (256GB):

We can now start the VM up again:

When the VM is running again, we can log in and check.
Our file system is the same size:

We check with the LVM command “pvdisplay” to display one of the physical disks, and we can see that the size has not changed, it is still 128GB:

We need to make LVM re-scan the disk to make it aware of the new increased size. We use the pvresize command:

Re-checking the disk using pvdisplay, and we can see it has increased to 256GB in size:

We do the same for the /dev/sde disk:

Once the physical disks are resized (in the eyes of LVM), we can now check the volume group:

We have now got 256GB of free space (see row: “Free PE / Size”) in our volume group.

To allow our file system to get this space, the logical volume within the volume group needs to be expanded into the free space.
We use the lvresize command to make our logical volume use all free space in the volume group “+100%FREE”:

NOTE: It is also possible to specify an exact size should you want to be specific.

Our file system is still only 256GB in size, until we resize it.
For XFS file systems, we use the xfs_growfs command as follows:

Checking the file system now, shows we have 512GB of free space (50% free):

Are my files still present? Yes:

Let’s check the contents of my text file:

Finally, I validate that my big data file has not been corrupted:

Value is: 3494419206

What is the Alternative to Uplifting?

Instead of uplifting the existing data disks, it is possible to increase the amount of storage in my volume group, by adding two new additional disks.
To prevent performance issues, these new disks should be of the same scale level (S10) as the existing disks.
You should definitely not be mixing disk types in a logical volume, so to prevent this, you should not mix them in a volume group (even though you could technically separate them at the logical volume level).

Is there a good reason when to add more disks? When you are going to create a new logical volume, it is ideal to keep the data on separate physical disks to help avoid data-loss (from a lost/deleted disk).
There are also performance reasons to have additional Linux devices, since parameters such as queue depth affect the Linux device level. The Linux O/S can effectively issue more simultaneous read requests since additional data disks are additional devices.

As we have seen, uplifting a disk tier, you will need to take the VM offline or detach the disk. Adding additional new disks on the other hand, you can do this all online. Adding new disks does pose a slight issue if you have one large logical volume with striping, since any new disk quantity needs to match the existing stripe layout (e.g. if you have 2 disks striped, you should add another 2 disks to increase the volume), and the striping balance will only ever be over the quantity of disks in the original striping, not over the new quantity of disks (e.g. striped over 2 disks, even if you have added another 2 to make 4).

Is there a good reason when to not add more disks? When you know you could exceed the VM data disk count limitations. Each VM has a limit, the bigger the VM, the bigger the limit.

When you know that you always leave a small proportion of disk space free. Adding more disks which will only ever be max 80% used, is more wasteful compared to upscaling an existing set of disks which will only ever be 80% used.

Summary

Using the power of Azure, we have increased the data disk sizes of our VM.
The increase needed the VM to be stopped and started again while the disks were uplifted from an S10 (128GB) to an S15 (256GB).

Once uplifted, we had to make LVM aware of the new disk sizes by using the pvresize command, then the free space in the volume group was given to the logical volume and finally the file system was grown.
To maintain the logical volume stripe, both disks were uplifted, then both disks needed the pvresize command running.
We validated the before and after state of our ASCII and data files and they were not corrupted.

Finally, we looked at the alternative to uplifting and saw that uplifting may not be appropriate in all cases.
Adding new disks can be done online.

Cleaning Up /tmp for SAP On SLES On Azure

In an Azure SLES 12 Linux VM the default installation image mounts the /tmp file system as a regular file system off the root (/). Historically, for many Unix/Linux environments, this is not “normal”.

In this post I will discuss what the impact is for this irregular setup of /tmp and what you can do to work around it, ensuring SAP continues to work as usual.

What is “Normal”?

In traditional Linux installations the /tmp file system is usually mounted as a temporary file system (tmpfs), which means it would be cleaned on O/S reboot.
This has been the case for many years. There’s a recent post here that highlights 1994 for Solaris.
Plus, you can find a detailed explanation of tmpfs here: https://www.kernel.org/doc/html/latest/filesystems/tmpfs.html

With the default SLES setup, any files placed into /tmp will not be automatically be cleaned up on reboot and as per the previous links, there can be performance reasons to use a tmpfs.

From memory, there are differing standards on what /tmp should be used for (again see here), and it is possible that the traditional setup is no longer following a newly agreed standard. I really am not certain why SLES does not mount /tmp as tmpfs.
All I know from over 20 years of working with different Unix/Linux products, is that it is generally accepted that /tmp is a dumping ground, that gets cleaned on reboot and from what I can see, SAP think the same.

What is the Impact of /tmp Not Being tmpfs?

When you have /tmp and it is not cleaned on reboot and is not a tmpfs, then it can cause issues when using software that expects some form of clean up to be performed.
When I look at some of the SAP systems in Azure on SLES 12, I see a build up of files in the /tmp directory, which results in the need for a scripted job to clean them up on a periodic cycle.

If some of the more prolific files are not cleaned up regularly, then they can build up into many thousands of files. While this shouldn’t impact the day-to-day running of the SAP system, it can impact some ad-hoc operations such as patching the SAP system or the database.
The reason is that sometimes the patching tools write out files to the /tmp area, then crudely perform a “ls” to list files or find files in that location. If there are many thousands of files, then those listing operations can fail or be delayed.
A perfect example is the patching of the SAP ASE database, which can be affected by thousands of files in the /tmp location.

Finally, with the /tmp directory mounted off the root disk, any filling of /tmp will fill your root disk and this will bring your VM to a halt pretty quickly! Be careful!

What Sort of Files Exists in /tmp ?

In the list below, I am looking specifically at SAP related files and some files that are culprits for building up in the /tmp directory.

File Name PatternDescription
.saphostagent_nnnnnSAP Host Agent run files.
.sapicmnnnSAP ICM run file.
.sapstartsrv##_sapstartsrv.logSAP Instance Agent run file.
.sapstreamnnnnSAP IPC files.
.theagentlives.tmpOwned by Sybase O/S user, is related to SAP ASE instance. Maybe JS Agent.
ctisql_*Temporary iSQL executions using sybctrl.
sap_jvm_nnnn_nnnnnn
sapjvm_profiling_server_nnnn_nnnnn
sap_jvm_monitoringboard_nnnn_nnnnn
SAP JVM execution.
sapinst_instdirFrom an execution of SWPM (contains sapinst).
saplg*Owned by sapadm and are part of the SAP Instance Agent logon ticket generated from the Hostagent.
sb*From an ASE installation.
tmp*Owned by root, lots and lots, possibly Azure agent related as they contain the text “Windows Azure CRP Certificate Generator” when passed through a base64 decoder.
tmp.*Lots and lots, seem to be Kerberos related.

How Can We Clean Up these Files?

The most common way is to use a script.
Within the script will be a “find” statement, which finds the specific files and removes each one.
It needs to be done this way, because if there are too many files, then trying to do “rm /tmp/tmp*” will exceed the number of lines in the shell space for globbing and it will either error or produce no output at all and no files will be removed.

The script will need to be executed as root frequently (maybe weekly or even daily) to ensure that the file quantities are kept consistently low. This can be achieved using an enterprise scheduler or a crontab on each server.

Here’s an example of how to clean up the /tmp/tmp* files with a very specific criteria. The files are removed if they are:

  • located in the /tmp directory
  • with a name length of at least 7 chars beginning with ‘tmp’ followed by A-z or 0-9 at least 4 times.
  • last modified more than 7 days ago.
  • owned by root, with a group of root.
find /tmp -type 'f' -regextype posix-awk -regex '/tmp/tmp[A-z0-9]{4,} -mtime +7 -user root -group root -delete -print

The above will remove the files due to the “-delete”. To test it, just remove the “-delete”.

In summary, you should check how /tmp is setup in your VMs, and then check the files that are created in /tmp.