This blog contains experience gained over the years of implementing (and de-implementing) large scale IT applications/software.

Simple in-Cloud SAP LaMa DR Setup

When running the SAP Landscape Management tool (LaMa) in the cloud, you need to be aware of the tool’s importance in your SAP landscape in the context of disaster recovery (DR).

In this post I will highlight the DR strategies for hosting SAP LaMa with your favourite cloud provider.

What is SAP LaMa?

For those not yet accustomed to SAP LaMa, it is SAP’s complete SAP/non-SAP landscape management and orchestration tool for both on-premise and cloud.

SAP LaMa comes in two guises:

  • Standard Edition
  • Enterprise Edition

The Enterprise edition comes with many additional features, but crucially, it includes the “Cloud Connectors” for all the mainstream cloud vendors.
A “Cloud Connector” allows seamless start/stop/provisioning of cloud hosted VMs.

Using SAP LaMa to execute a pre-configured, ordered startup of VMs and the applications on those VMs can be a huge time saving during a disaster.

What Installation Patterns Can We Use with SAP LaMa?

SAP LaMa is a software component installed inside a standard SAP Netweaver Java stack. Therefore, you may use the standard Netweaver Java installation patterns such as single-system or distributed system.
SAP LaMa will work in either pattern.

What is a Normal Installation Pattern in the Cloud?

In the cloud (e.g. Azure, GCP, AWS etc), when installing SAP Netweaver, you would usually want to use the distributed system architecture pattern, to prevent a single VM outage from disrupting the SAP Netweaver application too much. The distributed system pattern is preferred because you have slightly less control over the patching of the physical host systems, so it afford you that little bit extra up-time.

This usually means having: a Web Dispatcher tier, at least 2 application servers in the application tier, the Central Services (SCS) instance having failover and using Enqueue Replication Server (ERS), plus database replication technology on the database tier.


How is DR catered for in SAP LaMa?

For large organisations with business critical SAP systems like SAP S/4HANA, SAP ECC etc, you would usually have a “hot” DR database server (i.e. running and actively replicating from the primary database) in your designated DR cloud region.
This means there is minimal data-loss and as the DR database is mere minutes behind the primary database in transactional consistency.
The application tier and Web Dispatcher tier would use the cloud provider’s VM replication technology (e.g in Azure this is called Azure Site Recovery), ensuring that the application patching and config is also replicated.

I would designate the above pattern as a “hot” DR architecture pattern.

For SAP LaMa the situation is slightly more flexible because:

  1. It is not business critical, only operations critical.
  2. The database is only a repository for configuration and monitoring data. Therefore, transactional data loss is not critical.
    In fact, the configuration data in SAP LaMa can be exported into a single XML file and re-imported into another LaMa system.

Due to the above, we have some different options that we can explore for Disaster Recovery.
Excluding the “hot” DR architecture pattern, we could classify the DR architecture pattern options for SAP LaMa as “restore”, “cold”, “cool” and finally “warm”. (These are my own designators, you can call them what you like really).

What is a “restore” DR pattern for SAP LaMa?

A “restore” DR setup for SAP LaMa, is when you have no pre-existing VM in your cloud DR region. Instead you are replicating your VM level backups into a geo-replicated file storage service (in Azure this is called Azure Vault).

In this setup, during a DR scenario, the VM backups from your primary region would need to be accessible to restore to a newly built VM in the DR region.

This is the most cost friendly option, but there is a significant disadvantage here. Your system administrators will not have the benefit of LaMa to see the current state of the landscape and they will not be able to make use of the start/stop technology.

Instead they will need a detailed DR runbook with start/stop commands and system/VM startup priority, to be able to start your critical systems in a DR scenario. You are also placing your trust in the VM backup and restore capability to get LaMa back online.

The VM backup timing could actually be an issue depending on the state of the running database at the time of backup. Therefore, you may need to also replicate and restore the database backup itself.

During a DR scenario, the pressure will be immense and time will be short.

Cost: $
Effort: !!!! (mainly all during DR)
Bonus: 0

What is a “cold” DR pattern for SAP LaMa?

A “cold” DR setup for SAP LaMa, is when you have a duplicate SAP LaMa system that is installed in the DR cloud region, but the duplicate system is completely shutdown, including the VM(s).

In this setup, during a DR scenario, the VM would need to be started using the cloud provider tools (or other method) and then the SAP LaMa system would be started.

Once running, the latest backup of the LaMa configuration would need restoring (it’s an XML file) and the cloud connectors connecting to the cloud provider. After connecting to the cloud provider, LaMa can then be used to start/provision the other software components of the SAP landscape, into the DR cloud region.

Compared to the “restore” pattern, we can have our DR LaMa system up and running and start using it to start the VMs and applications in a pre-defined DR operation template (like a runbook).
However, we need a process in place to export and backup the export of the configuration from the primary LaMa system, so that the configuration file is available during a DR scenario.

In Azure, for example, we would store the configuration file export on a geo-replicated file storage service that was accessible from multiple-regions. We also have the associated hosting costs and the required patching/maintenance of the DR VM and LaMa system. As an added bonus, this pattern allows us to apply patches first to the DR LaMa system, which could remove the need for a Development LaMa system.

Cost: $$
Effort: !!! (some during DR, patching)
Bonus: +

What is a “cool” DR pattern for SAP LaMa?

A “cool” DR setup for SAP LaMa, is when you have a duplicate SAP LaMa system that is installed in the DR cloud region, and the duplicate system is frequently started (maybe daily) and the configuration synchronised with the primary SAP LaMa system.

The synchronisation could be using the in-built configuration synchronisation of the LaMa software layer, or it could be a simple automated configuration file import from a shared file location where the configuration file has previously been exported from the primary LaMa system.

In this setup, during a DR scenario, the VM *may* need to be started (depends on when the failure happens), using the cloud provider tools (or other method) and then the SAP LaMa system *may* need to be started. Once running, the latest backup of the LaMa configuration would probably not need restoring (it’s an XML file), because for your business critical systems, they would already exist and be configured as a result of the frequent synchronisation. The cloud connectors would need connecting to the cloud provider.
After connecting to the cloud provider, LaMa can then be used to start/provision the other software components of the SAP landscape, into the DR cloud region.

Compared to the “cold” pattern, we save a little time by having the frequent configuration file synchronisation already done. We can choose to also have a process in place to export and backup the export of the configuration from the primary LaMa system, should we choose to also use that configuration file.
There is an obvious cost to the frequent starting of the VM. Since you pay for the VM to be running.

As an added bonus, this pattern allows us to apply patches first to the DR LaMa system, which could remove the need for a Development LaMa system.

Cost: $$$
Effort: !! (a little during DR, patching)
Bonus: +

What is a “warm” DR pattern for SAP LaMa?

A “warm” DR setup for SAP LaMa, is when you have a duplicate SAP LaMa system that is installed in the DR cloud region, and the duplicate system is constantly running with frequent (could be hourly) synchronisation with the primary SAP LaMa system.
The synchronisation could be using the in-built configuration synchronisation in the LaMa software component, or it could be a simple automated file import from a shared file location where the configuration file has been exported from the primary LaMa system.

In this setup, during a DR scenario, the cloud connectors would need connecting to the cloud provider. After connecting to the cloud provider, LaMa can then be used to start/provision the other software components of the SAP landscape, into the DR cloud region.

Like the “cool” pattern, we get an added bonus that this pattern allows us to apply patches first to the DR LaMa system, which could remove the need for a Development LaMa system.

Compared to the other patterns, we gain the immediate advantage of being able to start/stop VMs and SAP systems in the DR region. However, there is a constant cost for the VM to be running (if using a PAYG VM pricing model).

Cost: $$$$
Effort: ! (hardly any during DR, patching)
Bonus: +

Summary

Depending on your strategy, you may choose to stick to your existing architecture patterns.

You could choose to use a “hot” DR pattern, and ensure that your DR LaMa system is in-synch to the primary.
However, for the most risk averse, I would be inclined to calculate the costs/benefits for the “warm” pattern.
A “warm” pattern also means you could forgo the distributed system installation pattern for the DR system. Choosing the more cost-effective single-system pattern and removing the extra complexity of database level replication.

For SMEs, I would favour more the “cool” pattern. This could remove the need for a Development system, allowing testing of patching on the DR system instead. I feel it represents the middle ground between using the technology vs cost.

Patching SAP LaMa 3.0 to SP17

On the 9th of November 2020 SAP released support package 17 of SAP Landscape Management 3.0.
If you already run SAP LaMa 3.0 SP11 and above, then you can quite easily patch to SP17 by installing the 3 SCA files into your existing Netweaver 7.5 Java stack.

However, things are never so easy, as I will show.

Required Netweaver Stack

Before you can patch SAP LaMa you must always read the support package release note.
For SP17, it is SAP note 2908399 in component “BC-VCM-LVM”.

In the SAP note, it states that a minimum of Netweaver 7.5 SP15 is required for LaMa 3.0 SP17, with a recommendation of Netweaver 7.5 SP17.

That’s good for me, I have Netweaver 7.5 SP16, I should be good to patch with no issues. Right?
No, after applying the 3 SCA files for LaMa 3.0 SP17, the Netweaver stack starts and stops successfully, but when I try to log into LaMa I see the message on the screen “SAP Landscape Management is loading, please wait…“, but it does not progress any further.
When accessing Netweaver Administrator, it works perfectly.

The Error

For the sake of clarity, I also took a look at the Java stack log viewer and I could see an error:

error binding ExecuteCustom/RMI …“, which didn’t mean a lot to me and produced no results in SAP notes.

The error record details mentioned: “com.sap.tc.vcm.engine.operation.handler.customop.CustomRMIOperationHandler.CustomRMIOperationProvider” in application: “sap.com/tc~vcm~engine~app“.

None of the above produced any SAP notes that looked vaguely related.

Let Groundhog Day Commence

I’ve been working with LVM and LaMa for a while now. When I actually started looking how long, I was surprised to see my knowledge went back to 2014.
I was sure at the back of my mind there was a slight recollection of this same issue.

I started searching the SAP notes and with this recollection of a problem in mind, I decided to search for the exact message that was staring at me on the LaMa post-login screen.
It was a direct hit on the SAP note search.
SAP note 2662354 “SAP Landscape Management is loading, please wait…” is an old SAP note for SAP LaMa 3.0 SP07, back in 2018.
The SAP note described the exact same symptoms, and the failure to progress into LaMa past the loading screen.

Inside SAP note 2662354, it referenced the support package release note 2542159 “SAP Landscape Management 3.0 SP07” which states: “Install at least SAP NetWeaver Application Server 7.5 for Java Support Package 11. If you use a lower Support Package, you have to update the SAPUI5 component“.

It was all coming back to me now. In the past, to apply LaMa 3.0 SP07, you needed to patch the NW stack, or an alternative was to simply apply a higher SAPUI5 software component (SCA).

SAP UI5 – Skipping Ahead

Once I understood the potential solution (apply a later SAPUI5 SCA), I needed to validate what I had already validated in the past.
Was it still supported to apply a later SAPUI5 software component to a SAP Netweaver 7.5 Java stack?

In SAP Note 2541677 “How to switch SAPUI5 versions in NW Java 7.50 SP07 and higher SPs“, it confirms that from SAPUI5 7.5 SP07, more than one version of the UI5 library will be included. The SCA effectively becomes cumulative as each SAPUI5 version is released.

More importantly the note says: “SAP recommends that you always implement the latest released Support Package. You are save to apply UI5 patches of higher SPs to your systems, as there as no direct dependencies.“.

That is exactly the confirmation I was looking for.

Patching SAPUI5

My Netweaver stack level is SP16. The recommended Netweaver stack level (based on that LaMa SP17 note) is SP17.
That left two options which could fix the problem:

  • The latest SCA patch level for SAPUI5 7.5 SP16
    or
  • The latest SCA patch level for SAPUI5 7.5 SP17 (taken from the NW 7.5 SP17 stack).

I decided that I would take the “slowly, slowly” approach and patch to the latest SAPUI5 7.5 SP16 patch first.
After patching and restarting the Netweaver stack, I still had the exact same problem.

Moving onto the second option, I applied the latest SAPUI5 7.5 SP17 patch level (UISAPUI5JAVA17P_18-80000708.SCA).
After patching and restarting the Netweaver stack, the issue was finally fixed!

As of Nov 11th, there is still no official documentation for this process.

Improving the Patching Efficiency

During the above problem resolution, I did not use SUM to apply the patches.
When patching SAP LaMa we are talking (usually) of only 3 software component archives.
For this reason, I prefer to patch using the Telnet deployment method.

As the Linux <sid>adm user on the JPAS host, log into the NW Telnet server port:

telnet 127.0.0.1 5##08

[enter administrator user/pw]

deploy /<path-to>/<SCA-file>

The Telnet deployment order for SP17 is:

  • VCMCR4E17_0-70001063.SCA
  • VCM17_0-70001062.SCA
  • VCM_ENT17_0-70001064.SCA
  • [the SAPUI5 7.5 SP17 patch – if you are on NW SP16]

Once deployed, the NW stack needs a full restart.

Summary

  • Patching SAP LaMa should be simple, sometimes it has issues.
  • LaMa depends on the SAPUI5 component version.
  • You may need to patch SAPUI5 to make LaMa work.
  • SAPUI5 support packages include prior versions (after 7.5 SP07).
  • SAP permits you to use a higher SP of SAPUI5 compared to the NW stack SP level.
  • It is possible to use Telnet to deploy the patches, providing you follow the correct order of deployment.
Useful Links
  • SAP Note 2908399 “SAP Landscape Management 3.0 SP17” v7
  • SAP Note 2662354 “SAP Landscape Management is loading, please wait…” v1
  • SAP Note 2541677 “How to switch SAPUI5 versions in NW Java 7.50 SP07 and higher SPs” v7
  • SAP Note 2542159 “SAP Landscape Management 3.0 SP07” v6

Critical SAP Host Agent Security Changes in PL47 – PermissionPolicy

The SAP Host Agent is a critical part of the SAP landscape infrastructure, used to control and, importantly, help automate some aspects of SAP systems.
Generally, writing custom scripts for the Host Agent has been easy.
With experience, it’s easy to see how the Host Agent could be easily abused in such a way that could allow highly privileged access to the server host, without certain security considerations being implemented.

As of the SAP Host Agent 7.21 PL47, the security of the SAP Host Agent and the way that it executes custom scripts is changing.
In this post I will describe how this could break a few things.

What Can The Host Agent Be Used For?

In my experience I have used the Host Agent for the following:

  • Detecting SAP instances on a server host.
  • Patching SAP instances on a server host.
  • Starting/Stopping SAP instances on a server host.
  • Executing scripts on a server host.

Some of the above actions have been performed direct on the server, from SAP BPA (Business Process Automation), from scripts or from tools like Postman, and a lot of the time from SAP LaMa (Landscape Management).

See a previous post for a more detailed example: How an Azure hosted SAP LaMa Controlled SAP System Starts Up

In the majority of cases I have been calling custom scripts, written to perform specific tasks on the target server host.
The scripts are generally hosted in a central location, accessible from all server hosts. This makes it simple to call whichever script.

To be able to execute a custom script, a Host Agent operation descriptor file is required to be deployed into the operations.d directory of the Host Agent home executable directory (usually /usr/sap/hostctrl/exe or C:\Program Files\SAP\hostctrl\exe).
The descriptor allows the Host Agent to understand how to execute the custom script. It contains, for example, the target platform (Windows\Linux), the name and path for the target script, which operating system user is needed to execute the script and any parameters.

On Linux, the descriptor can be specified to execute the target script as any operating system user on Linux, including the root user.
For this reason, the Host Agent and it’s installation directory location are owned by the root user. All files are only modifiable by the root user.

On Windows it is more secure by default.
The Windows security mechanisms prevent the Host Agent from executing any script as any user other than the Computer SYSTEM user (this is the user that the Host Agent executes as). NOTE: I have a workaround for this which I have developed.

Even though the Host Agent installation location and descriptor location and files are not necessarily easily modified, the weakest link in the security chain is the target script/executable and the location of the target script/executable.

What is Changing With Patch Level 47?

From June 2020, with the introduction of Host Agent 7.21 PL47, a new set of security requirements (PermissionPolicy) are introduced, which make the Host Agent more secure when executing custom scripts.

In fact, the changes were introduced before PL47, probably in PL44 or 45, as I remember seeing the PermissionPolicy check output in a previous trace file. It was obviously disabled by default in those prior patch levels.

The main changes introduced by the new PermissionPolicy are:

  • The target script and its directory must be owned by the same user as is specified in the descriptor file for the execution of the script, or it should be executable by the root user (Linux).
  • The script’s source directory must be writeable by this same user or root (Linux), or be writeable by the primary group of the user.
  • If the script is located on an NFS share, “root squash” must be disabled.

What Is Impacted By the New PermissionPolicy Change?

  • Any descriptor in the Host Agent operations.d directory, will be impacted.
  • Any target script will be checked by the new Host Agent security policy.
  • Only Linux/Unix servers will be affected due to the way that Windows security works (as mentioned before).

Because the new security policy affects Linux and affects any descriptor, this will also have a direct impact on some SAP HANA HSR operations performed from SAP LaMa, plus impact any custom operations that you have created.

By default the new security policy is enabled in the Host Agent as soon as you apply patch level 47.

How to Minimise Disruption?

A lot of customer implement the Host Agent auto-update feature, which saves significant effort when applying the frequent SAP Host Agent patches to the entire landscape.

The auto-update feature has one downside; it’s too easy to apply a patch to the whole landscape without reading the SAP notes to discover the contents of the patch or any changes in the patch. Make sure you always read the notes and make sure your auto-update architecture is designed to allow selective roll-out of the Host Agent patches to a portion of your landscape at a time (not the whole landscape in one go).

See here for a brief overview of SAP Host Agent auto-update.

The SAP note 2932953 mentions a method of adjusting the descriptor file to disable the new PermissionPolicy setting completely.
However, this needs pro-active adjustment, since some of the operations affected may only be used in a HANA HSR failover scenario (you will not know it doesn’t work until you need to use it).

Disabling the new security policy is obviously not a long term solution, since it could be enforced in the future.

Remember: Make your desired PermissionPolicy changes to your descriptor files before you apply the Host Agent patch.

HowTo: Show Current Role of a HA SAP Cloud Connector

Cloud wisp on mountain

If you have installed the SAP Cloud Connector, you will know that out-of-the-box it is capable of providing a High Availability feature at the application layer.

Essentially, you can install 2x SAP Cloud Connectors on 2x VMs and then they can be paired so that one acts as Master and one as “Shadow” (secondary).

The Shadow instance connects to the Master to replicate the required configuration.

If you decide to patch the Cloud Connector (everything needs patching right?!), then you can simply patch the Shadow instance, trigger a failover then patch the old Master.

There is only one complication in this, and that is that it’s not “easy” to see which is acting in which role unless you log into the web administration console.

You can go through the log files to see which has taken over the role of Master at some point, but this is a not easy and doesn’t lend itself to being scripted for automated detection of the current role.

Here’s a nice easy way to detect the current role, and could be used (for example) as part of a Custom Instance monitor script for SAP LaMa automation of the Cloud Connector:

awk '/<haRole>/ { match($1,/<haRole>(.*)<\/haRole>/,role); if (role[1] != "" ) { print role[1]; exit } }' /opt/sap/scc/scc_config/scc_config.ini

Out will be either “shadow”, or “master”.

I use awk a lot of the time for pattern group matching because I like the simplicity, it’s a powerful tool and deserves the very long O’Reilly book.

Here’s what that single code line is doing:

awkThe call to the program binary.
Start the contents of the inline AWK script (prevents interpretation by the shell).
/<haRole>/Match every line that contains the <haRole> tag.
{On each line match, execute this block of code (we close with “}”).
$1Match against the 1st space delimited parameter on the line.
/<haRole>(.*)<\/haRole>/,Obtain any text “.*” between <haRole> tag.
roleStore the match in a new array called “role”.
if (role[1] != “” )Check that after the matching, the role array has 2 entries (zero initialised array).
{ print role[1]; exit }If we do have 2 entries, print the second one (1st is the complete matched text string) from the array and exit.
}’Close off the command and AWK script.
/opt/sap/scc/
scc_config/
scc_config.ini
The name of the input file for AWK to scan.

It’s a nice simple way of checking the current role, and can be embedded into a shell script for easy execution.

With SAP LaMa you can auto-save on your Azure Managed Disk Costs

By now, most people know that after you’ve moved your SAP landscape to the cloud, you could save hosting costs by shutting down SAP system VMs when they are expected to not be used.
(There are caveats around this as it depends on whether you’re paying for reserved instances).

But did you know there’s also an extra saving that can be had in the cloud?

For SAP to support your SAP systems in Microsoft Azure, you must use Premium tier storage.
The reason for this is primarily because Premium tier storage comes with an SLA from Microsoft, which means you are expected to receive a certain level of performance from those disks.
However, you pay more for this SLA and the proposed performance. Which is quite correct, when you’re using the disk but what about when you’re not using the disk?

Right now, in the Azure “West Europe” region, a Premium tier P10 disk (SSD, 128GiB in size with 500 IOPS and 100MB/s throughput), will cost you £16.16 per month, excluding any deals and discounts (such as Azure Managed Disk Reservation).
The P10 is probably the work-horse of the majority of mid-sized server estates. Microsoft recommend a P10 as the Linux root disk for SUSE Linux based HANA database M-Series Azure VMs.

At the other end, the cost of a Standard tier E10 disk (SSD, 128GiB with 500 IOPS and 60MB/s throughput) is £7.16 per month, with the only performance difference being the throughput and the SLA:

So for the same size disk, although with lower throughput, we pay £9 per month less (55% less). I am going to say this saving is roughly 30 pence per day.

(There is one caveat and that is for standard SSD disks like the E10, you pay a transaction fee of 0.1 pence (£0.001) on the disk for every 10,000 256KiB I/O operations.
However, we will see that this transaction fee will not impact us and our saving, in a moment.)

Here’s how we can save money on this Premium managed disk.

In Microsoft Azure, you can change the disk tier from Premium to Standard, when the VM on which the disk is attached is shutdown (deallocated).
It’s simple, you just use the Azure Portal to change the disk configuration once the VM is shutdown.

While this is nice for just a couple of disks, this is not something you’re going to want to do on a regular basis.
Don’t forget, before you start the VM you need to switch the disk tier back to Premium (to retain your SAP support).
So for mass-changes, you may want to use PowerShell to adjust the disks before starting the VMs.
This itself could become a bit of a burden, since you now lose the ability to mass-power-on VMs from the Azure Portal completely.  You would need to use PowerShell all of the time, or setup an Azure based operation schedule (a.k.a. Power Automate – previously Microsoft Flow).

This is where SAP Landscape Manager (LaMa) really comes into its own.
With SAP LaMa, your BASIS team can:

  • Perform the start-up & shutdown of the SAP relevant VMs.
  • Perform the start-up & shutdown of the SAP systems on the VMs once they have been started (or the reverse).
  • Use the inbuilt scheduling capability of SAP LaMa to schedule the VM and SAP system operations (full automation of start-up and shutdown operations of the whole stack).

The security capabilities of Azure, coupled with SAP LaMa mean that the BASIS team can only perform specific VM related operations on the SAP VMs. Which gives the cloud Ops team peace of mind.

Now for the best bit.
To be able to save money on managed disk costs in Azure, the SAP BASIS administrator has to merely tick a tickbox in the SAP LaMa cloud provider settings, to “Change Storage Type to save costs”:

The next time the VM is de-allocated, SAP LaMa automatically changes the disk configuration in Azure, to a lower cost disk tier.
As we mentioned earlier, since the start/stop is controlled by SAP LaMa, it knows to switch the disk back to Premium tier during the start-up operation.

How simple is that!

As mentioned, there are some complications around any reservation payments for managed disk, so you need to understand what you’re paying for, before just enabling the tick-box!

Here are my very basic calcs for our P10/E10 disk combination example:

  • Weekends per year: 52
  • Saving per weekend: 60 pence
  • Total possible saving per year for 1 disk if it was unused every weekend: £31.20

Now let’s imagine that saving opportunity was applied across your 100 server estate, whereby every server had at least 1x P10 disk.
You can’t shutdown production, because it’s 24/7, but you don’t do development & testing round-the-clock and you have no international locations, so we are going to imagine our SAP estate is maybe 70% applicable to this saving opportunity. That’s 70 servers x £31.20 equals a saving of £2,184 per year on managed disk, by ticking a tickbox.

These are obviously just best guesses, but it shows how costs can build up and can also be reduced.

Happy ticking.