This blog contains experience gained over the years of implementing (and de-implementing) large scale IT applications/software.

SAP ICM/Web Dispatcher CLI Param Change Error -1

This specific error/situation has zero results in a google search, so it’s worth documenting.
You’re welcome!

Scenario:

You’re trying to use the icmon or wdispmon to modify a profile parameter value, but during the process you receive an error for the specific parameter.
Example:

Reading value for parameter icm/HTTP/redirect failed (-1)

The error is reported and the menu just loops to the same menu you were on before.
Inside the trace file (dev_webdisp in our example), we see:

*** ERROR => IcmHandleMonAdmMsg: unknown protocol/service: HTTP/, opcode ICM_COM_OP_GET_SUBHDL_PARM. [icxxmsg.c    3440]

Notice that the “protocol” above is “HTTP”. It is trying to use HTTP to talk to the webdisp/icm.
If no HTTP port is present then it fails.

Solution:

Add a HTTP port using the webgui.
In my solution, adding port 80## to the instance, then I was able to manually make the change using wdispmon. No restart of the web disp was needed for either of those things. 🙂

Cluster Config Issue for SAP ERS Instance

Running SAP Netweaver A(SCS)/ERS in a Pacemaker cluster in Azure, AWS or GCP or potentially even on-premise?

Be aware, there is a configuration issue in the current version of the Microsoft, AWS, GCP and SUSE documentation for the Pacemaker cluster configuration (on SLES with non-native SystemD Startup Framework) for the SAP Enqueue Replication Server (ERS) instance primitive in an ENSA1 (classic Enqueue) architecture.


Having double checked with both the SAP and SUSE documentation (I don’t have access to check RHEL) I believe that the SAP certified SUSE cluster design is correct, but that the instructions to configure it are not inline with the SAP recommendations.

In this post I explain the issue, what the impact is and how to potentially correct it.


NOTE: This is for SLES systems that are not using the new “Native Startup Framework for SystemD” Services for SAP, see here.

Don’t put your SAP system at risk, get a big coffee and let me explain below.

SAP ASCS and High Availability

The Highly Available (HA) cluster configuration for the SAP ABAP Central Services (ASCS) instance is critical to successful operation of the SAP system, with the SAP Enqueue (EN) process being the guardian of the SAP application level logical locks (locks on business objects) and the SAP Enqueue Replication Server (ERS) instance being the guarantor for the EN when a cluster failover occurs.

In a two-node HA SAP ASCS/ERS setup, the EN process is on the first server node and the ERS instance is on the second node.
The EN proccess (within the ASCS instance) is replicating to the ERS instance (or rather, the ERS is pulling from the EN process on a loop).


If the EN process goes down, the surrounding cluster is usually configured to fail over the ASCS instance to the secondary cluster node where the ERS instance is running. The EN process will then take ownership of the replica in-memory lock table:

What is an ideal Architecture design?

In an ideal design, according to the SUSE documentation here.
(again, I’m sure RHEL is similar but if someone can verify that would be great), the ASCS and ERS instances are installed on cluster controlled disk storage on both cluster nodes in the cluster:

We mount the /sapmnt (and potentially /usr/sap/SID/SYS) file system from NFS storage, but these file systems are *not* cluster controlled file systems.
The above design ensures that the ASCS and ERS instances have access to their “local” file system areas before they are started by the cluster. In the event of a failover from node A to node B, the cluster ensures the relevant file system area is present before starting the SAP instance.

We can confirm this by looking at the SAP official filesystem layout design for an HA system here:

What is Microsoft’s Design in Azure?

Let’s look at the cluster design on Microsoft’s documentation here:

It clearly shows that /usr/sap/SID/ASCS and /usr/sap/SID/ERS is being stored on the HA NFS storage.
So this matches with the SAP design.

What is Amazon’s design in AWS?

If we look at the documentation provided by Amazon here:

We can see that they are using EFS (Elastic File Storage) for the ASCS and ERS instance locations, which is mounted using the NFS protocol.
So the design is the same as Microsoft’s and again this matches the SAP design.

What is Google’s design in GCP?

If we look at the documentation provided by Google, the diagram doesn’t show clearly how the filesystems are provided for the ASCS and ERS, so we have to look further into the configuration step here:

The above shows the preparation of the NFS storage.

Later in the process we see in the cluster configuration that the ASCS and ERS file systems are cluster controlled:

and

The above is going to mount /usr/sap/SID/ASCS## or /usr/sap/SID/ERS## and they will be cluster controlled.
Therefore the GCP design is also matching the SAP, Azure and AWS designs.

Where is the Configuration Issue?

So far we have:

  • Understood that /sapmnt is not a cluster controlled file system.
  • established that Azure, AWS, GCP and the SUSE documentation are in alignment regarding the cluster controlled file systems for the ASCS and the ERS.

Now we need to pay closer attention to the cluster configuration of the SAP instances themselves.

The Pacemaker cluster configuration of a SAP instance involves 3 (or more) different cluster resources: Networking, Storage and SAP instance. With the “SAP Instance” resource being the actual running SAP software process(es).

Within the cluster the “SAP Instance” Resource Adapter (RA) is actually called “SAPInstance” and in the cluster configuration is takes a number of parameters specific to the SAP instance that it is controlling.
One of these parameters is called “START_PROFILE” which should point to the SAP instance profile.

The SAP instance profile file is an executable script (on Linux) that contains all the required commands and settings to start (and stop) the SAP instance in the correct way and also contains required configuration for the instance once it has started. It is needed by the SAP Instance Agent (sapstartsrv) and the executable that is the actual SAP instance (ASCS binaries: msg_server and enserver, ERS binary: enrepserver).
Without the profile file, the SAP Instance Agent cannot operate a SAP instance and the process that is the instance itself is unable to read any required parameter settings.

Usually, the SAP Instance Agent knows where to find the instance profile because at server startup, the sapinit script (triggered through either systemd unit-file or the old Sys-V start scripts) will execute the entries in the file /usr/sap/sapservices.
These entries call the SAP Instance Agent for each SAP instance and they pass the location of the start profile.

Here’s a diagram from a prior blog post which shows what I mean:

In our two-node cluster setup example, after a node is started, we will see 2 SAP Instance Agents running, one for ASCS and one for ERS. This happens no matter what the cluster configuration is. The instance agents are always started and they always start with the profile file specific in the /usr/sap/sapservices file.
NOTE: This changes in the latest cluster setup in SLES 15, which is a pure SystemD controlled SAP Instance Agent start process.

The /usr/sap/sapservices file is created at installation time. So it contains the entries that the SAP Software Provisioning Manager has created.
The ASCS instance entry in the sapeservices file, looks like so:

LD_LIBRARY_PATH=/usr/sap/SID/ASCS01/exe:$LD_LIBRARY_PATH; export LD_LIBRARY_PATH; /usr/sap/SID/ASCS01/exe/sapstartsrv pf=/usr/sap/SID/SYS/profile/SID_ASCS01_myhost -D -u sidadm

But the ERS instance entry looks slightly different:

LD_LIBRARY_PATH=/usr/sap/SID/ERS11/exe:$LD_LIBRARY_PATH; export LD_LIBRARY_PATH; /usr/sap/SID/ERS11/exe/sapstartsrv pf=/usr/sap/SID/ERS11/profile/SID_ERS11_myhost -D -u sidadm

If we compare the “pf=” (profile) parameter entry between ASCS and ERS after installation, you will notice the specified ERS instance profile location is not using the location accessible under the link /usr/sap/SID/SYS.
Since we have already investigated the file system layout, we know that the “SYS” location only contains links, which point to other locations. In this case, the ASCS is looking for sub-directory “profile”, which is a link to directory /sapmnt/SID/profile.
The ERS on the other hand, is using a local copy of the profile.

This is something that usually would go unnoticed, but the ERS must be using a local copy of the profile for a reason?
Looking at SAP notes, we find SAP note 2954193, which explains that an ERS instance in an ENSA1 architecture should be started using a local instance profile file:

Important part: “this configuration must not be changed“.
Very interesting. It doesn’t go into any further detail, but from what we have understood about the criticality of the SAP ASCS and ERS instances, we have to assume that SAP have tested a specific failure scenario (maybe failure of sapmnt) and deemed it necessary to ensure that the ERS instance profile is always available.
I can’t think of any other reason (maybe you can?).

The next question, how does that ERS profile get created in that local “profile” location? It’s not something the other instances do.
After some testing it would appear that the “.lst” file in the sapmnt location is used by the SAP Instance Agent to determine which files to copy at instance startup:

It is important to notice that the DEFAULT.PFL is also copied by the above process.
Make sure you don’t go removing that “.lst” file from “/sapmnt/SID/profile”, otherwise those local profiles will be outdated!

To summarise in a nice diagram, this setup is BAD:

This is GOOD:

What about sapservices?

When we discussed the start process of the server, we just mentioned that the SAP Instance Agent is always started from the /usr/sap/sapervices file settings. We also noted how in the /usr/sap/sapservices file, the settings for the ERS profile file location are correct.
So why would the cluster affect the profile location of the ERS at all?
It’s a good question, and the answer is not a simple explanation because it requires a specific circumstance to happen in the lifecycle of the cluster.

Here’s the specific circumstance:

  • Cluster starts, the ERS Instance Agent was already running and so it has the correct profile.
  • We can run “ps -ef | grep ERS” and we would see the “er” process has the correct “pf=/path-to-profile” and correctly pointing to the local copy of the profile.
  • If the ERS instance somehow is now terminated (example: “rm /tmp/.sapstream50023”) then the cluster will restart the whole SAP Instance Agent of the ERS (without a cluster failover).
  • At this point, the cluster starts the ERS Instance Agent with the wrong profile location, and the “er” binary now inherits this when it starts. This will be inplace until the next complete shutdown of the ERS Instance Agent.

As you can see, it’s not an easy situation to detect, because from an outside perspective, the ERS died and was successfully restarted.
Except it was restarted with the incorrect profile location.
If a subsequent failure happens to the sapmnt file system, this would render the ERS at risk (we don’t know the exact risk because it is not mentioned in the referenced SAP note that we noted earlier).
What is more, the ERS instance is not monitorable using SAP Solution Manager (out-of-the-box), you would need to create your own monitoring element for it.

Which Documentation has this Issue?

Now we know there is a difference required for the ERS instance profile location, we need to go back and look at how the cluster configurations have been performed, because of “this configuration must not be changed”!

Let’s look first at the SUSE documentation here:

Alright, the above would seem to show that “/sapmnt” is used for the ERS.
That’s not good as it doesn’t comply with the SAP note.

How about the Microsoft documentation for Azure:

No that’s also using /sapmnt for the ERS profile location. That’s not right either.

Looking at the AWS document now:

Still /sapmnt for the ERS.

Finally, let’s look at the GCP document:

This one is a little harder, but essentially, the proposed variable “PATH_TO_PROFILE” looks like it is bound to the same one as the ASCS instance defined just above it, so I’m going to say, it’s going to be “/sapmnt” because when you try and change it on one, it forces the same on the other:

We can say that all documentation available for the main hyperscalers, provides an incorrect configuration of the cluster, which could cause the ERS to operate in a way that is strongly not recommended by SAP.

Should we correct the issue and How can we correct the issue?

I have reported my finding to both Microsoft and SUSE, so I would expect them to validate.
However, in the past when providing such feedback, the relevant SAP note has been updated to exclude or invalidate the information altogether, rather than instigating the effort of fixing or adjusting any incorrect configuration documentation.
That’s just the way it is and it’s not my product, so I have no say in the solution, I can only report on what I know is correct at the time.

If you would like to correct the issue using the information known at this point in time, then the steps to be taken to validate that the ERS is operating and configured in the correct way are provided in a high-level below:

  1. Check the cluster configuration for the ERS instance to ensure it is using the local copy of the instance profile.
  2. Check the current profile location used by the running ERS Instance Agent (on both nodes in the cluster).
  3. Double check someone has not adjusted the /usr/sap/sapservices file incorrectly (on both nodes in the cluster).
  4. Check that the instance profile “.lst” file exists in the /sapmnt/SID/profile directory, so that the ERS Instance Agent can copy the latest versions of the profile and the DEFAULT.PFL to the local directory location when it next starts.
  5. Check for any differences between the current local profile files and the files in the /sapmnt/SID/profile directory and consider executing the “sapcpe” process manually.

Thanks for reading.

SAP Netweaver ICM Fast Channel Architecture

SAP Netweaver has been around for many, many years now. In fact we have had very nearly 20 years of Netweaver.
Back in March 2001, SAP acquired TopTier and went on to use TopTier’s application as the underpinning to the SAP Netweaver application server (WebAS).
Now this would not have been the Netweaver Java stack, that was to come later in the form of WebAS 6.30.
My point is, you would imagine by now that Netweaver is known inside and out by most BASIS professionals, but this is just not the case. It’s a complex and very capable application server and there are things that we know and things that we know in detail.
One of the things that seems to be little known is the FCA and it’s role within the ICM of the Netweaver Java stack.

In this post I want to explain the function of the SAP Netweaver Internet Communication Manager (ICM) Fast Channel Architecture (FCA) and how this is responsible for routing the HTTP communications to your Netweaver Java stack.

As usual, a little context will help set the scene.

A History of Netweaver Java

Before Netweaver 7.1, the Java stack did not have an Internet Communication Manager (ICM). This was reserved only for the Netweaver ABAP stack.
Instead, these old Netweaver Java versions had additional Java nodes (JVMs) called dispatcher nodes (in addition to the server0 node).

The dispatcher node was responsible for receiving and dispatching the inbound HTTP requests to the server nodes of the instance.

The ICM Was Added

Since Netweaver 7.1, the Java stack was given the ICM, which runs from the Kernel binaries, instead of a JVM.


The benefits of this change were:

  • Faster startup and response time (Kernel is C++ compiled binary code).
  • Smaller memory requirements.
  • Same ICM in Netweaver ABAP and Netweaver Java (same Kernel DB independent part).
  • Use of profile files for configuration (for SSL, security, memory params) instead of ConfigTool.

Identifying the FCA

We know the ICM is visible as a separate binary executable process at the operating system level.
In Windows we see “icman.exe” and in Unix/Linux we see “icman”.
At execution, the icman program reads the instance profile to determine it’s configuration.

The Fast Channel Architecture (FCA) is a specific, dedicated set of memory pipes (MPIs) in the shared memory region, accessible by both the ICM and the Java server nodes and used as a method of super fast inter-process communication between the ICM and the Java server nodes.
In Linux, shared memory segments are visible using the “ipcs -m” command, in Windows these are memory mapped files and you cannot see them so easily, you would need a 3rd party tool.

By using shared memory and the concept of memory pipes, it avoids the need for the data in a HTTP request/response to be sent from the ICM to the Java Server node. Instead of sending the actual data, a simple memory pointer can be sent (smaller and consistent in size), telling the Java Server node where to look in memory, for the data.
Effectively what this means is that the shared memory area for the MPIs, sits logically between the ICM and the Java Server nodes.

According to the Netweaver AS Java documentation, the FCA is itself just another MPI, that acts as a FIFO queue.
The HTTP requests coming into the ICM via a TCP port, travel through a regular (anonymous) MPI, before the ICM dispatches the request into a specific FCA queue.
If you have two server nodes on your Java stack (server0 and server1), then the ICM will query the server node to determine the back-end load, then push the request to the specific FCA queue of the target server node that has capacity to handle the request.
Therefore, if you have two server nodes, you will have a dedicated FCA queue for each.
It is the responsibility of the Java server node, to create the FCA queue in the ICM shared memory during start-up.

Once the HTTP request (or rather, the memory pointer to the request) hits the FCA, it becomes the responsibility of the Java server node to pull the request off the queue into a thread for processing.
Inside the Java Server node, these threads are known as the FCA threads or HTTP Worker Threads.
If you run a SAP PI/PO system, then you may already be familiar with these threads and their configuration.
You may have seen these threads when running thread dumps for SAP support incidents.

There are two methods to actually see the FCA Queues:

  • Within the SAP ICM Web Administration page.
  • Using the “icmon” command line tool.

We can call the icmon tool as follows:

icmon pf=<path-to-instance-profile>

then from the menu select "m"
then from the menu select "y"

Once the MPI list is dumped (option “y”), the the FCA queues are visible at the end of the output:

...
MPI<174>: 4d50494d 'ANON' 11 50 0 0 0 0(4996) 1(30001) 1(30001)
MPI<173>: 4d50494d 'ANON' 10 50 0 0 0 0(4996) 1(30001) 1(30001)
MPI<60>: 4d50494d 'TS1_00_1234650_HTTP_WAIT' 5 -1 20 0 0 0(4996) 1(10002) 0(-1)
MPI<5f>: 4d50494d 'TS1_00_1234650_HTTP' 4 -1 20 0 0 0(4996) 1(10002) 1(30001)
MPI<58>: 4d50494d 'TS1_00_1234651_HTTP_WAIT' 2 -1 20 0 4406 0(4996) 1(10003) 0(-1)
MPI<57>: 4d50494d 'TS1_00_1234651_HTTP' 7 -1 20 0 0 0(4996) 1(10003) 1(30001)
MPI<52>: 4d50494d 'TS1_00_1234650_P4' 6 -1 20 0 0 0(4996) 1(10002) 1(30001)
MPI<4d>: 4d50494d 'TS1_00_1234651_P4' 3 -1 20 0 0 0(4996) 1(10003) 1(30001)
MPI<4>: 4d50494d 'ANON' 1 1 0 0 0 0(4996) 1(30001) 1(30001)
MPI<2>: 4d50494d 'ANON' 0 1 0 0 0 0(4996) 1(30001) 1(30001)
 
    q - quit
    m - menue 

NOTE: For those interested, the 4d 50 49 4d at the beginning of each line, translates from HEX to ASCII as “MPIM”.

In my example, you can see I have 2 Java server nodes registered at this ICM: 1234650 and 1234651.
You will notice that there are 3 queues for each Java server node.
The P4 queue is self explanatory, it is used to talk to the Java server node on it’s P4 port (SAP proprietary protocol) and is probably used to acquire capacity/load information from the server node.
Of the other 2 queues, one queue is the “WAIT” queue and is where (I think) the inbound requests (destined to the Java server node) are held, before they enter the other request queue which is where (I think) the Java server node is waiting to process the requests.
(There is not a great deal of documentation on the above, but I have seen instances where the WAIT queue fills, which makes me believe it’s a holding area).

In the dev_icm trace we can also see the joining of the server nodes to the ICM for the HTTP protocol (other protocols are supported, such as Telnet, P4):

[Thr 140608759801600] Wed Mar 17 22:59:32:934 2021
[Thr 140608759801600] JNCMIHttpCallLBListener: node 1234650, service Http joins load balancing
[Thr 140608759801600] HttpJ2EELbPut: server 1234650 started protocol HTTP, attached to request queue TS1_00_1234650_HTTP
[Thr 140608759801600] JNCMIHttpMsPutLogon: set http logon port (port:50000) (lbcount: 2)
[Thr 140608759801600] JNCMIHttpMsPutLogon: set https logon port (port:50001) (lbcount: 2)

In the Java server node developer trace files (e.g. dev_server0 and dev_server1), we can see the name of the node (JNODE_10002 for server0) which is also visible in the dev_icm trace output in column 10:

F [Thr 139637668607872] Wed Mar 17 22:53:49 2021
F [Thr 139637668607872] JSFSetLocalAddr: using NI defaults for bind()
I [Thr 139637668607872] MtxInit: JNODE_10002 0 2

The relevant dev_icm output:

MPI<60>: 4d50494d ‘TS1_00_1234650_HTTP_WAIT’ 5 -1 20 0 0 0(4996) 1(10002) 0(-1)
MPI<5f>: 4d50494d ‘TS1_00_1234650_HTTP’ 4 -1 20 0 0 0(4996) 1(10002) 1(30001)

Sizing the FCA

The size of the FCA is not directly configurable.
Instead, we can configure the size of the shared memory area (total area) for all the MPIs using parameter “mpi/total_size_MB“, then from this total size, the maximum possible size of any individual MPI is fixed to 25% of the total area size.

In later Netweaver versions (7.40+), it is not recommended to adjust “mpi/total_size_MB“, instead, adjust the “icm/max_conn” parameter, which is then used to calculate “mpi/total_size_MB“.
The internal formula is described as:
mpi/total_size_MB = min(0.06 * $(icm/max_conn) + 50, 2000)

There is another undocumented (apart from SAP notes) parameter, which can allow you to increase the max size of an MPI. However it means any one MPI can consume more of the total area than the default 25%.
It is therefore not advised to be adjusted.

We can see the value of the parameter “mpi/total_size_MB” in the ICM developer trace file (dev_icm) during it’s start up. This is useful as it shows us the calculation based on the formula mentioned above.
We are looing at “total size MB” right at the end of the line:

[Thr 140610607359872] MPI init, created: pipes=40010 buffers=19985 reserved=5995 quota=10%, buffer size=65536, total size MB=1250

Common FCA Errors

There are a dedicated set of SAP notes for FCA errors, such as 1867119.
Based on the architecture we can see that they describe issues with throughput (through the FCA Queue), and with issues in the Java server node threads causing the FCA Queues to fill.
They also show issues with sizing of the MPIs, and the number of the worker threads (for high throughput scenarios).

In my experience the following types of FCA errors can be seen in the Java server developer traces “dev_server<n>” files:

  • “-3” error: The Java server node is unable to put a response back onto the FCA Queue, probably because the MPI area is full from a full FCA Queue. This can happen if one of the Java server node HTTP Worker threads has become stuck (waiting) for resources or for the database.
    As you will see from my previous diagram, a full MPI area will then start to affect HTTP access to both Java server nodes as they share the ICM (it’s a single point of failure).
  • “-7” error: This affects one individual Java server node and prevents it from pulling requests off the FCA queue in a timely manner. This specific issue is usually a timeout mismatch between the HTTP provider and the ICM.

Both of the above errors look similar, but one is a lack of resources in the Java stack and the other is a full FCA Queue (in shared memory) due to inaction (stuck threads) in the Java stack.
The “-7” error can therefore present itself as an issue in the ICM or in the Java stack, but it is usually a problem in the Java stack that causes it to close the connection early.

Summary

There you have it, the simple FCA queue that serves HTTP requests to your Java Server nodes.
We learned:

  • Netweaver Java was given the ICM in 7.1 onwards.
  • The ICM in the Netweaver Java and ABAP stacks is the same binary.
  • The ICM uses shared memory for the MPIs.
  • The shared memory area is controlled via a parameter of which it’s value is controlled via 1 parameter (in NW 7.40+).
  • The FCA queues are MPIs.
  • Only memory pointers are passed through the FCA Queues.
  • The Java server nodes are responsible for creating the FCA queues in the ICM shared memory.
  • There are 2 FCA queues for each server node.
  • The developer traces store information about the size of the ICM shared memory and the registration of the Java Server nodes to a queue.
  • There are a known set of errors that can occur and are documented in SAP notes.
Useful SAP References
  • SAP Note 1867119 – No more memory for FCA
  • SAP Note 2417488 – Resource leak for MPI buffers in FCA communication
  • SAP Note 1945745 – How to increase HTTP Worker (FCA) threads in PI
  • SAP Note 2579836 – AS Java system has performance problem – FCAException – Best practices and tuning recommendations.
  • SAP Note 2997765 – AS Java system has performance problem – FCAException – Best practices for analysis
  • SAP Note 2276273 – AS Java – How to identify the largest MPI buffer consumer by MPI dump

HowTo: Check Netweaver 7.02 Secure Store Keyphrase

For Netweaver 7.1 and above, SAP provide a Java class that you can use to check the Secure Store keyphrase.
See SAP note 1895736 “Check if secure store keyphrase is correct”.
However, in the older Netweaver 7.02, the Java check function does not exist.

In this post I provide a simple way to check the keyphrase without making any destructive changes in Netweaver AS Java 7.02.

Why Check the Keyphrase?

Being able to check the Netweaver AS Java Secure Store keyphrase is useful when setting up SAP ASE HADR. The Software Provisioning Manager requests the keyphrase when installing the companion database on the standby/DR server.

The Check Process

In NW 7.02, you can use the following method, to check that you have the correct keyphrase for the Secure Store.
The method does not cause any outage or overwrite anything.
It is completely non-destructive, so you can run it as many times as you need.
I guess in a way it could also be used as a brute force method of guessing the keyphrase.

As the adm Linux user on the Java Central Instance, we first set up some useful variables:

setenv SLTOOLS /sapmnt/${SAPSYSTEMNAME}/global/sltools
setenv LIB ${SLTOOLS}/sharedlib
setenv IAIK ${SLTOOLS}/../security/lib/tools

Now we can call the java code that allows us to create a temporary Secure Store using the same keyphrase that we think is the real Secure Store keyphrase:
NOTE: We change “thepw” for the keyphrase that we think is correct.

/usr/sap/${SAPSYSTEMNAME}/J*/exe/sapjvm_*/bin/java -classpath "${LIB}/tc_sec_secstorefs.jar:${LIB}/exception.jar:${IAIK}/iaik_jce.jar:${LIB}/logging.jar" com.sap.security.core.server.secstorefs.SecStoreFS create -s ${SAPSYSTEMNAME} -f /tmp/${SAPSYSTEMNAME}sec.properties -k /tmp/${SAPSYSTEMNAME}sec.key -enc -p "thepw"

The output of the command above is 2 files in the /tmp folder, called sec.key and sec.properties.
If we now compare the checksum of the new temporary key file, to the current system Secure Store key file (in our case this is called SecStore.key):

cksum /sapmnt/${SAPSYSTEMNAME}/global/security/data/SecStore.key 
cksum /tmp/${SAPSYSTEMNAME}Sec.key


If both the check sum values are the same, then you have the correct keyphrase.

Cookies, SAP Analytics Cloud and CORS in Netweaver & HANA

Back in 2019 (now designated as 2019AC – Anno-Covid19), I wrote a post explaining in simple terms what CORS is and how it can affect a SAP landscape.
In that post I showed a simple “on-premise” setup using Fiori, a back-end system and how a Web Dispatcher can help alleviate CORS issues without needing too much complexity.
This post is about a recent CORS related issue that impacts access to back-end SAP data repositories.

Back To The Future

If we hit the “Fast-Forward” button to 2020MC (Mid-Covid19), CORS is now an extremely important technical setup to enable Web Browser based user interfaces to be served from Internet based SAP SaaS services (like SAP Analytics Cloud) and communicate with back-end on-premise/private data sources such as SAP BW systems or SAP HANA databases.

We see that CORS is going to become ever more important going forward, since Web Browser based user interfaces will become more abundant (due to the increase of SaaS products) for the types of back-end data access. The old world of installing a software application on-premise takes too much time and effort to keep up with changing technology.
Using SaaS applications as user interfaces to on-premise data allows a far more agile delivery of user functionality.

The next generation of Web Interfaces will be capable of processing ever larger data sets, with richer capabilities and more in-built intelligence. We’re talking about the Web Browser being a central hub of cross-connected Web Based services.
Imagine, one “web application” that needs a connection to a SaaS product that provides the analytical interface and version management, a connection to one or more back-end data repositories, a connection to a separate SaaS product for AI data analysis and pattern matching (deep insights), a connection to a separate SaaS product for content management (publishing), a connection to a separate SaaS product for marketing and customer engagement.

All of that, from one central web “origin” will mean CORS will become critical to prevent unwanted connections and data leaks. The Web Browser is already the target of many cyber security exploits, therefore staying secure is extremely important, but security is always at the expense of functionality.

IETF Is On It

The Internet Engineering Task Force already have this in hand. That’s how we have CORS in the first place (tools.ietf.org/html/rfc6454).
The Web Origin Concept is constantly evolving to provide features for useability and also security. Way back in 2016 an update to RFC 6265 was proposed, to enhance the HTTP state management mechanism, which is commonly known to you and I as “cookies”.

This amendment (the RFC details are here: tools.ietf.org/html/draft-ietf-httpbis-cookie-same-site-00) was the SameSite attribute that can be set for cookies.
Even in this RFC, you can see that it actually attributes the idea of “samedomain-cookies” back to Mozilla, in 2011. So this is not really a “new” security feature, it’s a long time coming!

The Deal With SAC

The “problem” that has brought me back around to CORS, is recent experience with a CORS issue and SAP Analytics Cloud (SAC).
The issue led me to a blog post by Dong Pan of SAP Canada in Feb 2020 and a recent blog post by Ian Henry, also of SAP in Aug 2020.

Dong Pan wrote quite a long technical blog post on how to fix or work-around the full introduction of the SameSite cookie attribute in Google Chrome version 80 when using SAP Analytics Cloud (SAC).

Ian Henry’s post is also based on the same set of solutions that Dong Pan wrote about, but his issue was accessing a backend HANA XS Engine via Web Dispatcher.

The problem in both cases is that SAP Analytics Cloud (SAC) uses the Web Browser as a middleman to create a “Live Connection” back to an “on-premise” data repository (such as SAP BW or SAP S/4HANA), but the back-end SAP Netweaver/SAP ABAP Platform stack/HANA XS engine, that hosts the “on-premise” data repository does not apply the “SameSite” attribute to cookies that it creates.

You can read Dong Pan’s blog post here: www.sapanalytics.cloud/direct-live-connections-in-sap-analytics-cloud-and-samesite-cookies/
You can read Ian Henry’s blog post here: https://blogs.sap.com/2020/08/26/how-to-fix-google-chrome-samesite-cookie-issue-with-sac-and-hana/

By not applying the “SameSite” attribute to the cookie, Google Chrome browsers of version 80+ will not allow SAC to establish a full session to the back-end system.
You will see an HTTP 400 “session expired” error when viewing the HTTP browser traffic, because SAC tries to establish the connection to the back-end, but no back-end system cookies are allowed to be visible to SAC. Therefore SAC thinks you have no session to the back-end.

How to See the Problem

You will need to be proficient at tracing HTTP requests to be able to capture the problem, but it looks like the following in the HTTP response from the back-end system:

You will see (in Google Chrome) two yellow warning triangles on the “set-cookie” headers in the response from the back-end during the call to “GetServerInfo” to establish the actual connection.
The call is the GET for URL “/sap/bw/ina/GetServerInfo?sap-client=xxx&sap-language=EN&sap-sessionviaurl=X“, with the sap-sessionviaurl in the query-string being the key part.
The text when you hover over the yellow triangle is: “This Set-Cookie didn’t specify a “SameSite” attribute and was defaulted to “SameSite=Lax,” and was blocked because it came from a cross-site response which was not the response to a top-level navigation. The Set-Cookie had to have been set with “SameSite=None” to enable cross-site usage.“.

The Fix(es)

SAP Netweaver (or SAP ABAP Platform) needs some code fixes to add the required cookie attribute “SameSite”.

A workaround (it is a workaround) is possible by using the rewrite module capability of the Internet Communication Management (ICM) or using a rewrite rule in a Web Dispatcher, to re-write the responses and include a generic “SameSite” attribute on each cookie.
This is a workaround for a reason, because using the rewrite method causes unnecessary extra work in the ICM (or Web Dispatcher) for every request (matched or not matched) by the rewrite engine.

It’s always better (more secure, more efficient) to apply the code fix to Netweaver (or ABAP Platform) so the “SameSite” attribute is added at the point of the cookie creation.
For HANA XS, it will need a patch to be applied (if it ever gets fixed in the XS since it is soon deprecated).
With the workaround, we are forcing a setting onto cookies outside of the creation process of those cookies.

Don’t get me wrong, I’m not saying that the workaround should not be used. In some cases it will be the only way to fix this problem in some older SAP systems. I’m just pointing out that there are consequences and it’s not ideal.

Dong Pan and Ian Henry have done a good job of providing options for fixing this in a way that should work for 99% of cases.

Is There a Pretty Picture?

This is something I always find useful when I try and work something through in my mind.
I’ve adjusted my original CORS diagram to include an overview of how I think this “SameSite” attribute issue can be imagined.
Hopefully it will help.

We see the following architecture setup with SAC and it’s domain “sapanalytics.cloud”, issuing CORS requests to back-end system BE2, which sits in domain “corp.net”:

Using the above picture for reference, we can now show where the “SameSite” issue occurs in the processing of the “Resource Response” when it comes back to the browser from the BE2 back-end system:

The blocking, by the Chrome Web browser, of the cookies set by the back-end system in domain “corp.net”, means that from the point of view of SAC, no session was established.
There are a couple more “Request”, “Response” exchanges, before the usual HTTP Authorization header is sent from SAC, but at that point it’s really too late as the returned SAP SSO cookie will also be blocked.

At this point you could see a number of different error messages in SAC, but in the Chrome debugging you will see no HTTP errors because the actual HTTP request/response mechanism is working and HTTP content is being returned. It’s just that SAC will know it does not have a session established, because it will not be finding the usual cookies that it would expect from a successfully established session.

Hopefully I’ve helped explain what was already a highly technical topic, in a more visual way and helped convey the problem and the solution.


Useful Links: