This blog contains experience gained over the years of implementing (and de-implementing) large scale IT applications/software.

SUSE Cloud-Netconfig and Azure VMs – Dynamic Network Configuration

What is SUSE Cloud-Netconfig:
Within the SUSE SLES 12 (and OpenSUSE) operating system, lies a piece of functionality called Cloud-Netconfig.
It is provided as part of the System/Management group of packages.

The Cloud-Netconfig software consists of a set of shell functions and init scripts that are responsible for control of the network interfaces on the SUSE VM when running inside of a cloud framework such as Microsoft Azure.
The core code is part of the SUSE-Enceladus project (code & documents for use with public cloud) and hosted on GitHub here: https://github.com/SUSE-Enceladus/cloud-netconfig.
Cloud-Netconfig requires the sysconfig-netconfig package, as it essentially provides a netconfig module.
Upon installation, the Cloud-Netconfig module is prepended to the front of the netconfig module list like this: NETCONFIG_MODULES_ORDER=”cloud-netconfig dns-resolver dns-bind dns-dnsmasq nis ntp-runtime”.

What Cloud-Netconfig does:
As with every public cloud platform, a deployed VM is allocated and booted with the configuration for the networking provided by the cloud platform, outside of the VM.
In order to provide the usual networking devices and modules inside the VM with the required configuration information, the VM must know about its environment and be able to make a call out to the cloud platform.
This is where Cloud-Netconfig does its work.
The Cloud-Netconfig code will be called at boot time from the standard SUSE Linux init process (systemd).
It has the ability to detect the cloud platform that it is running within and make the necessary calls to obtain the networking configuration.
Once it has the configuration, this is persisted into the usual network configuration files inside the /sysconfig/network/scripts and /netconfig.d/cloud-netconfig locations.
The configuration files are then used by the wicked service to adjust the networking configuration of the VM accordingly.

What information does Cloud-Netconfig obtain:
Cloud-Netconfig has the ability to influence the following aspects of networking inside the VM.
– DHCP.
– DNS.
– IPv4.
– IPv6.
– Hostname.
– MAC address.

All of the above information is obtained and can be persisted and updated accordingly.

What is the impact of changing the networking configuration of a VM in Azure Portal:
Changing the configuration of the SUSE VM within Azure (for example: changing the DNS server list), will trigger an update inside the VM via the Cloud-Netconfig module.
This happens because Cloud-Netconfig is able to poll the Azure VM Instance metadata service (see my previous blog post on the Azure VM Instance metadata service).
If the information has changed since the last poll, then the networking changes are instigated.

What happens if a network interface is to remain static:
If you wish for Cloud-Netconfig to not manage a networking interface, then there exists the capability to disable management by Cloud-Netconfig.
Simply adjusting the network configuration file in /etc/sysconfig/network and set the variable CLOUD_NETCONFIG_MANAGE=no.
This will prevent future adjustments to this network interface.

How does Cloud-Netconfig interact with Wicked:
SUSE SLES 12 uses the Wicked network manager.
The Cloud-Netconfig scripts adjust the network configuration files in the locations /sysconfig/network/scripts which are then detected by Wicked and the necessary adjustments made (e.g. interfaces brought online, IP addresses assigned or DNS server lists updated).
As soon as the network configuration files have been written by Cloud-Netconfig, this is where the interaction ends.
From this point the usual netconfig services take over (wicked and nanny – for detecting the carrier on the interface).

What happens in the event of a VM primary IP address change:
If the primary IP address of the VM is adjusted in Azure, then the same process as before takes place.
The interface is brought down and then brought back up again by wicked.
This means that in an Azure Site Recovery replicated VM, should you activate the replica, the VM will boot and Cloud-Netconfig will automatically adjust the network configuration to that provided by Azure, even though this VM only contained the config for the previous hosting location (region or zone).
This significantly speeds up your failover process during a DR situation.

Are there any issues with this dynamic network config capability:
Yes, I have seen a number of issues.
In SLES 12 sp3 I have seen issues whereby a delay in the provision of the Azure VM Instance metadata during the boot cycle has caused the VM to lose sight of any secondary IP addresses assigned to the VM in Azure.
On tracing, the problem seemed to originate from a slowness in the full startup of the Azure Linux agent – possibly due to boot diagnostics being enabled.  A SLES patch is still being waited on for this fix.

I have also seen a “problem” whereby an incorrect entry inside the /etc/hosts file can cause the reconfiguration of the VM’s hostname.
Quite surprising.  This caused other custom SAP deployment script related issues as the hostname was being relied on to be in a specific intelligent naming convention, when instead, it was being changed to a temporary hostname for resolution during an installation of SAP sing the Software Provisioning Manager.

How can I debug the Cloud-Netconfig scripts:
According to the manuals, debug logging can be enabled through the standard DEBUG=”yes” and WICKED_DEBUG=”all” variables in config file /etc/sysconfig/network/config.
However, casting an eye over the scripts and functions inside of the Cloud-Netconfig module, these settings don’t seem to be picked up and sufficient logging produced.  Especially around the polling of the Azure VM Instance metadata service.
I found that when debugging I had to actually resort to adjusting the function script functions.cloud-netconfig.

Additional information:
https://www.suse.com/c/multi-nic-cloud-netconfig-ec2-azure/
https://www.suse.com/documentation/sles-12/singlehtml/book_sle_admin/book_sle_admin.html
https://github.com/SUSE-Enceladus/cloud-netconfig
https://www.suse.com/media/presentation/wicked.pdf
https://github.com/openSUSE/wicked

HANA Studio – Diagnosis Mode Connection Overload

Be careful when using HANA Studio in Diagnosis Mode with the refresh interval set to a low value.
When set to 5 seconds (the default), the number of connections opened to the HANA DB is one every 5 seconds:

HANA Diagnosis Mode refresh interval

If you check the number of connections with a tool such as TCPView or Process Monitor, you will see a very high number of ESTABLISHED connections over time:

 HANA client connections established

Note that the HANA DB SQL port is 3<xx>15.

Under certain heavy network load, you could be causing more strain on your PC, the network and the HANA server.

Simply decrease the refresh time and this will allow your PC to close off the un-wanted connections in time to create the new ones, reducing your CPU consumption.

Network Port Test Using SAP NIPING

Some companies have additional security policies that remove the Telnet application from the local desktop PCs.
This can prove difficult for SAP BASIS people trying to test if a specific network port is reachable, since Telnet is a perfect way of testing if a server port is accessible, or being blocked by a firewall.
Instead, you can use the NIPING tool (Network Interface Ping) supplied with the SAP Frontend installation on the desktop PC.

Check if you have NIPING.exe, it should exist in the default install location: “C:Program FilesSAPFrontEndSAPgui”.

You have to call NIPING from a command prompt:

C:> cd C:Program FilesSAPFrontEndSAPgui

There are two command line options that are useful when calling NIPING.
The command line option “-R” tells NIPING to attempt a RAW TCP connection.
Option “-P” specifies that NIPING should option slightly more detail.

If you need to test if a network port is available, you need to use the RAW option.
You don’t care what transport layer protocol is required (SMTP, HTTP, Telnet, SSH), you just want it to try and open a bare TCP connection to the specified port and see what happens.

To use NIPING with the RAW option:

> niping -c -H <dest host>  -S <dest port>  -R  -P

You will get some fairly detailed output.
What you are looking for is a return code (RC) of “-6” and “ERROR connection to partner ‘xxx.xxx.xxx.xxx:pppp’ broken“.

The RC of “-6” indicates that NIPING was able to open the TCP connection (NIPCONNECT) successfully, but it was not able to receive (NILREAD)  because the server closed the connection when we didn’t send any information (it was a raw connection).

If you receive an RC of “-10” and “ERROR partner xxx.xxx.xxx.xxx:pppp not reached” this indicates that NIPING was not able to even establish a basic TCP connection (NIPCONNECT) to the server host and port.
You may not have an network route to the server, the server IP may be invalid, the port may not be listening on the server, a firewall may be blocking you and many other reasons.

If you are simply trying to connect to a known SAP system dispatcher port (for SAP GUI connections), then using NIPING without the RAW option will perform an RFC connection to the SAP system dispatcher, if possible:

> niping -c -H <dest host>  -S 32<SAP_sys_id>

When you use NIPING without the RAW option, it will return success (“Connect to server o.k.“) if it can successfully connect to the SAP system dispatcher.  It will always complain about “bytes_written <> bytes_read“, so ignore this error.

You should note that connecting to the SAP ABAP message server (36xx) will return a “ERROR connection to partner xxx.xxx.xxx.xxx:pppp broken” and RC “-6” (just like a RAW connection) if it was successful.
The reason is that this is not a straight RFC connection that supports NIPING.  It’s meant to hand-off to a specific dispatcher or other tasks, but not ping.

Network with OEL 5.7 x86_64 install in Hyper-V

When installing Oracle Enterprise Linux 5.7 x86_64 in a Hyper-V 2012 VM, the Linux networking refuses to work with the Hyper-V “Legacy driver” if you have the UEK (unbreakable enterprise kernel) enabled and more than one vCPU.

First, you should always ensure that you add the Hyper-V “Legacy Network driver” to the VM container at the VM creation time to ensure that it will work when you come to install OEL in the VM.

Then, to get around the problem with the networking and vCPUs, disable the UEK kernel and shutdown, then you can add more than one vCPU to the VM.

Connecting SAP Netweaver ABAP Stack To SAP SLD

If you’ve ever quickly needed to know the points to check for customising settings connecting your SAP Netweaver *ABAP* system to an SLD (System Landscape Directory), then here they are:

Transaction: SLDAPICUST
Transacion: RZ70
Transaction: SM59 – TCP/IP connections: SAPSLDAPI (Use by ABAP API)
                                                                 LCRSAPRFC  (Use to read the exchange profile)

Here’s the SAP Help article: https://help.sap.com/saphelp_nw04s/helpdata/en/be/6e0f41218ff023e10000000a155106/content.htm