This blog contains experience gained over the years of implementing (and de-implementing) large scale IT applications/software.

Uplifting & Expanding Linux LVM Managed Disks in Azure

One of the great things about public cloud, is the potential to simply increase the data disk space for your systems.
In the on-premise, non-virtualised world, you would have needed to physically add more disk or swap the disk for a bigger unit.
Even in the on-premise virtualised world, you may have needed more actual disk to expand the virtual hard disk.

Sometimes those disks can be storing data files for databases.
In which case, if you have followed the Azure architecture best practices, then you will be using LVM or some other volume management layer on top of the raw disk device. This will give you more flexibility and performance (through striping).

In this guide I show how to increase the disk size by uplifting the data disks in Azure, then resizing the disk devices in Linux, allowing us eventually to grow the XFS file system to a larger size.
I will discuss good reasons to uplift vs adding extra data disks.

My Initial Setup

In this step-by-step, we will be using Linux (SUSE Enterprise Linux Server 12) with LVM as our volume management software.
The assumption is that you have already created your data disks (2 of them) and striped across those disks with a single logical volume.
(Remember, the striping gives you double the IOPS when reading/writing the data to/from the disks).

In my simple example, I used the following the create my LVM setup:

  1. Add 2x data disks of size 128GB (I used 2x S10) to a VM running SLES 12 using the Azure Portal.
  2. Create the physical volumes (mine were sdd and sde on LUNs 1 and 2):
    pvcreate /dev/sdd
    pvcreate /dev/sde

  3. Create the volume group:
    vgcreate volTMP /dev/sdd /dev/sde
  4. Create the striped logical volume using all the space:
    lvcreate -l +100%FREE volTMP/lvTMP1
  5. Create the file system using XFS:
    mkfs.xfs /dev/mapper/volTMP-lvTMP1
  6. Mount the file system to a new mount point:
    mkdir /BIGSTRIPEDDISK
    mount /dev/mapper/volTMP-lvTMP1 /BIGSTRIPEDDISK

In Azure my setup looked like the below (I already had 1 data disk, so I added 2 more):

In the VM, we can see the file system is mounted and has a size of 256GB (2x 128GB disks):

You can double check the striping using the lvdisplay command with “-m” flag:

Once the disk was setup, I then created a simple text file with ASCII text inside:

I also used “dd” to create a large 255GB file (leaving 1GB free):

dd if=/dev/zero of=./mybigfile.data bs=1024k count=261120

The disk usage is now close to 100%:

I ran a checksum on the large file:

Value is: 3494419206

With the checksum completed (it took a few minutes), I now have a way of checking that my file is the same before/after the disk resize, plus the cksum tool will force reading of the whole file (checking for filesystem I/O issues).

Increasing the Data Disk Size

Within the Azure portal, we first need to stop the VM:

Once stopped, we can go to each of the two data disks and uplift from an S10 (in my example) to an S15 (256GB):

We can now start the VM up again:

When the VM is running again, we can log in and check.
Our file system is the same size:

We check with the LVM command “pvdisplay” to display one of the physical disks, and we can see that the size has not changed, it is still 128GB:

We need to make LVM re-scan the disk to make it aware of the new increased size. We use the pvresize command:

Re-checking the disk using pvdisplay, and we can see it has increased to 256GB in size:

We do the same for the /dev/sde disk:

Once the physical disks are resized (in the eyes of LVM), we can now check the volume group:

We have now got 256GB of free space (see row: “Free PE / Size”) in our volume group.

To allow our file system to get this space, the logical volume within the volume group needs to be expanded into the free space.
We use the lvresize command to make our logical volume use all free space in the volume group “+100%FREE”:

NOTE: It is also possible to specify an exact size should you want to be specific.

Our file system is still only 256GB in size, until we resize it.
For XFS file systems, we use the xfs_growfs command as follows:

Checking the file system now, shows we have 512GB of free space (50% free):

Are my files still present? Yes:

Let’s check the contents of my text file:

Finally, I validate that my big data file has not been corrupted:

Value is: 3494419206

What is the Alternative to Uplifting?

Instead of uplifting the existing data disks, it is possible to increase the amount of storage in my volume group, by adding two new additional disks.
To prevent performance issues, these new disks should be of the same scale level (S10) as the existing disks.
You should definitely not be mixing disk types in a logical volume, so to prevent this, you should not mix them in a volume group (even though you could technically separate them at the logical volume level).

Is there a good reason when to add more disks? When you are going to create a new logical volume, it is ideal to keep the data on separate physical disks to help avoid data-loss (from a lost/deleted disk).
There are also performance reasons to have additional Linux devices, since parameters such as queue depth affect the Linux device level. The Linux O/S can effectively issue more simultaneous read requests since additional data disks are additional devices.

As we have seen, uplifting a disk tier, you will need to take the VM offline or detach the disk. Adding additional new disks on the other hand, you can do this all online. Adding new disks does pose a slight issue if you have one large logical volume with striping, since any new disk quantity needs to match the existing stripe layout (e.g. if you have 2 disks striped, you should add another 2 disks to increase the volume), and the striping balance will only ever be over the quantity of disks in the original striping, not over the new quantity of disks (e.g. striped over 2 disks, even if you have added another 2 to make 4).

Is there a good reason when to not add more disks? When you know you could exceed the VM data disk count limitations. Each VM has a limit, the bigger the VM, the bigger the limit.

When you know that you always leave a small proportion of disk space free. Adding more disks which will only ever be max 80% used, is more wasteful compared to upscaling an existing set of disks which will only ever be 80% used.

Summary

Using the power of Azure, we have increased the data disk sizes of our VM.
The increase needed the VM to be stopped and started again while the disks were uplifted from an S10 (128GB) to an S15 (256GB).

Once uplifted, we had to make LVM aware of the new disk sizes by using the pvresize command, then the free space in the volume group was given to the logical volume and finally the file system was grown.
To maintain the logical volume stripe, both disks were uplifted, then both disks needed the pvresize command running.
We validated the before and after state of our ASCII and data files and they were not corrupted.

Finally, we looked at the alternative to uplifting and saw that uplifting may not be appropriate in all cases.
Adding new disks can be done online.

Cleaning Up /tmp for SAP On SLES On Azure

In an Azure SLES 12 Linux VM the default installation image mounts the /tmp file system as a regular file system off the root (/). Historically, for many Unix/Linux environments, this is not “normal”.

In this post I will discuss what the impact is for this irregular setup of /tmp and what you can do to work around it, ensuring SAP continues to work as usual.

What is “Normal”?

In traditional Linux installations the /tmp file system is usually mounted as a temporary file system (tmpfs), which means it would be cleaned on O/S reboot.
This has been the case for many years. There’s a recent post here that highlights 1994 for Solaris.
Plus, you can find a detailed explanation of tmpfs here: https://www.kernel.org/doc/html/latest/filesystems/tmpfs.html

With the default SLES setup, any files placed into /tmp will not be automatically be cleaned up on reboot and as per the previous links, there can be performance reasons to use a tmpfs.

From memory, there are differing standards on what /tmp should be used for (again see here), and it is possible that the traditional setup is no longer following a newly agreed standard. I really am not certain why SLES does not mount /tmp as tmpfs.
All I know from over 20 years of working with different Unix/Linux products, is that it is generally accepted that /tmp is a dumping ground, that gets cleaned on reboot and from what I can see, SAP think the same.

What is the Impact of /tmp Not Being tmpfs?

When you have /tmp and it is not cleaned on reboot and is not a tmpfs, then it can cause issues when using software that expects some form of clean up to be performed.
When I look at some of the SAP systems in Azure on SLES 12, I see a build up of files in the /tmp directory, which results in the need for a scripted job to clean them up on a periodic cycle.

If some of the more prolific files are not cleaned up regularly, then they can build up into many thousands of files. While this shouldn’t impact the day-to-day running of the SAP system, it can impact some ad-hoc operations such as patching the SAP system or the database.
The reason is that sometimes the patching tools write out files to the /tmp area, then crudely perform a “ls” to list files or find files in that location. If there are many thousands of files, then those listing operations can fail or be delayed.
A perfect example is the patching of the SAP ASE database, which can be affected by thousands of files in the /tmp location.

Finally, with the /tmp directory mounted off the root disk, any filling of /tmp will fill your root disk and this will bring your VM to a halt pretty quickly! Be careful!

What Sort of Files Exists in /tmp ?

In the list below, I am looking specifically at SAP related files and some files that are culprits for building up in the /tmp directory.

File Name PatternDescription
.saphostagent_nnnnnSAP Host Agent run files.
.sapicmnnnSAP ICM run file.
.sapstartsrv##_sapstartsrv.logSAP Instance Agent run file.
.sapstreamnnnnSAP IPC files.
.theagentlives.tmpOwned by Sybase O/S user, is related to SAP ASE instance. Maybe JS Agent.
ctisql_*Temporary iSQL executions using sybctrl.
sap_jvm_nnnn_nnnnnn
sapjvm_profiling_server_nnnn_nnnnn
sap_jvm_monitoringboard_nnnn_nnnnn
SAP JVM execution.
sapinst_instdirFrom an execution of SWPM (contains sapinst).
saplg*Owned by sapadm and are part of the SAP Instance Agent logon ticket generated from the Hostagent.
sb*From an ASE installation.
tmp*Owned by root, lots and lots, possibly Azure agent related as they contain the text “Windows Azure CRP Certificate Generator” when passed through a base64 decoder.
tmp.*Lots and lots, seem to be Kerberos related.

How Can We Clean Up these Files?

The most common way is to use a script.
Within the script will be a “find” statement, which finds the specific files and removes each one.
It needs to be done this way, because if there are too many files, then trying to do “rm /tmp/tmp*” will exceed the number of lines in the shell space for globbing and it will either error or produce no output at all and no files will be removed.

The script will need to be executed as root frequently (maybe weekly or even daily) to ensure that the file quantities are kept consistently low. This can be achieved using an enterprise scheduler or a crontab on each server.

Here’s an example of how to clean up the /tmp/tmp* files with a very specific criteria. The files are removed if they are:

  • located in the /tmp directory
  • with a name length of at least 7 chars beginning with ‘tmp’ followed by A-z or 0-9 at least 4 times.
  • last modified more than 7 days ago.
  • owned by root, with a group of root.
find /tmp -type 'f' -regextype posix-awk -regex '/tmp/tmp[A-z0-9]{4,} -mtime +7 -user root -group root -delete -print

The above will remove the files due to the “-delete”. To test it, just remove the “-delete”.

In summary, you should check how /tmp is setup in your VMs, and then check the files that are created in /tmp.

HowTo: Extract SAP PI/PO Message Payload from SAP ASE DB

Sometimes you may need to directly the extract the SAP PO message payload from the underlying database tables such as BC_MSG_LOG in SAP ASE 16.0 database.
This could also potentially be called: extracting hex encoded ASCII data from an ASE image column. Because the SAP PO tables use an ASE image data type to store the payload as an off-row LOB.

There are plenty of examples for doing this extraction in Oracle, but in ASE it is not so easy because the message size could be larger than that supported by the page size of ASE (usually 16k in an ASE for BusSuite).
This means you won’t be able to store it into a T-SQL variable and use the ASE functions.

Instead, we can use the below simple approach to extract the raw hex, and then use Python 2 to convert it to ASCII:

1, Execute the selection SQL using isql at the Linux command prompt on the database server:

isql -USAPSR3DB -S<SID> -w999 -X

select MSG_BYTES
from [SAPSR3DB.BC_MSG_LOG]
where MSG_ID='<YOUR MSG ID>'
and DIRECTION='OUTBOUND'
and LOG_LOCATION='MS'

go

The output will consist of hexadecimal output, which starts with “0x” and should look something like this:

0x2d2d5341505f6

Copy & paste into a text file on the Linux server (use your favourite text editor) and call the file data.txt.

Edit the data.txt file and remove the first “0x” characters from the data.
Remove all newlines and carriage returns in the file.

Now create a simple Python script to read the data from our file data.txt and translate from hex to ASCII then print to the screen:

with open('data.txt', 'r') as file:
    data = file.read()
print data.decode('hex')

Run the Python script:

python ./myscript.py

The output should contain a header and a footer which start with:  “–SAP_”.
If you get an error from the Python script, then it could be because there are additional newlines or carriage returns in the data.txt file.

SAP ASE HADR Overview – Part5

In this multi-part post, I’m explaining the basics behind how SAP Replication Server works when replicating from a SAP ASE database to a SAP ASE database as part of an HADR (ASE always-on) setup for a SAP Business Suite system.
The post will be based on SAP ASE (Adaptive Server Enterprise) 16.0 HADR with SAP Replication Server (SRS) 16.0.

In Part 1 we started with:

  • What is SRS.
  • The basic premise of HADR with SRS.
  • What a transaction is.

In Part 2 we went on to discuss:

  • What is the ASE transaction log.
  • Which databases are replicated.
  • How do transactions move to the SRS.

In Part 3 we covered:

  • What is the Active SRS.
  • What are the Key Parts of SRS.
  • Replication Synchronisation Modes
  • Open Transaction Handling
  • What are the Implications of Replication Delay

In Part 4 we stepped through the replication of Bob’s data change and saw how the transactional data was replicated first to the SRS and eventually to the companion (secondary) database.

This is the last part of my ASE 16.0 HADR mini-series, and in this final part I will discuss possible issues that can impact an ASE 16.0 HADR system, which might be useful when planning operational acceptance testing.

Companion ASE Unavailable

When the companion (secondary) database is unavailable for whatever reason (undergoing maintenance, it’s broken or for other reasons), then the replicated transactions will still continue to move through the SRS until they get to the outbound queue (OBQ).
The assumption is that the active SRS (on same server as the companion DB) is still up and working.

In the OBQ the transactions will wait until the companion is available again.
As soon as the companion is available, the transactions will move from the OBQ into the companion.
The primary database will be unaffected during this time and transactions will continue through the SRS until the OBQ becomes full.

If the OBQ fills up, then transactions will start to accumulate in the inbound queue (IBQ).

If the companion database is down for a long period of time, you may need to make a decision:

  • Increase the stable queue partition space to hold more transactions.
    With the hope that the companion can be brought back online.
  • Disable replication, removing the Secondary Truncation Point (STP) from the primary database and acknowledging that the companion will need re-materialisation to bring it back in-sync with primary.

Inbound Queue Full

When the inbound queue becomes full, the transactions will start to build up into the simple persistent queue (SPQ).
You should note that the IBQ, by default, is only allowed to use up to 70% of the stable queue partition size. The rest is for the OBQ. So “full” is not actually 100% of the queue space.

There can be two common reasons for the IBQ to fill:

  1. The OBQ is also full due to an issue with the connection to the companion ASE database, or the companion is unavailable.
    or
  2. There is an open transaction in the IBQ and the SRS is waiting for the “commit” or “rollback” command to come through from the SPQ for the open transaction.

To resolve the full IBQ, you are going to need to establish which of the two issues is occurring.
An easy way to do this is to check the OBQ fill level.
If transactions are moving from the OBQ to the companion, then the issue is an open transaction.

If an open transaction has caused the IBQ to fill, then the “commit” or “rollback” transaction could now be stuck in the SPQ. Since there is no space in the IBQ, the SRS is also unable to process the SPQ records, which leaves the IBQ open transaction in a stale-mate situation.
You will need to make a decision:

  • Add more space to the stable queues to increase the IBQ size.
    or
  • Increase the proportion of stable queue size that the IBQ can use (if OBQ is empty).
    or
  • Zap (remove) the open transaction from the IBQ (will mean data-loss on companion so a rematerialise may be needed).

Normally, you can just add more space by adding another partition to the stable queues, hopefully resolve the issue, then remove the extra space again. How much is needed? Nobody will know.
However, if you have to zap the open transaction, then make sure you dump the queue contents out first, so you can see what DML was open, you can then make a decision on how the missing transaction will affect the companion database integrity (could negate the need for rematerialisation).

During this problematic period, the SPQ has remained functional, which has meant that the primary database has been able to continue to send transactions to the active SRS and therefore allowed it to continue to commit data in a timely manner. The primary database will have no issues.

Simple Persistent Queue Full

This is probably the most serious of the scenarios.
Once the SPQ becomes full, it immediately impacts the Replication Agent running in the primary ASE database.

Transactions are unable to move from the primary database transaction log to the active SRS.
You will start to get urgent warnings in the primary ASE database error log, informing you that the SPQ is full and that the buffers in the primary Replication Agent are full.

You will also see that the Replication Agent will be producing error messages and informing you that it has switched from synchronous to asynchronous replication mode.

The integrity of your production ASE database is now at risk!

It is possible you can add more space to the SPQ if you think you can resolve the problem in the IBQ and have the time to wait for the IBQ to empty!

You should note that this scenario has the symptoms if the active SRS is not working at all. If the SPQ is not available, then you need to troubleshoot the SRS error log. It’s also possible you may have issues with the Replication Agent in the primary ASE.

Primary Transaction Log Full

If your active SRS has now filled up to the SPQ, your primary ASE database is now at serious risk.
With the Replication Agent unable to move the Secondary Truncation Point (STP) then the transaction log of the primary ASE will start to fill up.
You will not be able to release the segments, even with a transaction log dump, because they are still needed by the Replication Agent.

You have the following options available:

  • Add more space to the transaction log by adding a new device (recommended, instead of expanding the existing device).
    This will give you more time to maybe fix the SRS.
    or
  • Disable replication, which removes the STP and allows the transaction log to be dumped. A rematerialise of the companion will be needed at a later date.
    or
  • If the transaction log is already full, then even trying to disable replication will not work (no transactions will be permitted).
    You will have to use DBCC to remove (ignore) the “LTM” (Last Truncation Marker), which will have a similar outcome to disabling replication.

At this point, if all else fails and your transaction log is full DO NOT RESTART THE PRIMARY ASE!
If you restart the primary ASE with a full transaction log, then you will be in a world of pain trying to get it to start up again.
You need to stop all processing (it won’t be working anyway), then try and resolve the issue.

Summary

This concludes the 5 part mini-series on SAP ASE 16.0 HADR.
I hope it has given you enough of an overview to be able to explain a typical setup, the way replication occurs and the most common issues that put your production ASE database at risk.

Feel free to contact me with any questions and feedback.

SAP ASE HADR Overview – Part4

In this multi-part post, I’m explaining the basics behind how SAP Replication Server works when replicating from a SAP ASE database to a SAP ASE database as part of an HADR (ASE always-on) setup for a SAP Business Suite system.
The post will be based on SAP ASE (Adaptive Server Enterprise) 16.0 HADR with SAP Replication Server (SRS) 16.0.

In Part 1 we started with:

  • What is SRS.
  • The basic premise of HADR with SRS.
  • What a transaction is.

In Part 2 we went on to discuss:

  • What is the ASE transaction log.
  • Which databases are replicated.
  • How do transactions move to the SRS.

In Part 3 we covered:

  • What is the Active SRS.
  • What are the Key Parts of SRS.
  • Replication Synchronisation Modes
  • Open Transaction Handling
  • What are the Implications of Replication Delay

In this penultimate part we step through the process of replication for an individual transaction, looking at how each of the previously discussed components plays it part along the way.

Step 1 -Begin Transaction

In the first step, our SAP Business Suite application is being used by Bob, our end-user.
Bob has a screen open and saves a change to an item of data.
The Business Suite application calls the Netweaver application stack to persist the change to the data according to the application’s dictionary model.
The dictionary dictates the relationship of the business object (e.g. an invoice) to the relational database tables.
The Netweaver code uses the SAP Kernel supplied libraries to save the change to the necessary system database tables.

There could be many tables involved in Bob’s one change.
It would not be right to update just one table without also updating the others that are related. Otherwise we would not have consistency.
For this reason, a single database transaction can include many data manipulation language (DML) statements.
By grouping the DML statements for the related table updates into a single database transaction, the SAP Kernel can enforce consistency.

In our example, our transaction will include updates to 2 different tables: tableA and tableB.
To designate the start of the database transaction, the SAP Kernel calls the database library to “BEGIN TRANSACTION”.

The affect of the database call to “BEGIN TRANSACTION”, is that a new transaction log record is OPENened:

Step 2 – Replication Agent – Open Transaction

Once a transaction has started (opened), the ASE HADR Replication Agent will see it.
Remember, we discussed the Replication Agent in Part 2.

The Replication Agent sends the transaction data across the network to the target SAP Replication Server (SRS). It knows where to send the data because of the configuration applied to the Replication Agent during the HADR setup process.

The SRS receives the data from the Replication Agent and writes it to the Simple Persistent Queue (SPQ), then sends an acknowledgement back to the Replication Agent.

Step 3a – Update Tables – DML

So far, all that has happened is a new transaction has been opened.
Now Netweaver will apply the required DML to the transaction:

UPDATE tableA SET column1=”ABC”
UPDATE tableB SET column1=”123″

The Kernel will apply the required DML to the opened transaction, this will also update the transaction log, which will be seen by the Replication Agent and sent across to the SRS as before.

You will notice that at this point, we are still using transaction log space, but we are also consuming space in the SPQ.

At this step, if one of the required “UPDATE” statements was to fail, then the whole transaction could be cancelled (a rollback) and no changes would be permanently made to any of the tables.
This is one of the requirements of the ACID principles.

Step 3b – The SRS Inbound Queue

At the same time as the DML is being applied to the open transaction in step 3a, the SRS continues to process the open transaction.

Inside the SRS, there are various component modules that process the incoming transactions from the SPQ, placing them in the correct order and compacting them (grouping) into larger, more efficient transactions.
Once this initial processing has completed, the new transaction is placed into the inbound queue (IBQ).

Something you will notice, is that we now have consumed space in:

  • Primary database transaction log.
  • Simple Persistent Queue.
  • Inbound Queue.

Once the transaction is safely persisted into the IBQ, the record in the SPQ is now free for re-use.

Step 4 – End Transaction

In steps 3a and 3b, we see the opening transaction record and the DML move across to the SRS.
At this point in time, Bob’s changes are still not replicated to the companion (secondary/standby) database.
In fact, Bob’s changes are not even visible to other users of the primary database, because Bob’s changes are not yet committed.

Once all the DML in Bob’s transaction has been applied at the primary database successfully (still not committed), then the SAP Kernel can issue the “END TRANSACTION”.
This signifies that this group of changes are finished.
After the “END TRANSACTION”, the SAP Kernel can issue one of two things; a “COMMIT” or a “ROLLBACK”.
In our case, the Kernel issues a “COMMIT”.

The “COMMIT” on the primary database is now performed, at the same time, the “COMMIT” record is also sent by the Replication Agent to the SPQ of the SRS.

I can hear the DB experts gasp at this point!
Yes, in Part 3 I mentioned that the commit is not allowed on the primary until after the Replication Agent has successfully sent the commit to the SPQ. In actual fact, this is not a hard and fast rule. The Replication Agent will attempt to send the commit record to the SPQ; it will wait for a given amount of time, before switching to asynchronous replication mode (see Part 3 for a description of this mode).
The commit to the primary database is therefore allowed to happen, even if it has not yet been acknowledged by the SPQ. This is the trade-off between performance and protection. The HADR solution has flexibility that allows a configurable amount of replication delay before synchronous replication is switched to asynchronous.

The acknowledgement that the “COMMIT” record has been successfully stored in the SRS SPQ, allows the primary database Replication Agent to move the Secondary Truncation Point (STP) forward and release the transaction log record, which allows it to be freed when the next “DUMP TRANSACTION” (transaction log backup) is performed.

The primary database data change becomes visible to all users of the database.
Bob’s screen returns a message telling him that his data is saved.

Step 5 – The SRS Outbound Queue

Inside the SRS IBQ, all of our transaction is now complete, we have a “BEGIN” and an “END”, a “COMMIT” and some DML in between that contains the updates to the required tables.

Once the “COMMIT” record is seen by the Inbound Queue (IBQ) of the SRS, then the SRS will process the re-packaged transaction from the IBQ to the Outbound Queue (OBQ).

This processing could involve adjusting the SQL language used, if the target database is not ASE.

From the OBQ, the Data Server Interface (DSI) component of the SRS, applies the transaction to the target database.

Finally, the replicated transaction data is applied to the target secondary database and harmony is achieved.

Bob’s View

Throughout this whole replication cycle, Bob had no idea that his data was being replicated to the target secondary database hundreds of kilometres away.
The response time of the database to the SAP Kernel was only slightly impacted by the addition of the Replication Agent processing time, plus the network transfer time to the active SRS, plus the processing and persistence time to the SPQ of the SRS.

As well as the response time, we noticed how the storage requirements of the SRS are bigger than the actual transaction log of the primary database due to the way the transaction is processed/transformed and re-processed through the SRS, then queued for application to the target database.

In the final part 5, I will discuss some common issues that can occur with HADR, allowing you to comprehensively plan your operational acceptance testing.