This blog contains experience gained over the years of implementing (and de-implementing) large scale IT applications/software.

SAP ASE HADR Overview – Part2

In this multi-part post, I’m explaining the basics behind how SAP Replication Server works when replicating from a SAP ASE database to a SAP ASE database as part of an HADR (ASE always-on) setup for a SAP Business Suite system.
The post will be based on SAP ASE (Adaptive Server Enterprise) 16.0 HADR with SAP Replication Server (SRS) 16.0.

A recap of what we discussed in Part 1:

  • What is SRS.
  • The basic premise of HADR with SRS.
  • What a transaction is.

In this part we discuss the ASE transaction log and how important it is for the working of the database then we discuss how transactions are replicated to the active SRS and touch on the internals of the SRS.

What is the ASE Transaction Log?

The ASE transaction log is the most active part of an ASE database.
When we talk about database ACID principles, the transaction log is what helps deliver those things. It is simply a log of transactions applied to a database.
The transaction log is where changes to the database are first made, it is the register, used to ensure that the database can be recovered to a consistent point-in-time.
Some of the seven ASE databases hosted in an ASE 16.0 instance do not need this point-in-time recovery capability, so they have a special setting that simply truncates the records of the transaction log frequently. The setting/option is called “trunc log on chkpt” and should *never* be applied to the SID database (the BusSite database) in production.
The SID database holds our vital business data, and needs to be recoverable right up to the last transaction that had a final state (committed or rolled-back) applied on that database.

On a regular interval, data changes contained in the transaction log that have a final state, are applied to the database data devices (the database main persistence).
This process is known as a checkpoint.
In those ASE databases with option “trunc log on chkpt” turned on, a checkpoint causes the used transaction log entries to be simply truncated (cleared), with no dump required.
In ASE databases with “trunc log on chkpt” turned off (like the SID database), after a checkpoint has occurred, the used transaction log segments are marked as freeable, and these freeable segments can become available for re-use only once a transaction log backup (dump) is performed to a backup disk or device.

This is a cycle that allows the database to continue to be used for making changes to the database data while also protecting the register of transactions in case of a database crash or even complete corruption.

If the transaction log ever becomes full, then the database is unable to allow new changes and no new transactions can be started until space is made available in the transaction log. It is therefore very important for regular database transaction logs dumps to be performed.
Not only to free space in the transaction log, but also to protect the database in the event that the transaction log is lost or corrupted. For added protection, these transaction dump file(s) can also be moved off to a separate location, away from the database server itself. This is a function of almost every backup tool and as you will find a requirement of almost every database that abides by the ACID principles.

Which Databases are Replicated?

Each ASE instance hosts multiple ASE databases.
In a Business Suite system, the ASE 16.0 instance hosts 7 databases (plus 2 temporary databases and a model database):

  • Master
  • Sybsystem
  • Sybmgmtdb
  • Sybsecurity
  • Sybsystemprocs
  • Saptools
  • SID

In a HADR system, we usually only replicate the Master database and the SID database. This is because the master database holds vital information about the user logins, roles and privileges, which are needed to be the same in the target system. The SID database is the database for the SAP Business Suite system, it holds our business data, which we need to protect.

How do Transactions Move to the SRS?

Inside the replicated primary ASE databases, there exists a Replication Agent process that runs in its own process thread.
The Replication Agent scans the transaction log of the database it serves and sends the transactions across to the active SAP Replication Server (SRS).
In the active SRS a matching receiver thread (Rep Agent User component thread) receives the transactions and pushes them into the Simple Persistent Queue (SPQ) on the disk of the active SRS server.

Subsequent components inside the active SRS process the transactions from the SPQ and eventually push them into the secondary database.

SAP ASE HADR Overview – Part1

In this multi-part post, I’m going to attempt to explain (mainly for my own understanding – as usual), the basics behind how SAP Replication Server works when replicating from a SAP ASE database to a SAP ASE database as part of an HADR (ASE always-on) setup for a SAP Business Suite system.
The post will be based on SAP ASE (Adaptive Server Enterprise) 16.0 HADR with SAP Replication Server (SRS) 16.0.

‘And what is the use of a book,’ thought Alice, ‘without pictures or conversation?’“.
The quote is from the story, Alice in Wonderland, and I think you may find a picture is most definitely worth a thousand words when it comes to understanding the rabbit hole that is SAP Replication Server.

Hold onto your hats, it is about to get mighty bumpy!

What is SAP Replication Server (SRS)?

Replication Server was originally a Sybase product, ingested by SAP when it bought Sybase.
SAP Replication Server (SRS) is not database specific, it can support a number of other source and target database systems such as SAP ASE, SAP HANA, Oracle and SQLAnywhere. Because of this heterogeneous database support, SRS is quite a complex product, offering a multitude of replication scenarios.
In fact, the SRS product is the underpinning to a number of other SAP products such as SAP Landscape Transformation (SLT) and the Near-Zero Downtime (NZDT) option for database migrations.

SRS can be used to provide HA and/or DR for databases, but it can also be used to produce active-active setups with multi-regional replicas of databases for improved local access times.
This is somewhat of an exceptional case and most definitely rare with SAP Business Suite applications.

In this post, we will be using the simple example of just a primary and a secondary (companion) database.
This is known as either SAP ASE HADR or ASE always-on and is the most common use case. In SRS language our secondary database is referred to as the “companion” database.

With SRS it is also possible to have a three tiered architecture, primary, secondary and tertiary. In this three tier setup the secondary is known as the companion and the disaster recovery (DR) database is known as the “DR node” or tertiary database. You would usually use a three tier HADR architecture setup if you want HA in a primary datacentre or cloud region and also a separate DR in a secondary datacentre or cloud region.

There are multiple replication options: synchronous, near-synchronous or asynchronous, depending on your latency between source and target databases, your required RPO and also depending on your required use of HADR.

SRS is the recommended option for SAP ASE database replication.

What is the Basic Premise of HADR with SRS?

For SAP systems with SAP ASE HADR (always-on), the SRS provides replication of “transactions” from the source database(s) to secondary (companion) and/or DR database(s).

In a SAP landscape, the SAP system is configured to fail-over its connection to the companion in the event of a database failover. No cluster is needed for the database network connectivity, because the ASE database driver (dbsl) is “HA aware”.

Inside the primary database a process called the Replication Agent is responsible for sending the transactions to the Replication Server on the companion database and it is configured in Stream Replication mode. This is the only supported mode in HADR.
NOTE: Stream Replication is also known as “ci” (Component Interface) throughout the SRS administration manuals. There are many occasions where you will need to know this information.

In “ci” mode, the SRS proprietary language Log Transfer Language (LTL) is not used.
With “ci” mode, there are three possible synchronisation modes: synchronous, near-synchronous or asynchronous.

With SRS, the primary and secondary databases are NOT the same database (regarding layout, size, blocks), unlike HANA System Replication.
They are their own databases requiring all the usual care and attention that would be applied to the primary. Such as frequent transaction log backups, health checks etc.

With SRS enabled, database transactions that are started on the primary database are replicated to the secondary while still in the open state.

What is a Transaction?

A transaction is an ATOMIC unit of work with a beginning and an end, with work performed in between.


Each transaction can have one of two final states. It is either committed (saved) to the database, or it can be rolled-back (undone).

A transaction that is not yet in a final state, is called an “open” transaction. Any transaction in the “open” state, is still in progress and occupies space in the database transaction (tran) log. Used space in the tran log cannot be used by other transactions.
In a HADR system, the oldest open transaction executing on the primary database, is usually what is responsible for the position of the Secondary Truncation Point (STP) in the primary database.
The STP is a marker point placed into the primary database transaction log by the Replication Agent and is used to determine the current commit point in the companion database (i.e. it shows the latest transaction that is not yet committed on the companion database).

That’s it for part 1.
In part 2 we will go into the internals of SRS and how transactions are replicated through it.

HowTo: Call sp_dump_history with Purge in ASE 16.0 SP03

Due to a bug in ASE 16.0 SP03 PL08 (and maybe earlier versions) the sp_dump_history call does not work properly.
It just returns an empty screen.
When you look at the procedure code, you will see that it fails to find the tempdb for whatever reason.

We can use the sp_dump_history procedure to look at the dump history (backup history) for an ASE database.
From ASE 15.7 onwards we can also use it to purge the history, reducing the size of the dumphist file and in some cases improving the performance of backups themselves (see SAP note 2753062 for details on issues).

By prefixing the tempdb onto the call from the master database, you can get sp_dump_history to work as it should.

Below is an example that purges all backup records from the dump history file (dumphist) from before January 29th @ 15:00:00 :

use master
go

tempdb..sp_dump_history @operation='purge', @until_time="january 29,2020 15:00:00:00"
go

Tip: In Linux the correct date format can be obtained with the date command:

date "+%b %d,%Y %T"

As another tip, you can use the sp_helptext procedure to look at the source code of any stored procedures. Take a look on help.sap.com for the parameters for sp_helptext.

ASE Issue During Setuphadr Removes Logins on Companion

There’s a specific scenario that can occur during the setup of SAP ASE HADR, which can leave the companion (secondary) database with no logins or roles. You will be unable to log into it. It will be completely unusable.

This issue could apply to any ASE SP03 where auditing is turned on (default for PL06 and above) but I have seen this in two systems, one on ASE SP03 PL08 and one on PL08-HF1.

The symptoms are seen during setuphadr in the step for materialisation of the master database; the setuphadr screen output doesn’t change. Eventually it displays a timeout.
When doing a “ps” on the primary ASE server, you can see a number of BCP processes that appear to be “stuck”. They are connecting to the companion database.
Inside the repdata directory location, the BCP data and out files are present and but they do not change.

When attempting to restart the setuphadr, the process fails as it cannot log into the companion ASE.
When trying manually to log into the companion ASE using isql, you are unable to log in as sapsa, sapsso. You can log in as “sa” only, but you have no roles/privs to do anything.
You check the list of logins and roles in the companion master DB (table master..syslogins) and the tables are empty, all logins have been removed!
You may also see login errors mentioned in the RMA log file on the companion server (the active SRS) with “SQLState: ZZZZZ“.

From what I can see, the cause of the issue is ASE auditing. It is enabled on the companion DB (the sp_configure parameter “auditing” has value of “1”).
The audit table is full or nearly full and/or the sybsecurity database is full or nearly full or the sybsecurity database tran log is full or nearly full.
This prevents the BCP processes from successfully loading the table data from the primary.
It doesn’t seem to prevent the truncation of the companion master DB table data, leaving it with no table data for roles and logins in the master DB.

The workaround that I have found that works most effectively is to completely disable ASE auditing (auditing = 0) in the companion ASE instance, to prevent the issue from happening in the first place.
There are a couple of params that can change the way auditing works and maybe adjusting the filtering to prevent auditing of system users would also solve the problem, but this is the setup of replication so I don’t see why you would want that process audited at all. You can always re-enable auditing after the HADR setup is completed.

Another prevention tip is to ensure that sybsecurity database and tran log have space before the setuphadr process is started.

What if you have already hit the above issue and are looking for help right now?
If the above issue is hit, then the companion master DB could be restored from backup, except you cannot log into it to start the LOAD command.
You could copy the master.dat from primary, but the inbuilt GUID that gets set on DB creation would then be identical to primary (companion and primary should be different).

This leaves two possible options:

  • Re-create the companion master db following SAP note 2631728 “How to rebuild master device and master database from backup – SAP ASE”.
  • Restore master DB from backup (could be a backup from primary master DB).
  • If backup from primary was used, then you may need to re-create the SID database (to be on the safe side), using the DDLGEN commands.

or:

  • Blast the companion ASE away and re-install using SWPM.

Good luck, and make sure you take regular backups 🙂

Tuning SAP ASE – Decompressing Small Lookup Tables

By default, for SAP applications based on SAP Netweaver, SAP ASE 16.0 will compress the data pages within the tables inside the database.
In some scenarios this will not help with performance.

In this brief post I explain the scenario, I correct a SAP note and show you how to check and adjust the tables in a safe way (do not just follow the note, you could corrupt the table data).

Data Page Compression

In ASE 16.0, table data pages can be compressed.

For SAP Netweaver based SAP systems such as SAP ERP, the use of table data page compression can help improve performance in two ways:

  1. Reduced I/O
    Reading and writing compressed data pages to disk makes it more efficient. Imagine that it’s a bit like copying a 4MB zip file on your computer, instead of an uncompressed 40MB text file!
  2. Reduced Memory Consumption
    In memory (in the data cache), the data pages can be held in compressed form and may not need to be uncompressed in order to satisfy certain queries.

The above sounds good and indeed it is good.
But there are some borderline cases where compression on a table may not be beneficial.

The Drawback with Compression

In some cases, compression can affect the response time of queries in very specific circumstances.

If you have “small” tables and those tables are accessed frequently with SQL queries that perform full table scans, then the extra CPU time required to decompress the table contents in the data cache, can impact the response time.

How do We Identify Candidate Tables?

If there are tables which are adversely affected from having compression turned, on, how do we identify those tables?

We can follow SAP note 1775764, which provides a nice simple SQL statement to list likely candidates. Great!

However, the provided SQL statement actually does not do what it should!
Oh dear!
Instead, the SQL incorrectly lists tables that are definitely not good candidates for decompression. In fact, some of them are the complete opposite of good candidates!

The SQL taken from the SAP note is:

--- DO NOT RUN THIS IT IS WRONG!!! ---
use <SID>
go
select top 10 ObjectName, 
LogicalReads 
from master..monOpenObjectActivity 
where DBID = db_id() 
and LogicalReads > 1000000 
-->   and data_pages(DBID,0)) < 6400      <-- WRONG LINE HERE! 
and object_attr(DBName || '.SAPSR3.' || ObjectName,'compression') <> 'none' 
order by LogicalReads desc
go

You will see in the above SQL taken from the SAP note, the “data_pages” function call has 2 parameters, the ID of the current database and an object id. Except the object id on line 9, has been hard coded to “0”.

This causes a value of “0” to be returned from the “data_pages” call, which is obviously always less than 6400.

Essentially, any table that has compression enabled, with high numbers of LogicalReads (table scans in the data cache), will be reported as a candidate! How confusing.

Correcting the SQL

We can see that we need to correct the SQL.
Here’s how:

--- This is the corrected statement ---
use <SID>
go
select top 10 ObjectName, 
LogicalReads 
from master..monOpenObjectActivity 
where DBID = db_id() 
and LogicalReads > 1000000 
and data_pages(DBID,ObjectID)) < 6400
and object_attr(DBName || '.SAPSR3.' || ObjectName,'compression') <> 'none' 
order by LogicalReads desc
go

It’s simple enough, we just substitute the incorrect “0” on line 9, with the column “ObjectID” from the monOpenObjectActivity table we are selecting from.
This correction then allows the correct output from the call to function “data_pages“.
With the correction in place, tables with a data page count of less than 6400 pages (using 16KB pages, means tables less than 100MB in size), with compression enabled and a high number of LogicalReads, will be listed as candidates. Yay!

Adjusting the Candidate Tables

Once you have your table list, you might now wish to turn off compression.
This is simple enough.
You can use the additional SQL supplied in the SAP note:

--- WARNING: check SAP note 2614712 first! ---
use <SAPSID>
go
setuser 'SAPSR3'
set quoted_identifier on
set chained on
go
alter table SAPSR3.<tablename> set compression = none
reorg rebuild SAPSR3.<tablename> [with online]  <-- check 2614712 first!
commit
go

As you can see, you will need to perform a REORG on the table.
The SQL statement specifies the optional “with online” option.
BE WARNED: There are lots of SAP notes about issues with REORG ONLINE, please ensure that you check SAP note 2614712 first! Shame on SAP for not mentioning this in the SAP note!

Checking the Result

We’ve gone through the effort of adjusting tables we think that might benefit from no-compression, but how do we now confirm that we have positively impacted the response time?

Well that’s a difficult one. I’ll leave that for another post.

Summary

The summary for this post, is a little bit “Boris Johnson”:

  • Follow the SAP notes.
  • But don’t follow the SAP notes.
  • Check and double check what is output from commands, don’t just take the output as being correct!
  • Check your results.
  • Compressing tables is a good thing.
  • Compressing specific tables may not be a good thing.