This blog contains experience gained over the years of implementing (and de-implementing) large scale IT applications/software.

SAP PI/PO Performance Statistics Data Extract and Analysis using Excel & Power Query

End-to-End performance analysis of a specific SAP PO/PI interface is a tricky business using the out-of-the-box tools. For example, you might be thinking that SAP Solution Manager can provide this detail, but it is not able to correlate inbound SAP Web Dispatcher HTTP logs or Netweaver Java ICM logs to the PI/PO interface.
Instead you need to implement your own monitoring to bring these components together, unless you implement a complete tool like AppDynamics.

In this post I will show how to extract the SAP PI/PO performance data and import into Excel, using a Power Query to transform it from the XML format into something that we can report on.
After seeing my example, you can take the same design away and, using PowerBI, construct a dashboard for reporting. You can also choose to incorporate the HTTP logs, to give you an the end-to-end performance analysis by interface.

This post is similar to another set of blog posts I published, showing how to pull data about your virtual machines directly from Azure, see: List Your Azure VMs in Excel. That other post used Power Query to transform JSON.

Data Sources

Before we go and start Excel, we need to understand where our data will be coming from.
There are 4 locations that we can report on in a SAP Netweaver AS Java stack running SAP PI/PO:

  • SAP Web Dispatcher (load balancer across multiple App instances).
  • SAP ICM (load balancer across multiple NW java server nodes).
  • SAP NW HTTP Provider.
  • SAP PI/PO performance data servlet.

The last item is the key one, which will require some Power Query goodness to transform the data.
We use the 2014 blog post from Vadim Klimov to see how to pull this data direct from PI/PO using the servlet “PerformanceDataQueryServlet”.

For the Web Dispatcher, the ICM and the HTTP Provider, we really need to choose just one of those points to collect the log information.
Since our source system is handing over processing to “us” at the Web Dispatcher, then that would be the logical location to collect the HTTP logs.
However, some companies use a load balancing appliance like an F5 at the entry-point, in which case, you would be better gathering the HTTP logs from each of the ICM server processes.

The reason for using the HTTP logs from the front-end of the architecture stack, is because you want to capture any HTTP 50x messages caused by unavailability of the back-end parts.
For example, if SAP Netweaver is down, then the Web Disp logs would show a HTTP 503 (service unavailable).
If the PO application is not started inside SAP Netweaver, then the ICM logs would show a HTTP 503.
You want to collect the logs from the closest point of the handover between your source system and the Netweaver stack.

For the HTTP log sources, we have a little bit of an issue.
In most cases, logging is not enabled in Web Dispatcher and ICM configurations. To enable the logging we need to look at the parameter “icm/HTTP/logging_<xx>“.
Ideally what we need for the log format is: %h %u %t “%r” %s %H %L
This will give:

  • %h = Source IP.
  • %u = BASIC Auth username or cert common name.
  • %t = Date/time.
  • %r = The request with query string !IMPORTANT!
  • %s = The HTTP response code e.g. 200 or 500 etc.
  • %H = The name of the server host.
  • %L = Response time in milliseconds.

The log file should be switched DAILY (SWITCHF=day) to prevent it from growing too large.
We will need to transform the data in the log, but we can do this in Power Query or in a script at source.

Data Transfer

We now need to understand how we will transfer the data from the data source to Excel (or PowerBI).
Natively, Excel (and PowerBI) can query a HTTP target to obtain data in a variety of formats.
Since XML is supported with no problems, this enables us to call the PI/PO PerformanceDataQueryServlet directly from Excel.

For those feeling adventurous, the HTTP logs can actually be picked up by Azure Log Analytics. You may need to adjust the date/time format with a script, but then it will be possible to have them stored in your workspace for pulling into PowerBI.

Alternatively, you will need to place the HTTP logs into a storage location on a regular interval, somewhere accessible from Excel/PowerBI. This could be Sharepoint or an Azure Storage Account.
Another option is to have them placed into a location that serves HTTP, such as the SAP instance agent like I have shown before. For the Web Dispatcher you may have 2 logs (in an active/active setup) for the ICM you will have a log file for each Application server instance.
By naming the log files in an intelligent manner, you can ensure that your Power Query can always find the files (e.g. don’t include the date/time in the file name).

Data Aquisition

With your data accessible via HTTP, you can use Excel or PowerBI to process it.
In this example, we will go through the processing for the PerformanceDataQueryServlet, since that is the hardest to process in its raw XML format, with multiple nested tables. The nested tables is the reason we use Power Query to transform it.

Open Excel and create a new workbook, then select the “Data” tab:

Click “Get Data” and select “From Other Sources”, then click “Blank Query”:

Click “Advanced Editor”:

Remove any existing text from the query box:

At this point, we can paste in the code necessary to obtain our data, but first we need to understand the URL composition correctly.
For this we can refer to the 2014 blog post from Vadim Klimov to understand the required parameters.

Here’s my sample HTTP call:
https://sapts1app1:50001/mdt/performancedataqueryservlet?component=af.ts1.sapts1db01&begin=2021-01-10T00:00:00.000Z&end=2021-01-11T00:00:00.000Z

This can be broken down as follows:

EntryValue
Https://sapts1app1:50001Host address of PO App server (this can be any one of the Java instances).
/mdt/performancedataqueryservletURI for the PerformanceDataQueryServlet
component=af.ts1.sapts1db01The name of our AAEX
begin=2021-01-10T00:00:00.000ZThe begin time of our data selection period.
end=2021-01-11T00:00:00.000ZThe end time of our data selection period.

Something you will notice about our URL is that we are using the HOURLY data selection period, listing data for a 24 hour period aggregated by hour.
We don’t really have much choice with the PerformanceDataQueryServlet, as we can only choose from MINUTE, HOURLY or DAILY with aggregation levels being 15mins, 1hour or 1day.

If we were to decide not to pull the data via HTTP, then we could save it to a flat file.
The data format that will be returned from the HTTP call could be pre-saved.
Here’s my sample data returned from the call to the PerformanceDataQueryServlet:

<?xml version="1.0" encoding="UTF-8" ?>
<PerformanceDataQueryResults xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="http://sapts1app1:50001/mdt/monitor/PerformanceDataQuery.xsd">
<Result>
<Code>OK</Code>
<Details>OK</Details>
<Text>Successfully&#x20;read&#x20;performance&#x20;data.</Text>
<Component>af.ts1.sapts1db01</Component>
<PeriodType>DAILY</PeriodType>
<BeginTime timezone="UTC">2021-01-10&#x20;00&#x3a;00&#x3a;00.0</BeginTime>
<EndTime timezone="UTC">2021-01-11&#x20;00&#x3a;00&#x3a;00.0</EndTime>
</Result>
<Data>
<ColumnNames>
<Column>INBOUND_CHANNEL</Column>
<Column>OUTBOUND_CHANNEL</Column>
<Column>DIRECTION</Column>
<Column>DELIVERY_SEMANTICS</Column>
<Column>SERVER_NODE</Column>
<Column>FROM_PARTY_NAME</Column>
<Column>FROM_SERVICE_NAME</Column>
<Column>TO_PARTY_NAME</Column>
<Column>TO_SERVICE_NAME</Column>
<Column>ACTION_NAME</Column>
<Column>ACTION_TYPE</Column>
<Column>SCENARIO_IDENTIFIER</Column>
<Column>MESSAGE_COUNTER</Column>
<Column>MAX_MESSAGE_SIZE</Column>
<Column>MIN_MESSAGE_SIZE</Column>
<Column>AVG_MESSAGE_SIZE</Column>
<Column>MAX_RETRY_COUNTER</Column>
<Column>MIN_RETRY_COUNTER</Column>
<Column>AVG_RETRY_COUNTER</Column>
<Column>AVG_PROCESSING_TIME</Column>
<Column>TOTAL_PROCESSING_TIME</Column>
<Column>MEASURING_POINTS</Column>
</ColumnNames>
<DataRows>
<Row>
<Entry>HTTPSender_TS1</Entry>
<Entry>SOAPReceiver_Proxy</Entry>
<Entry>OUTBOUND</Entry>
<Entry>BE</Entry>
<Entry>1234567</Entry>
<Entry></Entry>
<Entry>BSR_SAP</Entry>
<Entry></Entry>
<Entry>TS_SERVER</Entry>
<Entry>ProcessingIn</Entry>
<Entry>urn&#x3a;in2db.com&#x3a;ERP&#x3a;Test</Entry>
<Entry>&#x7c;BSR_SAP&#x7c;ProcessingOut&#x7c;&#x7c;</Entry>
<Entry>3</Entry>
<Entry>2396</Entry>
<Entry>1391</Entry>
<Entry>1817</Entry>
<Entry>0</Entry>
<Entry>0</Entry>
<Entry>0.00</Entry>
<Entry>524476</Entry>
<Entry>1573428</Entry>
<Entry> <MeasuringPoints><MP> <Name>MS&#x3a;module_in&#x3a;CallSapAdapter</Name> <Sequence>1</Sequence> <Max>394</Max> <Avg>349</Avg> <Min>261</Min></MP><MP> <Name>MS&#x3a;stage&#x3a;SI</Name> <Sequence>2</Sequence> <Max>12</Max> <Avg>9</Avg> <Min>8</Min></MP><MP> <Name>MS&#x3a;stage&#x3a;BI</Name> <Sequence>3</Sequence> <Max>73</Max> <Avg>60</Avg> <Min>52</Min></MP><MP> <Name>MS&#x3a;stage&#x3a;VI</Name> <Sequence>4</Sequence> <Max>12</Max> <Avg>8</Avg> <Min>7</Min></MP><MP> <Name>MS&#x3a;stage&#x3a;MS</Name> <Sequence>5</Sequence> <Max>1266</Max> <Avg>1050</Avg> <Min>771</Min></MP><MP> <Name>MS&#x3a;Message_Put_In_Store</Name> <Sequence>6</Sequence> <Max>155</Max> <Avg>112</Avg> <Min>90</Min></MP><MP> <Name>MS&#x3a;Message_Put_In_Disp_Queue</Name> <Sequence>7</Sequence> <Max>2328</Max> <Avg>836</Avg> <Min>82</Min></MP><MP> <Name>MS&#x3a;Message_Wait_In_Disp_Queue</Name> <Sequence>8</Sequence> <Max>1445</Max> <Avg>630</Avg> <Min>203</Min></MP><MP> <Name>MS&#x3a;Message_Put_In_Queue</Name> <Sequence>9</Sequence> <Max>44</Max> <Avg>42</Avg> <Min>42</Min></MP><MP> <Name>MS&#x3a;Message_Wait_In_Queue</Name> <Sequence>10</Sequence> <Max>323</Max> <Avg>263</Avg> <Min>195</Min></MP><MP> <Name>MS&#x3a;Message_Update_Status</Name> <Sequence>11</Sequence> <Max>233</Max> <Avg>166</Avg> <Min>128</Min></MP><MP> <Name>MS&#x3a;stage&#x3a;AM</Name> <Sequence>12</Sequence> <Max>114891</Max> <Avg>41811</Avg> <Min>2755</Min></MP><MP> <Name>MS&#x3a;stage&#x3a;SO</Name> <Sequence>13</Sequence> <Max>59</Max> <Avg>40</Avg> <Min>24</Min></MP><MP> <Name>MS&#x3a;stage&#x3a;VO</Name> <Sequence>14</Sequence> <Max>44</Max> <Avg>33</Avg> <Min>25</Min></MP><MP> <Name>MS&#x3a;stage&#x3a;AT</Name> <Sequence>15</Sequence> <Max>468</Max> <Avg>364</Avg> <Min>304</Min></MP><MP> <Name>MS&#x3a;module_out&#x3a;sap.com&#x2f;com.sap.aii.af.soapadapter&#x2f;XISOAPAdapterBean</Name> <Sequence>16</Sequence> <Max>1008279</Max> <Avg>478000</Avg> <Min>131434</Min></MP><MP> <Name>MS&#x3a;Resp&#x3a;stage&#x3a;BI</Name> <Sequence>17</Sequence> <Max>575</Max> <Avg>481</Avg> <Min>395</Min></MP><MP> <Name>MS&#x3a;Resp&#x3a;Message_Put_In_Store</Name> <Sequence>18</Sequence> <Max>157</Max> <Avg>136</Avg> <Min>121</Min></MP><MP> <Name>MS&#x3a;Resp&#x3a;Message_Update_Status</Name> <Sequence>19</Sequence> <Max>89</Max> <Avg>86</Avg> <Min>81</Min></MP> </MeasuringPoints></Entry>
</Row>
<Row>
<Entry>SOAP_Sender</Entry>
<Entry>SOAPReceiver_Proxy</Entry>
<Entry>OUTBOUND</Entry>
<Entry>EO</Entry>
<Entry>1234567</Entry>
<Entry></Entry>
<Entry>BSR_SAP</Entry>
<Entry></Entry>
<Entry>TS_SERVER</Entry>
<Entry>Confirmation_In</Entry>
<Entry>http&#x3a;&#x2f;&#x2f;sap.com&#x2f;xi&#x2f;IS-U&#x2f;Global2</Entry>
<Entry>&#x7c;BSR_SAP&#x7c;Confirmation_Out&#x7c;&#x7c;</Entry>
<Entry>48</Entry>
<Entry>672</Entry>
<Entry>672</Entry>
<Entry>672</Entry>
<Entry>0</Entry>
<Entry>0</Entry>
<Entry>0.00</Entry>
<Entry>89338</Entry>
<Entry>4288227</Entry>
<Entry> <MeasuringPoints><MP> <Name>MS&#x3a;SOAPHandler.processSOAPtoXMB</Name> <Sequence>1</Sequence> <Max>488</Max> <Avg>296</Avg> <Min>190</Min></MP><MP> <Name>MS&#x3a;module_in&#x3a;CallSapAdapter</Name> <Sequence>2</Sequence> <Max>521</Max> <Avg>211</Avg> <Min>144</Min></MP><MP> <Name>MS&#x3a;stage&#x3a;SI</Name> <Sequence>3</Sequence> <Max>55</Max> <Avg>6</Avg> <Min>5</Min></MP><MP> <Name>MS&#x3a;stage&#x3a;BI</Name> <Sequence>4</Sequence> <Max>195</Max> <Avg>37</Avg> <Min>26</Min></MP><MP> <Name>MS&#x3a;stage&#x3a;VI</Name> <Sequence>5</Sequence> <Max>28</Max> <Avg>5</Avg> <Min>4</Min></MP><MP> <Name>MS&#x3a;stage&#x3a;MS</Name> <Sequence>6</Sequence> <Max>7495</Max> <Avg>2675</Avg> <Min>1340</Min></MP><MP> <Name>MS&#x3a;Message_Put_In_Store</Name> <Sequence>7</Sequence> <Max>28648</Max> <Avg>8891</Avg> <Min>6457</Min></MP><MP> <Name>MS&#x3a;Message_Put_In_Disp_Queue</Name> <Sequence>8</Sequence> <Max>12290</Max> <Avg>6102</Avg> <Min>3558</Min></MP><MP> <Name>MS&#x3a;Message_Put_In_Queue</Name> <Sequence>9</Sequence> <Max>191</Max> <Avg>46</Avg> <Min>21</Min></MP><MP> <Name>MS&#x3a;Message_Wait_In_Queue</Name> <Sequence>10</Sequence> <Max>401</Max> <Avg>229</Avg> <Min>153</Min></MP><MP> <Name>MS&#x3a;Message_Wait_In_Disp_Queue</Name> <Sequence>11</Sequence> <Max>18855</Max> <Avg>5289</Avg> <Min>8</Min></MP><MP> <Name>MS&#x3a;Message_Update_Status</Name> <Sequence>12</Sequence> <Max>25237</Max> <Avg>9398</Avg> <Min>5056</Min></MP><MP> <Name>MS&#x3a;stage&#x3a;AM</Name> <Sequence>13</Sequence> <Max>390</Max> <Avg>183</Avg> <Min>124</Min></MP><MP> <Name>MS&#x3a;stage&#x3a;SO</Name> <Sequence>14</Sequence> <Max>102</Max> <Avg>17</Avg> <Min>16</Min></MP><MP> <Name>MS&#x3a;stage&#x3a;VO</Name> <Sequence>15</Sequence> <Max>155</Max> <Avg>22</Avg> <Min>17</Min></MP><MP> <Name>MS&#x3a;stage&#x3a;AT</Name> <Sequence>16</Sequence> <Max>1813</Max> <Avg>332</Avg> <Min>205</Min></MP><MP> <Name>MS&#x3a;module_out&#x3a;sap.com&#x2f;com.sap.aii.af.soapadapter&#x2f;XISOAPAdapterBean</Name> <Sequence>17</Sequence> <Max>91602</Max> <Avg>55588</Avg> <Min>46038</Min></MP> </MeasuringPoints></Entry>
</Row>
</DataRows>
</Data>

</PerformanceDataQueryResults>

The XML data is complex and contains nested tables for the “MeasuringPoint” elements. This is not something that is possible to extract using the Excel data import GUI alone. You will need to use my code 😉
In the code there are two points that do the required pre-processing to transpose, fillUp and then remove some data parts, returning it in the required format so that you can report on it with all the “MeasuringPoints” if you need them.
Could the above be done in another tool? Probably. But everyone has Excel.

Let’s put my Power Query code into the Excel query editor:

let
    // Uncomment to use a URL source,
    // Source = Xml.Tables(Web.Contents("https://sapts1app1:50001/mdt/performancedataqueryservlet?component=af.ts1.sapts1db01&begin=2021-01-10T00:00:00.000Z&end=2021-01-11T00:00:00.000Z")), 
    Source = Xml.Tables(File.Contents("C:\Users\darryl\Documents\Projects\po-perf-metrics\performancedataqueryservlet-1.xml")),
    Data = Source{1}[Table],
    DataRows = Data{1}[Table],
    Row = DataRows{0}[Table],
    #"Expanded Entry" = Table.TransformColumns(Row, {"Entry", each Table.RemoveLastN(Table.FillUp(Table.Transpose(_), {"Column22"}),1)}),
    #"Expanded Entry1" = Table.ExpandTableColumn(#"Expanded Entry", "Entry", {"Column1", "Column2", "Column3", "Column4", "Column5", "Column6", "Column7", "Column8", "Column9", "Column10", "Column11", "Column12", "Column13", "Column14", "Column15", "Column16", "Column17", "Column18", "Column19", "Column20", "Column21", "Column22"}, {"Entry.Column1", "Entry.Column2", "Entry.Column3", "Entry.Column4", "Entry.Column5", "Entry.Column6", "Entry.Column7", "Entry.Column8", "Entry.Column9", "Entry.Column10", "Entry.Column11", "Entry.Column12", "Entry.Column13", "Entry.Column14", "Entry.Column15", "Entry.Column16", "Entry.Column17", "Entry.Column18", "Entry.Column19", "Entry.Column20", "Entry.Column21", "Entry.Column22"}),
    #"Renamed Columns" = Table.RenameColumns(#"Expanded Entry1",{{"Entry.Column1", "INBOUND_CHANNEL"}, {"Entry.Column2", "OUTBOUND_CHANNEL"}, {"Entry.Column3", "DIRECTION"}, {"Entry.Column4", "DELIVERY_SEMANTICS"}, {"Entry.Column5", "SERVER_NODE"}, {"Entry.Column6", "FROM_PARTY_NAME"}, {"Entry.Column7", "FROM_SERVICE_NAME"}, {"Entry.Column8", "TO_PARTY_NAME"}, {"Entry.Column9", "TO_SERVICE_NAME"}, {"Entry.Column10", "ACTION_NAME"}, {"Entry.Column11", "ACTION_TYPE"}, {"Entry.Column12", "SCENARIO_IDENTIFIER"}, {"Entry.Column13", "MESSAGE_COUNTER"}, {"Entry.Column14", "MAX_MESSAGE_SIZE"}, {"Entry.Column15", "MIN_MESSAGE_SIZE"}, {"Entry.Column16", "AVG_MESSAGE_SIZE"}, {"Entry.Column17", "MAX_RETRY_COUNTER"}, {"Entry.Column18", "MIN_RETRY_COUNTER"}, {"Entry.Column19", "AVG_RETRY_COUNTER"}, {"Entry.Column20", "AVG_PROCESSING_TIME"}, {"Entry.Column21", "TOTAL_PROCESSING_TIME"}}),
    #"Expanded MP" = Table.ExpandTableColumn(#"Renamed Columns", "Entry.Column22", {"MP"}, {"Entry.Column22.MP"})
in
    #"Expanded MP"

In the code above, you will notice the “Source=” is using a local file. You can uncomment the “Web” source and comment out the “File” source if you are pulling the data direct via HTTP.

With the Power Query code entered into the editor, check there are no syntax errors and click “Done”:

When querying the data directly over HTTP you will need to edit the credentials at this point.
In the credentials screen, enter the “Basic” username and password to use.
The data will be displayed.
In my sample I see two rows of data:

At the initial top-level, you will see we have the Average Processing Time (in milliseconds) for each interface:

We also have an additional column at the end which contains embedded tables of additional metric data for each specific stage of processing within PI/PO:

By clicking the double arrows in the top of the header for the “Entry.Column22.MP” we can expand the embedded table (should you wish to), and you will see that it presents the following additional columns of data:

When you click “OK” it adds those columns to the main list, but it will create additional rows of data for each of those additional columns that have been expanded:

With the above data expanded, we can really produce some nice graphs.
Here’s an example showing the breakdown of average response time for each of those processing stages.
First I put them into a pivot table and apply an average to the “Avg” column for each of the “Name” column values :

Then I create a pie chart for the data and we can report on which processing stage inside PI/PO is consuming the most time:

By applying additional graphs and filters we could report on individual interfaces’ overall response times, then allow drill-down into the specifics.

Any Potential Issues?

There is one big caveat with the above process of pulling the data from the servlet.
The servlet is extracting data from a memory cache area inside PI/PO.
This cache is an LRU cache, meaning it has a limited size and gets overwritten when it becomes full.
The quantity of data is therefore limited.

It is possible that you could switch on the database persistence (logging) of successful messages in PI/PO, but this has a detrimental impact to message throughput performance and is not recommended by SAP for production systems.

To try and get around the cache limitations, my advice would be to extract the data using the smallest granular frequency that the servlet allows (MINUTE), and save to a disk file which could be accessible from Excel somehow.
Another method could be to use Microsoft Power Automate (previously MS Flow) to pull the file into Sharepoint or other storage.
By extracting the data frequently, you are trying to ensure that you have it stored before the cache is cleared, but also you are building a time-series from which you could potentially include into your reporting tool.
A time-series would allow you to scroll through data in windows of at least 15 mins in size. Not bad.

Summary

We identified the important areas of data collection in PI/PO (and Netweaver in general) to allow response times to be seen.
We also noted that HTTP response codes such as 503 should be obtained from the first outermost point of the Netweaver stack (if possible).

We saw an example of using the “PerformanceDataQueryServlet” to pull data from the PI/PO memory cache and transformed it using Power Query to allow detailed reporting on the response times.
I created a demonstration graph from a pivot table using my sample data, which showed a possible drill-down in the response time of individual processing stages within SAP PI/PO.

Hopefully I have given you some ideas on for how you can solve your PI/PO performance reporting requirement.

Tuning SAP ASE – Decompressing Small Lookup Tables

By default, for SAP applications based on SAP Netweaver, SAP ASE 16.0 will compress the data pages within the tables inside the database.
In some scenarios this will not help with performance.

In this brief post I explain the scenario, I correct a SAP note and show you how to check and adjust the tables in a safe way (do not just follow the note, you could corrupt the table data).

Data Page Compression

In ASE 16.0, table data pages can be compressed.

For SAP Netweaver based SAP systems such as SAP ERP, the use of table data page compression can help improve performance in two ways:

  1. Reduced I/O
    Reading and writing compressed data pages to disk makes it more efficient. Imagine that it’s a bit like copying a 4MB zip file on your computer, instead of an uncompressed 40MB text file!
  2. Reduced Memory Consumption
    In memory (in the data cache), the data pages can be held in compressed form and may not need to be uncompressed in order to satisfy certain queries.

The above sounds good and indeed it is good.
But there are some borderline cases where compression on a table may not be beneficial.

The Drawback with Compression

In some cases, compression can affect the response time of queries in very specific circumstances.

If you have “small” tables and those tables are accessed frequently with SQL queries that perform full table scans, then the extra CPU time required to decompress the table contents in the data cache, can impact the response time.

How do We Identify Candidate Tables?

If there are tables which are adversely affected from having compression turned, on, how do we identify those tables?

We can follow SAP note 1775764, which provides a nice simple SQL statement to list likely candidates. Great!

However, the provided SQL statement actually does not do what it should!
Oh dear!
Instead, the SQL incorrectly lists tables that are definitely not good candidates for decompression. In fact, some of them are the complete opposite of good candidates!

The SQL taken from the SAP note is:

--- DO NOT RUN THIS IT IS WRONG!!! ---
use <SID>
go
select top 10 ObjectName, 
LogicalReads 
from master..monOpenObjectActivity 
where DBID = db_id() 
and LogicalReads > 1000000 
-->   and data_pages(DBID,0)) < 6400      <-- WRONG LINE HERE! 
and object_attr(DBName || '.SAPSR3.' || ObjectName,'compression') <> 'none' 
order by LogicalReads desc
go

You will see in the above SQL taken from the SAP note, the “data_pages” function call has 2 parameters, the ID of the current database and an object id. Except the object id on line 9, has been hard coded to “0”.

This causes a value of “0” to be returned from the “data_pages” call, which is obviously always less than 6400.

Essentially, any table that has compression enabled, with high numbers of LogicalReads (table scans in the data cache), will be reported as a candidate! How confusing.

Correcting the SQL

We can see that we need to correct the SQL.
Here’s how:

--- This is the corrected statement ---
use <SID>
go
select top 10 ObjectName, 
LogicalReads 
from master..monOpenObjectActivity 
where DBID = db_id() 
and LogicalReads > 1000000 
and data_pages(DBID,ObjectID)) < 6400
and object_attr(DBName || '.SAPSR3.' || ObjectName,'compression') <> 'none' 
order by LogicalReads desc
go

It’s simple enough, we just substitute the incorrect “0” on line 9, with the column “ObjectID” from the monOpenObjectActivity table we are selecting from.
This correction then allows the correct output from the call to function “data_pages“.
With the correction in place, tables with a data page count of less than 6400 pages (using 16KB pages, means tables less than 100MB in size), with compression enabled and a high number of LogicalReads, will be listed as candidates. Yay!

Adjusting the Candidate Tables

Once you have your table list, you might now wish to turn off compression.
This is simple enough.
You can use the additional SQL supplied in the SAP note:

--- WARNING: check SAP note 2614712 first! ---
use <SAPSID>
go
setuser 'SAPSR3'
set quoted_identifier on
set chained on
go
alter table SAPSR3.<tablename> set compression = none
reorg rebuild SAPSR3.<tablename> [with online]  <-- check 2614712 first!
commit
go

As you can see, you will need to perform a REORG on the table.
The SQL statement specifies the optional “with online” option.
BE WARNED: There are lots of SAP notes about issues with REORG ONLINE, please ensure that you check SAP note 2614712 first! Shame on SAP for not mentioning this in the SAP note!

Checking the Result

We’ve gone through the effort of adjusting tables we think that might benefit from no-compression, but how do we now confirm that we have positively impacted the response time?

Well that’s a difficult one. I’ll leave that for another post.

Summary

The summary for this post, is a little bit “Boris Johnson”:

  • Follow the SAP notes.
  • But don’t follow the SAP notes.
  • Check and double check what is output from commands, don’t just take the output as being correct!
  • Check your results.
  • Compressing tables is a good thing.
  • Compressing specific tables may not be a good thing.

Dropping Empty SAP BW Table Partitions in SAP ASE

In a SAP BW 7.4 on SAP ASE database, table partitions are used as a method of storing data, primarily for query performance but also for object (table) management.

In this post I show a simple way to identify tables with many empty partitions, so that you can more quickly drop those empty partitions.
Less empty partitions reduces the downtime of database migrations, and can also increase the performance of business-as-usual queries.

Partitioning in SAP ASE

To use SAP BW on SAP ASE, the “partitioning” license needs to be bought/included in the database license.
The license is automatically included in the runtime license for ASE for SAP Business Suite.

SAP note 2187579 “SYB: Pros and cons of physical partitioning of fact tables” list all of the benefits and the options of partitions for ASE 15.7 and ASE 16.0.

During normal business usage, the database can use less effort to obtain data from a partitioned table, when the partition key column is used as a predicate in the query.
This is because the database knows exactly where the data is held.
It’s in its own table partition and is therefore more manageable.

A good analogy is to imagine that you have two holding pens, one with all cats, and one with all dogs. The partition key is “animal type” and each holding pen is a partition.
Both holding pens together make the table.
If we wanted to get all data from the table where the occupant was a cat, we simply go to the pen with all the cats and read the data.

Now imagine that we had 3 partitions that make up our table, but one of those partitions was actually empty.
In some cases, depending on the database settings, certain types of data queries will still scan for data in that empty partition.
These additional scans do not take a huge amount of time individually, but it does cause extra effort nevertheless.

If we upscale our scenario to a large multi-terabyte SAP BW system, and to a BW FACT table with thousands of partitions.
Imagine if we had thousands of empty partitions and we were accessing all records of the table (a full table scan), this would give a reasonable delay before the query could return the results.
For this exact reason, SAP provide a BW report.

The Standard SAP BW Report

The standard ABAP report SAP_DROP_EMPTY_FPARTITIONS is specifically for the FACT tables of a BW system and it is a recommendation in the ASE Best Practices document for this report to be run before a BW system migration/copy is performed.

By reducing the empty partitions, we also reduce the system export time. Easy winner.

The problem with the SAP standard report, is that you will need to go through each individual BW info-cube and execute the report in “show” mode.
This is really, really painfully slow.

A Better Way

Instead of the standard report, I decided to go straight to the DB layer and use SQL.
The example below is for SAP ASE 16.0 (should work on 15.7 also):

select distinct 
       convert(varchar(20),so.name) as tabname, 
       t_spc.numparts-1 as num_parts, 
       t_spn.numparts-1 as num_emptyparts 
from sysobjects so, 
     (select sp1.id, 
             count(sp1.partitionid) as numparts 
      from syspartitions sp1 
      where sp1.indid = 0 
      group by sp1.id 
     ) as t_spc,
     (select sp2.id, 
            count(sp2.partitionid) as numparts 
      from syspartitions sp2, 
           systabstats sts 
      where sp2.indid = 0 
        and sp2.partitionid = sts.partitionid
        and sts.indid = 0 
        and sts.rowcnt = 0 
      group by sp2.id 
     ) as t_spn 
where so.name like '/BIC/F%' 
  and so.id = t_spc.id 
  and so.id = t_spn.id 
  and so.loginame = 'SAPSR3' 
  and t_spn.numparts > 1 
order by t_spn.numparts asc, so.name

It’s fairly crude because it restricts the tables to those owned by SAPSR3 (change this if your schema/owner is different) and it is looking for FACT tables by their name (“/BIC/F*”) which may not be conclusive.

Below is an example output of the SQL versus the report SAP_DROP_EMPTY_FPARTITIONS in “show” mode:

You can see we are only 1 count out (I’ve corrected in the SQL now) but essentially we get a full list of the tables on which we can have the most impact!

Let’s look at a sample SELECT statement against that table:

We used the following SQL:

set statistics time on 
select count(*) from [SAPSR3./BIC/FZPSC0201]
go

Execution time on that statement was 25.9 seconds (elapsed time of 25931 ms).
We spent 2 seconds parsing and compiling the SQL statement (lots of partitions probably doesn’t help this either).
Since the CPU time is only 7 seconds, we have to assume that I/O was the reason for the delay while ASE scanned over the partitions.

Dropping The Partitions

Let’s go ahead and actually drop those empty partitions using another standard ABAP report SAP_DROP_EMPTY_FPARTITIONS.

NOTE: See SAP note 1974523 “SYB: Adaption of SAP_DROP_EMPTY_FPARTITIONS for SAP ASE” for more details on how to use the report.


We need to run this in the background, because dropping over 1,000 partitions will take a while.

Once dropped, we can re-run our select statement:

Total elapsed time is now down to just 6 seconds.
Admittedly there could be some time saving due to the data cache and plan cache already being populated for this table, so I ran ASE stored procedure: sp_unbindcache, which seemed to have done something.
Then I re-ran the query:

Being unsure if the unbind worked or not (I could not bounce the DB to be absolutely sure), I’m going to just accept that we have improved the result by dropping those empty partitions.

Analysing & Reducing HANA Backup Catalog Records

In honour of DBA Appreciation Day today 3rd July, I’ve written a small piece on a menial but crucial task that HANA database administrators may wish to check. It’s very easy to overlook but the impact can be quite amazing.

HANA Transaction Logging

In “normal” log mode (for recoverability), the HANA database, like Oracle, has an automatic transaction log backup process, which is responsible for backing up transaction log segments so that the HANA log volume disk space can be re-used by new transactions.
No free disk space in the HANA log volume, means the database will hang, until free space becomes available.

It is strongly recommended by SAP, to have your HANA database in log mode “normal”, since this offers the point-in-time recovery capability through the use of the transaction log backups.

By default a transaction log backup will be triggered automatically by HANA every time a log segment becomes full or if the timeout for an individual service is hit, whichever of those is sooner.
This is known as “immediate” interval mode.

I’m not going to go into the differences of the various interval options and the pros and cons of each since this is highly scenario specific. A lot of companies have small HANA databases and are quite happy with the default options. Some companies have high throughput, super low latency requirements, and would be tuning the log backup process for maximum throughput, while other companies want minimal data-loss and adjust the parameters to ensure that transactions are backed up off the machine as soon as possible.

The SITREP

In this specific situation that I encountered, I have a small HANA database of around ~200GB in memory, serving a SAP Solution Manager 7.2 system (so it has 2x tenant databases plus the SystemDB).

The settings are such that all databases run in log_mode “normal” with consolidated log backups enabled in “immediate” mode and a max_log_backup_size of 16GB (the default, but specified).

All backups are written to a specific disk area, before being pushed off the VM to an Azure Storage Account.

The Issue

I noticed that the local disk area was becoming quite full where the HANA database backups are written. Out of context you might have said it’s normal for an increase of activity in the system, but I know that this system is not doing anything at all (it’s a test system for testing Solution Manager patches and nobody was using it).

What Was Causing the Disk Usage?

Looking at the disk backup file system, I could easily see at the O/S level, that the HANA database log backups were the reason for the extra space usage.
Narrowing that down even further, I could be specific enough to see that the SYSTEMDB was to blame.

The SYSTEMDB in a very lightly used HANA database should not be transacting enough to have a day-to-day noticeable increase in log backup disk usage.
This was no ordinary increase!
I was looking at a total HANA database size on disk of ~120GB (SYSTEMDB plus 2x TenantDBs), and yet I was seeing ~200GB of transaction log backups per day from just the SYSTEMDB.

Drilling down further into the log backup directory for the SYSTEMDB, I could see the name of the log backup files and their sizes.
I was looking at log backup files of 2.8GB in size every ~10 to ~15 minutes.
The files that were biggest were….

… log_backup_0_0_0_0.<unix epoch time>
That’s right, the backup catalog backups!

Whenever HANA writes a backup, whether it is a complete data backup, or a transaction log backup, it also writes a backup of the backup catalog.
This is extremely useful if you have to restore a system and need to know about the backups that have taken place.
By default, the backup catalog backups are accumulated, which means that HANA doesn’t need to write out multiple backups of the backup catalog for each log backup (remember, we have 2x tenantDBs).

Why Were Catalog Backup Files So Big?

The catalog backups include the entire backup catalog.
This means every prior backup is in the backup file, so by default the backup catalog backup file will increase in size at each backup, unless you do some housekeeping of the backup catalog records.

My task was to write some SQL to check the backup catalog to see how many backup catalog records existed, for what type of backups, in which database and how old they were.

I came up with the following SQL:

--- Breakdown of age of backup records in months, by type of record.
SELECT smbc.DATABASE_NAME,
smbc.ENTRY_TYPE_NAME,
MONTHS_BETWEEN(smbc.SYS_START_TIME, CURRENT_DATE) as AGE_MONTHS,
COUNT(MONTHS_BETWEEN(smbc.SYS_START_TIME, CURRENT_DATE)) RECORDS,
t_smbc.YOUNGEST_BACKUP_ID
FROM	"SYS_DATABASES"."M_BACKUP_CATALOG" AS smbc,
		(SELECT xmbc.DATABASE_NAME, 
				xmbc.ENTRY_TYPE_NAME, 
				MONTHS_BETWEEN(xmbc.SYS_START_TIME, CURRENT_DATE) as AGE_MONTHS, 
				max (xmbc.BACKUP_ID) as YOUNGEST_BACKUP_ID 
				FROM "SYS_DATABASES"."M_BACKUP_CATALOG" xmbc 
				GROUP BY xmbc.DATABASE_NAME, 
						xmbc.ENTRY_TYPE_NAME, 
						MONTHS_BETWEEN(xmbc.SYS_START_TIME, CURRENT_DATE) 
		) as t_smbc 
WHERE t_smbc.DATABASE_NAME = smbc.DATABASE_NAME 
AND t_smbc.ENTRY_TYPE_NAME = smbc.ENTRY_TYPE_NAME 
AND t_smbc.AGE_MONTHS = MONTHS_BETWEEN(smbc.SYS_START_TIME, CURRENT_DATE) 
GROUP BY 	smbc.DATABASE_NAME, 
			smbc.ENTRY_TYPE_NAME, 
			MONTHS_BETWEEN(smbc.SYS_START_TIME, CURRENT_DATE), 
			t_smbc.YOUNGEST_BACKUP_ID 
ORDER BY DATABASE_NAME, 
		AGE_MONTHS DESC,
		RECORDS

The key points to note are:

  • I use the SYS_DATABASES.M_BACKUP_CATALOG view in the SYSTEMDB to see across all databases in the HANA system instead of checking in each one.
  • For each database, the SQL outputs:
    – type of backup (complete or log).
    – age in months of the backup.
    – number of backup records in that age group.
    – youngest backup id for that age group (so I can do some cleanup).

An example execution is:

(NOTE: I made a mistake with the last column name, it’s correct in the SQL now – YOUNGEST_BACKUP_ID)

You can see that the SQL execution took only 3.8 seconds.
Based on my output, I could immediately see one problem, I had backup records from 6 months ago in the SYSTEMDB!

All of these records would be backed up on every transaction log backup.
For whatever reason, the backup process was not able to honour the “BACKUP CATALOG DELETE” which was meant to keep the catalog to less than 1 month of records.
I still cannot adequately explain why this had failed. The same process is used on other HANA databases and none had exhibited the same issue.

I can only presume something was preventing the deletion somehow, since in the next few steps you will see that I was able to use the exact same process with no reported issues.
For reference this is HANA 2.0 SPS04 rev47, patched all the way from SPS02 rev23.

Resolving the Issue

How did I resolve the issue? I simply re-ran the catalog deletion that was already running after each backup.
I was able to use the backup ID from the YOUNGEST_BACKUP_ID column to reduce the backup records.

In the SYSTEMDB:

BACKUP CATALOG DELETE ALL BEFORE BACKUP_ID xxxxxxxx

Then for each TenantDB (still in the SYSTEMDB):

BACKUP CATALOG DELETE FOR <TENANTBD> ALL BEFORE BACKUP_ID xxxxxxxx

At the end of the first DELETE execution *in the first Tenant*, I re-ran the initial SQL query to check and this was the output:

We now only have 1 backup record, which was the youngest record in that age group for that first tenant database (compare to screenshot of first execution of the SQL query with backup id 1,590,747,286,179).
Crucially we have way less log backups for that tenant. Weve gone down from 2247 to 495.
Nice!
I then progressed to do the delete in the SYSTEMDB and other TenantDB of this HANA system.

Checking the Results

As a final check, I was able to compare the log backup file sizes:

The catalog backup in file “log_backup_0_0_0_0.nnnnnnn” at 09:16 is before the cleanup and is 2.7GB in size.
Whereas the catalog backup in “log_backup_0_0_0_0.nnnnnnn” at 09:29 is after the cleanup and is only 76KB in size.
An absolutely massive reduction!

How do we know that file “log_backup_0_0_0_0.nnnnnnn” is a catalog backup?
Because we can check using the Linux “strings” command to see the file string contents.
Way further down the listing it says it is a catalog backup, but I thought it was more interesting to see the “MAGIC” of Berlin:

UPDATE: August 2020 – SAP note 2962726 has been released which contains some standard SQL to help remove failed backup entries from the catalog.

Summary

  • Check your HANA backup catalog backup sizes.
  • Ensure you have alerting on file systems (if doing backups to disk).
  • Double check the backup catalog record age.
  • Give tons of freebies and thanks to your DBAs on DBA Appreciation Day!
Useful Links

Enable and Disable Automatic Log Backup
https://help.sap.com/viewer/6b94445c94ae495c83a19646e7c3fd56/2.0.05/en-US/241c0f0020b2492fb93a69a40b1b1b9a.html

Accumulated Backups of the Backup Catalog
https://help.sap.com/viewer/6b94445c94ae495c83a19646e7c3fd56/2.0.05/en-US/3def15378b954aac85f2b93bb3f85a49.html

Log Modes
https://help.sap.com/viewer/6b94445c94ae495c83a19646e7c3fd56/2.0.05/en-US/c486a0a3bb571014ab46c0633224f02f.html

Consolidated Log Backups
https://help.sap.com/viewer/6b94445c94ae495c83a19646e7c3fd56/2.0.05/en-US/653b5c6d5f9d41808011a5bd0fac6709.html

SAP ASE – Blocking Factor Madness

I spent around a day looking into a performance issue with a specific peice of SQL in an ERP 6 system running on a SAP ASE database.
The system has recently been migrated from Oracle, so we were expecting issues with hints, however this didn’t seem to be an index choice issue.
Look at the following two SQL statements, the first one is the system experiencing a performance problem:

e17933e9-cd69-4108-9924-3eb8e0b4900e

The second picture is a system where the performance issue doesn’t exist:

65731635-82b1-4893-90a0-bcc2bd00cf8e

Can you spot the difference?
Hint: Look at the number of question marks in the prepared statement.
The number of question marks indicate the number of items included in the “IN LIST” of the WHERE clause.

Since the ABAP SQL statement will be interpreted at the Kernel level, there is no way to see any difference in the ABAP layer other than this output from a SQL trace.
The consequence of the first statement (with fewer question marks) are that the SQL statement is executed multiple times in order to query for the same “PERNR” records.  This can result in as much as 4 to five times more effort for the SAP layer, plus the database layer.  Which adds more to the database response time and a little to the “processing time”.

What impacts the number of question marks?  Simple, the parameters “rsdb/*blocking_factor” at the Kernel level, will adjust how many parameters are fed into the prepared statement in the DBSL layer.

SAP Note 1996340 – SYB: Default RSDB profile parameters for SAP ASE  will provide all the answers.
The SAP note also answered my specific issue, which was why was one system in the landscape different.  The answer was that the production system (where the problem was seen) had it’s parameters mostly carried across during a migration from Oracle.  Whereas a smaller “release” system had it’s parameters left behind to die with the Oracle database.

As you will read in the SAP note 1996340, these rsdb parameters are pretty essential and should be re-evaluated when changing database platform.

Always re-evaluate all parameters when migrating from one platform to another.
Don’t assume that someone more experienced has set them with some future knowledge of the landscape/setup.