This blog contains experience gained over the years of implementing (and de-implementing) large scale IT applications/software.

SAP PI/PO Performance Statistics Data Extract and Analysis using Excel & Power Query

End-to-End performance analysis of a specific SAP PO/PI interface is a tricky business using the out-of-the-box tools. For example, you might be thinking that SAP Solution Manager can provide this detail, but it is not able to correlate inbound SAP Web Dispatcher HTTP logs or Netweaver Java ICM logs to the PI/PO interface.
Instead you need to implement your own monitoring to bring these components together, unless you implement a complete tool like AppDynamics.

In this post I will show how to extract the SAP PI/PO performance data and import into Excel, using a Power Query to transform it from the XML format into something that we can report on.
After seeing my example, you can take the same design away and, using PowerBI, construct a dashboard for reporting. You can also choose to incorporate the HTTP logs, to give you an the end-to-end performance analysis by interface.

This post is similar to another set of blog posts I published, showing how to pull data about your virtual machines directly from Azure, see: List Your Azure VMs in Excel. That other post used Power Query to transform JSON.

Data Sources

Before we go and start Excel, we need to understand where our data will be coming from.
There are 4 locations that we can report on in a SAP Netweaver AS Java stack running SAP PI/PO:

  • SAP Web Dispatcher (load balancer across multiple App instances).
  • SAP ICM (load balancer across multiple NW java server nodes).
  • SAP NW HTTP Provider.
  • SAP PI/PO performance data servlet.

The last item is the key one, which will require some Power Query goodness to transform the data.
We use the 2014 blog post from Vadim Klimov to see how to pull this data direct from PI/PO using the servlet “PerformanceDataQueryServlet”.

For the Web Dispatcher, the ICM and the HTTP Provider, we really need to choose just one of those points to collect the log information.
Since our source system is handing over processing to “us” at the Web Dispatcher, then that would be the logical location to collect the HTTP logs.
However, some companies use a load balancing appliance like an F5 at the entry-point, in which case, you would be better gathering the HTTP logs from each of the ICM server processes.

The reason for using the HTTP logs from the front-end of the architecture stack, is because you want to capture any HTTP 50x messages caused by unavailability of the back-end parts.
For example, if SAP Netweaver is down, then the Web Disp logs would show a HTTP 503 (service unavailable).
If the PO application is not started inside SAP Netweaver, then the ICM logs would show a HTTP 503.
You want to collect the logs from the closest point of the handover between your source system and the Netweaver stack.

For the HTTP log sources, we have a little bit of an issue.
In most cases, logging is not enabled in Web Dispatcher and ICM configurations. To enable the logging we need to look at the parameter “icm/HTTP/logging_<xx>“.
Ideally what we need for the log format is: %h %u %t “%r” %s %H %L
This will give:

  • %h = Source IP.
  • %u = BASIC Auth username or cert common name.
  • %t = Date/time.
  • %r = The request with query string !IMPORTANT!
  • %s = The HTTP response code e.g. 200 or 500 etc.
  • %H = The name of the server host.
  • %L = Response time in milliseconds.

The log file should be switched DAILY (SWITCHF=day) to prevent it from growing too large.
We will need to transform the data in the log, but we can do this in Power Query or in a script at source.

Data Transfer

We now need to understand how we will transfer the data from the data source to Excel (or PowerBI).
Natively, Excel (and PowerBI) can query a HTTP target to obtain data in a variety of formats.
Since XML is supported with no problems, this enables us to call the PI/PO PerformanceDataQueryServlet directly from Excel.

For those feeling adventurous, the HTTP logs can actually be picked up by Azure Log Analytics. You may need to adjust the date/time format with a script, but then it will be possible to have them stored in your workspace for pulling into PowerBI.

Alternatively, you will need to place the HTTP logs into a storage location on a regular interval, somewhere accessible from Excel/PowerBI. This could be Sharepoint or an Azure Storage Account.
Another option is to have them placed into a location that serves HTTP, such as the SAP instance agent like I have shown before. For the Web Dispatcher you may have 2 logs (in an active/active setup) for the ICM you will have a log file for each Application server instance.
By naming the log files in an intelligent manner, you can ensure that your Power Query can always find the files (e.g. don’t include the date/time in the file name).

Data Aquisition

With your data accessible via HTTP, you can use Excel or PowerBI to process it.
In this example, we will go through the processing for the PerformanceDataQueryServlet, since that is the hardest to process in its raw XML format, with multiple nested tables. The nested tables is the reason we use Power Query to transform it.

Open Excel and create a new workbook, then select the “Data” tab:

Click “Get Data” and select “From Other Sources”, then click “Blank Query”:

Click “Advanced Editor”:

Remove any existing text from the query box:

At this point, we can paste in the code necessary to obtain our data, but first we need to understand the URL composition correctly.
For this we can refer to the 2014 blog post from Vadim Klimov to understand the required parameters.

Here’s my sample HTTP call:
https://sapts1app1:50001/mdt/performancedataqueryservlet?component=af.ts1.sapts1db01&begin=2021-01-10T00:00:00.000Z&end=2021-01-11T00:00:00.000Z

This can be broken down as follows:

EntryValue
Https://sapts1app1:50001Host address of PO App server (this can be any one of the Java instances).
/mdt/performancedataqueryservletURI for the PerformanceDataQueryServlet
component=af.ts1.sapts1db01The name of our AAEX
begin=2021-01-10T00:00:00.000ZThe begin time of our data selection period.
end=2021-01-11T00:00:00.000ZThe end time of our data selection period.

Something you will notice about our URL is that we are using the HOURLY data selection period, listing data for a 24 hour period aggregated by hour.
We don’t really have much choice with the PerformanceDataQueryServlet, as we can only choose from MINUTE, HOURLY or DAILY with aggregation levels being 15mins, 1hour or 1day.

If we were to decide not to pull the data via HTTP, then we could save it to a flat file.
The data format that will be returned from the HTTP call could be pre-saved.
Here’s my sample data returned from the call to the PerformanceDataQueryServlet:

<?xml version="1.0" encoding="UTF-8" ?>
<PerformanceDataQueryResults xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="http://sapts1app1:50001/mdt/monitor/PerformanceDataQuery.xsd">
<Result>
<Code>OK</Code>
<Details>OK</Details>
<Text>Successfully&#x20;read&#x20;performance&#x20;data.</Text>
<Component>af.ts1.sapts1db01</Component>
<PeriodType>DAILY</PeriodType>
<BeginTime timezone="UTC">2021-01-10&#x20;00&#x3a;00&#x3a;00.0</BeginTime>
<EndTime timezone="UTC">2021-01-11&#x20;00&#x3a;00&#x3a;00.0</EndTime>
</Result>
<Data>
<ColumnNames>
<Column>INBOUND_CHANNEL</Column>
<Column>OUTBOUND_CHANNEL</Column>
<Column>DIRECTION</Column>
<Column>DELIVERY_SEMANTICS</Column>
<Column>SERVER_NODE</Column>
<Column>FROM_PARTY_NAME</Column>
<Column>FROM_SERVICE_NAME</Column>
<Column>TO_PARTY_NAME</Column>
<Column>TO_SERVICE_NAME</Column>
<Column>ACTION_NAME</Column>
<Column>ACTION_TYPE</Column>
<Column>SCENARIO_IDENTIFIER</Column>
<Column>MESSAGE_COUNTER</Column>
<Column>MAX_MESSAGE_SIZE</Column>
<Column>MIN_MESSAGE_SIZE</Column>
<Column>AVG_MESSAGE_SIZE</Column>
<Column>MAX_RETRY_COUNTER</Column>
<Column>MIN_RETRY_COUNTER</Column>
<Column>AVG_RETRY_COUNTER</Column>
<Column>AVG_PROCESSING_TIME</Column>
<Column>TOTAL_PROCESSING_TIME</Column>
<Column>MEASURING_POINTS</Column>
</ColumnNames>
<DataRows>
<Row>
<Entry>HTTPSender_TS1</Entry>
<Entry>SOAPReceiver_Proxy</Entry>
<Entry>OUTBOUND</Entry>
<Entry>BE</Entry>
<Entry>1234567</Entry>
<Entry></Entry>
<Entry>BSR_SAP</Entry>
<Entry></Entry>
<Entry>TS_SERVER</Entry>
<Entry>ProcessingIn</Entry>
<Entry>urn&#x3a;in2db.com&#x3a;ERP&#x3a;Test</Entry>
<Entry>&#x7c;BSR_SAP&#x7c;ProcessingOut&#x7c;&#x7c;</Entry>
<Entry>3</Entry>
<Entry>2396</Entry>
<Entry>1391</Entry>
<Entry>1817</Entry>
<Entry>0</Entry>
<Entry>0</Entry>
<Entry>0.00</Entry>
<Entry>524476</Entry>
<Entry>1573428</Entry>
<Entry> <MeasuringPoints><MP> <Name>MS&#x3a;module_in&#x3a;CallSapAdapter</Name> <Sequence>1</Sequence> <Max>394</Max> <Avg>349</Avg> <Min>261</Min></MP><MP> <Name>MS&#x3a;stage&#x3a;SI</Name> <Sequence>2</Sequence> <Max>12</Max> <Avg>9</Avg> <Min>8</Min></MP><MP> <Name>MS&#x3a;stage&#x3a;BI</Name> <Sequence>3</Sequence> <Max>73</Max> <Avg>60</Avg> <Min>52</Min></MP><MP> <Name>MS&#x3a;stage&#x3a;VI</Name> <Sequence>4</Sequence> <Max>12</Max> <Avg>8</Avg> <Min>7</Min></MP><MP> <Name>MS&#x3a;stage&#x3a;MS</Name> <Sequence>5</Sequence> <Max>1266</Max> <Avg>1050</Avg> <Min>771</Min></MP><MP> <Name>MS&#x3a;Message_Put_In_Store</Name> <Sequence>6</Sequence> <Max>155</Max> <Avg>112</Avg> <Min>90</Min></MP><MP> <Name>MS&#x3a;Message_Put_In_Disp_Queue</Name> <Sequence>7</Sequence> <Max>2328</Max> <Avg>836</Avg> <Min>82</Min></MP><MP> <Name>MS&#x3a;Message_Wait_In_Disp_Queue</Name> <Sequence>8</Sequence> <Max>1445</Max> <Avg>630</Avg> <Min>203</Min></MP><MP> <Name>MS&#x3a;Message_Put_In_Queue</Name> <Sequence>9</Sequence> <Max>44</Max> <Avg>42</Avg> <Min>42</Min></MP><MP> <Name>MS&#x3a;Message_Wait_In_Queue</Name> <Sequence>10</Sequence> <Max>323</Max> <Avg>263</Avg> <Min>195</Min></MP><MP> <Name>MS&#x3a;Message_Update_Status</Name> <Sequence>11</Sequence> <Max>233</Max> <Avg>166</Avg> <Min>128</Min></MP><MP> <Name>MS&#x3a;stage&#x3a;AM</Name> <Sequence>12</Sequence> <Max>114891</Max> <Avg>41811</Avg> <Min>2755</Min></MP><MP> <Name>MS&#x3a;stage&#x3a;SO</Name> <Sequence>13</Sequence> <Max>59</Max> <Avg>40</Avg> <Min>24</Min></MP><MP> <Name>MS&#x3a;stage&#x3a;VO</Name> <Sequence>14</Sequence> <Max>44</Max> <Avg>33</Avg> <Min>25</Min></MP><MP> <Name>MS&#x3a;stage&#x3a;AT</Name> <Sequence>15</Sequence> <Max>468</Max> <Avg>364</Avg> <Min>304</Min></MP><MP> <Name>MS&#x3a;module_out&#x3a;sap.com&#x2f;com.sap.aii.af.soapadapter&#x2f;XISOAPAdapterBean</Name> <Sequence>16</Sequence> <Max>1008279</Max> <Avg>478000</Avg> <Min>131434</Min></MP><MP> <Name>MS&#x3a;Resp&#x3a;stage&#x3a;BI</Name> <Sequence>17</Sequence> <Max>575</Max> <Avg>481</Avg> <Min>395</Min></MP><MP> <Name>MS&#x3a;Resp&#x3a;Message_Put_In_Store</Name> <Sequence>18</Sequence> <Max>157</Max> <Avg>136</Avg> <Min>121</Min></MP><MP> <Name>MS&#x3a;Resp&#x3a;Message_Update_Status</Name> <Sequence>19</Sequence> <Max>89</Max> <Avg>86</Avg> <Min>81</Min></MP> </MeasuringPoints></Entry>
</Row>
<Row>
<Entry>SOAP_Sender</Entry>
<Entry>SOAPReceiver_Proxy</Entry>
<Entry>OUTBOUND</Entry>
<Entry>EO</Entry>
<Entry>1234567</Entry>
<Entry></Entry>
<Entry>BSR_SAP</Entry>
<Entry></Entry>
<Entry>TS_SERVER</Entry>
<Entry>Confirmation_In</Entry>
<Entry>http&#x3a;&#x2f;&#x2f;sap.com&#x2f;xi&#x2f;IS-U&#x2f;Global2</Entry>
<Entry>&#x7c;BSR_SAP&#x7c;Confirmation_Out&#x7c;&#x7c;</Entry>
<Entry>48</Entry>
<Entry>672</Entry>
<Entry>672</Entry>
<Entry>672</Entry>
<Entry>0</Entry>
<Entry>0</Entry>
<Entry>0.00</Entry>
<Entry>89338</Entry>
<Entry>4288227</Entry>
<Entry> <MeasuringPoints><MP> <Name>MS&#x3a;SOAPHandler.processSOAPtoXMB</Name> <Sequence>1</Sequence> <Max>488</Max> <Avg>296</Avg> <Min>190</Min></MP><MP> <Name>MS&#x3a;module_in&#x3a;CallSapAdapter</Name> <Sequence>2</Sequence> <Max>521</Max> <Avg>211</Avg> <Min>144</Min></MP><MP> <Name>MS&#x3a;stage&#x3a;SI</Name> <Sequence>3</Sequence> <Max>55</Max> <Avg>6</Avg> <Min>5</Min></MP><MP> <Name>MS&#x3a;stage&#x3a;BI</Name> <Sequence>4</Sequence> <Max>195</Max> <Avg>37</Avg> <Min>26</Min></MP><MP> <Name>MS&#x3a;stage&#x3a;VI</Name> <Sequence>5</Sequence> <Max>28</Max> <Avg>5</Avg> <Min>4</Min></MP><MP> <Name>MS&#x3a;stage&#x3a;MS</Name> <Sequence>6</Sequence> <Max>7495</Max> <Avg>2675</Avg> <Min>1340</Min></MP><MP> <Name>MS&#x3a;Message_Put_In_Store</Name> <Sequence>7</Sequence> <Max>28648</Max> <Avg>8891</Avg> <Min>6457</Min></MP><MP> <Name>MS&#x3a;Message_Put_In_Disp_Queue</Name> <Sequence>8</Sequence> <Max>12290</Max> <Avg>6102</Avg> <Min>3558</Min></MP><MP> <Name>MS&#x3a;Message_Put_In_Queue</Name> <Sequence>9</Sequence> <Max>191</Max> <Avg>46</Avg> <Min>21</Min></MP><MP> <Name>MS&#x3a;Message_Wait_In_Queue</Name> <Sequence>10</Sequence> <Max>401</Max> <Avg>229</Avg> <Min>153</Min></MP><MP> <Name>MS&#x3a;Message_Wait_In_Disp_Queue</Name> <Sequence>11</Sequence> <Max>18855</Max> <Avg>5289</Avg> <Min>8</Min></MP><MP> <Name>MS&#x3a;Message_Update_Status</Name> <Sequence>12</Sequence> <Max>25237</Max> <Avg>9398</Avg> <Min>5056</Min></MP><MP> <Name>MS&#x3a;stage&#x3a;AM</Name> <Sequence>13</Sequence> <Max>390</Max> <Avg>183</Avg> <Min>124</Min></MP><MP> <Name>MS&#x3a;stage&#x3a;SO</Name> <Sequence>14</Sequence> <Max>102</Max> <Avg>17</Avg> <Min>16</Min></MP><MP> <Name>MS&#x3a;stage&#x3a;VO</Name> <Sequence>15</Sequence> <Max>155</Max> <Avg>22</Avg> <Min>17</Min></MP><MP> <Name>MS&#x3a;stage&#x3a;AT</Name> <Sequence>16</Sequence> <Max>1813</Max> <Avg>332</Avg> <Min>205</Min></MP><MP> <Name>MS&#x3a;module_out&#x3a;sap.com&#x2f;com.sap.aii.af.soapadapter&#x2f;XISOAPAdapterBean</Name> <Sequence>17</Sequence> <Max>91602</Max> <Avg>55588</Avg> <Min>46038</Min></MP> </MeasuringPoints></Entry>
</Row>
</DataRows>
</Data>

</PerformanceDataQueryResults>

The XML data is complex and contains nested tables for the “MeasuringPoint” elements. This is not something that is possible to extract using the Excel data import GUI alone. You will need to use my code 😉
In the code there are two points that do the required pre-processing to transpose, fillUp and then remove some data parts, returning it in the required format so that you can report on it with all the “MeasuringPoints” if you need them.
Could the above be done in another tool? Probably. But everyone has Excel.

Let’s put my Power Query code into the Excel query editor:

let
    // Uncomment to use a URL source,
    // Source = Xml.Tables(Web.Contents("https://sapts1app1:50001/mdt/performancedataqueryservlet?component=af.ts1.sapts1db01&begin=2021-01-10T00:00:00.000Z&end=2021-01-11T00:00:00.000Z")), 
    Source = Xml.Tables(File.Contents("C:\Users\darryl\Documents\Projects\po-perf-metrics\performancedataqueryservlet-1.xml")),
    Data = Source{1}[Table],
    DataRows = Data{1}[Table],
    Row = DataRows{0}[Table],
    #"Expanded Entry" = Table.TransformColumns(Row, {"Entry", each Table.RemoveLastN(Table.FillUp(Table.Transpose(_), {"Column22"}),1)}),
    #"Expanded Entry1" = Table.ExpandTableColumn(#"Expanded Entry", "Entry", {"Column1", "Column2", "Column3", "Column4", "Column5", "Column6", "Column7", "Column8", "Column9", "Column10", "Column11", "Column12", "Column13", "Column14", "Column15", "Column16", "Column17", "Column18", "Column19", "Column20", "Column21", "Column22"}, {"Entry.Column1", "Entry.Column2", "Entry.Column3", "Entry.Column4", "Entry.Column5", "Entry.Column6", "Entry.Column7", "Entry.Column8", "Entry.Column9", "Entry.Column10", "Entry.Column11", "Entry.Column12", "Entry.Column13", "Entry.Column14", "Entry.Column15", "Entry.Column16", "Entry.Column17", "Entry.Column18", "Entry.Column19", "Entry.Column20", "Entry.Column21", "Entry.Column22"}),
    #"Renamed Columns" = Table.RenameColumns(#"Expanded Entry1",{{"Entry.Column1", "INBOUND_CHANNEL"}, {"Entry.Column2", "OUTBOUND_CHANNEL"}, {"Entry.Column3", "DIRECTION"}, {"Entry.Column4", "DELIVERY_SEMANTICS"}, {"Entry.Column5", "SERVER_NODE"}, {"Entry.Column6", "FROM_PARTY_NAME"}, {"Entry.Column7", "FROM_SERVICE_NAME"}, {"Entry.Column8", "TO_PARTY_NAME"}, {"Entry.Column9", "TO_SERVICE_NAME"}, {"Entry.Column10", "ACTION_NAME"}, {"Entry.Column11", "ACTION_TYPE"}, {"Entry.Column12", "SCENARIO_IDENTIFIER"}, {"Entry.Column13", "MESSAGE_COUNTER"}, {"Entry.Column14", "MAX_MESSAGE_SIZE"}, {"Entry.Column15", "MIN_MESSAGE_SIZE"}, {"Entry.Column16", "AVG_MESSAGE_SIZE"}, {"Entry.Column17", "MAX_RETRY_COUNTER"}, {"Entry.Column18", "MIN_RETRY_COUNTER"}, {"Entry.Column19", "AVG_RETRY_COUNTER"}, {"Entry.Column20", "AVG_PROCESSING_TIME"}, {"Entry.Column21", "TOTAL_PROCESSING_TIME"}}),
    #"Expanded MP" = Table.ExpandTableColumn(#"Renamed Columns", "Entry.Column22", {"MP"}, {"Entry.Column22.MP"})
in
    #"Expanded MP"

In the code above, you will notice the “Source=” is using a local file. You can uncomment the “Web” source and comment out the “File” source if you are pulling the data direct via HTTP.

With the Power Query code entered into the editor, check there are no syntax errors and click “Done”:

When querying the data directly over HTTP you will need to edit the credentials at this point.
In the credentials screen, enter the “Basic” username and password to use.
The data will be displayed.
In my sample I see two rows of data:

At the initial top-level, you will see we have the Average Processing Time (in milliseconds) for each interface:

We also have an additional column at the end which contains embedded tables of additional metric data for each specific stage of processing within PI/PO:

By clicking the double arrows in the top of the header for the “Entry.Column22.MP” we can expand the embedded table (should you wish to), and you will see that it presents the following additional columns of data:

When you click “OK” it adds those columns to the main list, but it will create additional rows of data for each of those additional columns that have been expanded:

With the above data expanded, we can really produce some nice graphs.
Here’s an example showing the breakdown of average response time for each of those processing stages.
First I put them into a pivot table and apply an average to the “Avg” column for each of the “Name” column values :

Then I create a pie chart for the data and we can report on which processing stage inside PI/PO is consuming the most time:

By applying additional graphs and filters we could report on individual interfaces’ overall response times, then allow drill-down into the specifics.

Any Potential Issues?

There is one big caveat with the above process of pulling the data from the servlet.
The servlet is extracting data from a memory cache area inside PI/PO.
This cache is an LRU cache, meaning it has a limited size and gets overwritten when it becomes full.
The quantity of data is therefore limited.

It is possible that you could switch on the database persistence (logging) of successful messages in PI/PO, but this has a detrimental impact to message throughput performance and is not recommended by SAP for production systems.

To try and get around the cache limitations, my advice would be to extract the data using the smallest granular frequency that the servlet allows (MINUTE), and save to a disk file which could be accessible from Excel somehow.
Another method could be to use Microsoft Power Automate (previously MS Flow) to pull the file into Sharepoint or other storage.
By extracting the data frequently, you are trying to ensure that you have it stored before the cache is cleared, but also you are building a time-series from which you could potentially include into your reporting tool.
A time-series would allow you to scroll through data in windows of at least 15 mins in size. Not bad.

Summary

We identified the important areas of data collection in PI/PO (and Netweaver in general) to allow response times to be seen.
We also noted that HTTP response codes such as 503 should be obtained from the first outermost point of the Netweaver stack (if possible).

We saw an example of using the “PerformanceDataQueryServlet” to pull data from the PI/PO memory cache and transformed it using Power Query to allow detailed reporting on the response times.
I created a demonstration graph from a pivot table using my sample data, which showed a possible drill-down in the response time of individual processing stages within SAP PI/PO.

Hopefully I have given you some ideas on for how you can solve your PI/PO performance reporting requirement.

List Your Azure VMs in Excel – Part 3

The third and final part to the trilogy.
I show you how to list your Azure VMs in Excel O365 using Power Query, and this time we enhance the code even further using the Power Query function List.Accumulate, to accumulate the returned list of VMs from each subscription into one big list.

Like an episode of a tacky day-time television show, it’s always worth wasting some words on a recap.
So, in-case you’re not aware, here’s what has happened in this saga so far:

List Your VMs in Excel – part1
The first part of the trilogy, showed how to create a new Power Query in Excel O365, and how to enter the code, which generated a basic VM list from Azure.

List Your VMs in Excel – part2
The second part enhanced the code slightly, parameterising more of the text strings.
It also introduced the ability to get the powerState of each VM from Azure, allowing you to see in Excel, which VMs were running and which were deallocated.

By applying the code changes from this post, you will no longer need multiple Power Query queries, going from 1 per Azure subscription, to just 1 query for all subscriptions.

What’s New?

As mentioned, the code now includes the use of the Power Query “List.Accumulate” function to combine the lists of VMs from multiple subscriptions.

I’ve never really use Power Query before, so even I had no idea how to loop on a list of values and execute a function on each loop. After a bit of searching I found that the List.Accumulate function did exactly what I needed it to do, with minimal coding needed.

Here’s a pretty comprehensive description of how List.Accumulate works: https://datachant.com/2016/06/02/power-query-list-accumulate-unleashed/

What you will notice is that our code has the following properties:
subscriptions = Our defined list of Azure subscriptions.
{} = Our blank list as a seed.
List.Combine = Is executed for each entry in our subscriptions list.

List.Accumulate(subscriptions, 
                {},
                (state,current)=>
                              List.Combine({state,FnGeneratedVMList(current)}))

We define the subscriptions list right at the start of the code:

subscriptions = {"[your subscription1]","[your subscription2]"} as list, 

To make our code work with the List.Accumulate, we have changed the “GeneratedVMList” variable contents, to be a function, instead of a string:

// FnGeneratedVMList pages through the subscription and gets the VM lists. 
FnGeneratedVMList = (subscription as text) as list => 
 List.Generate( ()=>[i=0, 
  res = FnGetOnePage(endPoint & "/subscriptions/" & subscription & "/providers/Microsoft.Compute/virtualMachines?api-version=" & apiVersion)], 
  each [i]null, 
  each [i=[i]+1, 
  res = FnGetOnePage([res][Next])], 
  each [res][Data]),

The main benefit of the function is that it can now be passed a parameter “subscription“, which is used to adjust the URI to the Azure API for the correct subscription.

The End Result

The end result of our changes is the following code:

let 
 iterations = 10 as number, 
 // Max Number of Pages of VMs. 
 endPoint = "https://management.azure.com" as text, 
 subscriptions = {"[your subscription1]","[your subscription2]"} as list, 
 apiVersion = "2019-07-01" as text,
 
// FnGetOnePage is the function that performs an import of single page. 
// The page consists of a record with the data and the URL in the 
// fields data and next. Other Web APIs hold the data and cursor in different formats 
// but the principle is the same. 
FnGetOnePage = (url) as record => 
 let Source = Json.Document(Web.Contents(url)), 
 data = try Source[value] otherwise null, 
 next = try Source[nextLink] otherwise null, 
 res = [Data=data, Next=next] 
in 
 res,
 
// FnGetVMdisplayStatus gets the instanceView object for the passed VM ID 
// then parses out the displayStatus from one of two possible locations. 
FnGetVMdisplayStatus = (idURI) as text => 
 let Source = Json.Document(Web.Contents(endPoint & idURI & "/instanceView?api-version=" & apiVersion)), 
 statuses = Source[statuses], 
 vmDisplayStatus1 = try statuses{1}[displayStatus] otherwise "", 
 vmDisplayStatus2 = try statuses{2}[displayStatus] otherwise "", 
 vmDisplayStatus = vmDisplayStatus1 & vmDisplayStatus2 
in
 vmDisplayStatus,

// FnGeneratedVMList pages through the subscription and gets the VM lists. 
FnGeneratedVMList = (subscription as text) as list => 
 List.Generate( ()=>[i=0, 
  res = FnGetOnePage(endPoint & "/subscriptions/" & subscription & "/providers/Microsoft.Compute/virtualMachines?api-version=" & apiVersion)], 
  each [i]null, 
  each [i=[i]+1, 
  res = FnGetOnePage([res][Next])], 
  each [res][Data]),

// SubscriptionsVMList combines the returned lists using the Accumulator. 
SubscriptionsVMList = 
 List.Accumulate(subscriptions, {},(state,current)=>List.Combine({state,FnGeneratedVMList(current)})),
 
#"VMListTable" = Table.FromList(SubscriptionsVMList, Splitter.SplitByNothing(), null, null, ExtraValues.Error), 
#"Expanded-VMListTable-Column1" = Table.ExpandListColumn(#"VMListTable", "Column1"), 
#"VMdetail-list" = Table.ExpandRecordColumn(#"Expanded-VMListTable-Column1", "Column1", {"name","id"}), 
#"VMdetail-list-with-displayStatus" = Table.AddColumn(#"VMdetail-list", "displayStatus", try each FnGetVMdisplayStatus([id]) otherwise "??") 
in 
#"VMdetail-list-with-displayStatus"

You will need to adjust [your subscription1] and [your subscription2] with your subscriptions, adding any additional subscriptions in the same format.

If you only have one subscription, you can still use this code, just remove one of the items from the subscriptions list, like so:

subscriptions = {"[your subscription1]"} as list,

Summary

Hopefully you can see how you can enhance the code slightly to include more VM details if you need them.
As an example, this line here can be adjusted to include additional columns as well as “name” and “id” (for example: “location“), as per the VM list API call (see the first part of this trilogy for more detail):

#"VMdetail-list" = Table.ExpandRecordColumn(#"Expanded-VMListTable-Column1", "Column1", {"name","id"}), 

You can now enjoy your data in Excel.

Update 23-June: Due to popular demand, here is how the above code can be changed to include the “location” and “vmSize” fields, which will depict the location and series/size of the VM.
We make a small change to the last few lines like so:

#"VMListTable" = Table.FromList(SubscriptionsVMList, Splitter.SplitByNothing(), null, null, ExtraValues.Error), 
"Expanded-VMListTable-Column1" = Table.ExpandListColumn(#"VMListTable", "Column1"),
#"VMdetail-list" = Table.ExpandRecordColumn(#"Expanded-VMListTable-Column1", "Column1", {"name","id", "location", "properties"}),
#"VMdetail-list-properties" = Table.ExpandRecordColumn(#"VMdetail-list", "properties", {"hardwareProfile"}, {"hardwareProfile"}),
#"VMdetail-list-properties-hardwareProfile" = Table.ExpandRecordColumn(#"VMdetail-list-properties", "hardwareProfile", {"vmSize"}, {"vmSize"}),
#"VMdetail-list-with-displayStatus" = Table.AddColumn(#"VMdetail-list-properties-hardwareProfile", "displayStatus", try each FnGetVMdisplayStatus([id]) otherwise "??"),
in 
#"VMdetail-list-with-displayStatus"

List Your Azure VMs in Excel – Part Deux

If you’ve been following, I recently showed you how to get a list of Azure hosted VMs into Microsoft Excel (O365) using Power Query.

You can find the original post here: List Your Azure VMs in Excel

The list was basic, listing only the VM names.
I promised a follow-up post on how to get the Power State of the VM, which would show if it was running, deactivated or powered off completely.
Here is the follow-up.

Can We Just Adjust our API Call?

If you look at the Azure REST API specification for the Compute collection, you will see that it looks like a simple enough task to get the VM Power State as part of the call to VirtualMachines ListAll (see here https://docs.microsoft.com/en-us/rest/api/compute/virtualmachines/listall).

However, all is not well with the call when applying the “statusOnly” URI parameter to the call. It generates an error in Excel when called via “JSON.Document” and Googling around would seem to throw up some interesting comments from others with the same issue.

In short, it doesn’t seem to work the way the API docs say it should, and there appears to be no other way apart from looping on the “VirtualMachineListResult” array that you get back from the “ListAll” call, then making a separate call to get the VM details from the VMInstance view.

So, this is exactly what I did but from within Power Query 🙂

What Changes Were Needed?

As part of the changes to our code, we had to include the VM ID, so that we could use it to call the VMInstance view.
To do this we included a new function FnGetVMdisplayStatus:

// FnGetVMdisplayStatus gets the instanceView object for the passed VM ID 
// then parses out the displayStatus from one of two possible locations. 
FnGetVMdisplayStatus = (idURI) as text => 
 let Source = Json.Document(Web.Contents(endPoint & idURI & "/instanceView?api-version=" & apiVersion)), 
 statuses = Source[statuses], 
 vmDisplayStatus1 = try statuses{1}[displayStatus] otherwise "", 
 vmDisplayStatus2 = try statuses{2}[displayStatus] otherwise "", 
 vmDisplayStatus = vmDisplayStatus1 & vmDisplayStatus2 
in
 vmDisplayStatus,

The new function uses the VM ID, passed in as the function parameter idURI, to formulate the API call URL.
You can see that we’ve also had to parameterise the URL elements throughout, to reduce duplication.

 endPoint = "https://management.azure.com" as text, 
 subscription = "[your subscription]" as text,
 apiVersion = "2019-07-01" as text,

The Power State (we use JSON field displayStatus) of the VM, is returned in one of two items inside a JSON statuses array element. Since there is no guarantee which item in the array will contain the displayStatus, we check in both:

 vmDisplayStatus1 = try statuses{1}[displayStatus] otherwise "", 
 vmDisplayStatus2 = try statuses{2}[displayStatus] otherwise "", 

Finally, we augment our final output table, with an additional column, which will include the call to FnGetVMdisplayStatus to get the related Power State for the VM, using the current table row’s ID column value:

#"VMdetail-list-with-displayStatus" = Table.AddColumn(#"VMdetail-list", "displayStatus", try each FnGetVMdisplayStatus([id]) otherwise "??")

The End Result

The end result of our modifications means that we now have an additional column in our output table, which is populated by our additional function call to the Azure API to get the VMInstance view, parse it and extract the displayStatus.

Once again, thanks to Gil Raviv for the original code with the pagination technique (available here: datachant.com/2016/06/27/cursor-based-pagination-power-query/).

NOTE: I am a Power Query novice, so if YOU (Jon 😉 ) have any tips on how to make this better, neater, faster, please leave a comment and I will test any recommendations.

let 
 iterations = 10 as number,
 // Max Number of Pages of VMs. 
 endPoint = "https://management.azure.com" as text, 
 subscription = "[your subscription]" as text, 
 apiVersion = "2019-07-01" as text, 
 vmListUrl = endPoint & "/subscriptions/" & subscription & "/providers/Microsoft.Compute/virtualMachines?api-version=" & apiVersion as text,
 
// FnGetOnePage is the function that performs an import of single page. 
// The page consists of a record with the data and the URL in the 
// fields data and next. Other Web APIs hold the data and cursor in different formats 
// but the principle is the same. 
FnGetOnePage = (url) as record => 
 let Source = Json.Document(Web.Contents(url)), 
data = try Source[value] otherwise null, 
 next = try Source[nextLink] otherwise null, 
 res = [Data=data, Next=next] 
in
 res,

// FnGetVMdisplayStatus gets the instanceView object for the passed VM ID 
// then parses out the displayStatus from one of two possible locations. 
FnGetVMdisplayStatus = (idURI) as text => 
 let Source = Json.Document(Web.Contents(endPoint & idURI & "/instanceView?api-version=" & apiVersion)), 
 statuses = Source[statuses], 
 vmDisplayStatus1 = try statuses{1}[displayStatus] otherwise "", 
 vmDisplayStatus2 = try statuses{2}[displayStatus] otherwise "", 
 vmDisplayStatus = vmDisplayStatus1 & vmDisplayStatus2 
in
 vmDisplayStatus,

// GeneratedVMList is the function to page through the subscriptions and get the VM lists. 
GeneratedVMList = 
 List.Generate( ()=>[i=0, res = FnGetOnePage(vmListUrl)], 
  each [i]null, 
  each [i=[i]+1, 
  res = FnGetOnePage([res][Next])], 
  each [res][Data]),

#"VMListTable" = Table.FromList(GeneratedVMList, Splitter.SplitByNothing(), null, null, ExtraValues.Error), 
#"Expanded-VMListTable-Column1" = Table.ExpandListColumn(#"VMListTable", "Column1"), 
#"VMdetail-list" = Table.ExpandRecordColumn(#"Expanded-VMListTable-Column1", "Column1", {"name","id"}), 
#"VMdetail-list-with-displayStatus" = Table.AddColumn(#"VMdetail-list", "displayStatus", try each FnGetVMdisplayStatus([id]) otherwise "??") 
in 
#"VMdetail-list-with-displayStatus"

You need to replace [your subscription] with your actual Azure subscription ID.

You can insert the code into an Excel O365 workbook by following the steps in my original post here:

List Your Azure VMs in Excel

Once you have entered the code, you will get back as a result, the VM name, id URI and the displayStatus which will reflect the power status of the VM.

That’s Amazing, What Next?

Can we enhance this code even further? Yes we can.
Right now, if you have multiple subscriptions, you will need to create multiple Power Query queries and then combine them into another query with the other queries as sources. This is quite cumbersome.

In part 3, I will show how you can provide a list of multiple subscriptions and get all the results in just one query. Much neater.

List Your Azure VMs in Excel

In this post I would like to show how you can use an Excel Power Query inside the latest version of Excel, to be able to dynamically list your Azure VMs in Excel.

You are probably wondering why you would need to do such a thing.
Well if you want to validate that all VMs in Azure have been picked up by your SAP LaMa system, then this is a fairly easy way to perform that validation. By combining both the SAP LaMa list and the Azure list and validating both using an Excel vlookup, for example.

Prerequisites

  • You’re going to need the latest version of Excel (O365).
  • You will also need read access to your Azure subscription (if you can log into the Azure portal and see VMs, then that should be good enough).

Create a Workbook

Open Excel and create a new blank workbook.
Select the “Data” tab:

Click “Get Data” and select “From Other Sources”, then click “Blank Query”::

Click “Advanced Editor”:

Remove any existing text from the query box:

Secret Sauce – The Code

We modified some cool Power Query code provided by Gil Raviv (available here: datachant.com/2016/06/27/cursor-based-pagination-power-query/).

Instead of querying Facebook like in Gil’s example, we change it to the URL of our Azure subscription and the specific Compute API for Virtual Machines in Azure (API details are here: docs.microsoft.com/en-us/rest/api/compute/virtualmachines/listall).

Adjust the code below, changing:
[YOUR SUBSCRIPTION] = Your Azure subscription ID.

You may also need to increase the “iterations” parameter if you have more than say 900 VMs.

 let
 iterations = 10, // Max Number of Pages of VMs, you may need more depending on how many 100s of VMs you have. 
 url = "https://management.azure.com/subscriptions/[YOUR SUBSCRIPTION]/providers/Microsoft.Compute/virtualMachines?api-version=2019-07-01", 

 // FnGetPage is the function that performs an import of single page. 
 // The page consists of a record with the data and the URL in the 
 // fields data and next. Other Web APIs hold the data and cursor in different formats 
 // but the principle is the same. 
 FnGetOnePage = (url) as record => let Source = Json.Document(Web.Contents(url)), 
 data = try Source[value] otherwise null, 
 next = try Source[nextLink] otherwise null, 
 res = [Data=data, Next=next] in res, 
 GeneratedList = List.Generate( ()=>[i=0, res = FnGetOnePage(url)], 
 each [i]null, each [i=[i]+1, 
 res = FnGetOnePage([res][Next])], 
 each [res][Data]),

 #"Converted to Table" = Table.FromList(GeneratedList, Splitter.SplitByNothing(), null, null, ExtraValues.Error), 
 #"Expanded Column1" = Table.ExpandListColumn(#"Converted to Table", "Column1"), 
 #"Expanded Column2" = Table.ExpandRecordColumn(#"Expanded Column1", "Column1", {"name"}, {"VM-Name"}) 
 in 
 #"Expanded Column2"

Paste your modified into the query box:

Check there are no syntax errors and click “Done”:

Click “Edit Credentials”:

Select “Organizational account” and select the first entry that contains your subscription ID, then click “Sign in”:

Sign in with your username and password for the Azure Portal.
Click “Connect” once signed in.

You will then see your VMs listed in the query output:

Click “Close & Load”:

The query is now embedded into a new worksheet:

That’s it.
For now this is a basic VM listing.
You may be wanting to extract more information about the VMs in the list, maybe the powerState, or the resourceGroup.
I’ll be showing you how to do this in the second post here.