SAP HANA Archives » Page 3 of 6 » Musings of an IT Implementor

HowTo: Install SAP HANA 2.0 in a VM in less than 30minutes – Part #1

For the original post back in 2014 we used SAP HANA 1.0 sps07 and installed into a Virtual Machine running SUSE Linux 11.
Things have moved on since 2014 and we have now seen the arrival of HANA 2.0 with multi-tenant database feature and new HANA Cockpit and SUSE Enterprise Linux 12 with it’s new systemd daemon replacement of the old SYS V init scripts.
I decided it was time to update the post…
Scenario: You want to prototype something for a new HANA 2.0 database. We can use the power of a virtual machine to get a HANA 2.0 database up and running in less than 30 minutes.
Well, it was supposed to be 30 minutes, and it sure can be 30 minutes, providing you have the right (fast) equipment to hand.
Remember, this is not a “Here’s the standard install process” hand-holding stuff – this is let’s get it installed and use it!
Here’s how…

What you’ll need:
– SAP HANA In Memory DB 2.0 install media from SAP Software Download Centre.
This can be the Platform Edition (for native HANA systems) or the Enterprise Edition (for S/4HANA or BW/4HANA or any other x/4HANA).
I also cheated a little in my process, since I downloaded the “Installation / Patch” for a HANA database, since this contains the latest entire code line and installer but is much less in size.
In my example I use IMDB_SERVER20_012_3-80002031.SAR which is ~3.5GB in size.

– The SUSE Linux for SAP v12 sp02 or sp03 (recommended) install media (ISO).
This is free to download from https://download.suse.com (although you will need to register an account with SUSE) but you don’t need a license. This is ~3.6GB in size and you only need the first DVD (DVD1).

– A valid license for the HANA database (platform edition or enterprise edition).

– SAP HANA Studio installed on a PC which can access the virtual HANA server you’re going to create (the Studio install media is contained within the full HANA install media DVD, or you can download it separately from SAP Software Download Centre).
In my example, I’m using IMC_STUDIO2_212_3-80000323.SAR (should be the same revision as the database) which is 734MB in size.
NOTE: The later revisions of HANA come with the HANA Cockpit built-in (web based) so you may not need the HANA Studio, it depends what you want to do with it. See SAP note 2185556 for more details.

– A host machine to host the virtual machine. You need at least 20GB of RAM, although if you configure your pagefile (in Windows) on SSD or flash, you could get away with 16GB (I did !!!).
– SAP notes access. Specifically to read/check SAP notes 1984787, 2205917 & 1944799.
– A downloaded version of SAPCAR.exe on your PC (if, like me, you will be using the VMWare shared folders option to present your downloaded media to the gues O/S).

What we’re going to do:
– We’ll create a basic SUSE Linux for SAP 12 SP3 virtual machine. You can use any host OS, I’m using Windows 7 64bit and VMWare Workstation Player v14.
– Because most people are using VMs to maximise infrastructure, we’ll go through a couple of steps to really reduce the O/S memory footprint and for efficiency we use SSH and the text mode installer for HANA. We get this whole thing running in less than 16GB of RAM in the end.
– We’ll install a basic HANA 2.0 database (in multi-tennant mode – this is the future). Initially we only get the SYSTEM DB, then we create a new tenant DB afterwards.

START THE CLOCK!

Create your basic VM for SUSE Enterprise Linux (I’m using SUSE Linux 12 for SAP SP3).
It will need the following resources:
– More than 16GB of RAM (initially 24GB for installation) on the physical host machine .
– 8GB of disk for the O/S.
– 50GB of disk for the basic HANA DB with nothing in it, plus the installed software.
– 20GB of disk on the physical host for swapping (if you don’t have 24GB of RAM).
– 2 CPUs if you can spare the cores.
– A hostname and fully qualified domain name.
– Some form of networking (use “Bridged” if you need to access this across the network, I will be using “Host-Only”).
Let’s create the VM and set the CDROM to point to the SUSE Linux 12 SP3 install DVD ISO file:

We choose to do the install later to avoid the VMWare “EasyInstall” feature:

Set the initial hard disk to have 8GB and store it in one big file (it’s up to you really):

Now customise the hardware:
Set the RAM to 24GB or more (you really need 24GB of RAM, but I have only 16GB and will be ready for some serious swapping). After installation, at a minimum the VM should have 18GB of RAM for day-to-day running:

Give the VM at least 2 cores:

Set the CD/DVD to use the SUS Linux installation ISO you downloaded:

Use bridged networking if you need to access over the network, but only if you have DHCP enabled or you’re a network guru. I’m using “Host-only”:

I also removed the Sound Card and Printer.
Summary:

Start the VM.

We’re off.
The SUSE install took 12.5 minutes in my testing on a core i5 (unfortunately only 3rd gen 🙁 )
That is it for Part #1 of this guide.
>> Continue on to Part #2 for the completion of our HANA 2.0 install.

Encrypting HANA Data Volumes

Out of the box, the HANA database is not encrypted.
For those businesses that demand encryption, from within HANA Studio you can activate the encryption of the HANA data volumes.
The HANA documentation supplied with SPS7, suggests that the recommended approach is to encrypt the data volumes as part of the HANA installation process. This is due to the “Copy-on-write” method used to persist data on disk. An already used database may still have unencrypted data pages in it, even after encryption is enabled.

You should note that the words “data volumes” means the location of the database files for the indexserver and the statistics server, which is usually something like “/hana/data/<SID>/mnt0000<n>/hdb0000<n>/*.dat”.

Tip: You can check which values for “<n>” in the above, will be used, by checking the “Landscape Overview” tab and the “Volumes” tab within that.

Prerequisites:
– SAP recommend changing the SSFS encryption key after installation of the HANA system. This key is used to store the data volume encryption key and others.
– Disk space of the data volume areas (we discuss this below anyway).
– Take a cold backup of the HANA DB area and config files.
– Access to the HANA Studio (you can do this with SQL access and hdbsql, but I don’t show this).

Let’s begin:

From within HANA Studio, open the “Security” tab and navigate to the “Data Volume Encryption” tab.
You will notice that encryption is not already enabled.
Tick the “Activate encryption of data volumes” tick box, then click the “Execute” button:

HANA volume encryption activation

The encryption process will start immediately (“Encryption Pending” then “Encryption Running”):

HANA volume encryption pending

The status is updated when each server process finishes:

HANA volume encryption running

Once the encryption process is completed successfully, the tick box is enabled again for you to be able to de-encrypt the volumes should you wish to reverse the process:

HANA volume encryption encrypted

My small (empty) HANA DB @ sps7, took approximately 10 minutes to encrypt.
The data volume sizes increased as part of the encryption process.
The indexserver data volume went from 324MB used to 341MB used:

HANA volume encryption indexserver size

Of more importance is the dramatic increase in the allocated (Total Size).
It’s gone from 513MB to 896MB! You will need to be aware of the disk space you may need before enabling encryption.

The statistics server is smaller and went from 80MB used to 101MB used:

Snap668 2014-06-26, 11_26_17

Again, notice that we’ve got an increased allocation from 320MB to 400MB.

Validate the encryption status using an SQL console:

SELECT * FROM M_PERSISTENCE_ENCRYPTION_STATUS;

What are the implications for encryption?

Well, the data volumes have been encrypted, but we’ve not seen anything about encrypting the logs or the backups.
Also, to preserve the performance of the HANA system in memory, the data is decrypted when read from the disk into memory.

Is encrypting the HANA DB transaction logs feasible? Probably not. There are various whitepapers detailing the issues of encryption of data stored on SSDs. Since the majority of high performance appliance resellers use SSDs for the HANA DB transaction logs, the use of software layer encryption on SSDs is not worth the effort and would probably reduce the performance of the database. Instead, using the SSD hardware layer encryption may prove useable, but this is only worth while if you think that the SSD could be physically stolen. SSD hardware encryption doesn’t prevent an intruder at the O/S level from seeing the data.

Is encrypting the HANA DB backups (data and logs) feasible? Yes this is definitely something that should be employed. There are 3rd party backup tools that will take a copy of the HANA backup files, then encrypt and store elsewhere. But these tools may not support secure wipe, so the disk location of the backup files would potentially still contain the data. Using backup tools certified for SAP with the BackInt interface would be better, since the data doesn’t touch the disks, it’s piped straight to the backup device via the “network”.

There is possibly some slight performance impact from encrypting the data volumes. The data files are written to at least every 5 minutes. They are also read from during HANA start up. These I/O operations will be the same speed as they were before, except there will be some additional software routines involved to perform the encryption, which will use slightly more memory and mean a slight delay between the data being read/written to/from the disk.

Be aware of bugs in the backup/restore process.
I’ve seen mention of bugs in other software products where an encrypted backup is void due to a software bug. Ensure that you test your solution thoroughly.

SAP recommend that you change the Page Encryption Key (used to encrypt the data volumes) regularly as per your organisations standards. This may yet again increase the allocated size.

Summary:
– Data volume encryption is easy and fast in HANA. There’s not really any reason to not implement it. Beware of implementing in an already populated database and ensure you change the keys regularly.
– Backups and transaction log volumes are not encrypted and for the logs there’s a good reason why you may not want to.
– Performance in certain scenarios could be affected slightly.
– You should attempt to implement a supported Backint capable backup product with encryption, since the backups with this method don’t touch the unencrypted disks.
– Encryption can be performed and validated at the SQL command level.
– Be aware that data volume encryption in HANA will require more disk space to be allocated to the data volumes, increasing the footprint of the HANA database by as much as 50%.

HANA DB Re-initialisation Without Reinstall

Scenario: You have a small test system or PoC system and you want to revert or recreate the HANA DB like you’ve just opened the box and installed it from new, but without the hassle of the reinstall.

This is completely possible using the hdbnsutil command line program.
With HANA shutdown, connect to the HANA server via SSH as the <sid>adm Linux user, then run the hdbnsutil command line program as follows:

hana01:/usr/sap/H10/HDB10/exe> hdbnsutil -initTopology
checking for inactive nameserver …
nameserver hana01:31001 not responding.
creating persistence …
run as transaction master writing initial topology…
writing initial license: status check = 2
done 

As you will see, it recreates the persistence layer (database) and also re-creates the license.
You will need to reinstall your HANA system license after the re-initialisation process is complete.
On a slow system, the process took approximately 2 minutes.

Due to the size of my system, I am unable to tell you if this process destroys any specific configuration. From what I can tell, the existing global.ini, nameserver.ini and indexserver.ini are kept.

You should also note that the SYSTEM user password is reset to its default value of “master”.
Plus, if you have enabled encryption, the reinitialised data volumes will be re-encrypted unless you de-check the checkbox on the Data Volume Encryption tab inside the Security tab, prior to reinitialisation.

SAP HANA Backup Allocation Failed

During a HANA backup, you get an “Allocation Failed” error.
This is caused by a lack of memory. If possible, increase the memory available to HANA by increasing the free memory at the O/S level (e.g. shutting down other HANA instances), or increase the global allocation limit.

HANA OOM Error Tip #1 – Partition Tables Correctly

If your HANA system is regularly experiencing OOM (Out Of Memory) errors, then there are a number of things that you can do to try and reduce memory consumption.

Tip #1: Partition Large Tables Correctly
If there are large Column Store tables in your HANA system, you should partition them.
Whilst this is an absolute must in a HANA scale-out scenario (for enabling multi-node parallelism), it might not be so obvious that it can also help in a single node HANA system.
Partitioning a column table means that only the required partitions of the table are loaded into memory when accessed, you would think.

Partitioning a large table into smaller chunks will therefore help to reduce the memory usage of the table during SQL queries and also during updates.
During updates, each partition gets its own delta cache area.

Choosing how to partition a table is slightly more difficult and will depend on whether the table is a custom table, SAP standard table or other.
Plus, you will need to know what and how queries or updates are executed against the table. A period of monitoring is suggested, to enable to you collect the required information to be able to make a decision.

One thing you should try to do, is partition the table using the most logical, selective columns.
Read on for a simple example with a twist!

A simple example, a single node HANA system has 1 large column table T1.
The table is partitioned into ranges based on the date column INVOICEDATE:

CREATE COLUMN TABLE “DARRYL”.”T1″ (“INVOICEREF” VARCHAR(1) NOT NULL ,
     “INVOICEDATE” DAYDATE CS_DAYDATE NOT NULL ) UNLOAD PRIORITY 5 AUTO MERGE WITH PARAMETERS (‘PARTITION_SPEC’ = ‘RANGE year(INVOICEDATE) 2000-2001,2001-2002,2002-2003,*’)
;
CREATE UNIQUE INDEX “I1” ON “DARRYL”.”T1″ ( “INVOICEREF” ASC ) NONLEAF PARTIAL KEY LENGTH 1;

As you can see, I’ve created 3 partitions by year: 2000 to 2001, 2001 to 2002 and 2002 to 2003.
This will actually create 4 partitions: year 2000, year 2001, year 2002 and year <OTHER>.

HANA Table Distribution

Insert 5 records into the table:

insert into darryl.t1 (INVOICEREF,INVOICEDATE) values(‘1′,’2000-01-01’)
insert into darryl.t1 (INVOICEREF,INVOICEDATE) values(‘2′,’2001-01-01’)
insert into darryl.t1 (INVOICEREF,INVOICEDATE) values(‘3′,’2002-01-01’)
insert into darryl.t1 (INVOICEREF,INVOICEDATE) values(‘4′,’2003-01-01’)
insert into darryl.t1 (INVOICEREF,INVOICEDATE) values(‘5′,’2004-01-01’)

Inside the Table Distribution tab, you will now see that the records have been inserted according to their values into the respective partitions (see Raw Record Count field on the right):

HANA Table Distribution

The last two records for year 2004 and 2003 are in the fourth partition.
You can also see that each partition has a Delta Size, and that the Delta Size for the fourth partition with the most records, is larger than the other partitions.
Unload the table from memory:

HANA Unload Table from memory

Refreshing the Table Distribution tab now shows the table partitions to have zero size in memory:

HANA Table Distribution

Now select the records for the years 2004 and 2005 only:

select * from darryl.t1 where invoicedate in (‘2004-01-01′,’2003-01-01’)

Refreshing the Table Distribution tab now shows the tables to have non-zero size in memory for ALL partitions!

HANA Table Distribution

All of the records from all of the partitions appear to be loaded!
What went wrong?
Well, it’s simple, we didn’t create an index on the column INVOICEDATE.
This forced HANA to scan through the entire table to access the required records, meaning that it needed to load them all into memory.

Let’s create an index in INVOICEDATE:

CREATE UNIQUE INDEX “I1” ON “DARRYL”.”T1″ ( “INVOICEDATE” ASC ) NONLEAF PARTIAL KEY LENGTH 1;

Unload the table from memory:

HANA Unload Table from memory

Refreshing the Table Distribution tab now shows the tables to have zero size in memory:

HANA Table Distribution

Now re-execute the SELECT statement:

select * from darryl.t1 where invoicedate in (‘2004-01-01′,’2003-01-01’)

Once again, on the Table Distribution tab, we can see that it has accessed all partitions, AGAIN!:

HANA Table Distribution

What went wrong this time? Well, HANA doesn’t yet have any statistics on the table data, so it simply ignored the index.
If you now unload the table from memory once again (we haven’t done anything else):

HANA Unload Table from memory

Now re-execute the SELECT statement:

select * from darryl.t1 where invoicedate in (‘2004-01-01′,’2003-01-01’)

Look at the Table Distribution tab:

HANA Table Distribution

You can see that HANA has now only accessed the final partition of the table. The other partitions have not been loaded into memory.
At first I thought this feature might be due to statistics, so I tried removing them from the table T1 (drop statistics on T1;). Then I retried the process of unloading and re-running the query. This had no effect, HANA correctly went straight to the fourth partition.
This left me with one other option, the Plan Cache.

Clearing the Plan Cache using:

ALTER SYSTEM CLEAR SQL PLAN CACHE

I then re-ran the test by unloading the table from memory:

HANA Unload Table from memory

Re-executing the SQL SELECT:

select * from darryl.t1 where invoicedate in (‘2004-01-01′,’2003-01-01’)

Look at the Table Distribution tab:

HANA Table Distribution

Bingo!
The Plan Cache was storing some form of execution plan statistics that meant that it was accessing the fourth partition straight away.
Each time the table is unloaded, the statistics from the existing Plan Cache remain and are re-used upon next execution of the query, which means HANA is able to go straight to the fourth partition.

Summary:
Partitioning is a great way of parallelising access to a table in HANA.
It also serves to help reduce memory by only loading specific partitions into memory when they are required.
In order to effectively use partitioning, you need to partition on an indexed column.
The initial access of a newly partition table with a new index, does not enable the benefits of partition until the second subsequent access of the table due to the Plan Cache. A method/process of pre-loading the execution plan/statistics into the cache is required.