This blog contains experience gained over the years of implementing (and de-implementing) large scale IT applications/software.

Preventing File System Corruption from Halting Boot Up of SLES in Azure

When you create a Linux VM in Azure, you don’t get to know the “root” user password.
By default, if a Linux VM detects journaled file system corruption at boot if will go into recovery mode, requiring the root password to be able to fix it.
Without the root password, the only other way to fix the issue is copying the O/S disk, mounting on another VM and fixing the issue.
If you don’t have Azure Boot Diagnostics enabled, you might not even know what the problem is! The VM will just appear to not boot.

In this post I show a simple way to prevent Debian based Linux distributions (I use SLES) from failing boot up due to file system corruption. Our example is an XFS file system just like in my previous post.
XFS is journaled and will check the integrity on mounting. If there are problems with the file system then Linux will fail to mount it, which will cause the O/S boot up process to stall.

In a production system, you can imagine the scenario where a simple restart of a VM causes an hour long downtime (or longer).

NOTE: In my scenario there is no Linux device encryption, which could make the job or repair even harder, and all the more important to prevent boot failure.

Preventing Boot Failure

To prevent our corrupt XFS file system from halting boot, we just need to add 1 single option to the mount options in file /etc/fstab.
We use the “nofail” option.

We could just go and write this straight out to the fstab file and expect it to work.
However, we can test it first to make sure that it is:

  • supported on your version/distribution of Linux.
  • supported for your file system type (mine is XFS).

We could use the “-f” (fake) mount option to the “mount” command, but in testing I cannot get this to actually show an error when it is passed an invalid mount point option.
Instead, let’s actually mount the file system to check if “nofail” is accepted.

As the root user (or with sudo) get the current mount options for your file system (the one you will be applying “nofail” to):

grep BIG /etc/fstab

/dev/volTMP/lvTMP1 /BIGSTRIPEDDISK xfs defaults 0 0

I can see that my /BIGSTRIPEDDISK is mounted from a volume group and has the “defaults” mount options. Yours may be different.
We can now create a new mount point location and temporarily mount the file system adding the “nofail” option to test it is accepted (adjust the mount options using your current mount point settings):

mkdir /mnt/tempmount
mount -o defaults,nofail /dev/volTMP/lvTMP1 /mnt/tempmount

If you got an error or warning, then the file system type or your Linux distribution does not support the use of “nofail”. Maybe check the man page for an equivalent option (“man mount”).

If you didn’t get an error, then you know that you can successfully apply the “nofail” option to the end of the options column (column number 4) in the fstab:

vi /etc/fstab
...
/dev/volTMP/lvTMP1 /BIGSTRIPEDDISK xfs defaults,nofail 0 0
...

Once applied, it is recommended that you always verify boot related changes, by taking some downtime to restart the machine. There is nothing worse than applying a change and not testing it.

With “nofail” in place, the next time the O/S boots and the file system is mounted if there are issues with the integrity or even if the device is missing, the O/S will move forward in the boot process and ignore the error.
There is obviously a small consequence of this, file systems may not be mounted after a boot has completed.
It is possible to mitigate this problem with monitoring (scripts that monitor file system free space, for example) or other checks after boot.
Of course there is also a second option to all of this, set the root user password on new VMs and store in your secure password location. You can use a 16 character random string like those generated from a password manager.
You will also need to ensure that you can use the Azure Serial Console to get to the VM command line, because in some configurations, security practices can indirectly prevent this.

SUSE Linux 12 – Kernel 4.4.73 – Boot Hang – BTRFS Issue

I had a VMWare guest running SUSE Linux 12 SP3 64bit (kernel 4.4.73).
One day after a power outage, the VM failed to boot.
It would arrive at the SUSE Linux “lizard” splash screen and then just hang.

I noticed prior to this error that the SUSE 12 operating system creates it’s root partition inside a logical volume call “/dev/system/root” and it is then formatted as a BTRFS filesystem.

At this point I decided that I must have a corrupt disk block.
I launched the VM with the CDROM attached and pointing at the SUSE 12 installation ISO file.
While the VM starts you need to press F2 to get to the “BIOS” boot options to enable the CDROM to be bootable before the hard-disks.

Once the installation cdrom was booting, I selected “Recovery” from the SUSE menu.
This drops you into a recovery session with access to the BTRFS filesystem check tools.

Following a fair amount of Google action, I discovered I could run a “check” of the BTRFS file system (much like the old fsck on EXT file systems).

Since I already knew the device name for the root file system, things were pretty easy:

# btrfs check /dev/system/root
Checking filesystem on /dev/system/root

found 5274484736 bytes used err is 0

Looks like the command worked, but it is showing no errors.
So I tried to mount the partition:

# mkdir /old_root
# mount -t btrfs /dev/system/root /old_root

At this point the whole VM hung again!
I had to restart the whole process.
So there was definately an issue with the BTRFS filesystem on the root partition.

Starting the VM again and re-entering the recovery mode of SUSE, I decided to try and mount the partition in recovery mode:

# mkdir /old_root
# mount -t btrfs /dev/system/root /old_root -o ro,recovery

It worked!
No problems.  Weird.
So I unmounted and tried to re-mount in read-write mode again:

# umount /old_root
# mount -t btrfs /dev/system/root /old_root

BAM! The VM hung again.

Starting the VM again and re-entering the recovery mode of SUSE, I decided to just run the btrfs command with the “repair” command (although it says this should be a last resort).

# btrfs check –repair /dev/system/root
enabling repair mode
Checking filesystem on /dev/system/root
UUID: a09b7c3c-9d33-4195-af6e-9519fe550694
checking extents
Fixed 0 roots.
checking free space cache
cache and super generation don’t match, space cache will be invalidated
checking fs roots
checking csums
checking root refs
found 5274484736 bytes used err is 0
total csum bytes: 4909484
total tree bytes: 236126208
total fs tree bytes: 215973888
total extent tree bytes: 13647872
btree space waste bytes: 38681887
file data blocks allocated: 5186543616

Maybe this cache problem that it fixed is the issue.

# mkdir /old_root
# mount -t btrfs /dev/system/root /old_root

Yay!
So, weird problem fixed.
Maybe this is a Kernel level issue and later Kernels have a patch, not sure.  It’s not my primary concern to fix this as I don’t plan on having many power outages, but if it was my production system then I might be more concerned and motivated.

Corrupt OEL 5.7 ISO Prevents Boot into Installer

I ran into this little problem whilst trying to install OEL 5.7 into a Hyper-V environment.

Kernel panic – not syncing: VFS: Unable to mount root fs on unknown-block(xxx,xx)“.

I tried all different manner of parameters with “linux xxxxx” as recommended by the installer.
None of these worked.
It looked like the Hyper-V drivers weren’t working at first.

So I re-downloaded the OEL 5.7 ISO, and re-attached to the VM cd-rom.
Then it worked!
Must have been a corrupt OEL 5.7 ISO that prevented it booting into install/setup from Hyper-V.