«

»

Oct 22 2015

Acropolis Virtual Machine High Availability – Part II

“VMTurbo"
This is the second part of the Acropolis Virtual Machine High Availability (VMHA) blog series, you can find the first one here. This part will cover how it is determined when a virtual machine needs to be restarted on another Acropolis Hypervisor (AHV) host in the Nutanix Acropolis Cluster. The blog post will cover what happens when an AHV Host failed and when an AHV Host is network partitioned.

I’ll post the below picture in this blog post as well (was part of the initial VMHA blog post that you can find here) since it’s critical to understand the naming convention used for an Acropolis Cluster with AHV.

Screen Shot 2015-10-12 at 11.43.16

To understand what happens during an AHV Host failure or when an AHV Host is network partitioned we must first understand how the the health checks to determine if an AHV Host is up or down within the Nutanix AHV Cluster are conducted.

Acropolis, as described in this blog post, is a name of a framework including several features and we call the process delivering this framework for Acropolis.

Acropolis logical construct uses a master/slave setup where one Controller Virtual Machine (CVM) is the master and the rest of the CVMs in the Nutanix AHV Cluster are slaves.

A critical component involved in VMHA is the CVM Stargate process which is the process taking care of IO in the Distributed Storage Fabric (DSF). Read more about the Stargate process here.

Failure detection

There is an ongoing communication, once every second, between the Acropolis Master and each AHV Hosts libvirtd, which you can read more about here, process. We initiate the failover process when this communication fails and is not reestablished within 4 seconds.

AHV-libvirtd

Failure Scenarion I – Acropolis Master is alive and a remote AHV host fails

The time it takes before the VMs are restarted is presented in the worst case scenario. Usually the 20 second timeout happen between T24-T44 takes only a few seconds.

 

Time in secNormal operation meaning Acropolis Master can successfully complete the health checks against all remote AHV hosts libvirtd process.
T0Acropolis Master lose network connectivity to remote AHV host(s) libvirtd process.
T4Acropolis Master starts 20 second timeout.
T24Acropolis Master instructs all CVMs startgate processes to block IO from the AHV host the Acropolis Master cannot establish connection to.
Acropolis Master waits for all remote CVM stargate processes to acknowledge the IO block.
T44All VMs are restarted. Acropolis Master is responsible for distributing the VM start requests to the available AHV hosts.

Failure Scenarion II – Acropolis Master is alive and a remote AHV host is network partitioned

The major difference between Failure Scenario I and Failure Scenario II is that the VMs on the AHV Host(s) that is network partitioned can actually run VMs.

However, since we deny the network partitioned AHV Host from accessing the VMs virtual disks the VMs will fail in the network partitioned AHV Host 45 seconds after the first IO failure. This means we will not end up with multiple copies of the same VM running.

The time it takes before the VMs are restarted is presented in the worst case scenario. Usually the 20 second timeout happen between T24-T44 takes only a few seconds.

 

Time in secNormal operation meaning Acropolis Master can successfully complete the health checks against all remote AHV hosts libvirtd process.
T0Acropolis Master lose network connectivity to remote AHV host(s) libvirtd process.
T4Acropolis Master starts 20 second timeout.
T24Acropolis Master instructs all CVMs startgate processes to block IO from the AHV host the Acropolis Master cannot establish connection to.
Acropolis Master waits for all remote CVM stargate processes to acknowledge the IO block.Since the VMs can’t make any progress on the network partitioned AHV Host since all IOs are blocked it is safe to continue. The VMs will commit suicide 45 seconds after the first failed IO request on the network partitioned AHV Host.
T44All VMs are restarted. Acropolis Master is responsible for distributing the VM start requests to the available AHV hosts.

Failure Scenarion III – Acropolis Master fails

The last scenario i want to describe is what will happen when the Acropolis Master fails. The failure can depend on the following:

  • Failure of the CVM where Acropolis master runs.
  • Failure of the AHV Host where the CVM including the Acropolis Master runs.
  • Network partition of the AHV Host where CVM including the Acropolis Master runs.

The time it takes before the VMs are restarted is presented in the worst case scenario. Usually the 20 second timeout happen between T44-T64 takes only a few seconds.

 

Time in secNormal operation meaning Acropolis Master can successfully complete the health checks against all remote AHV hosts libvirtd process.
T0Acropolis Master fails.
T20New Acropolis Master is elected on the available AHV hosts.
T40New Acropolis Master instructs all CVMs startgate processes to block IO from the AHV host where the original Acropolis Master lived.
Acropolis Master waits for all remote CVM stargate processes to acknowledge the IO block.
T60All VMs are restarted. Acropolis Master is responsible for distributing the VM start requests to the available AHV hosts.

Conclusion

Without enabling any feature at all your Acropolis Cluster with AHV is protected against an AHV Host or CVM failure. The only configuration required is if you need a guarantee that all VMs can be restarted in case of an AHV Host failure or not.

The time it takes before VMHA starts failed VMs because of an AHV Host failure or an AHV Host being network partitioned varies but the first VMs are usually powered on after a maximum of 44 – 64 seconds. Number of powered on tasks per AHV Hosts and number of maximum parallel tasks in a Acropolis Cluster with AHV also comes into play. That will be covered in another blog post.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code class="" title="" data-url=""> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong> <pre class="" title="" data-url=""> <span class="" title="" data-url="">