Chapter 1 - Introduction to vSAN 6.2
This chapter introduces you to the world of the software-defined datacenter, but with a focus on the storage aspect. The chapter covers the basic premise of the software-defined datacenter and then delves deeper to cover the concept of software-defined storage and associated solutions such as the server storage-area network (Server SAN) and hyper-converged infrastructure solutions.
VMworld, the VMware annual conferencing event, introduced VMware’s vision for the software-defined datacenter (SDDC) in 2012. The SDDC is VMware’s architecture for the public and private clouds where all pillars of the datacenter—computing, storaging, and networking (and the associated services)—are virtualized. Virtualizing datacenter components enables the IT team to be more flexible. If you lower the operational complexity and cost while increasing availability and agility, you will ultimately lower the time to market for new services.
To achieve all of that, virtualization of components by itself is not sufficient. The platform used must be capable of being installed and configured in a fully automated fashion. More importantly, the platform should enable you to manage and monitor your infrastructure in a smart and less operationally intense manner. That is what the SDDC is all about! Raghu Raghuram (VMware senior vice president) captured it in a single sentence: The essence of the software-defined datacenter is “abstract, pool, and automate.”
Abstraction, pooling, and automation are all achieved by introducing an additional layer on top of the physical resources. This layer is usually referred to as a virtualization layer. Everyone reading this book is probably familiar with the leading product for compute virtualization, VMware vSphere. Fewer people are probably familiar with network virtualization, sometimes referred to as software-defined network (SDN) solutions. VMware offers a solution named NSX that is based on the solution built by the acquired company Nicira. NSX does for networking what vSphere does for compute. These layers do not just virtualize the physical resources but also allow you to pool them and provide you with an application programming interface (API) that enables you to automate all operational aspects.
Automation is not just about scripting, however. A significant part of the automation of virtual machine (VM) provisioning (and its associated resources) is achieved through policy-based management. Predefined policies allow you to provision VMs in a quick, easy, consistent, and repeatable manner. The resource characteristics specified on a resource pool or a vApp container exemplifies a compute policy. These characteristics enable you to quantify resource policies for compute in terms of reservation, limit, and priority. Network policies can range from security to quality of service (QoS). Unfortunately, storage has thus far been limited to the characteristics provided by the physical storage device, which in many cases did not meet the expectations and requirements of many of our customers.
This book examines the storage component of VMware’s SDDC. More specifically, the book covers how a product called vSAN, fits into this vision. You will learn how it has been implemented and integrated within the current platform and how you can leverage its capabilities and expand on some of the lower-level implementation details. Before going further, though, you want to have a generic understanding of where vSAN fits in to the bigger software-defined storage picture.
Software-defined storage is a term that has been used and abused by many vendors. Because software-defined storage is currently defined in so many different ways, consider the following quote from VMware:
Software Defined Storage is the automation and pooling of storage through a software control plane, and the ability to provide storage from industry standard servers. This offers a significant simplification to the way storage is provisioned and managed, and also paves the way for storage on industry standard servers at a fraction of the cost. (Source: http://cto.vmware.com/vmwares-strategy-for-software-defined-storage/)
A software-defined storage product is a solution that abstracts the hardware and allows you to easily pool all resources and provide them to the consumer using a user-friendly user interface (UI) and API. A software-defined storage solution allows you to both scale up and scale out, without increasing the operational effort.
Many hold that software-defined storage is about moving functionality from the traditional storage devices to the host. This trend was started by virtualized versions of storage devices such as HP’s StoreVirtual VSA and evolved into solutions that were built to run on many different hardware platforms. One example of such a solution is Nexenta. These solutions were the start of a new era.
Hyper-Convergence/Server SAN Solutions
Over the last years there have been many debates around what hyper-converged is versus a server SAN solution. In our opinion the big difference between these two is the level of integration with the platform it is running on and the delivery model. When it comes to the delivery mode there are two distinct flavors:
- Appliance based
- Software only
An appliance-based solution is one where the hardware and the software are sold and delivered as a single bundle. It will come preinstalled with a hypervisor and usually requires little to no effort to configure. There also is a deep integration with the platform it sits on (or in) typically. This can range from leveraging the provided storage APIs to providing extensive data services to being embedded in the hypervisor.
In all of these cases local storage is aggregated into a large shared pool by leveraging a virtual storage appliance or a kernel-based storage stack. Typical examples of appliance-based solutions that are out there today include Nutanix, SimpliVity, and of course vSAN. When one would ask the general audience what a typical “hyper-converged appliance” looks like the answer would usually be: a 2U form factor with four hosts. However, hyper-convergence is not about a form factor in our opinion. It is about combining different components into a single solution. This solution needs to be easy to install, configure, manage, and monitor. It is fair to say however, that traditionally most hyper-converged platforms were delivered in a 2U form factor with four hosts. Figure 1.1 shows what these appliances looked like, but make no mistake these are just generic x86 servers.
Figure 1.1 - Commonly used hardware by hyper-converged storage vendors
You might ask, “If these are generic x86 servers with hypervisors installed and a virtual storage appliance or a kernel storage stack, what are the benefits over a traditional storage system?” The benefits of a hyper-converged platform are as follows:
- Time to market is short, less than 1 hour to install and deploy
- Ease of management and integration
- Able to scale out, both capacity and performance wise
- Lower total costs of acquisition compared to traditional environments
These solutions are sold as a single stock keeping unit (SKU), and typically a single point of contact for support is provided. This can make support discussions much easier. However, a hurdle for many companies is the fact that these solutions are tied to hardware and specific configurations. The hardware used by hyper-converged vendors is often not the same as the preferred hardware supplier you may already have. This can lead to operational challenges when it comes to updating/patching or even cabling and racking. In addition, a trust issue exists. Some people swear by server Vendor X and would never want to touch any other brand, whereas others won’t come close to server Vendor X. Fortunately, most hyper-converged vendors these days offer the ability to buy their solution through different server hardware vendors. If that does not provide sufficient flexibility, then this is where the software-based storage solutions come in to play.
Software-only storage solutions come in two flavors. The most common solution today is the virtual storage appliance (VSA). VSA solutions are deployed as a VM on top of a hypervisor installed on physical hardware. VSAs allow you to pool underlying physical resources into a shared storage device. Examples of VSAs include Maxta, HP’s StoreVirtual VSA, and EMC Scale IO. The big advantage of software-only solutions is that you can usually leverage existing hardware as long as it is on the hardware compatibility list (HCL). In the majority of cases, the HCL is similar to what the underlying hypervisor supports, except for key components like disk controllers and flash devices.
vSAN is also a software-only solution, but vSAN differs significantly from the VSAs listed. VSAN sits in a different layer and is not a VSA-based solution. On top of that, vSAN is typically combined with hardware by a vendor of choice. Hence, VMware refers to vSAN as a hyper-converged software solution as it is literally the enabler of many hyper-converged offerings.
VMware’s plan for software-defined storage is to focus on a set of VMware initiatives related to local storage, shared storage, and storage/data services. In essence, VMware wants to make vSphere a platform for storage services.
Historically, storage was something that was configured and deployed at the start of a project, and was not changed during its life cycle. If there was a need to change some characteristics or features of the logical unit number (LUN) or volume that were being leveraged by VMs, in many cases the original LUN or volume was deleted and a new volume with the required features or characteristics was created. This was a very intrusive, risky, and time-consuming operation due to the requirement to migrate workloads between LUNs or volumes, which may have taken weeks to coordinate.
With software-defined storage, VM storage requirements can be dynamically instantiated. There is no need to repurpose LUNs or volumes. VM workloads and requirements may change over time, and the underlying storage can be adapted to the workload at any time. VSAN aims to provide storage services and service-level agreement automation through a software layer on the hosts that integrates with, abstracts, and pools the underlying hardware.
A key factor for software-defined storage is storage policy-based management (SPBM). SPBM can be thought of as the next generation of VMware’s storage profile feature that was introduced with vSphere 5.0. Where the initial focus of storage profiles was more about ensuring VMs were provisioned to the correct storage device, in vSphere 6.x. SPBM is a critical component to how VMware is implementing software-defined storage.
Using SPBM and vSphere APIs, the underlying storage technology surfaces an abstracted pool of storage space with various capabilities that is presented to vSphere administrators for VM provisioning. The capabilities can relate to performance, availability, or storage services such as thin provisioning, compression, replication, and more. A vSphere administrator can then create a VM storage policy (or profile) using a subset of the capabilities that are required by the application running in the VM. At deployment time, the vSphere administrator selects a VM storage policy. SPBM pushes the VM storage policy down to the storage layer and datastores that understand that the requirements placed in the VM storage policy will be made available for selection. This means that the VM is always instantiated on the appropriate underlying storage based on the requirements placed in the VM storage policy and that the VM is provisioned with just the right amount of resources, the required services from the abstracted pool of storage resources.
Should the VM’s workload, availability requirement or I/O pattern change over time, it is simply a matter of applying a new VM storage policy with requirements and characteristics that reflect the new workload to that specific VM, or even virtual disk, after which the policy will be seamlessly applied without any manual intervention from the administrator (in contrast to many legacy storage systems, where a manual migration of VMs or virtual disks to a different datastore would be required). vSAN has been developed to seamlessly integrate with vSphere and the SPBM functionality it offers.
What Is vSAN?
vSAN is a storage solution from VMware, released as a beta in 2013, made generally available to the public in March 2014, and reached version 6.2 in March of 2016. vSAN is fully integrated with vSphere. It is an object-based storage system and a platform for VM storage policies that aims to simplify VM storage placement decisions for vSphere administrators. It fully supports and is integrated with core vSphere features such as vSphere high availability (HA), vSphere Distributed Resource Scheduler (DRS), and vMotion, as illustrated in Figure 1.2.
Figure 1.2 - Simple overview of a vvSAN cluster
vSAN’s goal is to provide both resiliency and scale-out storage functionality. It can also be thought of in the context of QoS in so far as VM storage policies can be created that define the level of performance and availability required on a per-VM, or even virtual disk, basis.
vSAN is a software-based distributed storage solution that is built directly in the hypervisor. Although not a virtual appliance like many of the other solutions out there, a vSAN can best be thought of as a kernel-based solution that is included with the hypervisor. Technically, however, this is not completely accurate because components critical for performance and responsiveness such as the data path and clustering are in the kernel, while other components that collectively can be considered part of the “control plane” are implemented as native user-space agents. Nevertheless, with vSAN there is no need to install anything other than the software you are already familiar with: VMware vSphere.
vSAN is about simplicity, and when we say simplicity, we do mean simplicity. Want to try out vSAN? It is truly as simple as creating a VMkernel network interface card (NIC) for vSAN traffic and enabling it on a cluster level, as shown in Figure 1.3. Of course, there are certain recommendations and requirements to optimize your experience, as described in further detail in Chapter 2, “vSAN Prerequisites and Requirements for Deployment.”
Figure 1.3 - Two-click enablement
Now that you know it is easy to use and simple to configure, what are the benefits of a solution like vSAN? What are the key selling points?
- Software defined: Use industry standard hardware
- Flexible: Scale as needed and when needed, both scale up and scale out
- Simple: Ridiculously easy to manage and operate
- Automated: Per-VM and disk policy-based management
- Hyper-converged: Enables you to create dense/building-block-style solutions
That sounds compelling, doesn’t it? Where does vSAN fit you may ask, what are the use cases and are there situations where it doesn’t fit today? Today the use cases are as follows:
- Business critical apps: Stable storage platform with all data services required to run business critical workloads, whether that is Microsoft Exchange, SQL, Oracle etc.
- Virtual desktops: Scale-out model using predictive and repeatable infrastructure blocks lowers costs and simplifies operations.
- Test and dev: Avoids acquisition of expensive storage (lowers total cost of ownership [TCO]), fast time to provision.
- Management or DMZ infrastructure: Fully isolated resulting in increased security and no dependencies on the resources it is potentially managing.
- Disaster recovery target: Inexpensive disaster recovery solution, enabled through a feature like vSphere replication that allows you to replicate to any storage platform.
- Remote office/branch office (ROBO): With the ability to start with as little as two hosts, centrally managed, vSAN is the ideal fit for ROBO environments.
- Stretched cluster: Providing very high availability across remote sites for a wide range of potential workloads.
Now that you know what vSAN is and that it is ready for any type of workload, let’s have a brief look at what was introduced in terms of functionality with each release.
- vSAN 1.0: March 2014
- Initial release
- vSAN 6.0: March 2015
- 64 host cluster scalability
- All-flash configurations
- 2x performance increase for hybrid configurations
- New snapshot mechanism
- Enhanced cloning mechanism
- Fault domain/rack awareness
- vSAN 6.1: September 2016
- Stretched clustering across a max of 5 ms RTT (milliseconds)
- 2-node VSAN for remote office, branch office (ROBO) solutions
- vRealize operations management pack
- vSphere replication—5 minutes RPO
- Health monitoring
- vSAN 6.2: March 2016
- RAID 5 and 6 over the network (erasure coding)
- Space efficiency (deduplication and compression)
- QoS–IOPS limits
- Software checksums
- IPv6 support
- Performance monitoring
Hopefully, that gives a quick overview of all the capabilities introduced and available in each of the releases. There are many items listed, but that does not mean vSAN is complex to configure, manage, and monitor. Let’s take a look from an administrator perspective; what does vSAN look like?
What Does vSAN Look Like to an Administrator?
When vSAN is enabled, a single shared datastore is presented to all hosts that are part of the vSAN-enabled cluster. This is the strength of vSAN; it is presented as a datastore. Just like any other storage solution out there, this datastore can be used as a destination for VMs and all associated components, such as virtual disks, swap files, and VM configuration files. When you deploy a new VM, you will see the familiar interface and a list of available datastores, including your vSAN-based datastore, as shown in Figure 1.4.
Figure 1.4 Just a normal datastore
This vSAN datastore is formed out of host local storage resources. Typically, all hosts within a vSAN-enabled cluster will contribute performance (flash) and capacity (magnetic disks or flash) to this shared datastore. This means that when your cluster grows, your datastore will grow with it. vSAN is what is called a scale-out storage system (adding hosts to a cluster), but also allows scaling up (adding resources to a host).
Each host that wants to contribute storage capacity to the vSAN cluster will require at least one flash device and one capacity device (magnetic disk or flash). At a minimum, vSAN requires three hosts in your cluster to contribute storage (or two hosts if you decide to use a witness host, which is a common configuration for ROBO); other hosts in your cluster could leverage these storage resources without contributing storage resources to the cluster itself. Figure 1.5 shows a cluster that has four hosts, of which three (esxi-01, esxi-02, and esxi-03) contribute storage and a fourth does not contribute but only consumes storage resources. Although it is technically possible to have a non-uniform cluster and have a host not contributing storage, VMware highly recommend creating a uniform cluster and having all hosts contributing storage for overall better utilization, performance, and availability.
Figure 1.5 - Nonuniform vSAN cluster example
Today’s boundary for vSAN in terms of both size and connectivity is a vSphere cluster. This means that vSAN supports single clusters/datastores of up to 64 hosts, but of course a single vCenter Server instance can manage many 64 host clusters. It is a common practice for most customers however to limit their clusters to a maximum size of around 20 hosts. This is for operational considerations like the time it takes to update a full cluster. Each host can run a supported maximum of 200 VMs, allowing for a total combined of 6,400 VMs within a 64-host vSAN cluster. As you can imagine with a storage system at this scale, performance and responsiveness is of the utmost importance. vSAN was designed to take advantage of flash to provide the experience users expect in today’s world. Flash resources are used for all writes, and depending on the type of hardware configuration used (all-flash or hybrid) reads will typically also be served from flash.
To ensure VMs can be deployed with certain characteristics, vSAN enables you to set policies on a per-virtual disk or a per-VM basis. These policies help you meet the defined service level objectives (SLOs) for your workload. These can be performance-related characteristics such as read caching or disk striping, but can also be availability-related characteristics that ensure strategic replica placement of your VM’s disks (and other important files).
If you have worked with VM storage policies in the past, you might now wonder whether all VMs stored on the same vSAN datastore will need to have the same VM storage policy assigned. The answer is no.vSAN allows you to have different policies for VMs provisioned to the same datastore and even different policies for disks from the same VM.
As stated earlier, by leveraging policies, the level of resiliency can be configured on a per-virtual disk granular level. How many hosts and disks a mirror copy will reside on depends on the selected policy. Because vSAN can use mirror copies (RAID-1) or erasure coding (RAID-5/6) defined by policy to provide resiliency, it does not require a local RAID set. In other words, hosts contributing to vSAN storage capacity should simply provide a set of disks to vSAN.
Whether you have defined a policy to tolerate a single host failure or, for instance, a policy that will tolerate up to three hosts failing, vSAN will ensure that enough replicas of your objects are created. The following example illustrates how this is an important aspect of vSAN and one of the major differentiators between vSAN and most other virtual storage solutions out there.
EXAMPLE: We have configured a policy that can tolerate one failure and created a new virtual disk. We have chosen to go with a number of failures to tolerate =1 which results in a RAID-1 configuration. This means that vSAN will create two identical storage objects and a witness. The witness is a component tied to the VM that allows vSAN to determine who should win ownership in the case of a failure. If you are familiar with clustering technologies, think of the witness as a quorum object that will arbitrate ownership in the event of a failure. Figure 1.6 may help clarify these sometimes-difficult-to-understand concepts. This figure illustrates what it would look like on a high level for a VM with a virtual disk that can tolerate one failure. This can be the failure of a host, NICs, disk, or flash device, for instance.
Figure 1.6 - vSAN failures to tolerate
In Figure 1.6, the VM’s compute resides on the first host (esxi-01) and its virtual disks reside on the other hosts (esxi-02 and esxi-03) in the cluster. In this scenario, the vSAN network is used for storage I/O, allowing for the VM to freely move around the cluster without the need for storage components to be migrated with the compute. This does, however, result in the first requirement to implement vSAN. vSAN requires at a minimum one dedicated 1 Gbps NIC port, but VMware recommends a 10 GbE for the vSAN network.
Yes, this might still sound complex, but in all fairness, vSAN masks away all the complexity, as you will learn as you progress through the various chapters in this book.
To conclude, VMware vSAN is a market leading, hypervisor-based distributed storage platform that enables convergence of compute and storage resources, typically referred to as hyper-converged software. It enables you to define VM-level granular SLOs through policy-based management. It allows you to control availability and performance in a way never seen before, simply and efficiently.
This chapter just scratched the surface. Now it’s time to take it to the next level. Chapter 2 describes the requirements for installing and configuring vSAN.