Articles like this usually start with some basic terminology so let me start with context of what it took to build cloud internally. Companies which invested in some sort of virtualization technology years ago didn't knew they were building private clouds, this hyped buzzword came later.
They just realized that they can get more flexibility and higher density of servers when they abstract HW layer from actual x86 machine by adding additional virtualization layer (hypervisor).
Does it makes sense to you? No? OK let's see it on a big picture.
|HW abstraction of type 1 (bare metal) hypervisor|
By looking at the picture you can see that actual components of physical HW are not directly accessible to operating systems (there are moments/scenarios when paravirtualization is used and this is not completely true but we will get to this later). There is virtual HW presented to OS instead and this is somehow mapped to physical HW by hypervisor. By doing this you can easily add, remove and modify virtual HW without physically touching the server in your server room. Even without downtime for certain OSes like Windows 2003, Windows 2008 Enterprise or SLES 11
VMware did a great job by doing this for x86 platforms. I'm telling you because virtualization, high availability, resource scheduling was here long before VMware. IBM and generally mainframe class machines had its own implementations since 70's. Important thing is that x86 is the most available and cost effective platform out there with huge portfolio of machines and configurations. You can check for hundreds of vendors and their compatibility with VMware hypervisors giving you plethora of options regarding you choice for virtualization platform.
So much for the history. VMWare offers two enterprise class of hypervisors both are type 1 that means they are installed on bare metal HW without any requirements for host operating system.
ESX - once mainstream platform with full console currently being replaced by stripped version called ESXi, last available version was 4.1U3, all newer versions does not contain ESX.
ESXi - very small, in terms of installation footprint, hypervisor, no console on server itself, set to be managed through infrastructure management tools like vCenter server, powercli etc. Current version is 5.1
Both versions fully support 64bit and can be combined.
You may wonder what is so magical about it. You have your HW abstracted you can install 2,3,4 maybe 5 independent OS on one physical server. So what? You are just putting all eggs into one basked if your server fail it will drag all guest OS with it.
That's where VMware clustering kicks in.
All available VMware vSphere versions allows you to create basic high availability clusters.
vSphere is the common name for all the functions of enterprise virtualization products. There is a comparison available so you can pick the correct licensing model.
What you can expect from VMware cluster is that if one of your ESX(i) hosts fail you will have all guest OS restarted on rest of the ESX(i) machines still available in the same cluster. Internal resource management (DRS) will also, according to your preference, try to equalize usage across cluster nodes.
To let you create cluster you need to have certain pre-requirements fulfilled.
1.) You need SAN (iSCSI, Fiber Channel, NAS, NFS, FCoE) so your hosts can see the same storage at all times.
2.) HW must be the same model and configuration, there are some native functions which can help you with HW compatibility but in real life scenario you should prefer exactly the same HW from CPU till NIC and memory size.
Depending on your license you will have several functions of clustering at hand.
vMotion - let you move around your virtual machines within same cluster, it will transfer VM and content of the memory (memory state) without outage to different ESX(i)
svMotion - it will do the same but for all the configuration and virtual disk files. You can for example free up over utilized datastore and move the machine to less used one.
HA - in case of ESX(i) failure it will restart guest VMs on another ESX(i) in the same cluster, there is whole science behind how nodes in cluster detects failure but this article is called basic building blocks so no advanced terms here yet :) There is a brilliant book available describing whole magic.
DRS - will monitor resource usage in real time and in case there is a need it will move (vMotion) machine from one ESX(i) to another. Very useful for having same level of utilization on all cluster nodes.
FT - fault tolerance, despite the fact that HA will take care of restarting guest VMs in case of failure on available ESX(i) machines it still means necessary downtime for such VM. (machine needs to boot up) FT goes further and it will (once turned on) create linked shadow VM so that in case of failure there will be no outage at all. The CPU operations are shadowed on secondary VM and such VM is immediately available.
dvSwitch - distributed network switch. Whole networking in cluster is virtualized and you can have two or more physical NICs connected to same virtual switch (for redundancy and bandwidth) vSwitch is set up per one ESX(i). Allowing you to have several port groups serving several VLANs. Each guest VM can have multiple virtual NICs connected to multiple VLANs. What dvSwitch do is the further extension of vSwitch configuration across multiple ESX(i) within same cluster.
storage API - once supported by SAN vendor and model you can offload specific operations (usually demanding heavy CPU usage) to storage array and spare some CPU cycles. For example storage vMotion can be fully managed and performed on array by ESX(i) issuing xcopy command to move virtual machine vmdk files from one datastore (LUN) to another.
There are some other functions with more or less value but these, I described, are basic building blocks of every virtual infrastructure built on VMware vSphere product.
I will focus on management options in Part 2