I am very pleased to announce that I have been awarded the vExpert award from VMware for 2014.
The vExpert award is given to individuals who make a considerable effort within the community to share their expertise with others.
A vExpert is someone who is not necessarily a technical expert or even an expert in all things VMware, but rather someone who goes above and beyond their day job in the community to develop a platform of influence both publicly in books, blogs, online forums, and VMware User Groups; and privately inside customers and VMware partners.
If you have wondered how you can go about creating your own ESXi image complete with any drivers, such as storage controller drivers or network card drivers then you can use PowerCLI imagebuilder to achieve this.
This is helpful if the standard ESXi image is causing a purple screen of death (PSOD) when trying to boot up ESXi.
To do this you will need a copy of PowerCLI, you can download the v5.5 from the VMware website.
Once installed, set the execution policy to remote signed.
Now you are ready to start building your custom image following the steps below.
Import the VMWare software and vendor depot Add-EsxSoftwareDepot C:\Name_of_ESXi_Offline_Bundle.zip
Find the name of the driver we added Get-EsxSoftwarePackage -Vendor Vendor_Name
Find the name of the image profile we want to copy. This will list all the image profiles available within the offline bundle. Get-EsxImageProfile | Select Name
Copy the image profile that we need and give it a name New-EsxImageProfile -cloneprofile ESXi-5.5.x.x-standard -name New_Name
Add the VIB to the image, <name> is the name of the driver (E.g. qlogic-nic) Add-EsxSoftwarePackage -imageprofile name -softwarepackage <name>
Export to a new .ISO Export-EsxImageProfile -imageprofile name -exporttoiso -filepath “c:\custom.imagebuilder.iso”
Boot from the ISO and you have your own custom ESXi image.
When an operating system is installed directly onto the physical hardware in a non-virtualised environment, the operating system has direct access to the memory installed in the system and simple memory requests, or pages always have a 1:1 mapping to the physical RAM, meaning that if 4GB of RAM is installed, and the operating system supports that much memory then the full 4GB is available to the operating system as soon as it is requested. Most operating systems will support the full 4GB, especially if they are 64-bit operating systems.
When an application within the operating system makes a memory page request, it requests the page from the operating system, which in turn passes a free page to the application, so it can perform its tasks. This is performed seamlessly.
The hypervisor adds an extra level of indirection. The hypervisor maps the guest physical memory addresses to the machine, or host physical memory addresses. This gives the hypervisor memory management abilities that are transparent to the guest operating system. It is these memory management techniques that allow for memory overcommitment.
To get a good understanding of memory behaviour within a virtualised environment, let’s focus on three key areas.
With an operating system running in a virtual machine the memory requested by the application is called the virtual memory, the memory installed in the virtual machine operating system is called the physical memory and the hypervisor adds an additional layer called the machine memory.
To help define the interoperability of memory between the physical RAM installed in the server and the individual applications within each virtual machine, the three key memory levels are also described as follows. Host physical memory– Refers to the memory that is visible to the hypervisor as available on the system. (Machine memory) Guest physical memory – Refers to the memory that is visible to the guest operating system running in the virtual machine. Guest physical memory is backed by host physical memory, which means the hypervisor provides a mapping from the guest to the host memory. Guest virtual memory – Refers to a continuous virtual address space presented by the guest operating system to applications. It is the memory that is visible to the applications running inside the virtual machine.
To help understand how these layers inter-operate, look at the following diagram.
The Virtual memory creates a uniform memory address space for operating systems that maps application virtual memory addresses to physical memory addresses. This gives the operating system memory management abilities that are transparent to the application.
Hypervisor Memory Management
Memory pages within a virtualised environment have to negotiate an additional layer, the hypervisor. The hypervisor creates a contiguous addressable memory space for a virtual machine. This memory space has the same basic properties as the virtual address space that the guest operating system presents to the applications running on it. This allows the hypervisor to run multiple virtual machines simultaneously while protecting the memory of each virtual machine from being accessed by others.
The virtual machine monitor (VMM) controls each virtual machine’s memory allocation. The VMM does this using software-based memory virtualization.
The VMM for each virtual machine maintains a memory mapping from memory pages contained inside the the guest operating system. The mapping is from the guest physical pages to the host physical pages. The host physical memory pages are also called the machine pages.
This memory mapping technique is maintained through a Physical Memory Mapping Data (PMAP) structure.
Each virtual machine sees its memory as a contiguous addressable memory space. The underlying physical machine memory however may not be contiguous. This is because it may be running more than one virtual machine at any one time and it is sharing the memory out amongst the VMs.
The VMM sits between the guest physical memory and the Memory Management Unit (MMU) on the CPU so that the actual CPU cache on the processor is not updated directly by the virtual machine.
The hypervisor maintains the virtual-to-machine page mappings in a shadow page table. The shadow page table is responsible for maintaining consistency between the PMAP and the guest virtual to host physical machine mappings.
The shadow page table is also maintained by the virtual machine monitor. (VMM)
Each processor in the physical machine uses the Translation Lookaside Buffer (TLB) on the processor cache for the direct virtual-to-physical machine mapping updates. These updates comes from the shadow page tables.
Some CPU’s support hardware-assisted memory virtualisation. AMD SVM-V and Intel Xeon 5500 series CPU’s support it. These CPU’s have two paging tables, one for the virtual-to-physical translations and one for the physical-to-machine translations.
Hardware assisted memory virtualisation eliminates the overhead associated with software virtualisation, namely the overhead associated with keeping shadow page tables synchronised with guest page tables, as it uses two layers of page tables in hardware that are synchronized using the processor hardware.
One thing to note with hardware assisted memory virtualisation is that the TLB miss latency is significantly higher. As a result workloads with a small amount of page table activity will not have a detrimental effect using software virtualisation, whereas workloads with a lot of page table activity are likely to benefit from hardware assistance.
Application Memory Management
An application starts with no memory, it allocates memory through a syscall to the operating system. The application frees up memory voluntarily when not in use through an explicit memory allocation interface with the operating system.
Operating System Memory Management
As far as the operating system is concerned it owns all the physical memory allocated to it. That is because it has no memory allocation interface with the hardware, only with the virtual machine monitor. It does not explicitly allocate or free physical memory; it defines the in use and available memory by maintaining a free list and an allocated list of physical memory. The memory is either free or allocated depending on which list it resides on. This memory allocation list exists in memory.
Virtual Machine Memory Allocation
When a virtual machine starts up it has no physical memory allocated to it. As it starts up it ‘touches’ memory space, as it does this the hypervisor allocates it physical memory. With some operating systems this can actually mean the entire amount of memory allocated to the virtual machine is called into active memory as soon as the operating system starts, as is typically seen with Microsoft Windows operating systems.
Transparent Page Sharing (TPS)
Transparent Page Sharing is on-the-fly de-duplication of memory pages by looking for identical copies of memory and deleting all but one copy, giving the impression that more memory is available to the virtual machine. This is performed when the host is idle.
Hosts that are configured with AMD-RVI or Intel-EPT hardware assist CPUs are able to take advantage of large memory pages where the host will back guest physical memory pages with host physical memory pages in 2MB pages rather than the standard 4KB pages where lage pages are not used. This is because there will be less TLB misses and so will achieve better performance. There is a trade off though as large memory pages will not be shared as the chance of finding a 2MB memory pages that are identical are low and the overhead associated with doing the bit-by-bit comparison of the 2MB pages is greater than the 4KB page.
Large memory pages may still be broken down into smaller 4KB pages during times of contention as the host will generate 4KB hashes for the 2MB large memory pages so that when the host is swapping memory it can use these hashes to share the memory.
It is possible to configure advanced settings on the host to set the time to scan the virtual machines memory, Mem.ShareScanTime and the maximum number of scanned pages per second in the host, Mem.ShareScanGHz and the maximum number of per-virtual machine scanned pages using Mem.ShareRateMax.
Use resxtop and esxtop to view PSHARE field to monitor current transparent page sharing activity. This is available in memory view. See below.
You can disable TPS on a particular VM by configuring the advanced setting Sched.mem.Pshare.enable=false.
An ESXi host has no idea how much memory is allocated within a virtual machine, only what the virtual machine has requested. As more virtual machines are added to a host there are subsequently more memory requests and the amount of free memory may become low. This is where ballooning is used.
Provided VMware tools is installed, the ESXi host will load the balloon driver (vmmemctl) inside the guest operating system as a custom device driver.
The balloon driver communicates directly with the hypervisor on the host and during times of contention creates memory pressure inside the virtual machine. This tells the virtual machine that memory is contended, and ‘inflates like a balloon’ by requesting memory from the guest and this memory is then ‘pinned’ by the hypervisor and mapped into host physical memory as free memory that is available for other guest operating systems to use. The memory that is pinned within the guest OS is configured so that the guest OS will not swap out the pinned pages to disk. If the guest OS with the pinned pages requests access to the pinned memory pages it will be allocated additional memory by the host as per a normal memory request. Only when the host ‘deflates’ the balloon driver will the guest physical memory pages become available to the guest again.
Ballooning is a good thing to have as it allows the guest operating systems to handle how much of it’s memory to free up rather than the hypervisor which doesn’t understand when a guest OS is finished accessing memory.
Take a look at the figure below and you can see how ballooning works. The VM has one memory page in use by an application and two idle pages that have been pinned by the hypervisor so that it can be claimed by another operating system.
The memory transfer between guest physical memory and the host swap device is referred to as hypervisor swapping and is driven by the hypervisor.
The memory transfer between the guest physical memory and the guest swap device is referred to as guest-level paging and is driven by the guest operating system.
Host level swapping occurs when the host is under memory contention. It is transparent to the virtual machine.
The hypervisor will swap random pieces of memory without a concern as to what that piece of memory is doing at that time. It can potentially swap out currently active memory. When swapping, all segments belonging to a process are moved to the swap area. The process is chosen if it’s not expected to be run for a while. Before the process can run again it must be copied back into host physical memory.
Memory compression works by stepping in and acting as a last line of defence against host swapping by compressing the memory pages that would normally be swapped out to disk and compressing them onto the local cache on the host memory. This means that rather than sending memory pages out to the comparatively slow disk they are instead kept compressed on the local memory within the host which is significantly faster.
Only memory pages that are being sent for swapping and can be compressed by a factor of 50% or higher are compressed, otherwise they are written out to host-level swap file.
Because of this memory compression will only occur when the host is under contention and performing host-level swapping.
Memory compression is sized at 10% of the configured allocated memory on a virtual machine by default to prevent excessive memory pressure on the host as compression size needs to be accounted for on every VM.
You can configure a different value than 10% with the following advanced setting on the virtual machine. Mem.MemZipMaxPct.
When the compression cache is full the first memory compressed page will be decompressed and swapped out to the hypervisor level cache.
Someone once likened esxtop to windows task manager on steroids. When you have a look at the monitoring options available you can see why someone would think that.
Esxtop is based on the ‘nix top system tool, used to monitor running applications, services and system processes. It is an interactive task manager that can be used to monitor the smallest amount of performance metrics of the ESX(i) host.
Esxtop works on ESX and it’s cousin remote esxtop or resxtop is for ESXi, although resxtop will work on ESX as well.
To run esxtop from ESX all you need to do is type esxtop from the command prompt.
To run resxtop you need to setup a vMA appliance and then run resxtop –server and enter the username and password when prompted. You don’t need root access credentials to view resxtop counters, you can use vCenter Server credentials.
The commands for esxtop and resxtop are basically the same so I will describe them as one.
Monitoring with resxtop collects more data than monitoring with esxtop so CPU usage may be higher. This doesn’t mean this esxtop isn’t resource intensive, quite the opposite. In fact it can use quite a lot of CPU cycles when monitoring in a large environment, if possible limit the number of fields displayed. Fields are also known as columns and entities (rows) You can also limit the CPU consumption by locking entities and limiting the display to a single entity using l.
Esxtop uses worlds and groups as the entities to show CPU usage. A world is an ESX Server VMkernel schedulable entity, similar to a process or thread in other operating systems. It can represent a virtual machine, a component of the VMkernel or the service console. A group contains multiple worlds.
Counter values esxtop doesn’t create performance metrics, it derives performance metrics from raw counters exported in the VMkernel system info nodes. (VSI nodes) Esxtop can show new counters on older ESX systems if the raw counters are present in VMkernel. (i.e. 4.0 can display 4.1 counters)
Many raw counters have static values that don’t change with time. A lot of counters increment monotonically, esxtop reports the delta for these counters for a given refresh interval. For instance the counters for CMDS/sec, packets transmitted/sec and READS/sec display information captured every second.
By default counters are displayed for the group, in group view the counters are cumulative whereas in expanded view, counters are normalised per entity. Because of the cumulative stats, percentage displayed can often exceed 100%. To view expanded stats, press e.
esxtop takes snapshots. Snapshots are every 5 seconds by default. To display a metric it takes two snapshots and makes a comparison of the two to display the difference. The lowest snapshot value is 2 seconds. Metrics such as %USED and %RUN show the CPU occupancy delta between successive snapshots.
The manual for (r)esxtop is full of useful examples, and unlike normal commands run on ESXi it will work if you run it from the vMA in the same way that it does from ESX.
To use just type man esxtop or man resxtop.
Commands for (r)esxtop can be split into two distinct types, running commands and interactive commands
These are commands that get placed at the end of the initial call of (r)esxtop.
Example #esxtop -d 05
This command would set a delay on the refresh rate of 5 seconds.
Running commands examples -d – sets delay. -o – sets order of Columns. -b – batch mode – I will explain this in further detail below.
These are commands that work on a key press once (r)esxtop is up and running. If in any doubt of which command to use just hit h for help.
Interactive commands examples c – CPU resource utilisation. m – memory resource utilisation. d – storage (disk) adapter resource utilisation. u – storage device resource utilisation. v – storage VM resource utilisation. f – displays a panel for adding or removing statistics columns to or from the current panel. n – network resource utilisation. h – help. o – displays a panel for changing the order of statistics. i – interrupt resource utilisation. p – power resource utilisation. q – quit. s – delay of updates in seconds. (can use fractional numbers) w – write current setup to the config file. (esxtop4c)
From the field selection panel, accessible by pressing o you can move a field to the left by pressing the corresponding uppercase letter and you can move a field to the right by pressing the corresponding lowercase letter. The currently active fields are shown in uppercase letters, and with an asterix to show it is selected in the order field selection screen.
Batch mode (-b)
Batch mode is used to create a CSV file which is compatible with Microsoft perfmon and ESXplot. For reading performance files ESXplot is quicker than perfmon. CSV compatibility requires a fixed number of columns on every row. Because of this reason statistics of vm (world) instances that appear after starting the batch mode are not collected. Only counters that are specified in the configuration file are collected. Using the -a option collects all counters. Counters are named slightly differently to be compatible with perfmon.
To use batch mode select the columns you want in interactive mode and then save with W, then run #esxtop -b -n 10 > esxtopfilename.csv
Options -a – all. -b – Batch mode. -c – User defined configuration file. -d – Delay between statistics snapshots. (minimum 2 seconds) -n – Number of iterations before exiting.
To read the batch mode output file is to load it in Windows perfmon.
(1) Run perfmon
(2) Type “Ctrl + L” to view log data
(3) Add the file to the “Log files” and click OK
(4) Choose the counters to show the performance data.
Esxtop reads it’s default configuration from a .esxtopp4rc file. This file contains 8 lines. The first 7 lines are upper and lowercase letters to specify which fields appear, default is CPU, memory, storage, adapter, storage device, virtual machine storage, network and interrupt. The 8th line contains other options. You can save configuration files to change the default view with in esxtop.
Replay mode interprets data that is collected by issuing the vm-support command and plays back the information as esxtop statistics. replay mode accepts interactive commands until no more snapshots are collected by vm-support. Replay mode does not process the output of batch mode.
To use replay mode run #vm-support -S -i 5 -d 60
untar the file using #tar -xf /root/esx*.tgz
then run #esxtop -R root/vm-support*
Commands -S – Snapshot mode, prompts for the delay between updates, in seconds. -R – Path to the vm-support collected snapshot’s directory.
CPU load average of 1.00 means full utilisation of all CPU’s. A load of 2.00 means the host is using twice as many physical CPU’s as are currently available, likewise 0.50 means half are being utilised.
PCPU USED(%) – Physical hardware execution context. Can be a physical CPU core if hyperthreading is unavilable or disabled or a logical CPU (LCPU) or SMT thread if hyperthreading is enabled. This displays PCPU percentage of CPU usage when averaged over all PCPUs. PCPU UTIL(%) – Physical CPU utilised. (real time) Indicates how much time the PCPU was busy, in an unhalted state, in the last snapshot duration. Might differ from PCPU USED(%) due to power management technologies or hyperthreading.
If hyper threading is enabled these figures can be different, likewise if the frequency of the PCPU is changed due to power management these figures can also be adjusted.
As an example if PCPU USED(%) is 100 and PCPU UTIL(%) is 50 this is because hyper threading is splitting the load across the two PCPUs. If you then look in the vSphere client you may notice that CPU usage is 100%. This is because the vSphere client will double the statistics if hyperthreading is enabled.
In a dual core system, each PCPU is charged by the CPU scheduler half of the elapsed time when both PCPUs are busy.
CCPU(%) – Total CPU time as reported by ESX service console. (Not applicable to ESXi)
us – Percentage user time.
sy – Percentage system time.
id – Percentage idle time.
wa – Percentage wait time.
cs/sec – Context switches per second recorded by the ESX Service Console.
CPU panel statistics (c) Fields
ID – resource pool or VM ID of the running worlds resource pool or VM or world ID of running world.
GID – Resource pool ID of the running worlds resource pool or VM.
NAME – err… name.
NWLD – Number of members in a running worlds resource pool or VM.
%USED – CPU core cycles used.
%RUN – CPU scheduled time.
%SYS – Time spent in the ESX(i) VMkernel on behalf of the resource pool, VM or world to processor interrupts.
%WAIT – Time spent in the blocked or busy wait state.
%RDY – Time CPU is ready to run, waiting for something else.
High %RDY and high %USED can imply CPU overcommitment.
%IDLE – As it says. Subtract this from %WAIT to see time waiting for an event. WAIT-IDLE can be used to estimate guest I/O wait time.
%MLMTD (max limited – Time VMkernel didn’t run because it would violate limit settings on the resource pool, VM or world limits setting.
%SWPWT – Wait time for swap memory.
CPU ALLOC – CPU allocation. Set of CPU statistics made up of the following. (For a world the % are the % of one physical CPU core)
AMIN – Attribute reservation.
AMAX – Attribute limit.
ASHRS – Attribute shares.
SUMMARY STATS – Only applies to worlds.
CPU – Which CPU esxtop was running on.
HTQ – Indicates whether a world is currently quarantined or not. (Y or N)
TIMER/s – Timer rate for this world.
%OVRLP – Time spent on behalf of a different resource pool/VM or world while the local was scheduled. Not included in %SYS.
%CSTP – Time the vCPUS of a VM spent in the co-stopped state, waiting to be co-started. This gives an indication of the co-scheduling overhead incurred by the VM. If this value is high and the CPU Ready time is also high, this represents that the VM has too many vCPUs. If low, then any performance problems should be attributed to other issues and not to the co-scheduling of the VM’s vCPU.
Single key display settings e – expand. Displays utilisation broken down by individual worlds belonging to a resource pool or VM. All %’s are for individual worlds of a single physical CPU. u – Sort by %USED column. r – Sort by %RDY. n – Sort by GID. (default) v – VM instances only. l – Length of NAME column.
CPU clock frequency scaling %USED – CPU usage with reference to the base core frequency, i.e. the actual CPU value in Mhz. %UTIL – CPU utilisation with reference to the current clock frequency. (displayed as %) %RUN – Total CPU scheduled time. Displayed as %. If using turbo boost will show greater than 100%.
%UTIL may be different if turbo boost in enabled. To better define this, if a CPU has a base core clock speed of 1Ghz, with turbo boost is 1.5Ghz then %USED (with turbo boost enabled) is 150%. So the current CPU speed (%UTIL) will be that 150% displayed as a value of 100%. Now if the current used CPU is 1Ghz then the current %UTIL will be 50% and not 100%. (as per the base core frequency) Consider this when monitoring these stats.
Interrupt panel (i)
VECTOR – Interrupt vector ID.
COUNT/s – Interrupts per second on CPU x
TIME/int – Average processing time per interrupt. (in micro seconds)
TIME_x – Average processing time on CPU. (in micro seconds)
DEVICES – Devices that use the interrupt vector. If the interrupt vector is not enabled name is in < > brackets.
The following counters and statistics assume a basic understanding of memory management in a virtualised environment. Check out my Understanding virtual machine memory guide for a brief overview of memory management.
Memory screen (m) PMEB(MB) – Machine memory statistics.
Total – Yup you guessed it, total.
COS – Amount allocated to the service console.
VMK – Machine memory being used by the ESX(i) VMkernel.
Other – Everything else.
Free – Machine memory free VMKMEM(MB) – Statistics for VMkernel in MB.
Managed – Total amount.
Min free – Minimum amount of machine memory VMKernel aims to keep free.
RSVD – Reserved by resource pools.
USVD – Total unreserved.
State – Values are high, soft, hard, low. (Pressure states) COSMEM(MB) – Statistics as reported by the service console.
Free – Amount of idle memory.
Swap_t – Total swap configured.
Swap_f – Swap free.
r/s is – Rate at which memory is swapped in from disk.
w/s – Rate at which memory is swapped to disk. NUMA(MB) – Only if running on a NUMA server.
PSHARE (MB) – Page sharing.
shared – Shared memory.
common – Across all worlds.
saving – Saved due to transparent page sharing. SWAP(MB)
curr – Current.
target – What the ESX(i) system expects the swap usage to be.
r/s – swapped from disk.
w/s – swapped to disk. MEM CTL(MB) – Balloon statistics.
curr – Amount reclaimed.
target – Host attempt reclaims using the balloon driver, vmmemctl.
max – Maximum amount the host can reclaim using vmmemctl. Fields
AMIN – Memory reservation.
AMAX – Memory limit. A value if -1 means unlimited.
ASHRS – Memory shares.
NHN – Current home node for resource pool or VM. (NUMA only)
NRMEM (MB) – Current amount of remote memory allocated. (NUMA only)
N% L – Current % of memory allocated to the VM or resource pool that’s local.
MEMSZ (MB) – Amount of phyiscal memory allocated to a resource pool or VM.
GRANT (MB) – Guest memory mapped.
SZTGT (MB) – Amount the VMkernel wants to allocate.
TCHD (MB) – Working set estimate.
%ACTV – % guest physical memory referenced by the guest.
%ACTVS – Slow moving version of the above.
%ACTVF – Fast moving.
%ACTVN – Estimation. (This is intended for VMware use only)
MCTL – Memory balloon drive installed or not. (Y/N)
MCTLSZ (MB) – Amount of physical memory reclaimed by ballooning.
MCTLTGT (MB) – Attempts to reclaim by ballooning.
MCTLMAX (MB) – Maximum that can be reclaimed by ballooning.
SWCUR (MB) – Current swap.
Interactive m – Sort by group mapped column. b – sort by group Memch column. n – sort by group GID column. (Default) v – Display VM instances only. l – Display length of the NAME column.
The network stats are arranged per port of a virtual switch.
PORT-ID identifies the port and DNAME shows the virtual switch name. UPLINK indicates whether the port is an uplink. If the port is an uplink, i.e., UPLINK is ‘Y’, USED-BY shows the physical NIC name. If the port is connected by a virtual NIC, i.e., UPLINK is ‘N’, USED-BY shows the port client name.
Network panel (n) Fields
PORT-ID – Port ID.
UPLINK – Uplink enabled.(Y or N)
UP – Guess what.
SPEED – Link in MB.
FDUPLX – Full duplex.
USED-BY – VM device port user.
DTYP – Virtual network device type. (H=hub, S=switch)
DNAME – Device name.
Interactive T – Sort by Mb transmitted. R – Sort by Mb received. t – Packets transmitted. r – Packets received. N – Port-ID. (default) L – Length of DNAME column.
Storage Panels d – disk adapter. u – disk device. (also includes NFS if ESX(i) host is 4.0 Update 2 or later) v – disk VM.
An I/O request from an application in a virtual machine traverses through multiple levels of queues, each associated with a resource of some sort, whether that is the guest OS, the VMkernel or the physical storage. Each queue has an associated latency.
Esxtop shows the storage statistics in three different screens; the adapter screen, device screen and vm screen.
By default data is rolled up to the highest level possible for each screen. On the adapter screen the statistics are aggregated per storage adapter by default, but the can be expanded to display data per storage channel, target, path or world using a LUN.
On the device screen statistics are aggregated per storage device by default and on the VM screen, statistics are aggregated on a per-group basis. One VM has one corresponding group, so they are equivalent to per-vm statistics. Use interactive command V to show only statistics related to VMs.
AQLEN – The storage adapter queue depth.
LQLEN – The LUN queue depth.
WQLEN – The World queue depth.
ACTV – The number of commands in the ESX Server VMKernel that are currently active. QUED The number of commands queued.
LOAD – The ratio of the sum of VMKernel active commands and VMKernel queued commands to the queue depth.
%USD – The percentage of queue depth used by ESX Server VMKernel active commands.
%USD = ACTV / QLEN * 100%
I/O throughput statistics CMDS/s – Number of commands issued per second.
READS/s – Number of read commands issued per second.
WRITES/s – Number of write commands issued per second.
MBREAD/s – MB reads per second.
MBWRTN/s – MB written per second.
I/O latencies are measured per SCSI command so it is not affected by the refresh interval. Reported latencies are average values for all the SCSI commands issued within the refresh interval window. Reported average latencies can be different on different screens, (adapter, LUN, VM) since each screen accounts for different group of I/O’s.
This group of counters report latency values. These are under the labels GAVG, KAVG and DAVG. GAVG is the sum of DAVG and KAVG.
GAVG – round-trip latency that the guest sees for all IO requests sent to the virtual storage device.(should be under 25)
KAVG – latencies due to the ESX Kernel’s command. should be small in comparison to DAVG DAVG latency seen at the device driver level. includes the roundtrip time between the HBA and the storage. (should be 2 or less)
QAVG – average queue latency. QAVG is part of KAVG (should be zero)
CID – Channel ID.
TID – Target ID.
LID – LUN ID.
Interactive e – Expand/rollup storage adapter statistics. p – Same as e but doesn’t roll up to adapter statistics. a – Expand/rollup storage channel statistics. t – Expand/rollup storage target statistics. r – Sort by READ/s. w – Sort by WRITES/s. R – sort by MBREADS/s. T – Sort by MBWRTN/s.
This guide is written with ESXi 4.1 update 1 in mind, however it will work with any update version from 3.5 onwards.
First off you will require vSphere CLI, this is a free download available to everyone with a valid VMware login. If you don’t have one you can easily register for a new one. Download from the VMware website
Power off all VM’s or vMotion them to another host and place the host in maintenance mode. (Right click on the host and select Enter Maintenance Mode)
The upgrade package contains two update bulletin parts. The esxupdate bulletin and the upgrade bulletin. These both need to be installed by running these commands on the computer with the vSphere CLI installed on it.
Ensure these commands are run from this directory C:Program FilesVMwareVMware vSphere CLIbin>
vihostupdate.pl –server Hostname or IP address -i -b patch location and zip file name -B ESXi410-GA-esxupdate
when prompted enter the root username and password
vihostupdate.pl –server Hostname or IP address -i -b patch location and zip file name ESXi410-GA
If following the vSphere upgrade guide you may notice that this last command fails with this error message
No matching bulletin or VIB was found in the metadata.No Bulletin or VIB found with ID ‘ESXi410-GA’.
This is because it has an extra -B in it. If you run the command listed above it will work.
Finally type the following to confirm successful installation.
vihostupdate.pl –server hostname or IP address –query
Reboot the host to complete the installation. Don’t forget to take it out of maintenance mode!!
VMware View 4.6 is out, and with it come new features. A full list of improvements is available here.
In the words of VMware, VMware View is the leading desktop virtualisation solution. It provides a virtualised desktop infrastructure which can leverage existing virtual infrastructures and provide a cost effective centrally managed desktop deployment.
VMware View offers the ability for desktop administrators to virtualize the operating system, applications, and user data and deliver modern desktops to end-users.
VMware View Manager is an enterprise-class virtual desktop manager, and a critical component of VMware View.
IT administrators use VMware View Manager as a central point of control for providing end-users with secure, flexible access to their virtual desktops and applications, leveraging tight integration with VMware vSphere to help customers deliver desktops as a secure, managed service. Extremely scalable and robust, a single instance VMware View Manager can broker and monitor tens of thousands of virtual desktops at once, using the intuitive Web-based administrative interface for creating and updating desktop images, managing user data, enforcing global policies, and more.
Ok, so that’s the official description, but how does it all fit together?
VMware View is made up of the following core components.
View Manager Components
VMware View Connection Server—Manages secure access to virtual desktops, works with VMware vCenter Server to provide advanced management capabilities.
VMware View Agent—Provides session management and single sign-on capabilities. VMware View Client—Enables end-users on PCs and thin clients to connect to their virtual desktops through the VMware View Connection Server.
Use View Client with Local Mode to access virtual desktops even when disconnected without compromising on IT policies. VMware vCenter Server with View composer —Enables administrators to make configuration settings, manage virtual desktops and set entitlements of desktops and assignment of applications. View transfer server – to transfer desktops to client PC’s and laptops with offline mode. View Security Server – A View Security Server (in a DMZ) is also an option. This will allow RDP and PCoIP connections over the WAN.
This diagram from the VMware Visio templates depicts a typical View deployment, taking advantage of View Linked Clones with Offline Mode, ThinApp and PCoIP.
vCenter Server – View manager installed (cannot use IIS or be a domain controller)
View Connection server, preferably two (cannot have any other View roles, use IIS or be a domain controller)
View transfer server for Linked-Clones with Offline Mode (Cannot have any other roles. Can be a physical server)
Database server for events and View Composer database
Optional View Security Server for WAN RDP and PCoIP connectivity
View Composer is installed on the vCenter Server, it provides storage-saving linked clones, rapid desktop deployment, quick update, patch management and tiered storage options.
View Composer can utilise Quickprep or Sysprep. System automation tools for creating unique operating system instances in Microsoft Active Directory.
Changes to the master images can be sent out to all linked clones by running a recompose operation. Running a refresh operation on a linked clone synchronises it with the master image.
This is useful if users are experiencing issues with their linked clone, it is a way of setting it back to default.
Each user in a linked clone can have their own persistent data disk which will contain all of their unique user data, documents and settings.
Linked-Clones with Offline Mode
A linked clone is made from a snapshot of the parent. All files available on the parent at the moment of the snapshot continue to remain available to the linked clone. On-going changes to the virtual disk of the parent do not affect the linked clone, and changes to the disk of the linked clone do not affect the parent. This provides a secure master template machine that can be used to create additional clones.
A linked clone must have access to the parent. Without access to the parent, a linked clone is disabled.
Offline mode allows users to check out the desktop and use it on a PC or laptop, for instance when travelling on a train and then check it back in and synchronise the changes when returning to the office.
ThinApp simplifies application delivery by encapsulating applications in portable packages that can be deployed to many end point devices while isolating applications from each other and the underlying operating system.
ThinApp virtualizes applications by encapsulating application files and registry into a single ThinApp package that can be deployed, managed and updated independently from the underlying operating system (OS). The virtualized applications do not make any changes to the underlying OS and continue to behave the same across different configurations for compatibility, consistent end-user experiences, and ease of management.
PCoIP supports WAN connections with less than 100kbps peak bandwidth with up to 250ms of latency however I recommend a minimum 1Mbps upload speed across the WAN with less than 150ms of latency.
PCoIP sessions average bandwidth for an active office worker may be in the 80-150kbps range. This drops to nearly zero when not in use.
It is recommended that the infrastructure is using an offload card as PCoIP rendering is fairly resource intensive on the hosting server.
A PCoIP security gateway removes the need for a VPN connection. This became available in the latest VMware View 4.6 release.
Modern thin client devices like the zero clients from Wyse are designed specifically for connecting to a virtual desktop environment, these devices support PCoIP out of the box with no major configuration required to connect them to the virtual desktop infrastructure.
vShield Endpoint provides an API to allow third party anti-virus vendors a way of scanning machines at the Hypervisor level, rather than at the individual virtual machine level, removing unnecessary load from the individual clients.
In the future this will be the standard way that anti virus scanning will be completed with virtual desktop infrastructure, and server infrastructure also. The current offerings are from Trend-Micro only which is limited to scanning 15 machines per virtual appliance. But future developments from other providers may support more virtual machines.
vShield Endpoint is included in the cost of VMware View Premier.
ThinPrint allows a view client to utilise the print devices installed on the connecting client machine so that a user can seamlessly print to their default local printer without having to install any drivers.
VMware View is available using two licensing models, Enterprise and Premier. The differences between the two are illustrated in the table below.
Windows 7 requires a KMS server for automatic server provisioning. This can be a 2003, 2008 and a 2008 R2 server however they have the following caveats.
Must have at least 5 Servers checked in for server activation to occur or 25 Windows 7 or Vista machines checked in for client activation to occur.
Windows Server 2008 is not supported as a KMS host to activate Windows 7 and Office 2010.
A patch is available to allow activation of Windows 7 client machines. (A Windows Server 2008 R2 KMS key is required.)
A patch is not available to allow activation of Office 2010 clients.
Hardware requirements will vary depending on individual circumstances, however as a ball park figure use the figures below as a guideline.
A view infrastructure to support 30 & 100 users will require the following core components.
30 users Two ESXi hosts (Minimum) ideally three, two for Workstations one for servers. (Existing Virtual infrastructure will do for servers.) (Approx. 32GB RAM, dual core, For 30 VM’s)
100 users Four ESXi hosts Three for Workstations one for servers (Approx. 48GB RAM, dual core, R710 for 35 VM’s)
To leverage the advanced VMware features HA and DRS, shared central storage is required.
This can be achieved using a storage area network. (SAN)
Windows virtual machines require more memory with each passing release and software demands on memory are becoming larger all the time. In a vitual environment it is quite simple to increase the amount of virtual memory granted to a virtual machine, especially with features such as hot add. The ability to dynamically increase the amount of memory available to the virtual machine whilst it is still turned on.
Hot add has a few requirements.
From a licensing point of view you need vSphere Advanced, Enterprise or Enterprise plus as a minimum.
The chart below illustrates the Windows Operating Systems that support Hot Add.
* Server must be rebooted for memory to be registered in Windows.
Adding additional RAM to a virtual machine is so easy these days that it is important to not over look the effects that increasing the virtual machine memory can cause on the host and the virtual machine datastore.
Every Gigabyte of virtual machine memory demands more memory overhead from the ESX host. The overhead on the host is illustrated in the table below.
As you can see, the total amount of RAM the hosts requires to power your virtual machines and future ones can increase dramatically when you start adding more RAM to your virtual machines, especially when adding more vCPU’s. This is also true with the multi-cored vCPU in vSphere 4.1, as explained by Jason Boche.
To put this into perspective, if you have a host with ESX installed with two virtual machine running 2 VCPUs and 4GB RAM each you would need to allow 10.1GB RAM as a minimum.
Depending on the load on the virtual machines, they may require up to the full 4GB assigned to them, this leaves no additional memory available. This is not something I would recommend. I would instead assign at least 12GB to this configuration, more likely more so that there is room for expansion at a later date, or adding additional virtual machines.
Storage is another consideration when upping the amount of virtual machine memory. The virtual machine swap file, .vswp, is equal to the size of the virtual machine RAM installed. Ensure this is taken into account when thinking about what size datastore is required to store the virtual machines. The alternative is to specify a seperate, fast access datastore for storing the virtual machine swap file. If possible use SSD drives.
This can be set by following the instructions for Storing a virtual machine swap file in a location other than the default on the VMware website.
With careful planning, running out of memory or storage should never happen if you follow VMware best practices for creating virtual machines. More information can be found in the vSphere 4.1 Performance Best Practices guide.
Additional information on VMFS datastore storage allocation can be found here
As technology progresses, storage requirements grow. It seems to be a never ending pattern. I remember only a few years ago the maximum configurable LUN size of 2TB seemed huge. Now it is common to have many LUN carvings making up tens of Terabytes of SAN storage.
The downside to all this extra storage is demand for larger virtual machine disks then you find that the VMFS datastores get filled up in no time.
This is something we are all aware of , and it is something we can avoid with enough planning done ahead of time. (Preventing it filling up not stopping demand for more space that is!)
Before adding any additional virtual machine drives it is important to ensure that enough free space is available for the virtual machines already setup.
In order to calculate the minimum free space required use the following formula courtesy of ProfessionalVMware.
(Total Virtual machine VMDK disk sizes + Total Virtual Machine RAM Sizes * 1.1)*1.1 + 12GB
This formula can be used to work out what size the VMFS datastore needs to be. Once you work that out you can deduct this from the total available space on the VMFS datastore to see how much space can be used for additional drives without resorting to just adding disks until the vSphere Server complains it is running out of free space.
This will allow enough for the local installation of ESX(i) and an additional 10% for snapshots, plus an additional 10% for overhead. (12GB for an ESXi install is a little excessive but I would still recommend leaving this much space as it will be required before you know it.)
This formula can prove useful when planning how much storage space is required when performing a P2V migration. This way you can be sure to manage expectations so that you are fully aware from the beginning how much free space you have available in the VMFS datastore.
This is a recommended minimum, you may need to leave more free space depending on the requirements. ISO files, templates etc will also need to be taken into account.
Following the calculations you may find that the requirements for free space has been met but you are getting alarms in the vSphere Client saying you are running out of free space.
The alarms within the vSphere client are set to send a warning alert you when 75% of the datastore is in use, and an error when 85% is in use.
This can be adjusted if required by clicking the top level and selecting the alarms tab within the vSphere client.