Tag : esx

Esxtop Guide

Someone once likened esxtop to windows task manager on steroids. When you have a look at the monitoring options available you can see why someone would think that.

Esxtop is based on the ‘nix top system tool, used to monitor running applications, services and system processes. It is an interactive task manager that can be used to monitor the smallest amount of performance metrics of the ESX(i) host.

Esxtop works on ESX and it’s cousin remote esxtop or resxtop is for ESXi, although resxtop will work on ESX as well.

To run esxtop from ESX all you need to do is type esxtop from the command prompt.
To run resxtop you need to setup a vMA appliance and then run resxtop –server and enter the username and password when prompted. You don’t need root access credentials to view resxtop counters, you can use vCenter Server credentials.

The commands for esxtop and resxtop are basically the same so I will describe them as one.

Monitoring with resxtop collects more data than monitoring with esxtop so CPU usage may be higher. This doesn’t mean this esxtop isn’t resource intensive, quite the opposite. In fact it can use quite a lot of CPU cycles when monitoring in a large environment, if possible limit the number of fields displayed. Fields are also known as columns and entities (rows) You can also limit the CPU consumption by locking entities and limiting the display to a single entity using l.

Esxtop uses worlds and groups as the entities to show CPU usage. A world is an ESX Server VMkernel schedulable entity, similar to a process or thread in other operating systems. It can represent a virtual machine, a component of the VMkernel or the service console.  A group contains multiple worlds.

Counter values
esxtop doesn’t create performance metrics, it derives performance metrics from raw counters exported in the VMkernel system info nodes.  (VSI nodes) Esxtop can show new counters on older ESX systems if the raw counters are present in VMkernel.  (i.e. 4.0 can display 4.1 counters)
Many raw counters have static values that don’t change with time. A lot of counters increment monotonically, esxtop reports the delta for these counters for a given refresh interval. For instance the counters for CMDS/sec, packets transmitted/sec and READS/sec display information captured every second.

Counter normalisation
By default counters are displayed for the group, in group view the counters are cumulative whereas in expanded view, counters are normalised per entity. Because of the cumulative stats, percentage displayed can often exceed 100%. To view expanded stats, press e.

esxtop takes snapshots. Snapshots are every 5 seconds by default. To display a metric it takes two snapshots and makes a comparison of the two to display the difference. The lowest snapshot value is 2 seconds. Metrics such as %USED and %RUN show the CPU occupancy delta between successive snapshots.

The manual for (r)esxtop is full of useful examples, and unlike normal commands run on ESXi it will work if you run it from the vMA in the same way that it does from ESX.

To use just type man esxtop or man resxtop.

Commands for (r)esxtop can be split into two distinct types, running commands and interactive commands

Running commands
These are commands that get placed at the end of the initial call of (r)esxtop.
#esxtop -d 05

This command would set a delay on the refresh rate of 5 seconds.

Running commands examples
-d – sets delay.
-o – sets order of Columns.
-b – batch mode – I will explain this in further detail below.

Interactive commands
These are commands that work on a key press once (r)esxtop is up and running.  If in any doubt of which command to use just hit h for help.

Interactive commands examples
c – CPU resource utilisation.
m – memory resource utilisation.
d – storage (disk) adapter resource utilisation.
u – storage device resource utilisation.
v – storage VM resource utilisation.
f – displays a panel for adding or removing statistics columns to or from the current panel.
n – network resource utilisation.
h – help.
o – displays a panel for changing the order of statistics.
i – interrupt resource utilisation.
p – power resource utilisation.
q – quit.
s – delay of updates in seconds. (can use fractional numbers)
w – write current setup to the config file. (esxtop4c)

From the field selection panel, accessible by pressing o you can move a field to the left by pressing the corresponding uppercase letter and you can move a field to the right by pressing the corresponding lowercase letter.  The currently active fields are shown in uppercase letters, and with an asterix to show it is selected in the order field selection screen.

Batch mode (-b)
Batch mode is used to create a CSV file which is compatible with Microsoft perfmon and ESXplot. For reading performance files ESXplot is quicker than perfmon. CSV compatibility requires a fixed number of columns on every row. Because of this reason statistics of vm (world) instances that appear after starting the batch mode are not collected. Only counters that are specified in the configuration file are collected. Using the -a option collects all counters. Counters are named slightly differently to be compatible with perfmon.
To use batch mode select the columns you want in interactive mode and then save with W, then run
#esxtop -b -n 10 > esxtopfilename.csv

-a – all.
-b – Batch mode.
-c – User defined configuration file.
-d – Delay between statistics snapshots. (minimum 2 seconds)
-n – Number of iterations before exiting.

To read the batch mode output file is to load it in Windows perfmon.
(1) Run perfmon
(2) Type “Ctrl + L” to view log data
(3) Add the file to the “Log files” and click OK
(4) Choose the counters to show the performance data.

Esxtop reads it’s default configuration from a .esxtopp4rc file. This file contains 8 lines. The first 7 lines are upper and lowercase letters to specify which fields appear, default is CPU, memory, storage, adapter, storage device, virtual machine storage, network and interrupt. The 8th line contains other options. You can save configuration files to change the default view with in esxtop.

Replay mode
Replay mode interprets data that is collected by issuing the vm-support command and plays back the information as esxtop statistics.  replay mode accepts interactive commands until no more snapshots are collected by vm-support.  Replay mode does not process the output of batch mode.

To use replay mode run
#vm-support -S -i 5 -d 60

untar the file using
#tar -xf /root/esx*.tgz

then run
#esxtop -R root/vm-support*

-S – Snapshot mode, prompts for the delay between updates, in seconds.
-R – Path to the vm-support collected snapshot’s directory.

CPU load average of 1.00 means full utilisation of all CPU’s. A load of 2.00 means the host is using twice as many physical CPU’s as are currently available, likewise 0.50 means half are being utilised.

CPU Screen
– Physical hardware execution context. Can be a physical CPU core if hyperthreading is unavilable or disabled or a logical CPU (LCPU) or SMT thread if hyperthreading is enabled. This displays PCPU percentage of CPU usage when averaged over all PCPUs.
PCPU UTIL(%) – Physical CPU utilised. (real time) Indicates how much time the PCPU was busy, in an unhalted state, in the last snapshot duration. Might differ from PCPU USED(%) due to power management technologies or hyperthreading.

If hyper threading is enabled these figures can be different, likewise if the frequency of the PCPU is changed due to power management these figures can also be adjusted.

As an example if PCPU USED(%) is 100 and PCPU UTIL(%) is 50 this is because hyper threading is splitting the load across the two PCPUs. If you then look in the vSphere client you may notice that CPU usage is 100%. This is because the vSphere client will double the statistics if hyperthreading is enabled.
In a dual core system, each PCPU is charged by the CPU scheduler half of the elapsed time when both PCPUs are busy.

CCPU(%) – Total CPU time as reported by ESX service console. (Not applicable to ESXi)
us – Percentage user time.
sy – Percentage system time.
id – Percentage idle time.
wa – Percentage wait time.
cs/sec – Context switches per second recorded by the ESX Service Console.

CPU panel statistics (c)
ID – resource pool or VM ID of the running worlds resource pool or VM or world ID of running world.
GID – Resource pool ID of the running worlds resource pool or VM.
NAME – err… name.
NWLD – Number of members in a running worlds resource pool or VM.
%USED – CPU core cycles used.
%RUN – CPU scheduled time.
%SYS – Time spent in the ESX(i) VMkernel on behalf of the resource pool, VM or world to processor interrupts.
%WAIT – Time spent in the blocked or busy wait state.
%RDY – Time CPU is ready to run, waiting for something else.

High %RDY and high %USED can imply CPU overcommitment.

Additional fields
%IDLE – As it says. Subtract this from %WAIT to see time waiting for an event. WAIT-IDLE can be used to estimate guest I/O wait time.
%MLMTD (max limited – Time VMkernel didn’t run because it would violate limit settings on the resource pool, VM or world limits setting.
%SWPWT – Wait time for swap memory.

CPU ALLOC – CPU allocation.  Set of CPU statistics made up of the following. (For a world the % are the % of one physical CPU core)
AMIN – Attribute reservation.
AMAX – Attribute limit.
ASHRS – Attribute shares.

SUMMARY STATS – Only applies to worlds.
CPU – Which CPU esxtop was running on.
HTQ – Indicates whether a world is currently quarantined or not. (Y or N)
TIMER/s – Timer rate for this world.
%OVRLP – Time spent on behalf of a different resource pool/VM or world while the local was scheduled. Not included in %SYS.
%CSTP – Time the vCPUS of a VM spent in the co-stopped state, waiting to be co-started. This gives an indication of the co-scheduling overhead incurred by the VM. If this value is high and the CPU Ready time is also high, this represents that the VM has too many vCPUs. If low, then any performance problems should be attributed to other issues and not to the co-scheduling of the VM’s vCPU.

Single key display settings
e – expand. Displays utilisation broken down by individual worlds belonging to a resource pool or VM. All %’s are for individual worlds of a single physical CPU.
u – Sort by %USED column.
r – Sort by %RDY.
n – Sort by GID. (default)
v – VM instances only.
l – Length of NAME column.

CPU clock frequency scaling
%USED – CPU usage with reference to the base core frequency, i.e. the actual CPU value in Mhz.
%UTIL – CPU utilisation with reference to the current clock frequency. (displayed as %)
%RUN – Total CPU scheduled time. Displayed as %. If using turbo boost will show greater than 100%.

%UTIL may be different if turbo boost in enabled. To better define this, if a CPU has a base core clock speed of 1Ghz, with turbo boost is 1.5Ghz then %USED (with turbo boost enabled) is 150%. So the current CPU speed (%UTIL) will be that 150% displayed as a value of 100%. Now if the current used CPU is 1Ghz then the current %UTIL will be 50% and not 100%. (as per the base core frequency) Consider this when monitoring these stats.

Interrupt panel (i)
VECTOR – Interrupt vector ID.
COUNT/s – Interrupts per second on CPU x
TIME/int – Average processing time per interrupt. (in micro seconds)
TIME_x – Average processing time on CPU. (in micro seconds)
DEVICES – Devices that use the interrupt vector. If the interrupt vector is not enabled name is in < > brackets.


The following counters and statistics assume a basic understanding of memory management in a virtualised environment.  Check out my Understanding virtual machine memory guide for a brief overview of memory management.

Memory screen (m)
PMEB(MB) – Machine memory statistics.
Total – Yup you guessed it, total.
COS – Amount allocated to the service console.
VMK – Machine memory being used by the ESX(i) VMkernel.
Other – Everything else.
Free – Machine memory free
VMKMEM(MB) – Statistics for VMkernel in MB.
Managed – Total amount.
Min free – Minimum amount of machine memory VMKernel aims to keep free.
RSVD – Reserved by resource pools.
USVD – Total unreserved.
State – Values are high, soft, hard, low. (Pressure states)
COSMEM(MB) – Statistics as reported by the service console.
Free – Amount of idle memory.
Swap_t – Total swap configured.
Swap_f – Swap free.
r/s is – Rate at which memory is swapped in from disk.
w/s – Rate at which memory is swapped to disk.
NUMA(MB) – Only if running on a NUMA server.
PSHARE (MB) – Page sharing.
shared – Shared memory.
common – Across all worlds.
saving – Saved due to transparent page sharing.
curr – Current.
target – What the ESX(i) system expects the swap usage to be.
r/s – swapped from disk.
w/s – swapped to disk.
MEM CTL(MB) – Balloon statistics.
curr – Amount reclaimed.
target – Host attempt reclaims using the balloon driver, vmmemctl.
max – Maximum amount the host can reclaim using vmmemctl.


AMIN – Memory reservation.
AMAX – Memory limit. A value if -1 means unlimited.
ASHRS – Memory shares.
NHN – Current home node for resource pool or VM. (NUMA only)
NRMEM (MB) – Current amount of remote memory allocated. (NUMA only)
N% L – Current % of memory allocated to the VM or resource pool that’s local.
MEMSZ (MB) – Amount of phyiscal memory allocated to a resource pool or VM.
GRANT (MB) – Guest memory mapped.
SZTGT (MB) – Amount the VMkernel wants to allocate.
TCHD (MB) – Working set estimate.
%ACTV – % guest physical memory referenced by the guest.
%ACTVS – Slow moving version of the above.
%ACTVF – Fast moving.
%ACTVN – Estimation. (This is intended for VMware use only)
MCTL – Memory balloon drive installed or not. (Y/N)
MCTLSZ (MB) – Amount of physical memory reclaimed by ballooning.
MCTLTGT (MB) – Attempts to reclaim by ballooning.
MCTLMAX (MB) – Maximum that can be reclaimed by ballooning.
SWCUR (MB) – Current swap.

m – Sort by group mapped column.
b – sort by group Memch column.
n – sort by group GID column. (Default)
v – Display VM instances only.
l – Display length of the NAME column.


The network stats are arranged per port of a virtual switch.

PORT-ID identifies the port and DNAME shows the virtual switch name. UPLINK indicates whether the port is an uplink. If the port is an uplink, i.e., UPLINK is ‘Y’, USED-BY shows the physical NIC name. If the port is connected by a virtual NIC, i.e., UPLINK is ‘N’, USED-BY shows the port client name.

Network panel (n)
PORT-ID – Port ID.
UPLINK – Uplink enabled.(Y or N)
UP – Guess what.
SPEED – Link in MB.
FDUPLX – Full duplex.
USED-BY – VM device port user.
DTYP – Virtual network device type. (H=hub, S=switch)
DNAME – Device name.

T – Sort by Mb transmitted.
R – Sort by Mb received.
t – Packets transmitted.
r – Packets received.
N – Port-ID. (default)
L – Length of DNAME column.


Storage Panels
– disk adapter.
– disk device. (also includes NFS if ESX(i) host is 4.0 Update 2 or later)
– disk VM.

An I/O request from an application in a virtual machine traverses through multiple levels of queues, each associated with a resource of some sort, whether that is the guest OS, the VMkernel or the physical storage.  Each queue has an associated latency.
Esxtop shows the storage statistics in three different screens; the adapter screen, device screen and vm screen.
By default data is rolled up to the highest level possible for each screen.  On the adapter screen the statistics are aggregated per storage adapter by default, but the can be expanded to display data per storage channel, target, path or world using a LUN.
On the device screen statistics are aggregated per storage device by default and on the VM screen, statistics are aggregated on a per-group basis.  One VM has one corresponding group, so they are equivalent to per-vm statistics. Use interactive command V to show only statistics related to VMs.

Queue Statistics
AQLEN – The storage adapter queue depth.
LQLEN – The LUN queue depth.
WQLEN – The World queue depth.
ACTV – The number of commands in the ESX Server VMKernel that are currently active. QUED The number of commands queued.
LOAD – The ratio of the sum of VMKernel active commands and VMKernel queued commands to the queue depth.
%USD – The percentage of queue depth used by ESX Server VMKernel active commands.

%USD = ACTV / QLEN * 100%

I/O throughput statistics
CMDS/s – Number of commands issued per second.
READS/s – Number  of read commands issued per second.
WRITES/s – Number of write commands issued per second.
MBREAD/s – MB reads per second.
MBWRTN/s – MB written per second.

I/O latencies
I/O latencies are measured per SCSI command so it is not affected by the refresh interval. Reported latencies are average values for all the SCSI commands issued within the refresh interval window. Reported average latencies can be different on different screens, (adapter, LUN, VM) since each screen accounts for different group of I/O’s.

Latency statistics
This group of counters report latency values.  These are under the labels GAVG, KAVG and DAVG.  GAVG is the sum of DAVG and KAVG.

GAVG – round-trip latency that the guest sees for all IO requests sent to the virtual storage device.(should be under 25)
KAVG – latencies due to the ESX Kernel’s command. should be small in comparison to DAVG DAVG latency seen at the device driver level. includes the roundtrip time between the HBA and the storage. (should be 2 or less)
QAVG – average queue latency. QAVG is part of KAVG (should be zero)

Storage adapter
CID – Channel ID.
TID – Target ID.

e – Expand/rollup storage adapter statistics.
p – Same as e but doesn’t roll up to adapter statistics.
a – Expand/rollup storage channel statistics.
t – Expand/rollup storage target statistics.
r – Sort by READ/s.
w – Sort by WRITES/s.
R – sort by MBREADS/s.
T – Sort by MBWRTN/s.


Further information can be found at Duncan Eppings blog and in the excellent VMware communities post on Interpreting esxtop Statistics.

vMotion CPU Compatibility

vMotion has quite a few requirements that need to be in place before it will work correctly. Here is a list of the key requirements for vMotion to work.

  • Each host must be correctly licensed
  • Each host must meet shared storage requirements
  • Each host must meet the networking requirements
  • Each compatible CPU must be from the same family


When configuring vMotion between hosts I would recommend keeping to one brand of server per cluster, i.e. Dell, HP, IBM. Also always ensure that these servers are compatible with each other.  You can confirm this by speaking to the server manufacturer.
A very important item to consider is to always ensure you are using the latest BIOS version on each of your hosts.

Ensuring that the CPU’s are compatible with each other is essential for vMotion to work successfully, this is because the host that the virtual machine migrates to has to be capable of carrying on any instructions that the first host was running.
If a virtual machine is successfully running an application on one host and you migrate it to another host without these capabilities the application would most likely crash, possibly even the whole server would crash, hence why vMotion compatibility is required between hosts before you can migrate a running virtual machine.

It is user-level instructions that bypass the virtualisation layer such as Streaming SIMD Extensions (SSE), SSE2 SSSE3, SSE4.1 and Advanced Encryption Standard (AES) Instruction Sets that can differ greatly between CPU models and families of processors, and so can cause application instability after the migration.

Always ensure that all hardware is on the VMware compatibility guide.
To confirm compatibility between same family CPU models check the charts below.

This is a chart from Dell showing which Intel CPU’s support vMotion.

This second chart also from Dell illustrates which AMD processors support vMotion

Further information on vMotion requirements between hosts can be found in the vSphere Datacenter Administration Guide

Virtual Machine Memory Overhead

Windows virtual machines require more memory with each passing release and software demands on memory are becoming larger all the time.  In a vitual environment it is quite simple to increase the amount of virtual memory granted to a virtual machine, especially with features such as hot add.  The ability to dynamically increase the amount of memory available to the virtual machine whilst it is still turned on.

Hot add has a few requirements.
From a licensing point of view you need vSphere Advanced, Enterprise or Enterprise plus as a minimum.
The chart below illustrates the Windows Operating Systems that support Hot Add.

* Server must be rebooted for memory to be registered in Windows.

Adding additional RAM to a virtual machine is so easy these days that it is important to not over look the effects that increasing the virtual machine memory can cause on the host and the virtual machine datastore.
Every Gigabyte of virtual machine memory demands more memory overhead from the ESX host.  The overhead on the host is illustrated in the table below.

As you can see, the total amount of RAM the hosts requires to power your virtual machines and future ones can increase dramatically when you start adding more RAM to your virtual machines, especially when adding more vCPU’s.  This is also true with the multi-cored vCPU in vSphere 4.1, as explained by Jason Boche.

To put this into perspective, if you have a  host with ESX installed with two virtual machine running 2 VCPUs and 4GB RAM each you would need to allow 10.1GB RAM as a minimum.

1.6GB (ESX)+ 4GB (VM1)+ 242.51MB (Overhead VM1)+4GB (VM2)+ 242.51MB (Overhead VM2)=10.1GB

Depending on the load on the virtual machines, they may require up to the full 4GB assigned to them, this leaves no additional memory available.  This is not something I would recommend.  I would instead assign at least 12GB to this configuration, more likely more so that there is room for expansion at a later date, or adding additional virtual machines.

Storage is another consideration when upping the amount of virtual machine memory.  The virtual machine swap file, .vswp, is equal to the size of the virtual machine RAM installed. Ensure this is taken into account when thinking about what size datastore is required to store the virtual machines.  The alternative is to specify a seperate, fast access datastore for storing the virtual machine swap file.  If possible use SSD drives.
This can be set by following the instructions for Storing a virtual machine swap file in a location other than the default on the VMware website.

With careful planning, running out of memory or storage should never happen if you follow VMware best practices for creating virtual machines.  More information can be found in the vSphere 4.1 Performance Best Practices guide.
Additional information on VMFS datastore storage allocation can be found here

Windows Server 2008 Enterprise x64 a
Windows Server 2008 Enterprise x86 a
Windows Server 2008 Standard x64 a *
Windows Server 2008 Standard x86 a *
Windows Server 2003 Enterprise x64 a
Windows Server 2003 Enterprise x86 a
Windows Server 2003 Standard x64 x
Windows Server 2003 Standard x86 x

VMFS Datastore Free Space Calculations

As technology progresses, storage requirements grow.  It seems to be a never ending pattern.  I remember only a few years ago the maximum configurable LUN size of 2TB seemed huge.  Now it is common to have many LUN carvings making up tens of Terabytes of SAN storage.
The downside to all this extra storage is demand for larger virtual machine disks then you find that the VMFS datastores get filled up in no time.
This is something we are all aware of , and it is something we can avoid with enough planning  done ahead of time.   (Preventing it filling up not stopping demand for more space that is!)

Before adding any additional virtual machine drives it is important to ensure that enough free space is available for the virtual machines already setup.
In order to calculate the minimum free space required  use the following formula courtesy of ProfessionalVMware.

(Total Virtual machine VMDK disk sizes + Total Virtual Machine RAM Sizes * 1.1)*1.1 + 12GB

This formula can be used to work out what size the VMFS datastore needs to be. Once you work that out you can deduct this from the total available space on the VMFS datastore to see how much space can be used for additional drives without resorting to just adding disks until the vSphere Server complains it is running out of free space.

This will allow enough for the local installation of ESX(i) and an additional 10% for snapshots, plus an additional 10% for overhead.  (12GB for an ESXi install is a little excessive but I would still recommend leaving this much space as it will be required before you know it.)

ProfessionalVMware have provided this handy excel spreadsheet for working this out for you.

This formula can prove useful when planning how much storage space is required when performing a P2V migration.  This way you can be sure to manage expectations so that you are fully aware from the beginning how much free space you have available in the VMFS datastore.
This is a recommended minimum, you may need to leave more free space depending on the requirements.  ISO files, templates etc will also need to be taken into account.

Following the calculations you may find that the requirements for free space has been met but you are getting alarms in the vSphere Client saying you are running out of free space.
The alarms within the vSphere client are set to send a warning alert you when 75% of the datastore is in use, and an error when 85% is in use.
This can be adjusted if required by clicking the top level and selecting the alarms tab within the vSphere client.

VMware NIC Trunking Design

Having read various books, articles, white papers and best practice guides I have found it difficult to find consistently good advice on vNetwork and physical switch teaming design so I thought I would write my own based on what I have tested and configured myself.

To begin with I must say I am no networking expert and may not cover some of the advanced features of switches, but I will provide links for further reference where appropriate.


The basics

Each physical ESX(i) host has at least one physical NIC (pNIC) which is called an uplink.

Each uplink is known to the ESX(i) host as a vmnic.

Each vmnic is connected to a virtual switch (vSwitch).

Each virtual machine on the ESX(i) host has at least one virtual NIC (vNIC) which is connected to the vSwitch.

The virtual machine is only aware of the vNIC, only the vSwitch is aware of the uplink to vNIC relationship.

This setup offers a one to one relationship between the virtual machine (VM) connected to the vNIC and the pNIC connected to the physical switch port, as illustrated below.

When adding another virtual machine a second vNIC is added, this in turn is connected to the vSwitch and they share that same pNIC and the physical port the pNIC is connected to on the physical switch (pSwitch).

When adding more physical NIC’s we then have additional options with network teaming.


NIC Teaming

NIC teaming offers us the option to use connection based load balancing, which is balanced by the number of connections and not on the amount of traffic flowing over the network.

This load balancing can provide us resilience on our connections by monitoring the links and if a link goes down, whether it’s the physical NIC or physical port on the switch, it will resend that traffic over the remaining uplinks so that no traffic is lost.  It is also possible to use multiple physical switches provided they are all on the same broadcast range.  What it will not do is to allow you to send traffic over multiple uplinks at once, unless you configure the physical switches correctly.

There are four options with NIC teaming, although the fourth is not really a teaming option

  1. Port-based NIC teaming
  2. MAC address-based NIC teaming
  3. IP hash-based NIC teaming
  4. Explicit failover

Port-based NIC teaming

Route based on the originating virtual port ID or port-based NIC teaming as it is commonly known as will do as it says and route the network traffic based on the virtual port on the vSwitch that it came from.   This type of teaming doesn’t allow traffic to be spread across multiple uplinks.  It will keep a one to one relationship between the virtual machine and the uplink port when sending and receiving to all network devices.  This can lead to a problem where the amount of physical ports exceeds the number of virtual ports as you would then end up with uplinks that don’t do anything.  As such the only time I would recommend using this type of teaming is when the amount of virtual NIC’s exceeds the number of physical uplinks.

MAC address-based NIC teaming

Route based on MAC hash or MAC address-based NIC teaming sends the traffic out of the originating vNIC’s MAC address.  This works in a similar way to the port-based NIC teaming in that it will send its network traffic over only one uplink.  Again the only time I would recommend using this type of teaming is when the amount of virtual NIC’s exceeds the number of physical uplinks.

IP hash-based NIC teaming

Route based on IP hash or IP hash-based NIC teaming works differently from the other types of teaming.  It takes the source and destination IP address and creates a hash.  It can work on multiple uplinks per VM and spread its traffic across multiple uplinks when sending data to multiple network destinations.

Although IP-hash based can utilise multiple uplinks it will only use one uplink per session.  This means that if you are sending a lot of data between one virtual machine and another server that traffic will only transfer over one uplink.  Using the IP hash based teaming we can then use teaming or trunking options on the physical switches.  (Depending on the switch type)  IP hash requires Ether Channel (again depending on switch type) which for all other purposes should be disabled.

Explicit failover

This allows you to override the default ordering of failover on the uplinks.  The only time I can see this being useful is if the uplinks are connected to multiple physical switches and you wanted to use them in a particular order.  Either that or you think a pNIC In the ESX(i) host is not working correctly.  If you use this setting it is best to configure those vmnics or adapters as standby adapters as any active adapters will be used from the highest in the order and then down.



The other options

Network failover detection

There are two options for failover detection.  Link status only and beacon probing.  Link status only will monitor the status of that link, to ensure that a connection is available on both ends of the network cable. If it becomes disconnected it will mark it as unusable and send the traffic over the remaining NIC’s.  Beacon probing will send a beacon up the network on all uplinks in the team.  This includes checking that the port on the pSwitch is available and is not being blocked by configuration or switch issues.  Further information is available on page 44 of the ESXi configuration guide.  Do not set to beacon probing if using route based on IP-hash.


Notify switches

This should be left set to yes (default) to minimise route table reconfiguration time on the pSwitches.  Do not use this when configuring Microsoft NLB in unicast mode.


Failback will re-enable the failed uplink when it is working correctly and send the traffic over it that was sent over the standby uplink.  Best practice is to leave this set to yes unless using IP based storage.  This is because if the link were to go up and down quickly it could have a negative impact on iSCSI traffic performance.

Incoming traffic is controlled by the pSwitch routing the traffic to the ESX(i) host, and hence the ESX(i) host has no control over which physical NIC traffic arrives. As multiple NIC’s will be accepting traffic, the pSwitch will use whichever one it wants.

Load balancing on incoming traffic can be achieved by using and configuring a suitable pSwitch.

pSwitch configuration

The topics covered so far describe egress NIC teaming, with physical switches we have the added benefit of using ingress NIC teaming.

Various vendors support teaming on the physical switches, however quite a few call trunking teaming and vice-versa.

From the switches I have configured I would recommend the following.

All Switches

A lot of people recommend disabling Spanning Tree Protocol (STP) as vSwitches don’t require it as they know the MAC address of every vNIC connected to it.  I have found that the best practice is to enable STP and set it to Portfast.  Without Portfast enabled there can be a delay whereby the pSwitch has to relearn the MAC addresses again during convergence which can take 30-50 seconds.  Without STP enabled there is a chance of loops not being detected on the pSwitch.

802.3ad & LACP

Link aggregation control protocol (LACP) is a dynamic link aggregation protocol (LAG) which can dynamically make other switches aware of the multiple links and combine them into one single logical unit.  It also monitors those links and if a failure is detected it will remove that link from the logical unit.

VMware doesn’t support LACP.  However VMware does support IEEE 802.3ad which can be achieved by configuring a static LACP trunk group or a static trunk.  The disadvantage of this is that if one of those links goes down, 802.3ad static will continue to send traffic down that link.


Dell switches

Set Portfast using

Spanning-tree portfast

To configure follow my Dell switch aggregation guide

Further information on Dell switches is available through the product manuals.

Cisco switches

Set Portfast using

Spanning-tree portfast (for an access port)

Spanning-tree portfast trunk (for a trunk port)

Set etherchannel

Further information is available through the Sample configuration of EtherChannel / Link aggregation with ESX and Cisco/HP switches

HP switches

Set Portfast using

Spanning-tree portfast (for an access port)

Spanning-tree portfast trunk (for a trunk port)

Set static LACP trunk using

trunk < port-list > < trk1 … trk60 > < trunk | lacp >

Further information is available through the Sample configuration of EtherChannel / Link aggregation with ESX and Cisco/HP switches



ESX upgrade guide using putty

01:On a Windows box, download the patch bundle directly from VMware. This will be .zip file.

02:On a Windows box with the vSphere client installed, download and install FastSCP. Create a folder called updates in the var file and copy the upgrade files into the Updates on the ESX host.
03::Obtain local console access, or SSH putty.exe, to the ESX 4 host that the bundle file was uploaded to and confirm there is enough free space by running vdf –h

Note: The directory /var/updates is used in this document, but any directory on a partition with adequate free space could substituted.

04:Verify that the patch bundles aren’t already installed (or if they are required), using the command:

esxupdate query
05:Use the vSphere client to put the ESX 4 host in maintenance mode. Alternatively, use the command:

vimsh -n -e /hostsvc/maintenance_mode_enter
06:With networks with multiple hosts and shared storage the virtual machines can be left powered on.

The following commands can be used to list and then shut down virtual machines. This is for environments without VMotion or for single hosts.

vmware-cmd -s listvms

vmware-cmd <full path to .vmx file> stop soft
07:To determine which bulletins in the bundle are applicable to this ESX 4 host, use the command:

esxupdate –bundle file:///var/updates/upgrade_file_name.zip scan
08::To check VIB signature, dependencies, and bulletin order without doing any patching (a dry run), use the command:

esxupdate –bundle file:///var/updates/ upgrade_file_name.zip stage
09::If the stage (dry run) found no problems, then the bundle can be installed using the command:

esxupdate –bundle file:///var/updates/ upgrade_file_name.zip update
10:When (or IF) prompted to reboot, use the command:

Note: Not all patches will require an ESX host reboot.

11:After the system boots, verify patch bundles were installed with the command:

esxupdate query
12:If applicable, take the ESX host out of maintenance mode with the command:

vimsh -n -e /hostsvc/maintenance_mode_exit
13:If applicable, restart virtual machines using the vSphere client or the following command:

vmware-cmd <full path to .vmx file> start

note: automatic virtual machine startup will not work if the ESX host is powered on in maintenance mode
14:Delete the bundle zip file from the /var/updates folder, using the command:

rm /var/updates/*.zip
15:Verify that host disk free space is still acceptable, using the command:

vdf -h