VMware NIC Trunking Design

image_pdfimage_print

Having read various books, articles, white papers and best practice guides I have found it difficult to find consistently good advice on vNetwork and physical switch teaming design so I thought I would write my own based on what I have tested and configured myself.

To begin with I must say I am no networking expert and may not cover some of the advanced features of switches, but I will provide links for further reference where appropriate.

 

The basics

Each physical ESX(i) host has at least one physical NIC (pNIC) which is called an uplink.

Each uplink is known to the ESX(i) host as a vmnic.

Each vmnic is connected to a virtual switch (vSwitch).

Each virtual machine on the ESX(i) host has at least one virtual NIC (vNIC) which is connected to the vSwitch.

The virtual machine is only aware of the vNIC, only the vSwitch is aware of the uplink to vNIC relationship.

This setup offers a one to one relationship between the virtual machine (VM) connected to the vNIC and the pNIC connected to the physical switch port, as illustrated below.

When adding another virtual machine a second vNIC is added, this in turn is connected to the vSwitch and they share that same pNIC and the physical port the pNIC is connected to on the physical switch (pSwitch).

When adding more physical NIC’s we then have additional options with network teaming.

 

NIC Teaming

NIC teaming offers us the option to use connection based load balancing, which is balanced by the number of connections and not on the amount of traffic flowing over the network.

This load balancing can provide us resilience on our connections by monitoring the links and if a link goes down, whether it’s the physical NIC or physical port on the switch, it will resend that traffic over the remaining uplinks so that no traffic is lost.  It is also possible to use multiple physical switches provided they are all on the same broadcast range.  What it will not do is to allow you to send traffic over multiple uplinks at once, unless you configure the physical switches correctly.

There are four options with NIC teaming, although the fourth is not really a teaming option

  1. Port-based NIC teaming
  2. MAC address-based NIC teaming
  3. IP hash-based NIC teaming
  4. Explicit failover

Port-based NIC teaming

Route based on the originating virtual port ID or port-based NIC teaming as it is commonly known as will do as it says and route the network traffic based on the virtual port on the vSwitch that it came from.   This type of teaming doesn’t allow traffic to be spread across multiple uplinks.  It will keep a one to one relationship between the virtual machine and the uplink port when sending and receiving to all network devices.  This can lead to a problem where the amount of physical ports exceeds the number of virtual ports as you would then end up with uplinks that don’t do anything.  As such the only time I would recommend using this type of teaming is when the amount of virtual NIC’s exceeds the number of physical uplinks.

MAC address-based NIC teaming

Route based on MAC hash or MAC address-based NIC teaming sends the traffic out of the originating vNIC’s MAC address.  This works in a similar way to the port-based NIC teaming in that it will send its network traffic over only one uplink.  Again the only time I would recommend using this type of teaming is when the amount of virtual NIC’s exceeds the number of physical uplinks.

IP hash-based NIC teaming

Route based on IP hash or IP hash-based NIC teaming works differently from the other types of teaming.  It takes the source and destination IP address and creates a hash.  It can work on multiple uplinks per VM and spread its traffic across multiple uplinks when sending data to multiple network destinations.

Although IP-hash based can utilise multiple uplinks it will only use one uplink per session.  This means that if you are sending a lot of data between one virtual machine and another server that traffic will only transfer over one uplink.  Using the IP hash based teaming we can then use teaming or trunking options on the physical switches.  (Depending on the switch type)  IP hash requires Ether Channel (again depending on switch type) which for all other purposes should be disabled.

Explicit failover

This allows you to override the default ordering of failover on the uplinks.  The only time I can see this being useful is if the uplinks are connected to multiple physical switches and you wanted to use them in a particular order.  Either that or you think a pNIC In the ESX(i) host is not working correctly.  If you use this setting it is best to configure those vmnics or adapters as standby adapters as any active adapters will be used from the highest in the order and then down.

 

 

The other options

Network failover detection

There are two options for failover detection.  Link status only and beacon probing.  Link status only will monitor the status of that link, to ensure that a connection is available on both ends of the network cable. If it becomes disconnected it will mark it as unusable and send the traffic over the remaining NIC’s.  Beacon probing will send a beacon up the network on all uplinks in the team.  This includes checking that the port on the pSwitch is available and is not being blocked by configuration or switch issues.  Further information is available on page 44 of the ESXi configuration guide.  Do not set to beacon probing if using route based on IP-hash.

 

Notify switches

This should be left set to yes (default) to minimise route table reconfiguration time on the pSwitches.  Do not use this when configuring Microsoft NLB in unicast mode.

Failback

Failback will re-enable the failed uplink when it is working correctly and send the traffic over it that was sent over the standby uplink.  Best practice is to leave this set to yes unless using IP based storage.  This is because if the link were to go up and down quickly it could have a negative impact on iSCSI traffic performance.

Incoming traffic is controlled by the pSwitch routing the traffic to the ESX(i) host, and hence the ESX(i) host has no control over which physical NIC traffic arrives. As multiple NIC’s will be accepting traffic, the pSwitch will use whichever one it wants.

Load balancing on incoming traffic can be achieved by using and configuring a suitable pSwitch.

pSwitch configuration

The topics covered so far describe egress NIC teaming, with physical switches we have the added benefit of using ingress NIC teaming.

Various vendors support teaming on the physical switches, however quite a few call trunking teaming and vice-versa.

From the switches I have configured I would recommend the following.

All Switches

A lot of people recommend disabling Spanning Tree Protocol (STP) as vSwitches don’t require it as they know the MAC address of every vNIC connected to it.  I have found that the best practice is to enable STP and set it to Portfast.  Without Portfast enabled there can be a delay whereby the pSwitch has to relearn the MAC addresses again during convergence which can take 30-50 seconds.  Without STP enabled there is a chance of loops not being detected on the pSwitch.

802.3ad & LACP

Link aggregation control protocol (LACP) is a dynamic link aggregation protocol (LAG) which can dynamically make other switches aware of the multiple links and combine them into one single logical unit.  It also monitors those links and if a failure is detected it will remove that link from the logical unit.

VMware doesn’t support LACP.  However VMware does support IEEE 802.3ad which can be achieved by configuring a static LACP trunk group or a static trunk.  The disadvantage of this is that if one of those links goes down, 802.3ad static will continue to send traffic down that link.

 

Dell switches

Set Portfast using

Spanning-tree portfast

To configure follow my Dell switch aggregation guide

Further information on Dell switches is available through the product manuals.

Cisco switches

Set Portfast using

Spanning-tree portfast (for an access port)

Spanning-tree portfast trunk (for a trunk port)

Set etherchannel

Further information is available through the Sample configuration of EtherChannel / Link aggregation with ESX and Cisco/HP switches

HP switches

Set Portfast using

Spanning-tree portfast (for an access port)

Spanning-tree portfast trunk (for a trunk port)

Set static LACP trunk using

trunk < port-list > < trk1 … trk60 > < trunk | lacp >

Further information is available through the Sample configuration of EtherChannel / Link aggregation with ESX and Cisco/HP switches

 

 

21 comments

  • Paul

    Hello,

    Great post, thanks for taking the time to. Couple of questions if I may;

    1) You mentioned that VMware does not support LACP. Do you have a reference for this?
    2) At the end of the article you show how to configure LACP for various brands of physical switches, but earlier you said this wasn’t a supported option. Can you please clarrify that point for me?

    • admin

      Hi Paul,

      Thanks for your comments.

      Link aggregation can be configured as either dynamic or static. Dynamic configuration is supported using the IEEE 802.3ad standard, which is known as Link Aggregate Control Protocol (LACP). So you have two types, LACP dynamic and LACP static. VMware does support IEEE 802.3ad static as shown here in this vSphere 4.0 article on the VMware site

      I hope this answers your questions.

  • Bruce He

    Hi Simon

    thank for your great work!

    I have a question for teaming on vmware. Since there are several physical NICs connected to physical switch, are there ways to observe the detail traffic on each physical NIC? For example, traffic vmA to B is on NIC-A, where vmC to D is on NIC-B

    • Simon Greaves

      Hi Bruce,

      You can view virtual machine traffic stats on the performance tab of the vSwitch or virtual machines. For detailed information you would need to use a packet capture device of some sort, such as Wireshark.

      Regards,

      Simon

  • 30ma

    i have the number of 25 vlans in switch 3750 and i want to related them to esxi5 server
    interface vlan 3001
    ip address 10.128.21.254
    !inreface vlan 3002
    ip address 10.128.22.254
    !
    .
    .
    .
    interface vlan 3025
    ip address 10.128.45.254

    • Simon Greaves

      Hi 30ma,

      If these 25 VLANs are for virtual machine traffic you can just create a portgroup on the Standard or Distributed Switch within vCenter and give each portgroup the appropriate VLAN tag then just add the network interface of each VM to the appropriate portgroup.

      Hope this helps!

      Simon

  • David

    What if I am working with two physical Cisco switches that are not the same type nor are they stacked: 1GB and 10GB switches

    I want to configure vmnics on the ESX host to failover (not load balance traffic) from the 10GB to the 1GB switch in the event of a catastrophic failure in the 10GB switch.

    Basically, two vmnics are connected from the ESX host, one to each of the switches.

    Can that be accomplished or do I have to rely on Cisco port trunking on a single switch (or switch stack) to accomplish this?

  • Simon Greaves

    Hi David,

    You can easily configure the 1GB to act as the failover vmnic by editing the portgroup setting and selecting the NIC teaming tab, select the override vswitch failover order tick box and ensure the 10Gb NIC is in the active adapters section and the 1Gb NIC is in the standby adapters section.

    This is a good way to minimise single point of failure risk without having to purchase expensive 10Gb NICs for the failover port. Obviously note that the 1Gb NIC will perform much slower than the 10Gb port so ensure that this won’t cause you any issues to the traffic that is flowing on the failed over NIC.

    Also note that active/standby is not supported on iSCSI traffic. This will work for NFS or management/virtual machine traffic as long as the physical switch ports are configured to allow any relevant VLANs in case of failover.

    Thanks for reading!

    Simon

  • David

    ooooo… iSCSI not supported? Then it won’t work for us because the 10GB is handling the iSCSI traffic.

  • www.allnew-tv.de

    I was curious if you evsr considered changing the page layout of
    your blog? Its very well written; I love what youve got to say.
    But maybe you could a little more iin the way off content so people could connect with it better.
    Youve got an awful lot of tet for only having 1 or two pictures.
    Maybe you could space itt out better?

  • wayne gordon

    Hi,

    Your guide “Dell switch aggregation guide” link appears to have been removed.

    Currently we have 3x host with 6x nic assigned as LAN but only 1x nic seems to take any traffic even though all nics are added as the vswitch team in active active mode.

    Many thanks in advance

Leave a Reply

Your email address will not be published. Required fields are marked *


*