Tag : virtual-machine

VMware Virtual Machine Memory Guide

Memory Virtualisation Basics

When an operating system is installed directly onto the physical hardware in a non-virtualised environment, the operating system has direct access to the memory installed in the system and simple memory requests, or pages always have a 1:1 mapping to the physical RAM, meaning that if 4GB of RAM is installed, and the operating system supports that much memory then the full 4GB is available to the operating system as soon as it is requested.  Most operating systems will support the full 4GB, especially if they are 64-bit operating systems.
When an application within the operating system makes a memory page request, it requests the page from the operating system, which in turn passes a free page to the application, so it can perform its tasks.  This is performed seamlessly.

The hypervisor adds an extra level of indirection.  The hypervisor maps the guest physical memory addresses to the machine, or host physical memory addresses.  This gives the hypervisor memory management abilities that are transparent to the guest operating system. It is these memory management techniques that allow for memory overcommitment.

To get a good understanding of memory behaviour within a virtualised environment, let’s focus on three key areas.

  • Memory Terminology
  • Memory Management
  • Memory Reclamation

Memory Terminology

With an operating system running in a virtual machine the memory requested by the application is called the virtual memory, the memory installed in the virtual machine operating system is called the physical memory and the hypervisor adds an additional layer called the machine memory.

To help define the interoperability of memory between the physical RAM installed in the server and the individual applications within each virtual machine, the three key memory levels are also described as follows.
Host physical memory Refers to the memory that is visible to the hypervisor as available on the system.  (Machine memory)
Guest physical memory – Refers to the memory that is visible to the guest operating system running in the virtual machine.  Guest physical memory is backed by host physical memory, which means the hypervisor provides a mapping from the guest to the host memory.
Guest virtual memory – Refers to a continuous virtual address space presented by the guest operating system to applications. It is the memory that is visible to the applications running inside the virtual machine.
To help understand how these layers inter-operate, look at the following diagram.

Host Physical to Guest Physical to Guest Virtual Memory

The Virtual memory creates a uniform memory address space for operating systems that maps application virtual memory addresses to physical memory addresses.  This gives the operating system memory management abilities that are transparent to the application.

Memory Management

Hypervisor Memory Management

Memory pages within a virtualised environment have to negotiate an additional layer, the hypervisor.  The hypervisor creates a contiguous addressable memory space for a virtual machine. This memory space has the same basic properties as the virtual address space that the guest operating system presents to the applications running on it. This allows the hypervisor to run multiple virtual machines simultaneously while protecting the memory of each virtual machine from being accessed by others.

The virtual machine monitor (VMM) controls each virtual machine’s memory allocation. The VMM does this using software-based memory virtualization.
The VMM for each virtual machine maintains a memory mapping from memory pages contained inside the the guest operating system. The mapping is from the guest physical pages to the host physical pages. The host physical memory pages are also called the machine pages.
This memory mapping technique is maintained through a Physical Memory Mapping Data (PMAP) structure.

Each virtual machine sees its memory as a contiguous addressable memory space. The underlying physical machine memory however may not be contiguous. This is because it may be running more than one virtual machine at any one time and it is sharing the memory out amongst the VMs.
The VMM sits between the guest physical memory and the Memory Management Unit (MMU) on the CPU so that the actual CPU cache on the processor is not updated directly by the virtual machine.
The hypervisor maintains the virtual-to-machine page mappings in a shadow page table. The shadow page table is responsible for maintaining consistency between the PMAP and the guest virtual to host physical machine mappings.

The shadow page table is also maintained by the virtual machine monitor. (VMM)

Shadow Page Tables and PMAP

Each processor in the physical machine uses the Translation Lookaside Buffer (TLB) on the processor cache for the direct virtual-to-physical machine mapping updates. These updates comes from the shadow page tables.

Some CPU’s support hardware-assisted memory virtualisation. AMD SVM-V and Intel Xeon 5500 series CPU’s support it. These CPU’s have two paging tables, one for the virtual-to-physical translations and one for the physical-to-machine translations.

Hardware assisted memory virtualisation eliminates the overhead associated with software virtualisation, namely the overhead associated with keeping shadow page tables synchronised with guest page tables, as it uses two layers of page tables in hardware that are synchronized using the processor hardware.

One thing to note with hardware assisted memory virtualisation is that the TLB miss latency is significantly higher. As a result workloads with a small amount of  page table activity will not have a detrimental effect using software virtualisation, whereas workloads with a lot of page table activity are likely to benefit from hardware assistance.

Application Memory Management

An application starts with no memory, it allocates memory through a syscall to the operating system. The application frees up memory voluntarily when not in use through an explicit memory allocation interface with the operating system.

Operating System Memory Management

As far as the operating system is concerned it owns all the physical memory allocated to it.  That is because it has no memory allocation interface with the hardware, only with the virtual machine monitor. It does not explicitly allocate or free physical memory; it defines the in use and available memory by maintaining a free list and an allocated list of physical memory.  The memory is either free or allocated depending on which list it resides on.  This memory allocation list exists in memory.

Virtual Machine Memory Allocation

When a virtual machine starts up it has no physical memory allocated to it.  As it starts up it ‘touches’ memory space, as it does this the hypervisor allocates it physical memory.  With some operating systems this can actually mean the entire amount of memory allocated to the virtual machine is called into active memory as soon as the operating system starts, as is typically seen with Microsoft Windows operating systems.

Memory Reclamation

Transparent Page Sharing (TPS)

Transparent Page Sharing is on-the-fly de-duplication of memory pages by looking for identical copies of memory and deleting all but one copy, giving the impression that more memory is available to the virtual machine.  This is performed when the host is idle.
Hosts that are configured with AMD-RVI or Intel-EPT hardware assist CPUs are able to take advantage of large memory pages where the host will back guest physical memory pages with host physical memory pages in 2MB pages rather than the standard 4KB pages where lage pages are not used. This is because there will be less TLB misses and so will achieve better performance. There is a trade off though as large memory pages will not be shared as the chance of finding a 2MB memory pages that are identical are low and the overhead associated with doing the bit-by-bit comparison of the 2MB pages is greater than the 4KB page.

Large memory pages may still be broken down into smaller 4KB pages during times of contention as the host will generate 4KB hashes for the 2MB large memory pages so that when the host is swapping memory it can use these hashes to share the memory.

It is possible to configure advanced settings on the host to set the time to scan the virtual machines memory, Mem.ShareScanTime and the maximum number of scanned pages per second in the host, Mem.ShareScanGHz and the maximum number of per-virtual machine scanned pages using Mem.ShareRateMax.

Use resxtop and esxtop to view PSHARE field to monitor current transparent page sharing activity.  This is available in memory view.  See below.

PSHARE

You can disable TPS on a particular VM by configuring the advanced setting Sched.mem.Pshare.enable=false.

Ballooning

An ESXi host has no idea how much memory is allocated within a virtual machine, only what the virtual machine has requested. As more virtual machines are added to a host there are subsequently more memory requests and the amount of free memory may become low. This is where ballooning is used.
Provided VMware tools is installed, the ESXi host will load the balloon driver (vmmemctl) inside the guest operating system as a custom device driver.
The balloon driver communicates directly with the hypervisor on the host and during times of contention creates memory pressure inside the virtual machine. This tells the virtual machine that memory is contended, and ‘inflates like a balloon’ by requesting memory from the guest and this memory is then ‘pinned’ by the hypervisor and mapped into host physical memory as free memory that is available for other guest operating systems to use.  The memory that is pinned within the guest OS is configured so that the guest OS will not swap out the pinned pages to disk.  If the guest OS with the pinned pages requests access to the pinned memory pages it will be allocated additional memory by the host as per a normal memory request. Only when the host ‘deflates’ the balloon driver will the guest physical memory pages become available to the guest again.

Ballooning is a good thing to have as it allows the guest operating systems to handle how much of it’s memory to free up rather than the hypervisor which doesn’t understand when a guest OS is finished accessing memory.

Take a look at the figure below and you can see how ballooning works. The VM  has one memory page in use by an application and two idle pages that have been pinned by the hypervisor so that it can be claimed by another operating system.

Balloon Driver In Action

Swapping

The memory transfer between guest physical memory and the host swap device is referred to as hypervisor swapping and is driven by the hypervisor.
The memory transfer between the guest physical memory and the guest swap device is referred to as guest-level paging and is driven by the guest operating system.
Host level swapping occurs when the host is under memory contention. It is transparent to the virtual machine.
The hypervisor will swap random pieces of memory without a concern as to what that piece of memory is doing at that time.  It can potentially swap out currently active memory. When swapping, all segments belonging to a process are moved to the swap area. The process is chosen if it’s not expected to be run for a while.   Before the process can run again it must be copied back into host physical memory.

Compression

Memory compression works by stepping in and acting as a last line of defence against host swapping by compressing the memory pages that would normally be swapped out to disk and compressing them onto the local cache on the host memory.  This means that rather than sending memory pages out to the comparatively slow disk they are instead kept compressed on the local memory within the host which is significantly faster.

Only memory pages that are being sent for swapping and can be compressed by a factor of 50% or higher are compressed, otherwise they are written out to host-level swap file.
Because of this memory compression will only occur when the host is under contention and performing host-level swapping.

Memory compression is sized at 10% of the configured allocated memory on a virtual machine by default to prevent excessive memory pressure on the host as compression size needs to be accounted for on every VM.
You can configure a different value than 10% with the following advanced setting on the virtual machine.  Mem.MemZipMaxPct.

When the compression cache is full the first memory compressed page will be decompressed and swapped out to the hypervisor level cache.

Virtual Machine Memory Overhead

Windows virtual machines require more memory with each passing release and software demands on memory are becoming larger all the time.  In a vitual environment it is quite simple to increase the amount of virtual memory granted to a virtual machine, especially with features such as hot add.  The ability to dynamically increase the amount of memory available to the virtual machine whilst it is still turned on.

Hot add has a few requirements.
From a licensing point of view you need vSphere Advanced, Enterprise or Enterprise plus as a minimum.
The chart below illustrates the Windows Operating Systems that support Hot Add.

* Server must be rebooted for memory to be registered in Windows.

Adding additional RAM to a virtual machine is so easy these days that it is important to not over look the effects that increasing the virtual machine memory can cause on the host and the virtual machine datastore.
Every Gigabyte of virtual machine memory demands more memory overhead from the ESX host.  The overhead on the host is illustrated in the table below.


As you can see, the total amount of RAM the hosts requires to power your virtual machines and future ones can increase dramatically when you start adding more RAM to your virtual machines, especially when adding more vCPU’s.  This is also true with the multi-cored vCPU in vSphere 4.1, as explained by Jason Boche.

To put this into perspective, if you have a  host with ESX installed with two virtual machine running 2 VCPUs and 4GB RAM each you would need to allow 10.1GB RAM as a minimum.

1.6GB (ESX)+ 4GB (VM1)+ 242.51MB (Overhead VM1)+4GB (VM2)+ 242.51MB (Overhead VM2)=10.1GB

Depending on the load on the virtual machines, they may require up to the full 4GB assigned to them, this leaves no additional memory available.  This is not something I would recommend.  I would instead assign at least 12GB to this configuration, more likely more so that there is room for expansion at a later date, or adding additional virtual machines.

Storage is another consideration when upping the amount of virtual machine memory.  The virtual machine swap file, .vswp, is equal to the size of the virtual machine RAM installed. Ensure this is taken into account when thinking about what size datastore is required to store the virtual machines.  The alternative is to specify a seperate, fast access datastore for storing the virtual machine swap file.  If possible use SSD drives.
This can be set by following the instructions for Storing a virtual machine swap file in a location other than the default on the VMware website.

With careful planning, running out of memory or storage should never happen if you follow VMware best practices for creating virtual machines.  More information can be found in the vSphere 4.1 Performance Best Practices guide.
Additional information on VMFS datastore storage allocation can be found here

Windows Server 2008 Enterprise x64 a
Windows Server 2008 Enterprise x86 a
Windows Server 2008 Standard x64 a *
Windows Server 2008 Standard x86 a *
Windows Server 2003 Enterprise x64 a
Windows Server 2003 Enterprise x86 a
Windows Server 2003 Standard x64 x
Windows Server 2003 Standard x86 x

VMFS Datastore Free Space Calculations

As technology progresses, storage requirements grow.  It seems to be a never ending pattern.  I remember only a few years ago the maximum configurable LUN size of 2TB seemed huge.  Now it is common to have many LUN carvings making up tens of Terabytes of SAN storage.
The downside to all this extra storage is demand for larger virtual machine disks then you find that the VMFS datastores get filled up in no time.
This is something we are all aware of , and it is something we can avoid with enough planning  done ahead of time.   (Preventing it filling up not stopping demand for more space that is!)

Before adding any additional virtual machine drives it is important to ensure that enough free space is available for the virtual machines already setup.
In order to calculate the minimum free space required  use the following formula courtesy of ProfessionalVMware.

(Total Virtual machine VMDK disk sizes + Total Virtual Machine RAM Sizes * 1.1)*1.1 + 12GB

This formula can be used to work out what size the VMFS datastore needs to be. Once you work that out you can deduct this from the total available space on the VMFS datastore to see how much space can be used for additional drives without resorting to just adding disks until the vSphere Server complains it is running out of free space.

This will allow enough for the local installation of ESX(i) and an additional 10% for snapshots, plus an additional 10% for overhead.  (12GB for an ESXi install is a little excessive but I would still recommend leaving this much space as it will be required before you know it.)

ProfessionalVMware have provided this handy excel spreadsheet for working this out for you.

This formula can prove useful when planning how much storage space is required when performing a P2V migration.  This way you can be sure to manage expectations so that you are fully aware from the beginning how much free space you have available in the VMFS datastore.
This is a recommended minimum, you may need to leave more free space depending on the requirements.  ISO files, templates etc will also need to be taken into account.

Following the calculations you may find that the requirements for free space has been met but you are getting alarms in the vSphere Client saying you are running out of free space.
The alarms within the vSphere client are set to send a warning alert you when 75% of the datastore is in use, and an error when 85% is in use.
This can be adjusted if required by clicking the top level and selecting the alarms tab within the vSphere client.

Understanding virtual machine memory

Memory in a virtualised environment is split up in three memory types.

Virtual Memory – Allocates memory through a syscall to the operating system.  This runs at the application level in the same way on virtual machines as in physical machines

Physical Memory – Runs at the OS level.  In simplistic terms it uses an ‘allocated’ and a ‘free’ list.  The application asks for memory from the physical memory so the OS puts moves the memory from the free list to the allocated list.

Machine Memory – The actual physically installed memory, runs at the Hypervisor level.

VM memory allocation starts with no memory then the hypervisor allocates machine memory to the physical memory.  When this memory gets released the free memory remains on the OS and it isn’t returned to the hypervisor.

A more in depth way of looking at this is that the memory is generally administered by what is known as software based memory virtualisation.
Each virtual machine’s (VM) memory is controlled by the virtual memory manager. (VMM)
The VMM for each VM maintains a mapping from the guest OS’s memory pages called the physical pages to the physical memory pages of the underlying host machine, called the machine pages.

Each VM sees a contiguous zero-based addressable memory space, however the underlying machine memory may not be contiguous, as it may be running more than one VM at a time.
The VMM intercepts virtual machine instructions that manipulate guest operating system memory management structures so that the actual memory management unit (MMU) on the processor is not updated directly by the virtual machine.
The ESX/ESXi host maintains the virtual-to-machine page mappings in a shadow page table that is kept up to date with the physical-to-machine mappings (maintained by the VMM).

The processor uses the translation lookaside buffer (TLB) on the processor cache for the direct virtual-to-physical machine mappings.
Some CPU’s support hardware assisted memory virtualisation
AMD SVM-V and Intel Xeon 5500 series support it. These CPU’s have two paging tables;
one for the virtual-to-physical translations
one for the physical-to-machine translations
Although hardware assisted memory virtualisation eliminates the overhead associated with software virtualisation, namely the overhead associated with keeping shadow page tables synchronised with guest page tables the TLB miss latency is significantly higher. As a result workloads with a small amount of page table activity will not have a detrimental effect using software virtualisation, whereas workloads with a lot of page table activity are likely to benefit from hardware assistance.

Transparent memory sharing
Transparent memory is on-the-fly de-duplication of memory by looking for identical copies of memory and deleting all but one copy, giving the impression that more memory is available to the virtual machine. You can set the rate with Mem.ShareScanTime and MemShareGhz in the advanced options.
Disable with Sched.mem.Pshare.enable set to false.
Use resxtop and esxtop to view PSHARE field of the interactive mode in the memory view.

Virtual machine memory in the vSphere client
Performance tab
Memory Granted – Amount of physical memory mapped to machine memory
Memory Shared – Amount of physical memory whose mapped machine memory has multiple pieces of phyiscal memory mapped to it.
Memory Consumed – Amount of machine memory that has physical memory mapped to it.
Memory Shared Common – Amount of machine memory with multiple pieces of physical memory mapped to it.

Resource Pools
When creating resource pools the system uses admission control to make sure that you cannot allocate what isn’t available.

Reservations
Resources are considered reserved regardless of whether VM’s are associated with the pool or not.

Expandable Reservations
The pool can use resources from it’s parent or ancestors if this check box is selected.

When you move a VM into a resource pool its existing reservation and limits do not change. However shares will reflect the total number of shares in the new resource pool.

Reservation types
Fixed – A predefined limit is imposed and cannot be exceeded
Expandable (default) – Resource pool can borrow resources from its parent resource pool if the parent resource pool has expandable reservation ticked. This setting doesn’t allow you to exceed reservations or limit settings imposed.

Futher information can be found in the vSphere resource management guide.

Virtual Machine configuration and maximums

Virtual machines are made up of the following files.
vmname.vmx – Config file
vmname.vmdk – Describes charateristics
vmname-flat.vmdk – (hidden by default) Contains the data
vmname.nvram – VM BIOS
vmname.log – log file
vmware#.log – VMware log file
vmname.vswp – Virtual machine swap file on the ESX(i) host
vmname.vmsd – snapshot descriptor file

Limits per VM
Can have up to 8 CPU’s
255GB RAM
2TB-512B Disk size
4 IDE devices (1 controller)
2 FDD (1 controller)
10 vNICs
20 USB devices (1 controller)
3 parallel devices
4 serial devices
4 SCSI adapters
15 SCSI targets per adapter
60 SCSI targets per VM
40 concurrent remote console connections per VM

VM vNIC types
vlance – AKA PCNet32. Supported by most 32 bit OS’s. 100MB max
vmxnet – Better performance
flexible – Either one of the above
e1000 – High performance
enhanced vmxnet – Enhanced performance, includes support for jumbo frames
vmxnet3 – Builds on vmxnet. Allows you to scale TCP/IP traffic flow in Windows Server 2008 (receive size scalability)

Paravirtualization allows the VM to talk direct to the lower levels of the virtual machine for better performance

Virtual machine hardware types 4 & 7; 7 is required for older OS support and newer networking types