<img height="1" width="1" style="display:none;" alt="" src="https://dc.ads.linkedin.com/collect/?pid=1005900&amp;fmt=gif">

Insights

Capacity Management in Virtualised Environments

In recent years, the increased popularity of server virtualisation has introduced considerable complexity into the Capacity Management process. This article is an abridged version of a Capacitas whitepaper that describes possible solutions to some of the challenges that may be encountered when conducting capacity management of services based on server virtualisation technologies. The paper also reveals a number of important areas where virtualisation is not able to deliver the magnitude of cost savings that many organisations initially expect.

Workload Insulation
Server virtualisation technologies form a subset of a broader spectrum of workload insulation options, as shown in the diagram below.

workload isolation

Workload insulation is the method by which services are isolated to minimise their potential impact (in terms of performance, availability etc.) on each other.

The use of completely dedicated servers for each service offers the highest level of workload insulation, but with no flexibility to share resources between workloads. Shared servers with no workload insulation offer the most flexibility to share server resources at the expense of workload isolation, meaning some workloads could negatively impact the performance of others. Various server virtualisation choices strike a balance between these two extremes.

With the exception of Sun Solaris Containers, the majority of server virtualisation solutions are virtual machine-based. Examples include VMware ESX Server, Microsoft Hyper-V, Sun xVM and IBM PowerVM.

Discover how the Senior Delivery Manager at easyJet reduced risk through  capacity management in our on-demand webinar

Monitoring

A key challenge when implementing server virtualisation is ensuring that monitoring data collected for virtualised operating systems is correct.

If traditional resource utilisation metrics are used in non-virtualised environments they may no longer be meaningful. For example, due to the way in which virtualised operating systems measure time under ESX Server, utilisation metrics including % CPU time data obtained from within VMware guest operating systems are over-reported.

Ultimately, decisions based on analysis of incorrect server monitoring data may result in wasted investment in unnecessary upgrades or service-impacting performance problems.

Consolidation Ratios
The consolidation ratio of a server virtualisation solution refers to the number of guest operating systems running on a single host. For enterprises involving multiple host servers, the average consolidation ratio may be calculated by dividing the total number of guest operating systems by the total number of host servers.

When planning a move to a virtualised infrastructure it is not appropriate to use vendors’ consolidation ratios as a basis for estimating the final number servers that will be required. This is because many applications may not be good candidates for virtualisation and may need to remain on dedicated servers. Possible reasons not to virtualise are discussed later in this paper.

When planning to virtualise, it is the ratio of current number of servers to the final number of servers that is important, as this describes the achievable cost savings through reduction in the total number of servers. For example, consider an estate consisting of 70 dedicated servers, where 10 of these servers are not good candidates for virtualisation. Assuming the remaining 60 servers may be virtualised with a 15:1 consolidation ratio; then the ratio of servers before and after virtualisation is in fact 5:1.

consolidation

 

Hence, when planning a virtualisation exercise it is possible to underestimate the final total number of required servers by a factor of three or more. It is very important not to apply the vendors’ quoted best consolidation ratio to your entire server estate.

Planning to Consolidate
Determining which of your existing servers are best suited to share resources is a non-trivial exercise that requires significant capacity management expertise.

When planning to consolidate servers using server virtualisation there are a number of potential barriers that may limit the ability of an organisation to virtualise their environments:

1. It is often necessary to implement storage virtualisation through provision of a SAN as an enabler for server virtualisation

2. Any existing servers that are to be consolidated using virtualisation onto the same server must be co-located

3. Although some virtualisation technologies do support heterogeneous guest operating systems, it is more typical to virtualise servers onto similar hosts (Windows on VMware, Solaris in Containers etc.)

4. It may not be possible to virtualise production, test, development and UAT servers onto shared host servers due to rules regarding server hosting in the data centre

Reasons not to Virtualise
Although virtualisation may deliver many benefits, there are a number of further reasons why some servers may not be good candidates for virtualisation:

1. Some third-party software suppliers do not support their products when running in a virtualised environment or may have software licensing models that make costs prohibitively expensive when virtualising

2. Solutions that already employ clustering technologies are not good candidates for consolidation

3. Numerous servers running identical workloads on commodity hardware (such as web server farms) do not benefit from virtualisation

Workload Seasonality
Seasonality describes fluctuations in the usage (and hence resource requirements) of a service or workload over time.

Workloads may exhibit seasonality across various time intervals; such as annual, monthly, weekly, daily or hourly profiles.

When considering the consolidation of two or more systems onto a shared server (or cluster) using virtualisation it is important to consider workload seasonality in order to understand when peak resource utilisations will coincide.

workload table

For example, consider the 5 production servers shown in the table above, which are candidates for server virtualisation. Using the average % CPU time across a fixed measurement period indicates a requirement for a total of 31 MHz. However, even in this highly simplified example examination of the daily profile by hour (shown in the chart below) indicates that in fact a minimum of 56 MHz is required to support the combined workloads.

workload graph

Analysis of further seasonal data may be expected to reveal other peak periods. Note that it is necessary to consider the impact of seasonality on all resource types and not just % CPU time. A common pitfall is to conduct feasibility analysis over too narrow a time period, thereby missing annual seasonality.

Failure to adequately consider the seasonality of candidate workloads represents a serious risk of insufficient host server capacity, potentially leading to service-impacting performance problems and costly unplanned upgrades.

Memory Requirements
One common misconception when planning a move towards a virtualised infrastructure is the extent to which existing server hardware can be re-used. Target platforms for migrated systems typically require large amounts of memory that often exceeds the upper limits that can be installed on the existing hardware, necessitating the purchase of new servers that support large memory configurations.

A formal capacity planning exercise is recommended to determine the optimum memory configuration for host servers prior to purchase.

Resource Management
Resource management is the technique whereby workloads are managed to allow further control over the appropriation of shared server hardware resources. Examples include resource capping, the creation of processor sets or the distribution of processor resources according to CPU shares.

Resource management may be used in conjunction with server virtualisation, in which case each guest operating system is treated as a workload. When implementing resource management it is important to ensure that the rules are documented, preferably in the Configuration Management System (CMS). A lack of understand of any resource management rules that are being applied can seriously limit the value of capacity management.

 If you would like to learn more about our Modelling and Performance testing solutions, please click below, to see our latest webinar.

Webinar easyJet reduce risk through capacity management