Storage Best Practices for Hosted Virtual Desktops
 
27 September 2010

Robert E. Passmore

Gartner RAS Core Research Note G00206668
 

This research examines storage best practices for hosted virtual desktop deployments.





Overview



There are features found in some storage arrays that can greatly enhance hosted virtual desktop (HVD) projects. HVD projects that adhere to the storage best practices in this research will yield improved results.

Key Findings
  • Users deploying HVD projects are often shocked by the apparent cost of storage.

  • Users who have deployed the technologies described in this research have been able to achieve usable storage area network (SAN)/network-attached storage (NAS) storage costs approaching the costs of consumer disks used in actual desktops.

  • Cache in the storage array is key to adequate performance during boot and login storms.

Recommendations
  • Seek at least one storage vendor that can deliver the capabilities described in this research.

  • Create a scenario/design around that vendor to provide a benchmark to evaluate other vendors and designs.




Analysis



With the rise of HVD implementations, many IT departments have been confronted by questions about how to deploy storage as part of making HVD viable. Other users have discovered that there are a set of storage features that can greatly enhance the success and viability of HVD deployments. Some of the questions include:

  • How can I simplify the creation and maintenance of thousands of disk images?

  • How can I drive per-seat storage costs near to the same level as the original desktops?

  • How can I provide performance to withstand boot, login and virus scan storms?

  • How can I provide availability, data integrity and recovery appropriate for all those desktops?

This research identifies storage best practices to addressing these questions.




How can I simplify the creation and maintenance of thousands of disk images?

Just because many physical desktops are virtualized onto a single server does not mean that all the desktops' disk images go away in an HVD world. However, creating and maintaining them individually in the same manner as the physical desktops are done would be a considerable amount of work. Some storage arrays provide an answer to this problem — using snapshots to spawn desktop images from a single, or at least a greatly reduced number of, master disk images.

The master image is created in a similar manner used by most organizations that supply their employees with a controlled, common "golden" image. But instead of installing the image on multiple machines, it is stored once on the storage array. Writable snapshots are then taken, one for each intended virtual desktop. Each snapshot is then presented to a virtual desktop using the HVD server (VMware View, Citrix XenDesktop, etc.). Inside the storage array, each snapshot volume contains the master image as a read-only segment, linked to space for any information written from the desktop. Initially during boot, this information will be operating system (OS) and file system boot/configuration data, but as the user creates and loads files, these too are placed in the individual snapshot space. Since there is one snapshot image for each virtual space, the individual's work is isolated from others, but the physical master image is shared virtually across all the snapshots. This means that, when it comes time to patch or otherwise update the master image, it only has to be done once for it to automatically become a part of all the snapshot images. Of course, it often won't take effect until the virtual desktop reboots to bring the changes into memory. An additional bonus to simplifying the maintenance of images is that the storage required is greatly reduced from the physical case, where every image requires the full amount of disk.

Action Item: IT administrators need to ensure that their business processes around HVD image creation are done in a way that minimizes the number of images to a manageable (ideally, no more than 10) level.




How can I drive per-seat storage costs near to the same level as the original desktops?

Servers for HVD deploy NAS- or SAN-based storage, which have a much higher costs per raw gigabyte than the consumer-grade hard drives inside of desktops. Even when the poor utilization of desktop disks is taken into account, a simple transfer of all the data to NAS or SAN storage would result in substantially higher costs. Spawning images using a master (as described previously) saves considerable storage (1,000 virtual desktops using a single disk image would be 1000:1 reduction). But what about all the user data, which, in most situations, will end up taking much more space in the individual snapshot than the master?

Two technologies found in some modern storage arrays can help address this issue. The first is thin provisioning, which eliminates the overprovisioning of capacity and the reprovisioning labor associated with manual provisioning. This can save 30% to 50% on storage. The second technology is deduplication. As its name implies, any duplicate bit stream found across all the desktop images is eliminated, and the first copy found is the only one stored. Deduplication further reduces the size of the master image, but, even better, can reduce the physical space required to store all the user data by as much as 50% to 90%. Users deploying all three of these techniques have reported actually achieving usable storage costs that are near desktop disk levels, which makes all the difference in justifying HVD projects.

Action Item: While PC hard-disk pricing is unlikely, organizations must use thin provisioning and deduplication to get pricing reasonably close (within 25%).




How can I provide performance to withstand boot, login and virus-scan storms?

A downside of the practices described so far is that the process of reducing storage costs does so by reducing the number of disk drives, which, in turn, reduces the input/output (I/O) throughput capability of the system. While the I/O rates of individual desktops vary with activity, application and OS, the good news is that groups of desktops, even thousands of them, produce relatively small average numbers of I/Os during normal operating time. And these I/Os tend to be randomly submitted, yielding few significant peaks in activity. The bad news is that certain activities tend to produce high I/O activity at certain times — for example, all users boot their desktops first thing in the morning, all users log back in immediately after lunch and virus scans all come during off hours. The storage technology that addresses this issue is cache. Traditionally involving RAM, cache today might involve flash, or a mix of RAM and flash. In either case, what happens is that the first boot, login, etc., pulls the necessary data off the disk into cache, where it stays until the storm is over. The result is that the virtual desktops see performance results that are much faster than the original activities from physical disks. And because the contents of cache automatically update to adapt to changing activity, the amount required is relatively small and, therefore, affordable, and does not require additional administrative activity.

Solid-state disk (SSD) drives are based on flash, and, in theory, could be used to produce a performance boost. If manually provisioned, however, the SSD capacity will likely need to be much larger than cache to give similar results. So far, the software to automate placement, while, at the same time, respond to short-term activities like boot storms, is in short supply.

Action Item: In order to minimize the impact of high HVD disk activity, storage infrastructures need to be configured with sufficient cache (using SSDs, flash, etc.). IT administrators also need to tune Windows properly, so that disk-intensive activities can be properly configured, and selected I/O-intensive desktops should not be virtualized.




How can I provide availability, data integrity and recovery appropriate for all those desktops?

Modern storage systems create high availability and data integrity through redundancy and nondisruptive maintenance processes. There should be no single points of failure in the hardware, and all upgrade processes, including firmware upgrades, should be nondisruptive. Since disk drives are usually the highest-volume component in the array, they are the most frequent-failing component, despite a very high mean time between failures (MTBFs). For today's large drives, it should be possible to survive two disk failures without losing data; the most economical algorithm and best practice is RAID 6, or its equivalent.

Logical failures (e.g., human error and software errors) are traditionally protected by backup to tape or disk, and these methods work for HVD, but are applied to the server, not to the individual desktops. The result is recovery time objectives (RTOs) proportional to the size of the data, and recovery point objectives (RPOs) tied to the frequency of backup, which is historically once every 24 hours. Substantial improvements to both RTO and RPO can be accomplished, along with high probability of recovery, by using a snapshot train in the primary array. This requires an array that can take snapshots of snapshots (the snapshot-spawned desktop images), but the recovery time is typically the time to mount the snapshot, which is independent of image size, and leads to much faster data access times. The RPO with this method is the snapshot interval, typically set between one to four hours. The incremental storage required for these snapshots is typically about 25% of the space required by a deduplicating, disk-based backup target, which is another cost savings.

True disaster protection requires replication of the data to a second site, and, although there is a variety of replication technologies, the best method for HVD is snapshot-driven replication. It has the advantage of creating and maintaining an identical snapshot train at the second site, as well as the latest copy of the data. This means that, in the event of a major disaster, the user has the option to pick an arbitrary point in the past for recovery.

Many storage vendors not only feature these techniques, but make deployment a simple matter of turning on the features and choosing a few policy settings.

Action Item: To avoid simultaneous outages of all virtual desktops, users should specify, configure, and operate storage for the highest levels of data protection and availability. Technology and practice for recovery from logical failures must be selected for appropriate RPO and RTO.




Bottom Line

Users evaluating the possibility of HVD deployments should, in the early stages, seek at least one storage vendor that can deliver the capabilities described in this research. Then, create a scenario/design around that vendor to provide a benchmark to evaluate other vendors and design options as they appear. A proof of concept activity may be required to predict actual capacity, performance and cost, but, in many cases, the storage vendor will be able to provide the data without a full proof of concept (POC). Users who follow this advice will have a much higher probability of successful payback than those who don't.


© 2010 Gartner, Inc. and/or its affiliates. All rights reserved. Gartner is a registered trademark of Gartner, Inc. or its affiliates. Reproduction and distribution of this publication in any form without prior written permission is forbidden. The information contained herein has been obtained from sources believed to be reliable. Gartner disclaims all warranties as to the accuracy, completeness or adequacy of such information. Although Gartner's research may discuss legal issues related to the information technology business, Gartner does not provide legal advice or services and its research should not be construed or used as such. Gartner shall have no liability for errors, omissions or inadequacies in the information contained herein or for interpretations thereof. The opinions expressed herein are subject to change without notice.