VMware Performance and Deduplication
Last week I talked about deduplication and its advantages for VMware environments. One question I heard from a reader, “if the savings are as great as you’re saying, why isn’t deduplication in VMware environments everywhere today?” And on the surface, I think the answer is simple, the key challenge to VMware deduplication is performance. IT organizations have come to understand how deduplication saves on storage costs, but they are generally concerned about any resulting performance impact. The good news is that Albireo is specifically designed for highly efficient primary storage deduplication.
Albireo’s unique design provides efficient deduplication without any performance penalty because Albireo runs entirely out of the read path. Albireo’s role is to provide deduplication advice to existing storage subsystems. Simply and elegantly, Albireo takes advantage of the built-in capabilities of modern file system and block virtualization layers provided by all mainstream storage vendors. With this design, Albireo creates no more performance impact than when common features such as clones and snapshots are in use on the storage system.
So what about ingestion speeds? This is where Albireo shines. The grid-based capabilities of the Albireo index allow it to be scaled across multiple independent CPU resources to provide deduplication advice on ingest at breakneck speeds. While Albireo is rated in excess of 140 MB/sec per processor core on a 3.0 GHz Intel Xeon even when operating with a small 4 KB chunk size, performance is even greater with larger chunk sizes. With 64KB chunks we’ve seen Albireo Grid-based dedupe scale to support ingest speeds over 5.5GB/s. These speeds are more than adequate to meet the needs of even the most intensive VMware environments. So there is no performance impact or latency as has been identified in early stage deduplication backup offerings.
What’s more, Albireo allows storage vendors to give their customers choice. With Albireo, an OEM can chose to implement deduplication inline, in parallel, and/or as a post-process event – providing the greatest flexibility for performance intensive environments while simplifying manageability. This means that in the latest powerful storage equipment deduplication can be performed inline with no performance impact, while in more resource constrained storage hardware, customers simply schedule deduplication to run at off hours. Either way, there is no performance impact to the user environment.
Finally, I wanted to talk a little about a way in which deduplication can actually improve read performance. While Albireo itself doesn’t sit in the read path, it does provide advice which allows blocks in the read path to be shared. Like VMware Transparent Page Sharing (TPS), Albireo looks at new data and provides advice on how to eliminate duplicate blocks using existing capabilities of modern storage systems. These existing capabilities maintain deduplicated blocks in the read cache, reducing the amount of memory needed for caching the operating systems and applications stored in VMware environments. The result is improved overall performance when reading these shared blocks from disk. This becomes especially important in virtual desktop deployments, where many users logging in at the same time can result in boot storms. In these environments, deduplication can reduce the impact of boot storms by up to 50%.