Increasing Density of Red Hat Storage with Permabit VDO
Last week Permabit announced that it is collaborating with Red Hat to support our Virtual Data Optimizer (VDO) software under Red Hat Storage. As part of this collaboration, Permabit Labs tested and certified VDO data reduction for Linux® with both Red Hat Ceph Storage and Red Hat Gluster Storage software. The purpose of this testing was to demonstrate how data center IT organizations can use the combination to reduce storage footprint in their private and hybrid cloud storage environments.
In addition to verifying that VDO is compatible with Red Hat Storage, Permabit Labs looked at space savings benefits from data reduction. This post will focus on the data reduction results we saw with VDO and Red Hat Ceph Storage.
Our data reduction testing focused on three of the standard datasets we’d collected in the past to help us understand the interaction of VDO with distributed open storage that spans across multiple devices. It’s important to realize that VDO data reduction occurs at the 4 KB block level, whereas Ceph chunking will depend on user configuration; our initial capacity savings tests focused on a 4 MB object size for Ceph. To evaluate data reduction benefits we created a dataset for this evaluation from 575 GB of data and 1151 files. The dataset included the following subsets:
- Datasets from the Federal Communication Commission that measure broadband usage in the United States which consist of 277 files accounting for 178GB of highly compressible data. In standalone testing with VDO, 45% space savings are seen, primarily from compression
- A set of in VMs consisting of 82 files which account for 145 GB. This data set was developed in consultation with ESG Labs for a report on Permabit’s Albireo SDK. This set sees 97% space savings from VDO, primarily from data deduplication.
- A set of RMAN backups consisting of 792 files and 252 GB was developed to evaluate VDO space savings with Oracle RMAN. This set saw 84% savings from both deduplication and compression.
To test the efficiency of this dataset with VDO and Red Hat Ceph Storage, we started with a Ceph cluster with 9 servers with 3 OSDs each, for a total of 27 OSDs. VDO is a device mapper module and each OSD in the Ceph cluster was configured to use a VDO device below it.
We configured the Ceph cluster first for 2 replicas and wrote the datasets described above. We saw 61% savings, a 2.5:1 data reduction rate and 90% savings, or 10:1 from the VM subset alone.
We then reconfigured our cluster to utilize erasure coding. In this second case, there were fewer full blocks to benefit from deduplication and compression so we expected lower results. Here we saw 50% savings, a 2:1 data reduction rate and 80%, or 5:1 with the VM subset alone.
The results demonstrate how VDO, combined with Red Hat Ceph Storage, can be used to reduce storage footprint and maximize data center density. There is also the added benefit of lowering ongoing costs of power and cooling. Data reduction has become a key requirement for today’s Linux hyperscale solutions, enabling businesses to optimize their storage consumption and minimize the need for expensive data center expansion. The combination of VDO and Red Hat Ceph Storage clearly does a terrific job of addressing this requirement and gives businesses the added edge they need in optimizing storage in the cloud.