Albireo – Storage Optimization Realized
In my last post, I gave the history of Albireo and I mentioned that we came to recognize seven key attributes that are absolute requirements for an integrated primary deduplication solution.
First, Albireo supports block, file and also new unified or converged storage platforms. By addressing all types of primary storage, we avoid leaving huge amounts of users’ data unoptimized. Additionally, next generation storage platforms put block and file data on the same underlying storage, and Albireo makes it possible to identify and deduplicate data across both.
Next, we uniquely provide the ability to scale deduplication across a pool of storage many petabytes in size, instead of limiting deduplication to smaller islands of a few terabytes. This is critical to delivering high rates of deduplication.
Further, Albireo delivers sub-file, content aware deduplication. Whole file single instancing just doesn’t cut it for common primary data files, like Office documents or virtual system images. Albireo can identify optimal boundaries in a variety of file types and then deduplicate segments as small as the storage can support. This delivers industry-leading deduplication efficiency to our partners.
As I explained in my last post, Albireo is successful because it is an embedded, integrated solution. Integrating directly with primary storage vendor’s technology, it leverages all their existing R&D. We also provide the capability for Albireo to be integrated as inline, post-process, or parallel deduplication, whichever matches the underlying storage platform the best. This means that there are no rough edges where features of the underlying storage are lost; the deduplication is transparent and automatic. Users may not even know that their storage is using Albireo for deduplication, except for the levels of savings and performance far beyond what anyone has seen before.
Finally, because Albireo is delivered as a software tool kit, it is integrated outside of the storage read path. Our technology solves the hardest parts of deduplication, namely sub-file duplicate identification, and then leverages existing vendor file and block system metadata for eliminating duplicates. Because of this, read operations only look at the file or block metadata without the need to consult our indexes, meaning we have no impact on performance, functionality, or data integrity. Even if our software is turned off, all user data remains accessible. This is completely unique in the industry.
I’m extremely excited to be able to now talk publicly about Albireo and the deduplication benefits it provides to existing primary storage technologies. We’ve been focused on this for the past year, and the success of our partners’ integration efforts has confirmed the ease of integration and technological power of the Albireo toolkit. Through our partners users will be using this soon, and I’m sure that they’ll be pleased with the results.