Left Lane Driving and Primary Storage Optimization
I grew up in the Detroit area, and I often go back to automobiles when I think about buying behaviors. Some time ago I determined that engine technologies had advanced enough so that I could make fuel efficiency and clean emissions my primary criteria for new vehicle purchases. The first efficient/clean vehicle I bought was a clean diesel for one of my daughters. The car gets better than 50 miles to the gallon at highway speeds; it is peppy around town, has a full complement of airbags and top safety ratings; it is comfortable, seats five, looks good, and she loves it. I recently bought a hybrid, and it too, is outstanding. It is fast, smooth, safe and as comfortable a car as I have ever had.
My daughter and I sacrificed no performance reduction (0-60 mph, braking, driving range), no safety rating decline (not the size of a postage stamp, solid vehicle) and no feature hit (comfort, curb appeal, capacity, cooling, sound, guidance, sunroof) to drive efficient and clean vehicles.
New technologies like these take broad hold when you sacrifice nothing to attain efficiency. My daughter and I can safely and comfortably drive in the left lane, and we can do it in our fuel efficient and clean vehicles.
I believe there are many parallels to primary data optimization adoption. We all want it. We all need it. If deduped primary storage was as fast, as feature rich and as safe as non-optimized storage we would have it already. Let’s examine each of these key adoption hurdles and how they can be addressed in data optimized primary storage.
Maintain High Performance – Customers demand storage performance. They have built their entire production environments and workflows around high I/O rates. And large primary storage companies like EMC, HDS and HP and smaller high performance NAS companies like BlueArc, Isilon and Symantec have invested hundreds of millions in R&D focused on performance to enable customers to process and store PBs of data at near-line speeds. I/O speed, like 0-60 acceleration is here to stay. So, to be broadly adopted, primary dedupe cannot impact I/O.
Here are two ways to solve this challenge. First, data optimization cannot be a “bump in the wire”, it must be out of the data path. Second, it must have the flexibility to be deployable as a parallel or post-process operation depending on the workload, platform, and required performance characteristics of the storage.
Preserve Features – We like our bells and whistles. Storage providers have made large investments to differentiate their primary offerings, each with their own flavor of snap shots, replication, provisioning and many other features. Once an enterprise adopts these features, they won’t give them up. However, inline deduplication can mask many of these features and/or can render them inoperable.
Again, the answer is to deploy dedupe out of the data path. When deployed out of the data path, deduplication optimizes many features and also shrinks the storage footprint. So, you can have the sunroof, heated seats and full size vehicle as well as fuel efficiency.
Assure Data Safety – Customers assume primary storage will protect their data and, again, storage companies have made huge R&D investments to protect information integrity. Just as I wasn’t willing to have my daughter drive an unsafe, “toy car” to achieve fuel efficiency, data safety in a deduped system cannot be compromised. Any “bump in the wire” solution introduces a single point of failure to the storage system. But wait a minute, it gets worse. Many solutions are not only in the data path, but they also alter the data structures. Enterprises must run from any solution that, “hydrates or re-hydrates” data. When data is altered, it is inaccessible if the application is ever unavailable or down. The point is simple, in a deduped primary system, the data must remain unaltered and data safety must be managed by the storage stack – any less is a step backwards in data protection.
Up to now, adopting primary deduplication took primary storage from the left lane to the break down lane. Today, I’m pleased to announce our revolutionary OEM Data Optimization SDK, Albireo. Permabit is delivering Albireo to storage OEMs who will bring it to market in their products beginning later this year. Albireo is the result of nearly ten years of R&D focused on advancing deduplication technologies. When our CTO (and founder), Jered Floyd conceived of Albireo it was with a singular mission to deliver data optimization for primary storage with no performance reduction, no data safety decline and no feature hit. Albireo does all this and it leverages all standard storage architectures and produces best of class deduplication rates.
That is Albireo. Data Optimization for Primary Storage – without compromise.
Embrace efficiency.