|
OEM Data Optimization Solutions |
Well, again we see that one of our deduplication market competitors is leveraging Permabit strategy and content. First it was Symantec with “Dedupe Everywhere” and this time it is HP using our concept of “Dedupe 2.0” in their recent announcements regarding StoreOnce. Just to get some facts clear, we launched our Dedupe 2.0 campaign back in August 2009 when we introduced www.dedupe2.com. For us Dedupe 1.0 was all about backup. Dedupe 2.0 is “dedupe everywhere” (backup, primary, SSD, cloud, applications, OS, anywhere…). One and a half years, and hundreds of visits to our website later by HP, and HP has just announced…(drumroll) “Dedupe 2.0” and it’s what? Still backup? How retro! (more…)
Some decisions are easier than others.
Picking a dedupe vendor can be obvious too.
Permabit, the better way to dedupe!
There are two approaches to where deduplication technology gets deployed: as a source and as a target. While each has its benefits, the real issue is identifying which is the better choice for your application. When implementing a deduplication technology, a decision needs to be made as to whether data is processed at the source location; (i.e. before data transmission over the wire), or at the target destination; (i.e. after the data reaches the destination). There are pro’s and con’s to both of these approaches. (more…)

Not all athletes perform at the same level.
Dedupe solutions perform at different rates too.
Permabit, the better way to dedupe!
Deduplication can be implemented in a number of different ways depending on the needs of the technology provider. There are trade-offs in each and they have the potential to impact cost and performance. Let’s explore them:
Post Process – With post-process deduplication, new data is first stored on a storage device and then analyzed for duplication at a later time. The benefit is that there is no need to wait for the hash calculations and lookup to be completed before storing the data, so the incoming data is not delayed and the write process is not visible to the end user. The trade-off, however, is that a significant amount of disk capacity is consumed to house the pre-processed data (caching). This leaves less storage capacity for actual long term storage diminishing the storage efficiency and raising the overall cost. What happens when you start running out of disk space that you have allocated for the cache? How can you predict when the process will be completed? IT introduced a new term called the “dedupe window” to address the time it takes for a post-process effort to complete. When this technique is applied to backup, users have to be concerned that the dedupe process completes before the next backup cycle begins or they will end up never catching up and completing the process. Post-process is typically implemented for deduplication technologies that cannot perform fast enough to not have an impact on real-time performance.
InLine - Inline deduplication is the process where deduplication hash calculations are created on the target device as the data enters the device in real-time. The benefit of inline deduplication over post-process deduplication is that it requires less storage as data is not duplicated after the initial write and there is no need to set aside storage for data caching.
On the negative side, it is frequently argued that because hash calculations and lookups take time to calculate, it can mean slower data ingestion which reduces the throughput of the device. Since many storage vendors spend millions of dollars optimizing their storage, anything that impedes that efficiency is not looked upon favorably.
With multicore processors available today in storage devices and the emergence of intelligent hashing, this is much less of a concern. With the compute power available to process the hash calculations, do the index lookup for duplicates and the optimization intelligent hashing brings, data can be processed much more efficiently and the need for post-process can be eliminated. The inline deduplication approach has now become the preferred method.
Better Dedupe = inline dedupe.
In my next post I’ll look at Source vs. Target deduplication….