permabits and petabytes blog oem data optimization for next generation storage OEM Data Optimization Solutions

Archive for July, 2008

Statistical Demons Lurk Everywhere

In my last post I wrote about why hash collisions are fundamentally not something to be concerned about in a deduplicating storage system that uses SHA-2 hashes. Now I'll use the same logic that other vendors use to attack hash-based systems to demonstrate that their systems may corrupt data even more frequently than hash collisions! Hash based systems are criticized based on the possible occurrence of an incredibly statistically unlikely event...

What do Hash Collisions Really Mean?

When considering deduplication technologies, other vendors and some analysts bring up the bogeyman of hash collisions. Jon Toigo touched upon it again in a recent post to his blog, and Alex McDonald from NetApp brought it up in response to a recent post that Mark Twomey made. So, what is a hash collision, and is it really a concern for data safety in a system like Permabit Enterprise Archive? For the long explanation on cryptographic hashes and...