In my last post I wrote about why hash collisions are fundamentally not something to be concerned about in a deduplicating storage system that uses SHA-2 hashes. Now I'll use the same logic that other vendors use to attack hash-based systems to demonstrate that their systems may corrupt data even more frequently than hash collisions!
Hash based systems are criticized based on the possible occurrence of an incredibly statistically unlikely event...
When considering deduplication technologies, other vendors and some analysts bring up the bogeyman of hash collisions. Jon Toigo touched upon it again in a recent post to his blog, and Alex McDonald from NetApp brought it up in response to a recent post that Mark Twomey made.
So, what is a hash collision, and is it really a concern for data safety in a system like Permabit Enterprise Archive?
For the long explanation on cryptographic hashes and...