The Albireo SDK is a deduplication development system designed to meet the needs of hardware, software, and service providers who wish to upgrade the data reduction capabilities in their existing products without reducing performance or degrading functionality. Currently available for 64-bit Windows and Linux operating environments, the SDK has also been ported to proprietary operating systems on an as-needed basis. The SDK provides a Deduplication Index that can be seamlessly deployed in primary, archive, and backup storage within on‑premise, cloud, or hybrid environments.
Key Features and Benefits
|Deduplication Index||Enables the fastest and most scalable deduplication available. Deduplicates across petabytes of storage, while performing 200,000 operations per second, per core, with less than 5 microsecond average latency per operation.|
|Block and Stream APIs||Leverages existing storage architectures, protecting existing investments in storage technology.|
|Content-aware Segmentation||Further optimizes capacity based on data type.|
|Target and Source APIs||Simplifies bandwidth optimization on premise, in the cloud, and in hybrid environments.|
|Grid Index||Works with existing multi-node storage configurations. Offers linear scale-out of performance and capacity as multiple servers are added.|
How it Works
The SDK’s Deduplication Index is designed to operate outside the storage application software as an independent duplicate advisory service. The API is called at write time and provides recommendations to the vendor’s storage software based on previously seen data chunks.
To use the API, the storage vendor must first implement an indirection layer and reference counting. With these interfaces in place, integration can be completed in a matter of weeks using the 6 C programming language API calls provided.
The SDK process operates as follows:
- OEM software passes new data and internal placement information (e.g., filename, inode, offset or LUN, block) to the Albireo Block or Stream API.
- Content-aware segmentation optionally breaks larger objects into variable-sized chunks.
- Unique content fingerprints are computed.
- Patented indexing technologies determine if the chunk has been previously seen.
- Previous placement information is pushed asynchronously to the OEM software for file, block, or extent unification.
¹ Extracting Value from Chaos, IDC, June 2011
² Storage Infrastructure Spending Trends, ESG, Jan 2011