Permabit Albireo SANblox™ FAQ


What is Albireo SANblox™?

Albireo SANblox is the storage industry’s first inline data efficiency appliance for Fibre Channel SANs.
A breakthrough in data reduction technologies, SANblox enables any SAN deployment to deliver 6X more capacity, at 85% less cost and up to 400% more performance with minimum 300 microsecond latency, giving traditional arrays the equivalent of all flash array performance at a fraction of the cost.

Based on Permabit’s Albireo data deduplication and HIOPS Compression™ software, SANblox also incorporates Red Hat® Enterprise Linux, Red Hat Enterprise Linux High Availability and Fibre Channel software from Emulex®. These components are integrated together with a set of Permabit management tools that simplify administration of the solution and enhance supportability. The product is delivered as a 1U form factor appliance using hardware from Permabit or one of its OEM partners

What market(s) does SANblox serve?

SANblox serves the $20B medium and large enterprise markets where Fibre Channel SAN storage is widely adopted and has broad horizontal application. SANblox provides immediate benefits for software products that store large amounts of redundant data such as virtual server farms and VDI solutions.

In addition, SANblox can provide benefits for software that produces compressible data such as traditional databases (OLTP, data warehouses), analytic applications, and “big data” applications including those that process and store system logs and sensor data. These applications are the backbone of the finance, healthcare, education, retail, manufacturing and energy industries.

What makes SANblox unique?

SANblox is the only high performance data efficiency solution on the market to supply both inline deduplication and compression to any Fibre Channel array. It is unique in that deduplication and compression are designed to work in concert so that the most appropriate optimization technology is used on the appropriate data set.

Because SANblox sits as a separate appliance on the SAN, it can be applied to data sets that will benefit from optimization technologies, while remaining out of the data path for those that won’t. And, the modular approach of SANblox leverages existing management features of the array (such as thin provisioning, snapshots, and replication features) to minimize the number of new administration tasks required.

Unlike gateway products that perform write-back caching to mask poor performance, SANblox always commits data to storage before a write is acknowledged.  Writes are never acknowledged before being persistently stored, and both data and metadata are maintained in the enterprise storage device. This improves data safety because once a write completes, data is immediately under the protection of the array.

Other SAN gateway products such as the IBM SVC and NetApp V-Series either do not support deduplication or are not designed for use with the heavy random workloads found in many mission critical application environments.

What advantage does having inline data efficiency give SANblox?

With inline data efficiency, storage savings from deduplication and compression are realized immediately.  Key advantages of an inline approach:

  •  Requires less processing load – inline avoids costly double pass dedupe/compression.
  • Increases performance for hybrid arrays and disk arrays cache – inline operation expands the effective size of DRAM/flash cache so more workloads hit higher performing fast media.
  • Decreases costly multiple writes –writes only unique and compressed blocks inline for immediate storage savings.
  • Enables high change workloads that bring post-process systems to their knees (i.e. what EMC refers to as unstable condition) – inline is real-time transaction ready.
  • Provides predictable performance and minimal latency – inline enables you to meet SLAs.
  • Protects data – inline enables immediate replication for DR of reduced data.

In contrast, with post-process data efficiency, which is used by companies such as Pure Storage, data is written unmodified to media and data reduction techniques are applied at a later point in time. Post-process techniques write to disk or flash twice: first, when the data is originally written, and again later after data efficiency is applied.  The result is increased storage consumption, wear on flash-based storage media and duplicate operational load on processors.

What advantage does SANblox provide by combining deduplication and compression?

louisblogReduction Rates 1 (2)Deduplication is a technique for reducing the consumption of storage resources by eliminating multiple copies of duplicate data. Deduplication is distinct from compression in that it operates at a much larger granularity and across much larger datasets than compression. 

Compression operates at a “micro” level, removing duplicate information within a file. When combined, the benefits of the two technologies yield greater potential savings than either technology on its own and results are additive as depicted in the graphic above. Above are some examples of potential physical space and data reduction rates for different types of data.

Does SANblox create a bottleneck because it sits in front of the SAN?

graphlouisblog2SANblox delivers up to 230,000 random 4K IOPS. SANblox data efficiency expands the effective size of the cache, typically by 6:1.  This increases cache hit rates and overall IOPS.  In one test, the number of cache hits was increased from 80% to 95%, and corresponding performance actually increased by 400%.

Does SANblox performance scale?

Multiple units can be combined to increase performance.

If a SANblox appliance fails, will data be lost on the SAN?

No, unlike gateway products that perform write-back caching to mask poor performance, SANblox always commits data to storage before a write is acknowledged.  Writes are never acknowledged before being persistently stored, and both data and metadata are maintained in the enterprise storage device. This improves data safety because once a write completes, data is immediately under the protection of the array. SANblox is configured in HA pairs only to provide high availability for the application itself.

What hardware does SANblox work with?

SANblox works with any new or existing Fibre Channel array. SANblox has been tested with leading block-based storage arrays including: EMC VMAX, EMC VNX, Dell SC800 series, HDS HUS, HDS VSP, NetApp E-Series, NEC M-Series and Huawei OceanStor T-series among others.  Brocade and Cisco Fibre Channel switches are supported as is connectivity with both Emulex and QLogic HBAs.

What data management software does SANblox work with?

SANblox is designed to be as transparent as possible in the SAN.  Data management software running on the back-end array will continue to function as-is.  This includes features like thin provisioning, snapshots and replication.  Applications see any LUN presented by SANblox as just another block device and will continue to function as normal.  Software running on the application server that is used to initiate snapshot, clone or replication operations on the enterprise array, can be directed to operate directly on the aggregate pool of enterprise array LUNS being optimized by SANblox.

What software products are being certified to work with SANblox?

SANblox certifications are in process for the following software applications:

  1. VMware vSphere
  2. VMware View
  3. Oracle RDBMS
  4. Microsoft SQL Server
  5. Microsoft Windows Server 2008R2 and 2012 (including HyperV)
  6. Red Hat Enterprise Linux

Where is SANblox available?

SANblox is available from OEM partners and resellers.

How is SANblox deployed?

SANblox Deployment

SANblox is delivered as a pair of 1U appliances that are connected to a Fibre Channel SAN. SANblox acts as both initiator and target.  During initial deployment, one or more thinly provisioned enterprise storage LUNs are provisioned for use by SANblox.  The LUNs will present a total of up to 256 TBs of logical (thin) storage for use by SANblox.  SANblox aggregates these thin LUNs and manages them as a global pool of optimized storage. The storage optimization pool is then carved up and provisioned as SANblox-optimized LUNs for consumption by applications.  Each SANblox system is deployed as an HA pair for maximum data protection and availability.

What kind of performance can I deliver with SANblox?

SANblox excels at delivering high performance for enterprise storage applications that require high IOPS and low latency.  SANblox performance is limited by the capabilities of the underlying storage.  The table below shows performance metrics using SANblox with flash-based storage exported from a generic Intel-based Linux system.

Performance Metric: Result:
Sequential read throughput 1045 MB/s
Sequential write throughput 800 MB/s
Random read performance 230,000 4K IOPS
Random write performance 111,000 4K IOPS
Rand 70% R/W performance 180,000 4K IOPS
Min read latency 300 usec
Min write latency* 400 usec

*Average overhead for write latency is 1.2X baseline (w/out SANblox) at queue depth of 256.

The table below shows performance of 24 disks in a SAN attached storage array from a leading vendor using hard drive shelves and the RAM cache of the array.

Performance Metric: Result:
Sequential read throughput 550 MB/s
Sequential write throughput 580 MB/s
Random read performance 9,000 4K IOPS
Random write performance 32,000 4K IOPS
Rand R/W 70/30 20,000 4K IOPS