WAN Optimized Replication of Backup Datasets Using Stream-Informed Delta Compression

VenueCategory
FAST'12Delta compression

1. Summary

Motivation of this paper

Stream-Informed Delta Compression

Implementation and Evaluation

  1. sketch cache size

  2. delta encoding

    1. 3.55x storage saving compared with only deduplication
  3. multi-level vs 1-level delta

    1. a delta storage system may only support 1- or 2-level delta encoding to bound decode time
    2. 1-level delta is a reasonable approximation to multi-level
  4. sketch index vs Stream-Informed sketch cache

    1. using a cache with two or more super-features achieves greater compression than a single index
  5. interaction of delta and local compression

    1. see significant advantages to using both techniques in combination with deduplication
  6. WAN replication improvement

    1. 1-2 orders of magnitude faster than network throughput
  7. performance characteristics

    1. each chunk stored in a container also has a sketch added to the meta data section (less than 20 bytes)

    2. overhead

      1. sketching on the write path and reading similar base chunks to perform delta compression

2. Strength (Contributions of the paper)

3. Weakness (Limitations of the paper)

4. Some Insights (Future work)