Privacy-Preserving Data Deduplication on Trusted Processors

VenueCategory
CLOUD'17Deduplication SGX

Privacy-Preserving Data Deduplication on Trusted Processors1. SummaryMotivation of this paperSGX DeduplicationImplementation and Evaluation2. Strength (Contributions of the paper)3. Weakness (Limitations of the paper)4. Some Useful Insights

1. Summary

Motivation of this paper

Revealing ownership and equality information of the outsourced data to untrustworthy parties has serious privacy implications.

SGX Deduplication

  1. SGX provisions the protected execution environments (a.k.a, enclaves). Each enclave is associated with a region on physical memory

90MB code and data inside the enclave is protected by the processor enclave may access the enclave memory as well as memory outside the enclave region.

  1. Enclaves cannot directly execute OS-provided services such as I/O.

a communication channel between the enclave code and the untrusted environment (e.g., OS) is required to service OS-provided functions.

  1. Each such SGX processor has a unique secret burned into its fusion.

refer it as seal key.

offers almost similar bandwidth savings to that of client-side deduplication solutions, but does not admit the leakage wherein a client can learn if a file is already stored on the server. 1570362198248 Storage server and enterprise proxies are equipped with SGX-enabled processors.

However, the attacker cannot violate SGX guarantees.

  1. CUpload (client)

client interacts with PEnclave to use blind signature scheme to get the encryption key. file-level deduplication, using deterministic encryption. Rate-limiting

  1. PDedup (proxy)

perform a privacy-preserving compaction to remove duplicate, then uploads the deduplicated data to . prevent traffic analysis: pads the traffic from PEnclave to SEnclave following differential privacy.

translate the Laplace noise to a corresponding number of chunks.

Implementation and Evaluation

  1. upload latency (vary file size)

compare with uploading the plain files of the same size.

  1. break down

encryption time, key derive time

2. Strength (Contributions of the paper)

  1. It proposes three-tier deduplication architecture

save the bandwidth, yet does not admit the client-side deduplication leakage on file existence

  1. leverage SGX in proxy and storage server to protect confidentiality, ownership and equality information of the outsourced data against various adversaries.
  2. implement a prototype and conduct experiments to show it incur low overhead

in comparison with conventional deduplication solutions

3. Weakness (Limitations of the paper)

  1. This method gives up the storage efficiency since it only consider two-stage deduplication and file-level deduplication, both of them are not exact deduplication. This paper can provide a deep analysis to show the loss of deduplication ratio.

4. Some Useful Insights

  1. This paper argues that encryption only protects the confidentiality of the files at rest, while sensitive information can still be inferred from their metadata (e.g., ownership and equality information)

How to combine SGX with encrypted deduplication?

  1. This papers use privacy-preserving compaction to realize data deduplication instead of table lookup. This provides a new way to achieve data deduplication in SGX environment.
  2. This paper mentions the issue that

successful retrieval of an outsourced record reveals its ownership information.

It can further prevent this leakage by using Oblivious RAM or Private Information retrieval.

how to mitigate its non-trivial performance overhead