Venue | Category |
---|---|
IEEE CNS'18 | Secure Deduplication |
Tapping the Potential: Secure Chunk-based Deduplication of Encrypted Data for Cloud Backup1. SummaryMotivation of this paperTapping the Potential Implementation and Evaluation2. Strength (Contributions of the paper)3. Weakness (Limitations of the paper)4. Future Works
Existing the secure deduplication designs are at odds with the real-world dedupe requirements in terms of security and performance. (pay little attention to the challenges and practical requirements of chunk-based deduplication (CBD))
The practical requirements:
Question: how to reduce the risk of the information leakage with minimal impact on the underlying deduplication routine?
Question: can it speed up the key generation while still ensuring an effective deduplication function?
Question: any secure chunk-level dedupe design should provide a read performance on par with plaintext CBD practice.
This paper intends to solve those three challenges.
will outright incapacitate the deduplication, this is the desired asymmetry between security and performance, but only with minimal dedupe performance loss.
It assumes it can achieve the semantic security for unpredictable data with CE.
All the communication channels between clients, the key server, and the backup storage system are secure.
A ideal functionality :
- input - client: chunk , key server: a chosen secret
- output - client: the chunk key , key server: a sign, storage server: the ciphertext of chunk
Goal: a probabilistic polynomial-time (PPT) adversary cannot distinguish the real-world execution of the proposed scheme.
the key server learns nothing on the client's input and algorithm output client cannot infer key server's secret
Some points:
The key design is it uses pairs of RSA, so that given any compromised client out of clients in the system
the adversary can only infer at most client's data on storage servers. ()
It can tweak the parameter to accommodate the real network scale.
the number of compromised machines depends on company's size but also
: the size of data that cannot be deduplicated across all the users : the size of data that have been deduplicated degradation ratio:
This can proves that when outsizes significantly, selecting a small or moderate will not introduce an obvious performance penalty.
the workload is deduplication-unfriendly.
accelerate the key generation
allow a duplicate chunk copy under one secret to be kept in the storage without referring it to an existing copy under another secret in an old container.
Further improve the read performance: reconstruction-aware chunk placement mechanism
enforce a spatial locality for chunks chunks under the same key server secret are stored close to each other.
a slight loss of deduplication so as to achieve the desired security objectives, the chunk fragmentation level for user backup is also reduced.
TCP-based randomized oblivious key generation protocol: Python Deduplication simulator: C Restore simulator: C