@SOSP'13 @Multi-cloud
SPANStore: Cost-Effective Geo-Replicated Storage Spanning Multiple Cloud Services1. SummaryMotivation of this paper:SPANStoreImplementation and Evaluation2. Strength (Contributions of the paper)3. Weakness (Limitations of the paper)4. Future Works
Existing problems:
e.g. the application needs to replicate data in each data center to ensure low latency access to it for users in different locations.
to satisfy its latency goals and consistency requirments at low cost.
This paper designs and implements a key-value store that presents a unified viewe of storage services in several geographically distributed data centers. Its goal is to minimize the cost incurred by latency-sensitive application providers.
SPANStore: span the data centers of multiple cloud service providers.
how to do the trade-off between (latency vs. storage costs and expenses)
SPANStore Architecture:
the application issues PUT and GET requests for objects to a SPANStore library that the application links to.
Placement Manager collects a summary of the application's workload and latencies from remote data centers in each epoch. And then computes the replication policies. Cost Model: Storage cost + Request Cost + Data transfer cost = Storage service cost
Determine replication policy for all objects with same access set, consider two factors:
application requirements workload properties Leverage application knowledge of sharing pattern (Dropbox/Google Doc know users that share a file)
capitalize on the discrepancies in pricing across different cloud services and relay updates to the replicas via another data center that has cheaper pricing for upload bandwidth.
1) PMan 2) a client library that applications can link to 3) an XMLRPC server that is run in every VM run by SPANStore 4) a memcached cluster to store in-memory metadata
evaluate the cost savings verify application requirements