Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
A Peer-to-Peer File System OSCAR LAB Overview A short introduction to peer-to-peer (P2P) Systems Ivy: a read/write P2P file system (OSDI’02) What is P2P ? An architecture of equals (as opposed to client/server), each peer/node acts as – Client – Server – Router Harness aggregate resources (e.g., CPUs, memory, disk capacities) among peers/nodes What is P2P ? Technical trends Increasing processing power of PCs Decreasing cost and increasing capacity of disk space Widespread penetration of broadband Creation of huge pool of available latent resources P2P Systems Centralized: have a centralized directory service – – E.g., Napster Limits scalability and poses a single point of failure Decentralized and Untructured – – – No precise control over the network topology or data placement E.g., Gnutella Controlled message flooding, limiting scalability P2P Systems Decentralized and Structured – Tightly control the network topology and data placement – Loosely structured: Freenet (the file placement is based on hints) – Highly structured: Pastry, Chord, Tapestry, and CAN Decentalized and Highly Structured P2P Systems Precise control of the network topology and data placement A distributed hash table (DHash) – – – – Each node has a host-ID (hash of the public key or IP addr.) Each file/object has a file-ID (hash of the file pathname) Both files and nodes are mapped into the Dhash Basic interface put(key, value) get(key) Decentalized and Highly Structured P2P Systems A location and routing infrastructure – – Advantages – – – – Application-level, routed by an ID not IP address Routing effciency: O(logN) Good scalability (O(logN) in routing effciency and routing table) Reliability Self-maintenance (node addition/removal) Good performance (compared to other P2P systems) Issues – – – Routing performance (compared to IP routing) Security Other issues …… P2P Applications Content delivery systems Application-level multicast Publishing/file sharing systems P2P storage systems (e.g., PAST, CFS, OceanStore) P2P file systems Ivy: A Read/Write P2P File System Introduction Design Issues Performance Evaluation Summary Introdcution Challenges: – – – – – – Previous P2P systems are either read-only or one single writer, so multiple writers pose file system consistency issue Unreliable participants render locking unattractive (for consistency) Undo/ignore untrusty participants’ modifications Security over untrusted storage of nodes Resolve update conflicts due to network partition High availability vs. strong consistency Design Issues DHash infrastructure Log-based metadata and data NFS-like file system DHash A distributed P2P hash table Stores participant’s logs Basic operations – – – put(key, value) get(key) E.g., key = content-hash of a log, value = log record Log Data Structure One log per participant A log contains all of one participant’s modifications (log records) to a file system data and metadata – Each log record is a content-hash block – Each participant appends log records only to its own log, but reads from all participants’ logs – Ignore some untrusty participant’s modifications by without reading its log Log Data Structure Log Data Structure Log Data Structure Using the Log Append a log record – – – – – Derive a log record from a NFS request Its prev field points to the last record Insert the new log record into DHash Sign a new log-head pointing to the new log record Insert the new log-head into DHash Using the Log File system creation – – – – Create a new log with an End record An Inode record with random i-number for the root directory A log-head Using the root i-number as the NFS root file handle Using the Log File creation – – – – – Request: create (directory i-number, file name) An Inode record with a new random i-number A Link record Return the NFS client with the i-number as a file handle If write the file, create a Write record File read – – Request: read (i-number, offset, length) Scan logs accumulating data from Write records overlapping the range of data to be read, while ignoring data hiddened by SetAttr records that indicate file trucation. Using the Log File name lookup – – – Request: open (directory i-number, file name) Scan logs for a corresponding Link record First encounter a corresponding Unlink record, indicating that the file doesn’t exist File attributes – – File length, mtime, ctime, etc. Scan logs to incrementally compute attributes User Cooperation: Views View: the set of logs comprising a file system View block – – – A DHash content-hash block containing pointers to all log-heads in the view Contains the root directory i-number One Property: immutable (different file systems with different view blocks ) Name a file system with the content-hash key of its view block, like self-certifying file system (SFS) Combining Logs Problem: – concurrent updates result in conflicts, how to order log records ? Solution: Version Vector in each log record – – – Detect update conflicts E.g., (A:5, B:7) < (A:6, B:7) compatible (A:5, B:7) vs. (A:6, B:6) concurrent version vectors, order them by comparing the public keys of two logs Snapshots Problem ? – have to traverse the entire log to answer requests (high overhead and inefficiency). Solution: snapshots – – – – – Avoid traversing the entire log Consistent state of the file system Private per participant, periodically construct it Stored in DHash, sharing contents among snapshots Contains a file map, a set of i-nodes, and some data blocks, see Figure 2 Snapshot Data Structure Snapshots Building snapshots – perform all log records newer than the previous snapshot Using snapshots – – – First traverse log records newer than current snapshot If this can’t fulfill a NFS request, further search information in current snapshot Mutually-trusted participants can share snapshots Cache Consistency Most updates are immediately visible Store the new log record and update the new log-head before replying to an NFS request – Query the latest log-heads for latest updates upon each NFS operation – Modified close-to-open consistency for file reads/writes Open() fectch all log-heads for subsequent reads/writes – Write() write data on its cache, defers writing data to DHash – Close() push log records (if any by writes), update loghead – Exclusive Create Requirement: create directory entries be exclusive – Some applications use this semantics to implement locks Solution: Partitioned Updates Close-to-open consistency guaranteed only if network is fully connected How if network partitioned? – – – – Maximize availability (by allowing concurrent updates) Compromise consistency After partition heals, using Version Vectors Application-level solver to resolve conflicts (Harp) Security and Integrity Form another view to exlcude bad/misbehavoring/malicious participants Using content-hash key and public-hash key to protect data integrity Evaluation Goal: understand the cost of Ivy’s design in terms of network latency and cryptographic operations Workload: Modified Andrew Benchmark (MAB) Performance in a WAN Many Logs, One Writer The number of logs has relatively little impact – Because Ivy fetches the log-heads/log-records in parallel Many DHash Servers More impact, since more messages are required to fetch logrecords Many Writers More impact, have to fetch other participants’ newly logged updates Summary Log-based data/metadata, avoiding using locking Close-to-open consistency Tradeoff between high availabilty and strong consistency Allow concurrent updates, detect and reslove update conflicts Performance: 2-3 times slower than NFS Limitations ? – – Small scale: limited to the number of logs Hard to hide wide-area network latency Thanks