Download Presentation for class

A Peer-to-Peer File System OSCAR LAB Overview   A short introduction to peer-to-peer (P2P) Systems Ivy: a read/write P2P file system (OSDI’02) What is P2P ?   An architecture of equals (as opposed to client/server), each peer/node acts as – Client – Server – Router Harness aggregate resources (e.g., CPUs, memory, disk capacities) among peers/nodes What is P2P ?  Technical trends  Increasing processing power of PCs  Decreasing cost and increasing capacity of disk space  Widespread penetration of broadband Creation of huge pool of available latent resources P2P Systems  Centralized: have a centralized directory service – –  E.g., Napster Limits scalability and poses a single point of failure Decentralized and Untructured – – – No precise control over the network topology or data placement E.g., Gnutella Controlled message flooding, limiting scalability P2P Systems  Decentralized and Structured – Tightly control the network topology and data placement – Loosely structured: Freenet (the file placement is based on hints) – Highly structured: Pastry, Chord, Tapestry, and CAN Decentalized and Highly Structured P2P Systems   Precise control of the network topology and data placement A distributed hash table (DHash) – – – – Each node has a host-ID (hash of the public key or IP addr.) Each file/object has a file-ID (hash of the file pathname) Both files and nodes are mapped into the Dhash Basic interface   put(key, value) get(key) Decentalized and Highly Structured P2P Systems  A location and routing infrastructure – –  Advantages – – – –  Application-level, routed by an ID not IP address Routing effciency: O(logN) Good scalability (O(logN) in routing effciency and routing table) Reliability Self-maintenance (node addition/removal) Good performance (compared to other P2P systems) Issues – – – Routing performance (compared to IP routing) Security Other issues …… P2P Applications      Content delivery systems Application-level multicast Publishing/file sharing systems P2P storage systems (e.g., PAST, CFS, OceanStore) P2P file systems Ivy: A Read/Write P2P File System     Introduction Design Issues Performance Evaluation Summary Introdcution  Challenges: – – – – – – Previous P2P systems are either read-only or one single writer, so multiple writers pose file system consistency issue Unreliable participants render locking unattractive (for consistency) Undo/ignore untrusty participants’ modifications Security over untrusted storage of nodes Resolve update conflicts due to network partition High availability vs. strong consistency Design Issues    DHash infrastructure Log-based metadata and data NFS-like file system DHash    A distributed P2P hash table Stores participant’s logs Basic operations – – – put(key, value) get(key) E.g., key = content-hash of a log, value = log record Log Data Structure  One log per participant A log contains all of one participant’s modifications (log records) to a file system data and metadata – Each log record is a content-hash block – Each participant appends log records only to its own log, but reads from all participants’ logs –  Ignore some untrusty participant’s modifications by without reading its log Log Data Structure Log Data Structure Log Data Structure Using the Log  Append a log record – – – – – Derive a log record from a NFS request Its prev field points to the last record Insert the new log record into DHash Sign a new log-head pointing to the new log record Insert the new log-head into DHash Using the Log  File system creation – – – – Create a new log with an End record An Inode record with random i-number for the root directory A log-head Using the root i-number as the NFS root file handle Using the Log  File creation – – – – –  Request: create (directory i-number, file name) An Inode record with a new random i-number A Link record Return the NFS client with the i-number as a file handle If write the file, create a Write record File read – – Request: read (i-number, offset, length) Scan logs accumulating data from Write records overlapping the range of data to be read, while ignoring data hiddened by SetAttr records that indicate file trucation. Using the Log  File name lookup – – –  Request: open (directory i-number, file name) Scan logs for a corresponding Link record First encounter a corresponding Unlink record, indicating that the file doesn’t exist File attributes – – File length, mtime, ctime, etc. Scan logs to incrementally compute attributes User Cooperation: Views   View: the set of logs comprising a file system View block – – –  A DHash content-hash block containing pointers to all log-heads in the view Contains the root directory i-number One Property: immutable (different file systems with different view blocks ) Name a file system with the content-hash key of its view block, like self-certifying file system (SFS) Combining Logs  Problem: –  concurrent updates result in conflicts, how to order log records ? Solution: Version Vector in each log record – – – Detect update conflicts E.g., (A:5, B:7) < (A:6, B:7) compatible (A:5, B:7) vs. (A:6, B:6) concurrent version vectors, order them by comparing the public keys of two logs Snapshots  Problem ? –  have to traverse the entire log to answer requests (high overhead and inefficiency). Solution: snapshots – – – – – Avoid traversing the entire log Consistent state of the file system Private per participant, periodically construct it Stored in DHash, sharing contents among snapshots Contains a file map, a set of i-nodes, and some data blocks, see Figure 2 Snapshot Data Structure Snapshots  Building snapshots –  perform all log records newer than the previous snapshot Using snapshots – – – First traverse log records newer than current snapshot If this can’t fulfill a NFS request, further search information in current snapshot Mutually-trusted participants can share snapshots Cache Consistency  Most updates are immediately visible Store the new log record and update the new log-head before replying to an NFS request – Query the latest log-heads for latest updates upon each NFS operation –  Modified close-to-open consistency for file reads/writes Open()  fectch all log-heads for subsequent reads/writes – Write()  write data on its cache, defers writing data to DHash – Close()  push log records (if any by writes), update loghead – Exclusive Create  Requirement: create directory entries be exclusive –  Some applications use this semantics to implement locks Solution: Partitioned Updates   Close-to-open consistency guaranteed only if network is fully connected How if network partitioned? – – – – Maximize availability (by allowing concurrent updates) Compromise consistency After partition heals, using Version Vectors Application-level solver to resolve conflicts (Harp) Security and Integrity   Form another view to exlcude bad/misbehavoring/malicious participants Using content-hash key and public-hash key to protect data integrity Evaluation Goal: understand the cost of Ivy’s design in terms of network latency and cryptographic operations  Workload: Modified Andrew Benchmark (MAB)  Performance in a WAN  Many Logs, One Writer  The number of logs has relatively little impact – Because Ivy fetches the log-heads/log-records in parallel Many DHash Servers  More impact, since more messages are required to fetch logrecords Many Writers  More impact, have to fetch other participants’ newly logged updates Summary       Log-based data/metadata, avoiding using locking Close-to-open consistency Tradeoff between high availabilty and strong consistency Allow concurrent updates, detect and reslove update conflicts Performance: 2-3 times slower than NFS Limitations ? – – Small scale: limited to the number of logs Hard to hide wide-area network latency Thanks

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Presentation for class