Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Peer-to-Peer Approaches to Grid Resource Discovery Ann Chervenak University of Southern California Information Sciences Institute Joint work with Shishir Bharathi, Min Cai 1 Resource Discovery In Grids • Applications running in wide area distributed environments or Grids need information about resources and services • Resource information has an impact on – Scheduling decisions – Replica selection – Planning for workflow execution (e.g., Pegasus) • Resource discovery services need to be – Highly available, fault tolerant, reliable – Highly scalable – Provide flexible query interfaces • Typical Grid resource discovery services are organized as hierarchies of index services • Applying P2P techniques offers promise of selforganization, self-healing, improved scalability 2 Outline • Resource Discovery in Grids: current approaches • Peer-to-Peer Resource Discovery • Applying P2P techniques to Grid Resource Discovery – Unstructured P2P Information Service • Joint work with Shishir Bharathi – Structured P2P Replica Location Service • Joint work with Min Cai • Summary and Future Work 3 Typical Grid Resource Discovery Currently, most services for resource discovery in Grids are query-based index services • One or more indexes aggregate information about resources • Often distributed using a hierarchical structure • Service provides a front end query interface • Database back end stores resource information • Service responds to queries by identifying resources that match desired properties • Scalable – Hold large amounts of resource information – Support high query rates • Often specialized for particular types of resource information and/or queries – Specialized query APIs, resource schemas, etc. 4 Index-based Discovery Services • Globus Monitoring and Discovery System (MDS) – Hierarchical, distributed Index Service – Aggregates information about resources (CPUs, storage systems, etc.) – Answers queries for resources with specified properties – Typically close to resources, whose information may change frequently (e.g., index co-located with a cluster) 5 Index-based Discovery Services (Cont.) • Globus Replica Location Service (RLS) Replica Location Index Nodes RLI – Hierarchical, distributed index – Provides mappings from logical names for data to physical locations of replicas • Metadata Catalog Service (MCS) RLI LRC RLI RLI LRC RLI LRC LRC Local Replica Catalogs – Centralized database of metadata attributes associated with data items – Answers queries for resources with specified metadata characteristics • Storage Resource Broker MCAT – Centralized (or partitioned) catalog with metadata, replica location and resource information 6 Challenges for Resource Discovery Services • Grids are growing larger – Increasing number of resources and resource discovery service instances • Organization of resource discovery services is challenging – Creation/maintenance of efficient hierarchies – Avoiding hotspots, eliminating update cycles • Much of the configuration and maintenance of these services is done manually – Few capabilities for self-configuration or self-healing – Limits scalability – Makes services complex to deploy and maintain • Goal: Use peer-to-peer techniques make services more self-configuring, reliable and scalable 7 Peer-to-Peer Systems • Service instances create an overlay network • Queries and responses forwarded/routed in the overlay • Structured overlay – Chord, Pastry, etc. – Distributed Hash Table (DHT) based – Effective when storing/retrieving <key, value> pairs – Strong bounds on performance • Unstructured overlay – Gnutella, KaZaA – Effective when querying on attributes – Not DHT based – Flooding algorithms • Hybrid approaches also possible • Good scalability, self-organization, reliability, self-healing 8 Structured P2P Networks • Maintain a structured overlay network among peers and use message routing • Basic functionality: lookup (key), which returns the identity of the node storing the object with that key • Often based on Distributed Hash Table (DHT) • Objects are associated with a key that can be produced by hashing the object name • Nodes have identifiers that share the same space as keys • Each node is responsible for storing a range of keys and corresponding objects • Nodes maintain an overlay network, with each node having several other nodes as neighbors • When a lookup (key) request is issued from one node, the lookup message is routed through the overlay network to the node responsible for the key 9 O(dN 1 / d ) Structured P2P Networks (cont.) • Different DHT systems construct a variety of overlay networks and employ different routing algorithms • They can guarantee to finish a lookup operation in O(log N) or O(dN1/d) hops • Each node only maintains the information of O(log N) or d neighbors for an N node network (where d is the dimension of the hypercube organization of the network) • So DHT systems provide good scalability as well as fault tolerance • DHT systems include Pastry, Chord, CAN and Koorde 10 Example: Chord Structured P2P System • Chord algorithm proposed by Stoica, et al. • Chord uses a one-dimensional circular identifier space with modulo 2m for both node identifiers and object keys • Every node in Chord is assigned a unique m-bit identifier by hashing their IP address and port number • All nodes self-organize into a ring topology based on their node identifiers in the circular space • Each object is also assigned a unique m-bit identifier called its object key • Object keys are assigned to nodes by using consistent hashing – Key k is assigned to the first node whose identifier is equal to or follows the identifier of k in the circular space – This node stores the object with key k and is called its successor node 11 An Example of Chord Network N60 N4 N54 N8 Key52 Key18 N48 N20 N24 N40 Key31 12 Chord Structured P2P System (cont.) • Each Chord node maintains two sets of neighbors: its successors and its fingers • Successor nodes immediately follow the node in the identifier space • Finger nodes are spaced exponentially around the identifier space • Each node has a constant number of successors and at most m fingers • The i-th finger for the node with identity n is the first node that succeeds n by at least 2i-1 on the identifier circle, where 1<=i<=m • The first finger node is the immediate successor of n, where i=1 13 An Example of Chord Network Finger Table N60 N4 N4+1, N4+2, N4+4 N54 N4+1 => N8 N4+2 => N8 N4+4 => N8 N4+8 => N20 N4+16 => N20 N4+32 => N40 N8 N4+32 N4+8, N4+16 N48 N20 N24 N40 14 Chord (cont.) • When node n wants to lookup the object with key k, it will route a lookup request to the successor node of key k • If the successor node is far away from n, node n forwards the request to the finger node whose identifier most immediately precedes the successor node of key k • By repeating this process, the request gets closer and closer to the successor node • Eventually, the successor node receives the lookup request for the object with key k, finds the object locally and sends the result back to node n • Each hop from one node to the next node covers at least half the identifier space (clockwise) between that node and the successor node of key k • Number of routing hops for a lookup is O(log N) for a Chord network with N nodes – Insertion time also O(log N) • Each node maintains pointers to O(log N) neighbors 15 An Example of Chord Network Finger Table Lookup(key52) N60 N4 Key52 N54 lookup(key52) N4+1, N4+2, N4+4 N4+1 => N8 N4+2 => N8 N4+4 => N8 N4+8 => N20 N4+16 => N20 N4+32 => N40 N8 Key52 N4+32 N4+8, N4+16 Key18 N48 N20 N24 N40 Key31 16 Unstructured P2P Systems • An unstructured P2P network usually does not impose any constraints on links between nodes in the system • Choice of neighbors to peer with is less restrictive and is often probabilistic or randomized • Unstructured overlays do not create associations between nodes or links in the system and the information stored in those nodes – Do not require that information adhere to a particular format or be tied to the structure of the overlay – Unlike DHT systems that store <key,value> pairs and hash on key value • Information is usually stored only at the node where it was generated or replicated in a probabilistic manner • Query-response pathways are also not well defined • Queries are propagated in the system using flooding based algorithms – Responses are routed back on the same path as the queries 17 Unstructured P2P Networks (cont.) • Cannot provide guarantees on query performance – Don’t know the number of hops taken by a query message to reach a node that can answer the query – Unlike structured overlays • Cannot guarantee that results will be returned if they exist in the network – The time-to-live field in the message dictates how far the message travels in the network – Message may not reach all nodes • Applications must be capable of dealing with these issues as they do with other failure modes 18 Unstructured P2P Networks (cont.) • Examples of unstructured P2P systems: Napster, Gnutella and Kazaa • Successful in internet file sharing applications • Allow peers to host content, discover content on other peers, and download that content • Popular in the Internet community despite known disadvantages: – Vulnerability of central indexes in Napster – High network loads imposed by Gnutella’s flooding algorithms • Optimizations of unstructured systems have been developed based on file and query distributions and on the use of replication and caching 19 Outline • Resource Discovery in Grids: current approaches • Peer-to-Peer Resource Discovery • Applying P2P techniques to Grid Resource Discovery – Unstructured P2P Information Service • Joint work with Shishir Bharathi – Structured P2P Replica Location Service • Joint work with Min Cai • Summary and Future Work 20 Applying Peer-to-Peer Techniques to Grid Resource Discovery • P2P technologies successful in internet file sharing applications – Gnutella, Kazaa, Napster, etc. – Allow peers to host content, discover content and download • Grid resource discovery services have similar requirements • Would like to use peer-to-peer technologies for resource discovery in Grids – Improved configuration, scalability, etc. • Convergence of P2P and Grid has been predicted – But not yet a reality… 21 Challenges in Applying P2P Technologies to Grids • Performance – May require multiple network hops to resolve a query – Some P2P overlays distribute resource information widely (e.g., structured overlays) – Still want to make use of specialized Grid indexes • Security issues – Access to resources and information about resources may need to be controlled – Need a security model that allows us to use P2P safely • Practical Issues – Has taken several years to make Grid discovery services scalable, stable – To support greater scalability, need further improvements in simple and dynamic deployment 22 P2P Grid Information Service • Explore organizing an existing grid information service (GT4 Index Service ) as a peer-to-peer system • Background – Grid Information Services – GT4 Index Service • P2P Index Service Design – Issues and design choices – Optimizations • Implementation • Experimental results “Design of a Scalable Peer-to-Peer Information System Using the GT4 Index Service”, Shishir Bharathi and Ann Chervenak, CCGrid 2007 Conference, May 2007 23 Grid Information Services • Collect information about resources – skynet.isi.edu has 96 nodes – GridFTP runs on port 2811 – Avg. load on skynet-login.isi.edu is 2.55 • Aggregate Information from multiple types of resources and services • Queried by schedulers, workflow planners, clients that need information about current resource state • Process different types of queries – What port does GridFTP run on ? • One response expected – What servers have load < 3.0 ? • Expect multiple response, information gathering step 24 Organization of GT4 Indexes • Globus Toolkit Version 4 Index Service – Part of the Monitoring and Discovery System (MDS) – Aggregates information about resources, responds to queries – Issues: Designing efficient hierarchies, avoiding hot spots, avoiding update cycles, scaling with amount of information – Dealing with multiple administrative domains 25 GT4 Index Service • WS-RF service, part of the Monitoring and Discovery System (MDS) • Information Providers generate resource information in XML format – E.g. Hawkeye, Ganglia, GridFTP • Aggregator sources aggregate information • Index Service publishes aggregated information in a single Resource Property document • Processes XPath queries and returns matching XML content 26 Design of P2P GT4 Index Service • Modified GT4 Index Services to create P2P Index Service • P2P Indexes organize themselves into an unstructured overlay • Each P2P Index can be accessed both via the overlay and as a standalone GT4 Index Service 27 Design of GT4 Index Service (cont.) • Overlay used only for self-organization and for forwarding and routing of queries and responses – Continue to use specialized MDS4 Index Services that aggregate resource information • Resource information is not stored or updated via the overlay – Each P2P Index acquires resource information via an outof-band mechanism – Policies may dictate when and how indexes are updated • Resource information is not replicated via the overlay – May change quickly, so replication is not effective • Separate “storing of information” from “querying for information” 28 Design Issue: Security Model • Grid services typically impose a strict security model – Client and server go through mutual authentication and authorization • P2P systems impose a less strict security model – Access to information and access to resource via the same overlay • Separation of “access to information” from “access to resources” – User authenticates at any node and queries for information • Trusted at VO level to access information. • e.g. Find compute resource that can execute job – User accesses resource directly and not through the overlay • Involves mutual authentication and authorization • Trusted at individual site level to access resources. • e.g. Submit job to resource directly 29 Design Issue: Choice of Overlay • Our choice: Unstructured Overlays – – – – Easy overlay management Suitable for storing arbitrary content Support VO defined topologies Previous work mostly in file-sharing applications • Why not Structured Overlays? – Well researched in context of information services. However… – Not ideal for storing XML resource information – Policy restrictions may prevent nodes from storing/indexing information generated at other nodes 30 Issues for Unstructured Overlays • Unstructured overlay + no replication + flooding algorithm means… – Cannot guarantee answer will be found – Depends on max-hops field – No guarantees on the number of hops taken to reach answer • Exponential growth in the number of messages sent – Need to optimize message forwarding to counter this explosion • Typical message handling – Process query locally. If result not found, forward to peers – Reduces number of messages sent – BUT slow, if client expects multiple responses 31 Optimization: Query Caching with Probabilistic Forwarding • Goal: Reduce number of messages sent – Replication/Caching most popular technique • Cannot cache query responses – Information may change quickly, policy restrictions, etc. • Can cache queries • Query Caching with Probabilistic Forwarding – Cache information about which nodes responded to query – If a node responded earlier to same query, deterministically forward query to that node – Forward query to other nodes with low probability • Identify nodes that have been updated with new information • Set up caches along duplicate paths correctly – Similar to other learning-based & probabilistic approaches • Effective for applications that may issue queries repeatedly (e.g., Pegasus workflow planner) 32 Optimization: Early Forwarding • Goal: Improve performance of attribute based queries – Process and forward model may be slow • Distinguish between “Return one” vs. “Return all” semantics • Return one - explicit queries – What is the load on skynet.isi.edu? – Requires a single response – Process query locally before forwarding to reduce messages • Return all – attribute based queries – What sites have load < 3.0? – Likely to be multiple responses – Forward query before processing locally to reduce response time (“early forwarding”) • Tag QUERY messages with hints (“Return one” or “Return all”) that indicate whether to do early forwarding 33 Implementation of P2P Index Service • Layered implementation – P2P Resource component maintains overlay and processes messages – IndexServiceResource component processes queries – “Almost” plug-and-play • Gnutella-like message forwarding • Updated using standard MDS aggregator framework and not through the overlay • Query Caching and Early Forwarding optimizations 34 Experiments Experimental Set-up • Evaluate overhead of applying a P2P overlay • Evaluate wide-area performance • Test beds - LAN at ISI, PlanetLab – Mostly comparison tests – small networks Applications • Pegasus (A workflow planning application) – Site and transformation catalogs used by Pegasus • Simple random query client – Artificial records Metrics • Time taken by Pegasus to generate a plan • Query rates 35 Experiments: Overhead of P2P Layer Comparison of a single Index Service and a single P2P Index Service - Indicator of P2P Index Service layer overhead Time taken by Pegasus to generate a plan (seconds) 600 P2P Index Service 500 400 300 200 100 0 20 40 60 80 100 120 140 160 Number of data sets in the index • Pegasus planning a 100 job workflow • Query Overhead reasonably constant as the number of datasets in the index increase 36 Queries/Minute Experiments: Wide Area Performance 300 250 200 150 100 50 0 Scalability Tests on 3 Networks 1000 datasets at each index, 2 peers per index World WAN US WAN LAN 1 2 4 8 16 32 64 Number of clients (4 threads each) • WAN measurements on the PlanetLab test bed • 8 nodes, 2 peers per node, up to 256 concurrent client threads • Query rates in “World” WAN slightly higher than in “US” WAN – Higher load on the US PlanetLab nodes – Query processing is compute intensive 37 P2P Index Service: Conclusions • P2P Organization of GT4 Information Service – Low overhead from adding a P2P layer • Key design features: – Separation of storage of information from querying for information (Overlay used only to forward queries) – Separation of access to information from access to resources (Security model: Choose what is exposed at VO level and apply additional restrictions at resource level) • Simple optimizations help address issues with flooding (results not shown here) – Query caching with probabilistic forwarding – Early Forwarding • Future Work – Scale to larger sizes – P2P version of Replica Location Service using overlay – Experiment with replicating relatively static information 38 Outline • Resource Discovery in Grids: current approaches • Peer-to-Peer Resource Discovery • Applying P2P techniques to Grid Resource Discovery – P2P Information Service • Joint work with Shishir Bharathi – P2P Replica Location Service • Joint work with Min Cai • Summary and Future Work 39 Peer-to-Peer Replica Location Service • Implemented a P2P Replica Location Service based on: – Globus Toolkit Version 3.0 RLS – Chord structured Peer-to-Peer overlay network “A Peer-to-Peer Replica Location Service Based on A Distributed Hash Table,” Min Cai, Ann Chervenak, Martin Frank, Proceedings of SC2004 Conference, November 2004. “Applying Peer-to-Peer Techniques to Grid Replica Location Services,” Ann L. Chervenak, Min Cai, Journal of Grid Computing, 2006. 40 The Globus Replica Location Service • The existing Globus Replica Location Service (RLS) is a distributed registry service that records the locations of data copies and allows discovery of replicas • Maintains mappings between logical identifiers and target names Replica Location Index Nodes • Local Replica Catalogs (LRCs) contain logical-toRLI RLI RLI target mappings • Replica Location Index Nodes (RLIs) aggregate information about LRCs • Soft state updates sent from LRCs to RLIs LRC LRC LRC LRC Local Replica Catalogs 41 Motivation for a Peer-to-Peer RLS • Each RLS deployment is statically configured – If upper level RLI fails, the lower level LRCs need to be manually redirected • More automated and flexible membership management is desirable for: – larger deployments – dynamic environments where servers frequently join and leave • We use a peer-to-peer approach to provide distributed RLI index for {logical-name, LRC} mappings • Consistent with our security model: resource discovery at the RLI level, stricter security at LRC level • In P2P RLS, replicate mappings, unlike in P2P MDS – Easier to hash on logical name than on arbitrary XML content – Mappings are much less dynamic than resource information 42 P2P Replica Location Service (P-RLS) Design • A P-RLS server consists of: – An unchanged Local Replica Catalog (LRC) to maintain consistent {logical-name, target-name} mappings – A Peer-to-Peer Replica Location Index node (P-RLI) • The P-RLS design uses a Chord overlay network to self-organize P-RLI servers – Chord is a distributed hash table that supports scalable key insertion and lookup – Each node has log (N) neighbors in a network of N nodes – A key is stored on its successor node (first node with ID equal to or greater than key) – Key insertion and lookup in log (N) hops – Stabilization algorithm for overlay construction and topology repair 43 P-RLS Design (cont.) • Uses Chord algorithm to store mappings of logical names to LRC sites – Generates Chord key for a logical name by applying SHA1 hash function – Stores {logical-name, LRC} mappings on the P-RLI successor nodeof the mapping • When P-RLI node receives a query for LRC(s) that store mappings for a logical name: – Answers the query if it contains the logical-to-LRC mapping(s) – If not, routes query to the successor node that contains the mappings • Then query LRCs directly for mappings from logical names to replica locations 44 An Example of P-RLS Network Finger Table SHA1(“lfn1000”) = 18 SHA1(“lfn1001”) = 52 SHA1(“lfn1002”) = 31 rli_get_lrc (“lfn1001”) N60 N4 <lfn1001, rlsn://lrc1001> N54 lookup(key52) <lfn1001, rlsn://lrc1001> N4+1, N4+2, N4+4 N8 N4+8, N4+16 lookup(key52) N4+32 N48 N4+1 => N8 N4+2 => N8 N4+4 => N8 N4+8 => N20 N4+16 => N20 N4+32 => N40 <lfn1000, rlsn://lrc1000> lookup(key52) N20 N24 N40 <lfn1002, rlsn://lrc1002> 45 P-RLS Implementation • Implemented a prototype of P-RLS • Extends RLS implementation in Globus Toolkit 3.0 • Each P-RLS node consists of an unchanged LRC server and a peer-to-peer P-RLI server • The P-RLI server implements the Chord protocol operations, including join, update, query, successor, probing & stabilization • LRC, RLI & Chord protocols implemented on top of RLS RPC layer P-RLS P-RLI Server LRC RLI Chord Protocol Protocol Protocol RLS RPC Layer Successor, Join, Update, Query, Probing Stabilizatio n Chord Network LRC Server P-RLI Server RLS Client API RLS Client API LRC Server P-RLS LRC RLI Chord Protocol Protocol Protocol RLS RPC Layer 46 P-RLS Performance • P-RLS network runs on a 16-node cluster • 1000 updates (add operations) on each node, updates overwrite existing mappings, and maximum 1000 mappings in the network • Update latencies increase on log scale with number of nodes Update latency (ms) 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Number of nodes 47 P-RLS Measurements (cont.) • Query latencies with 100,000 and 1 million mappings • Total number of mappings has little effect on query times – Uses hash table to index mappings on each P-RLI node Query latency (ms) • Query times increase on log scale with number of nodes 5 4.5 4 3.5 3 2.5 2 1.5 1 0.5 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Number of nodes 100,000 preloaded mappings 1,000,000 preloaded mappings 48 Successor Replication in P-RLS • Need better reliability when P-RLI nodes fail or leave – Replicate mappings so they are not lost • Min Cai proposes Adaptive Successor Replication: – Extends previously described scheme – Replicate each mapping on k successor nodes of the root node • Provides reliability despite P-RLI failures – No mappings lost unless all k successors fail simultaneously • Distributes mappings more evenly among P-RLI nodes as replication factor increases – Improves load balancing for popular mappings 49 Summary: P-RLS Work • Implemented a P2P Replica Location Service based on: – Globus Toolkit Version 3.0 RLS – Chord structured Peer-to-Peer overlay network • Measured the performance of our P-RLS system with up to 15 nodes – Query and update latencies increase at rate of O(logN) with size of P-RLS network • Simulated the performance of larger P-RLS networks • Replication of mappings results in more even distribution of mappings among nodes • Successor replication scheme provides query load balancing for popular mappings 50 Related Work: P2P and Grids Other approaches to applying P2P to Grid services • GLARE – Mumtaz Siddiqui et al., University of Innsbruck – Structured P2P approach to Grid information services • P2P Replica Location Service – Matei Ripeanu et al., University of British Columbia – Use bloom filters and an unstructured P2P network for replica location service Structured Peer-to-Peer Networks • Examples: Chord, CAN, Tapestry, Pastry Unstructured Peer-to-Peer Networks • Examples: Gnutella, KaZaA, Gia 51 Summary • Resource discovery services in Grids are well-suited to P2P approaches – Similar in function to internet file sharing applications • P2P approaches are attractive because – Scale of Grids growing larger – Organization of hierarchical resource discovery services is challenging, especially at large scale – Need self-configuration, self-healing • Security, performance & practical issues to be overcome • Two systems implemented and evaluated – P2P Information Service: Uses unstructured overlay to support resource information specified in XML – P2P Replica Location Service: Uses structured overlay to distribute mappings among indexes, bounds query response 52 Future Work • Continued research – Larger scale – Different overlay schemes – Additional optimizations to improve query performance, reduce message counts • Incorporate P2P techniques into real-world Grid resource discovery services – Make GT4-based peer-to-peer overlay available as open source contribution to Globus Toolkit – Release P2P Index Service component for MDS4 – Tested a version of Replica Location Service using the unstructured GT4 overlay – Additional improvements to services for easy, dynamic deployment to make these approaches practical 53