Download Peer-to-Peer Approaches to Grid Resource Discovery

Peer-to-Peer Approaches to Grid Resource Discovery Ann Chervenak University of Southern California Information Sciences Institute Joint work with Shishir Bharathi, Min Cai 1 Resource Discovery In Grids • Applications running in wide area distributed environments or Grids need information about resources and services • Resource information has an impact on – Scheduling decisions – Replica selection – Planning for workflow execution (e.g., Pegasus) • Resource discovery services need to be – Highly available, fault tolerant, reliable – Highly scalable – Provide flexible query interfaces • Typical Grid resource discovery services are organized as hierarchies of index services • Applying P2P techniques offers promise of selforganization, self-healing, improved scalability 2 Outline • Resource Discovery in Grids: current approaches • Peer-to-Peer Resource Discovery • Applying P2P techniques to Grid Resource Discovery – Unstructured P2P Information Service • Joint work with Shishir Bharathi – Structured P2P Replica Location Service • Joint work with Min Cai • Summary and Future Work 3 Typical Grid Resource Discovery Currently, most services for resource discovery in Grids are query-based index services • One or more indexes aggregate information about resources • Often distributed using a hierarchical structure • Service provides a front end query interface • Database back end stores resource information • Service responds to queries by identifying resources that match desired properties • Scalable – Hold large amounts of resource information – Support high query rates • Often specialized for particular types of resource information and/or queries – Specialized query APIs, resource schemas, etc. 4 Index-based Discovery Services • Globus Monitoring and Discovery System (MDS) – Hierarchical, distributed Index Service – Aggregates information about resources (CPUs, storage systems, etc.) – Answers queries for resources with specified properties – Typically close to resources, whose information may change frequently (e.g., index co-located with a cluster) 5 Index-based Discovery Services (Cont.) • Globus Replica Location Service (RLS) Replica Location Index Nodes RLI – Hierarchical, distributed index – Provides mappings from logical names for data to physical locations of replicas • Metadata Catalog Service (MCS) RLI LRC RLI RLI LRC RLI LRC LRC Local Replica Catalogs – Centralized database of metadata attributes associated with data items – Answers queries for resources with specified metadata characteristics • Storage Resource Broker MCAT – Centralized (or partitioned) catalog with metadata, replica location and resource information 6 Challenges for Resource Discovery Services • Grids are growing larger – Increasing number of resources and resource discovery service instances • Organization of resource discovery services is challenging – Creation/maintenance of efficient hierarchies – Avoiding hotspots, eliminating update cycles • Much of the configuration and maintenance of these services is done manually – Few capabilities for self-configuration or self-healing – Limits scalability – Makes services complex to deploy and maintain • Goal: Use peer-to-peer techniques make services more self-configuring, reliable and scalable 7 Peer-to-Peer Systems • Service instances create an overlay network • Queries and responses forwarded/routed in the overlay • Structured overlay – Chord, Pastry, etc. – Distributed Hash Table (DHT) based – Effective when storing/retrieving <key, value> pairs – Strong bounds on performance • Unstructured overlay – Gnutella, KaZaA – Effective when querying on attributes – Not DHT based – Flooding algorithms • Hybrid approaches also possible • Good scalability, self-organization, reliability, self-healing 8 Structured P2P Networks • Maintain a structured overlay network among peers and use message routing • Basic functionality: lookup (key), which returns the identity of the node storing the object with that key • Often based on Distributed Hash Table (DHT) • Objects are associated with a key that can be produced by hashing the object name • Nodes have identifiers that share the same space as keys • Each node is responsible for storing a range of keys and corresponding objects • Nodes maintain an overlay network, with each node having several other nodes as neighbors • When a lookup (key) request is issued from one node, the lookup message is routed through the overlay network to the node responsible for the key 9 O(dN 1 / d ) Structured P2P Networks (cont.) • Different DHT systems construct a variety of overlay networks and employ different routing algorithms • They can guarantee to finish a lookup operation in O(log N) or O(dN1/d) hops • Each node only maintains the information of O(log N) or d neighbors for an N node network (where d is the dimension of the hypercube organization of the network) • So DHT systems provide good scalability as well as fault tolerance • DHT systems include Pastry, Chord, CAN and Koorde 10 Example: Chord Structured P2P System • Chord algorithm proposed by Stoica, et al. • Chord uses a one-dimensional circular identifier space with modulo 2m for both node identifiers and object keys • Every node in Chord is assigned a unique m-bit identifier by hashing their IP address and port number • All nodes self-organize into a ring topology based on their node identifiers in the circular space • Each object is also assigned a unique m-bit identifier called its object key • Object keys are assigned to nodes by using consistent hashing – Key k is assigned to the first node whose identifier is equal to or follows the identifier of k in the circular space – This node stores the object with key k and is called its successor node 11 An Example of Chord Network N60 N4 N54 N8 Key52 Key18 N48 N20 N24 N40 Key31 12 Chord Structured P2P System (cont.) • Each Chord node maintains two sets of neighbors: its successors and its fingers • Successor nodes immediately follow the node in the identifier space • Finger nodes are spaced exponentially around the identifier space • Each node has a constant number of successors and at most m fingers • The i-th finger for the node with identity n is the first node that succeeds n by at least 2i-1 on the identifier circle, where 1<=i<=m • The first finger node is the immediate successor of n, where i=1 13 An Example of Chord Network Finger Table N60 N4 N4+1, N4+2, N4+4 N54 N4+1 => N8 N4+2 => N8 N4+4 => N8 N4+8 => N20 N4+16 => N20 N4+32 => N40 N8 N4+32 N4+8, N4+16 N48 N20 N24 N40 14 Chord (cont.) • When node n wants to lookup the object with key k, it will route a lookup request to the successor node of key k • If the successor node is far away from n, node n forwards the request to the finger node whose identifier most immediately precedes the successor node of key k • By repeating this process, the request gets closer and closer to the successor node • Eventually, the successor node receives the lookup request for the object with key k, finds the object locally and sends the result back to node n • Each hop from one node to the next node covers at least half the identifier space (clockwise) between that node and the successor node of key k • Number of routing hops for a lookup is O(log N) for a Chord network with N nodes – Insertion time also O(log N) • Each node maintains pointers to O(log N) neighbors 15 An Example of Chord Network Finger Table Lookup(key52) N60 N4 Key52 N54 lookup(key52) N4+1, N4+2, N4+4 N4+1 => N8 N4+2 => N8 N4+4 => N8 N4+8 => N20 N4+16 => N20 N4+32 => N40 N8 Key52 N4+32 N4+8, N4+16 Key18 N48 N20 N24 N40 Key31 16 Unstructured P2P Systems • An unstructured P2P network usually does not impose any constraints on links between nodes in the system • Choice of neighbors to peer with is less restrictive and is often probabilistic or randomized • Unstructured overlays do not create associations between nodes or links in the system and the information stored in those nodes – Do not require that information adhere to a particular format or be tied to the structure of the overlay – Unlike DHT systems that store <key,value> pairs and hash on key value • Information is usually stored only at the node where it was generated or replicated in a probabilistic manner • Query-response pathways are also not well defined • Queries are propagated in the system using flooding based algorithms – Responses are routed back on the same path as the queries 17 Unstructured P2P Networks (cont.) • Cannot provide guarantees on query performance – Don’t know the number of hops taken by a query message to reach a node that can answer the query – Unlike structured overlays • Cannot guarantee that results will be returned if they exist in the network – The time-to-live field in the message dictates how far the message travels in the network – Message may not reach all nodes • Applications must be capable of dealing with these issues as they do with other failure modes 18 Unstructured P2P Networks (cont.) • Examples of unstructured P2P systems: Napster, Gnutella and Kazaa • Successful in internet file sharing applications • Allow peers to host content, discover content on other peers, and download that content • Popular in the Internet community despite known disadvantages: – Vulnerability of central indexes in Napster – High network loads imposed by Gnutella’s flooding algorithms • Optimizations of unstructured systems have been developed based on file and query distributions and on the use of replication and caching 19 Outline • Resource Discovery in Grids: current approaches • Peer-to-Peer Resource Discovery • Applying P2P techniques to Grid Resource Discovery – Unstructured P2P Information Service • Joint work with Shishir Bharathi – Structured P2P Replica Location Service • Joint work with Min Cai • Summary and Future Work 20 Applying Peer-to-Peer Techniques to Grid Resource Discovery • P2P technologies successful in internet file sharing applications – Gnutella, Kazaa, Napster, etc. – Allow peers to host content, discover content and download • Grid resource discovery services have similar requirements • Would like to use peer-to-peer technologies for resource discovery in Grids – Improved configuration, scalability, etc. • Convergence of P2P and Grid has been predicted – But not yet a reality… 21 Challenges in Applying P2P Technologies to Grids • Performance – May require multiple network hops to resolve a query – Some P2P overlays distribute resource information widely (e.g., structured overlays) – Still want to make use of specialized Grid indexes • Security issues – Access to resources and information about resources may need to be controlled – Need a security model that allows us to use P2P safely • Practical Issues – Has taken several years to make Grid discovery services scalable, stable – To support greater scalability, need further improvements in simple and dynamic deployment 22 P2P Grid Information Service • Explore organizing an existing grid information service (GT4 Index Service ) as a peer-to-peer system • Background – Grid Information Services – GT4 Index Service • P2P Index Service Design – Issues and design choices – Optimizations • Implementation • Experimental results “Design of a Scalable Peer-to-Peer Information System Using the GT4 Index Service”, Shishir Bharathi and Ann Chervenak, CCGrid 2007 Conference, May 2007 23 Grid Information Services • Collect information about resources – skynet.isi.edu has 96 nodes – GridFTP runs on port 2811 – Avg. load on skynet-login.isi.edu is 2.55 • Aggregate Information from multiple types of resources and services • Queried by schedulers, workflow planners, clients that need information about current resource state • Process different types of queries – What port does GridFTP run on ? • One response expected – What servers have load < 3.0 ? • Expect multiple response, information gathering step 24 Organization of GT4 Indexes • Globus Toolkit Version 4 Index Service – Part of the Monitoring and Discovery System (MDS) – Aggregates information about resources, responds to queries – Issues: Designing efficient hierarchies, avoiding hot spots, avoiding update cycles, scaling with amount of information – Dealing with multiple administrative domains 25 GT4 Index Service • WS-RF service, part of the Monitoring and Discovery System (MDS) • Information Providers generate resource information in XML format – E.g. Hawkeye, Ganglia, GridFTP • Aggregator sources aggregate information • Index Service publishes aggregated information in a single Resource Property document • Processes XPath queries and returns matching XML content 26 Design of P2P GT4 Index Service • Modified GT4 Index Services to create P2P Index Service • P2P Indexes organize themselves into an unstructured overlay • Each P2P Index can be accessed both via the overlay and as a standalone GT4 Index Service 27 Design of GT4 Index Service (cont.) • Overlay used only for self-organization and for forwarding and routing of queries and responses – Continue to use specialized MDS4 Index Services that aggregate resource information • Resource information is not stored or updated via the overlay – Each P2P Index acquires resource information via an outof-band mechanism – Policies may dictate when and how indexes are updated • Resource information is not replicated via the overlay – May change quickly, so replication is not effective • Separate “storing of information” from “querying for information” 28 Design Issue: Security Model • Grid services typically impose a strict security model – Client and server go through mutual authentication and authorization • P2P systems impose a less strict security model – Access to information and access to resource via the same overlay • Separation of “access to information” from “access to resources” – User authenticates at any node and queries for information • Trusted at VO level to access information. • e.g. Find compute resource that can execute job – User accesses resource directly and not through the overlay • Involves mutual authentication and authorization • Trusted at individual site level to access resources. • e.g. Submit job to resource directly 29 Design Issue: Choice of Overlay • Our choice: Unstructured Overlays – – – – Easy overlay management Suitable for storing arbitrary content Support VO defined topologies Previous work mostly in file-sharing applications • Why not Structured Overlays? – Well researched in context of information services. However… – Not ideal for storing XML resource information – Policy restrictions may prevent nodes from storing/indexing information generated at other nodes 30 Issues for Unstructured Overlays • Unstructured overlay + no replication + flooding algorithm means… – Cannot guarantee answer will be found – Depends on max-hops field – No guarantees on the number of hops taken to reach answer • Exponential growth in the number of messages sent – Need to optimize message forwarding to counter this explosion • Typical message handling – Process query locally. If result not found, forward to peers – Reduces number of messages sent – BUT slow, if client expects multiple responses 31 Optimization: Query Caching with Probabilistic Forwarding • Goal: Reduce number of messages sent – Replication/Caching most popular technique • Cannot cache query responses – Information may change quickly, policy restrictions, etc. • Can cache queries • Query Caching with Probabilistic Forwarding – Cache information about which nodes responded to query – If a node responded earlier to same query, deterministically forward query to that node – Forward query to other nodes with low probability • Identify nodes that have been updated with new information • Set up caches along duplicate paths correctly – Similar to other learning-based & probabilistic approaches • Effective for applications that may issue queries repeatedly (e.g., Pegasus workflow planner) 32 Optimization: Early Forwarding • Goal: Improve performance of attribute based queries – Process and forward model may be slow • Distinguish between “Return one” vs. “Return all” semantics • Return one - explicit queries – What is the load on skynet.isi.edu? – Requires a single response – Process query locally before forwarding to reduce messages • Return all – attribute based queries – What sites have load < 3.0? – Likely to be multiple responses – Forward query before processing locally to reduce response time (“early forwarding”) • Tag QUERY messages with hints (“Return one” or “Return all”) that indicate whether to do early forwarding 33 Implementation of P2P Index Service • Layered implementation – P2P Resource component maintains overlay and processes messages – IndexServiceResource component processes queries – “Almost” plug-and-play • Gnutella-like message forwarding • Updated using standard MDS aggregator framework and not through the overlay • Query Caching and Early Forwarding optimizations 34 Experiments Experimental Set-up • Evaluate overhead of applying a P2P overlay • Evaluate wide-area performance • Test beds - LAN at ISI, PlanetLab – Mostly comparison tests – small networks Applications • Pegasus (A workflow planning application) – Site and transformation catalogs used by Pegasus • Simple random query client – Artificial records Metrics • Time taken by Pegasus to generate a plan • Query rates 35 Experiments: Overhead of P2P Layer Comparison of a single Index Service and a single P2P Index Service - Indicator of P2P Index Service layer overhead Time taken by Pegasus to generate a plan (seconds) 600 P2P Index Service 500 400 300 200 100 0 20 40 60 80 100 120 140 160 Number of data sets in the index • Pegasus planning a 100 job workflow • Query Overhead reasonably constant as the number of datasets in the index increase 36 Queries/Minute Experiments: Wide Area Performance 300 250 200 150 100 50 0 Scalability Tests on 3 Networks 1000 datasets at each index, 2 peers per index World WAN US WAN LAN 1 2 4 8 16 32 64 Number of clients (4 threads each) • WAN measurements on the PlanetLab test bed • 8 nodes, 2 peers per node, up to 256 concurrent client threads • Query rates in “World” WAN slightly higher than in “US” WAN – Higher load on the US PlanetLab nodes – Query processing is compute intensive 37 P2P Index Service: Conclusions • P2P Organization of GT4 Information Service – Low overhead from adding a P2P layer • Key design features: – Separation of storage of information from querying for information (Overlay used only to forward queries) – Separation of access to information from access to resources (Security model: Choose what is exposed at VO level and apply additional restrictions at resource level) • Simple optimizations help address issues with flooding (results not shown here) – Query caching with probabilistic forwarding – Early Forwarding • Future Work – Scale to larger sizes – P2P version of Replica Location Service using overlay – Experiment with replicating relatively static information 38 Outline • Resource Discovery in Grids: current approaches • Peer-to-Peer Resource Discovery • Applying P2P techniques to Grid Resource Discovery – P2P Information Service • Joint work with Shishir Bharathi – P2P Replica Location Service • Joint work with Min Cai • Summary and Future Work 39 Peer-to-Peer Replica Location Service • Implemented a P2P Replica Location Service based on: – Globus Toolkit Version 3.0 RLS – Chord structured Peer-to-Peer overlay network “A Peer-to-Peer Replica Location Service Based on A Distributed Hash Table,” Min Cai, Ann Chervenak, Martin Frank, Proceedings of SC2004 Conference, November 2004. “Applying Peer-to-Peer Techniques to Grid Replica Location Services,” Ann L. Chervenak, Min Cai, Journal of Grid Computing, 2006. 40 The Globus Replica Location Service • The existing Globus Replica Location Service (RLS) is a distributed registry service that records the locations of data copies and allows discovery of replicas • Maintains mappings between logical identifiers and target names Replica Location Index Nodes • Local Replica Catalogs (LRCs) contain logical-toRLI RLI RLI target mappings • Replica Location Index Nodes (RLIs) aggregate information about LRCs • Soft state updates sent from LRCs to RLIs LRC LRC LRC LRC Local Replica Catalogs 41 Motivation for a Peer-to-Peer RLS • Each RLS deployment is statically configured – If upper level RLI fails, the lower level LRCs need to be manually redirected • More automated and flexible membership management is desirable for: – larger deployments – dynamic environments where servers frequently join and leave • We use a peer-to-peer approach to provide distributed RLI index for {logical-name, LRC} mappings • Consistent with our security model: resource discovery at the RLI level, stricter security at LRC level • In P2P RLS, replicate mappings, unlike in P2P MDS – Easier to hash on logical name than on arbitrary XML content – Mappings are much less dynamic than resource information 42 P2P Replica Location Service (P-RLS) Design • A P-RLS server consists of: – An unchanged Local Replica Catalog (LRC) to maintain consistent {logical-name, target-name} mappings – A Peer-to-Peer Replica Location Index node (P-RLI) • The P-RLS design uses a Chord overlay network to self-organize P-RLI servers – Chord is a distributed hash table that supports scalable key insertion and lookup – Each node has log (N) neighbors in a network of N nodes – A key is stored on its successor node (first node with ID equal to or greater than key) – Key insertion and lookup in log (N) hops – Stabilization algorithm for overlay construction and topology repair 43 P-RLS Design (cont.) • Uses Chord algorithm to store mappings of logical names to LRC sites – Generates Chord key for a logical name by applying SHA1 hash function – Stores {logical-name, LRC} mappings on the P-RLI successor nodeof the mapping • When P-RLI node receives a query for LRC(s) that store mappings for a logical name: – Answers the query if it contains the logical-to-LRC mapping(s) – If not, routes query to the successor node that contains the mappings • Then query LRCs directly for mappings from logical names to replica locations 44 An Example of P-RLS Network Finger Table SHA1(“lfn1000”) = 18 SHA1(“lfn1001”) = 52 SHA1(“lfn1002”) = 31 rli_get_lrc (“lfn1001”) N60 N4 <lfn1001, rlsn://lrc1001> N54 lookup(key52) <lfn1001, rlsn://lrc1001> N4+1, N4+2, N4+4 N8 N4+8, N4+16 lookup(key52) N4+32 N48 N4+1 => N8 N4+2 => N8 N4+4 => N8 N4+8 => N20 N4+16 => N20 N4+32 => N40 <lfn1000, rlsn://lrc1000> lookup(key52) N20 N24 N40 <lfn1002, rlsn://lrc1002> 45 P-RLS Implementation • Implemented a prototype of P-RLS • Extends RLS implementation in Globus Toolkit 3.0 • Each P-RLS node consists of an unchanged LRC server and a peer-to-peer P-RLI server • The P-RLI server implements the Chord protocol operations, including join, update, query, successor, probing & stabilization • LRC, RLI & Chord protocols implemented on top of RLS RPC layer P-RLS P-RLI Server LRC RLI Chord Protocol Protocol Protocol RLS RPC Layer Successor, Join, Update, Query, Probing Stabilizatio n Chord Network LRC Server P-RLI Server RLS Client API RLS Client API LRC Server P-RLS LRC RLI Chord Protocol Protocol Protocol RLS RPC Layer 46 P-RLS Performance • P-RLS network runs on a 16-node cluster • 1000 updates (add operations) on each node, updates overwrite existing mappings, and maximum 1000 mappings in the network • Update latencies increase on log scale with number of nodes Update latency (ms) 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Number of nodes 47 P-RLS Measurements (cont.) • Query latencies with 100,000 and 1 million mappings • Total number of mappings has little effect on query times – Uses hash table to index mappings on each P-RLI node Query latency (ms) • Query times increase on log scale with number of nodes 5 4.5 4 3.5 3 2.5 2 1.5 1 0.5 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Number of nodes 100,000 preloaded mappings 1,000,000 preloaded mappings 48 Successor Replication in P-RLS • Need better reliability when P-RLI nodes fail or leave – Replicate mappings so they are not lost • Min Cai proposes Adaptive Successor Replication: – Extends previously described scheme – Replicate each mapping on k successor nodes of the root node • Provides reliability despite P-RLI failures – No mappings lost unless all k successors fail simultaneously • Distributes mappings more evenly among P-RLI nodes as replication factor increases – Improves load balancing for popular mappings 49 Summary: P-RLS Work • Implemented a P2P Replica Location Service based on: – Globus Toolkit Version 3.0 RLS – Chord structured Peer-to-Peer overlay network • Measured the performance of our P-RLS system with up to 15 nodes – Query and update latencies increase at rate of O(logN) with size of P-RLS network • Simulated the performance of larger P-RLS networks • Replication of mappings results in more even distribution of mappings among nodes • Successor replication scheme provides query load balancing for popular mappings 50 Related Work: P2P and Grids Other approaches to applying P2P to Grid services • GLARE – Mumtaz Siddiqui et al., University of Innsbruck – Structured P2P approach to Grid information services • P2P Replica Location Service – Matei Ripeanu et al., University of British Columbia – Use bloom filters and an unstructured P2P network for replica location service Structured Peer-to-Peer Networks • Examples: Chord, CAN, Tapestry, Pastry Unstructured Peer-to-Peer Networks • Examples: Gnutella, KaZaA, Gia 51 Summary • Resource discovery services in Grids are well-suited to P2P approaches – Similar in function to internet file sharing applications • P2P approaches are attractive because – Scale of Grids growing larger – Organization of hierarchical resource discovery services is challenging, especially at large scale – Need self-configuration, self-healing • Security, performance & practical issues to be overcome • Two systems implemented and evaluated – P2P Information Service: Uses unstructured overlay to support resource information specified in XML – P2P Replica Location Service: Uses structured overlay to distribute mappings among indexes, bounds query response 52 Future Work • Continued research – Larger scale – Different overlay schemes – Additional optimizations to improve query performance, reduce message counts • Incorporate P2P techniques into real-world Grid resource discovery services – Make GT4-based peer-to-peer overlay available as open source contribution to Globus Toolkit – Release P2P Index Service component for MDS4 – Tested a version of Replica Location Service using the unstructured GT4 overlay – Additional improvements to services for easy, dynamic deployment to make these approaches practical 53

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Peer-to-Peer Approaches to Grid Resource Discovery