Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
<1> A Peer-to-Peer Approach to Resource Discovery in Grid Environments (in HPDC’02, by U of Chicago) Gisik Kwon Nov. 18, 2002 Motivation • Two general resource discovery systems, Grid & P2P, eventually will have the same goal – Large-scale, decentralized and self-configuring system with complex functionalities • So, general guideline is needed for designing the resource discovery system – Proposing 4 axes(components) guiding the design of any resource discovery architecture – Presenting an emulated environment and preliminary performance evaluation with simulations How • 4 axes (components) to be considered – Membership protocol • How new nodes join and learn the network – Overlay construction function • Selecting the set of active collaborators from the local membership list – Preprocessing • Off-line preparations for better search performance • Caching (X), pre-fetching (O) • E.g) Dissemination of resource descriptions – Request processing • Local processing – lookup the requested resource in the local info., processing aggregated resources, .. • Remote processing – Request propagation rule Evaluation • Modeling the Grid environment (4 parameters) – Resource info. distribution and density • Some organizations : sharing large number of resources or just a few • Some resources : Common or unique – Resource info. dynamism • Highly variable(CPU load) or static(type of CPU) – Requests distribution • Pattern of Users’ requests • E.g) Zipf distribution, uniform, .. – Peer particapation • varies over time more significantly in P2P than Grid Evaluation • Preliminary Experimental Setup – Optimistic Grid model • Static resource attributes, constant peer participation, no failure – Passive membership protocol • A new node learns the network through out-of-band • When a new node contacts, membership is enriched – Overlay function accepts unlimited number of neighbors – No preprocessing – Request processing • Perfect matching • 4 request propagation strategies Evaluation • Random walk – Choosing randomly • learning-based – Answered similar requests previously • best-neighbor – Answered the largest number of requests • learning-based + best-neighbor – Learning-based first, otherwise best-neighbor Evaluation • User requests : Resource distribution Quantitative estimation of resource location costs Quantitative estimation of resource location costs Grid vs P2P GRID P2P Thousands of peers, Hundreds of users Hundreds of thousands of peers and users Centralization Some (centralized repositories for shared data) Less than GRID or no central point Functionality Complex (program exe., R/W access to data, resource monitoring,..) Specialized (file sharing, parallel computations) Participation Stable Unpredictable (long, predefined periods of time) User behavior Some homogeneity (incentives, policies) Uneven (free riding in Gnutella) Powerful, good connection Highly heterogeneous Yes No Scale System Administration <2> Adaptive Replication in Peer-to-Peer Systems (in ICS’02, by UMD) Gisik Kwon Nov. 18, 2002 Motivation • Recent P2P systems are good for the uniform query demand • But the demand can be heavily skewed • So lightweight, adaptive & system-neutral replication mechanism is needed to control the skewed demand – Proposing LAR – Evaluation on Chord and TerraDir systems with simulation How • LAR uses two soft states: caches and replicas – Why soft: Created and destroyed by local decision, without any coordination with the item’s home • Caches – Consist of <data item label, item’s home, home’s address(IP), a set of known replica locations> – Use LRU replacement strategy • Replicas – Contains item data itself and – More states: home address(IP), neighbor of home, known replicas – Should be advertised How • Load balancing with creating replicas – Each time a packet is routed, current load(li) of server(si) is checked – Load is defined in terms of messages sent to a server during a time unit – If li > limax (overloaded) • si creates replica at sj (if li > lj) – If (lilow <= li <= lihi) (highly loaded) • si creates replica at sj (only if lj <= ljlow ) • After creating replica, it is disseminated – 2/32 policy : 2 per msg, 32 per server Evaluation • Based on Chord simulator – Single network hop = 25ms • Query distribution – follows the poisson distribution – Average input rate = 500/sec – def. skewed input : 90-1 • 90% skewed to one item, other 10% randomly distributed • • • • • 1k servers, 32k data items lmax = 10/sec, lhi = 0.75*lmax, llow = 0.3*lmax Queue length = 32 Def. Load window size = 2 sec Dissemination policy = 2/32 Static vs. adaptive replication Static vs. adaptive replication Load balancing • lhi = 0.75 Parameter sensitivity scalability N70’s Finger Table Chord Lookup (0) N113 N32’s Finger Table N102 N32 N85 N40 N80 N52 N79 N70 N60 33..33 34..35 36..39 40..47 48..63 64..95 96..31 N40 N40 N40 N40 N52 N70 N102 71..71 N79 72..73 N79 74..77 N79 78..85 N80 86..101 N102 102..5 N102 6..69 N32 N80’s Finger Table 81..81 N85 82..83 N85 84..87 N85 88..95 N102 96..111 N102 112..15 N113 16..79 N32 Node 32, lookup(82): 32 70 80 85. TerraDir