Download tools_tutorial_ver00..

Document related concepts

Recursive InterNetwork Architecture (RINA) wikipedia , lookup

IEEE 802.1aq wikipedia , lookup

Airborne Networking wikipedia , lookup

CAN bus wikipedia , lookup

Distributed operating system wikipedia , lookup

List of wireless community networks by region wikipedia , lookup

Computer cluster wikipedia , lookup

Routing in delay-tolerant networking wikipedia , lookup

Kademlia wikipedia , lookup

Peer-to-peer wikipedia , lookup

Transcript
Performance Issues in
P2P File Sharing
Systems
Krishna Kant
Ravi Iyer
Vijay Tewari
Intel Corporation
(With contributions from Peter King, Heriott Watt Univ)
www.intel.com/labs
Outline
 Part I: P2P Computing




Overview of P2P applications
Overview of distributed computing frameworks
P2P services & their requirements
New research issues introduced by P2P
 Part II: Performance Study




Issues in network modeling
P2P file sharing issues.
Introduce a tool and some sample results.
Additional issues to investigate.
April 14, 2002
Kant, Iyer & Tewari, Performance Issues in P2P file-sharing systems
2
P2P Beginnings
 Interest kindled by distributed file-sharing applications
 Napster: Mediated digital music swapping.
(http://www.napster.com)
Where is
“X”?
Mediator
1
Peer B
has it
2
3
Copying X
Peer A
April 14, 2002
Peer B
Kant, Iyer & Tewari, Performance Issues in P2P file-sharing systems
3
P2P Beginnings
 Gnutella: Fully distributed file sharing. (http://gnutella.wego.com)
 Freenet Distributed file sharing with anonymity and key based search.
(http://freenet.sourceforge.net)
Peer B
1
Peer A
Where is File X?
1
5
GET File (Key) X (HTTP)
Where is File
(Key) X?
4
6
C: I have it.
File X
2
Peer D
Where is File (Key) X?
Peer C
C: I have it.
3
April 14, 2002
Kant, Iyer & Tewari, Performance Issues in P2P file-sharing systems
4
We had them already!
 Using idle CPU cycles on home PCs, e.g., SETI@home
 Involves scanning of radio telescope images for extraterrestrial life.
 Chunks of data downloaded by home PCs, processed and results returned to the
coordinator.
 Similar schemes used for other heavy-duty computational problems.
 Idle disk and main memory on workstations exploited in a number of
network of workstation (NOW) projects.
Processed Data
Master
Raw Data
Peer 1
Peer 2
Peer 3
Data
Crunching
Data
Crunching
Data
Crunching
April 14, 2002
Peer 4
Data
Crunching
Kant, Iyer & Tewari, Performance Issues in P2P file-sharing systems
5
Newer Applications
 P2P streaming media distribution
 CenterSpan (C-Star Multisource Peer Streaming)
 Mediated, Secure P2P platform for distributing digital content.
 Partition content and encrypt each segment. Distribute
segments amongst peers. Redundant distribution for reliability.
 Download segments from local cache, peers or seed servers.
 http://www.centerspan.com
 vTrails
 vtCaster: At stream source. Creates network topology tree
based on end users (vtPass client software).
 Dynamically optimizes tree.
 Content distributed in a tiered manner.
 http://www.vtrails.com
April 14, 2002
Kant, Iyer & Tewari, Performance Issues in P2P file-sharing systems
6
Newer Applications
 P2P Collaboration Networks
 A variety of applications: telemedicine, military planning, videoconferencing, document editing.
 A group of peers discover one-another and form an ad-hoc network
 Peers setup communication channels & distribute objects.
 Peers do arbitrary real-time computation perhaps involving
multiparty synchronization.
 Example: Groove (http://www.groove.net)
 Real time, small group interaction and collaboration.
 Fundamental notion around a “shared space”
 Each member of the group owns a copy of the “shared space”.
 Changes made to the “shared space” by one user are propagated to all
others (Store and forward if some member is offline).
 Secure platform (PKI for authentication, end to end encryption,
digitally signed components)
April 14, 2002
Kant, Iyer & Tewari, Performance Issues in P2P file-sharing systems
7
So, what is P2P?
 Hype: A new paradigm that can
 Unlock vast idle computing power of the Internet, and
 Provide unlimited performance scaling.
 Skeptic’s view: Nothing new, just distributed computing “rediscovered” or made fashionable.
 Reality: Distributed computing on a large scale
 No longer limited to a single LAN or a single domain.
 Autonomous nodes, no controlling/managing authority.
 Heterogeneous nodes intermittently connected via links of varying speed
and reliability.
 A tentative definition:
 An uncoordinated dynamic network (peers can come & go as they
please)
 No central controlling or managing authority.
 A node can act as both as a “client” and as a “server”.
April 14, 2002
Kant, Iyer & Tewari, Performance Issues in P2P file-sharing systems
8
P2P Platforms
 Legion, University of Virginia, Now owned by “Avaki” Corp.
 Globe, Vrije Univ., Netherlands
 Globus, Developed by a consortium including Argonne Natl. Lab
and USC’s Information Sciences Institute.
 JXTA, Open source P2P effort started by Sun Microsystems.
 .NET by Microsoft Corp.
 WebOS, University of Washington
 Magi, Endeavors Technology
 Groove networks
 PAST (Berkeley), OceanStore (Rice) (persistent storage),
 CAN (content addressable network),
 CHORD (P2P lookup service),
 Several others not mentioned here.
April 14, 2002
Kant, Iyer & Tewari, Performance Issues in P2P file-sharing systems
9
Avaki (Legion)
 Objective: Wide-area O/S functionality via distributed objects.
 Middleware infrastructure for distributed resource sharing in mutually distrustful
environment..
 Global O/S services built on top of local O/S
*Source: Peer-to-Peer Computing by David Barkai (Intel Press)
April 14, 2002
Kant, Iyer & Tewari, Performance Issues in P2P file-sharing systems
10
Avaki (Legion)
 Naming: LOID (location Indep. Object Id), current object address &
object name
 Persistent object space: generalization of file-system (manages
files, classes, hosts, etc.)
 Communication: RPC like except that the results can be forwarded
to the real consumer directly.
 Security: RSA keys a part of LOIDs, Encryption, authentication,
digesting provided.
 Local autonomy: Objects call local O/S services for all
management, protection and scheduling.
 Active objects: objects represent both processes and methods.
 Overall:
 A comprehensive WAN O/S for distributed computing.
 Not targeted as a general P2P enabler.
April 14, 2002
Kant, Iyer & Tewari, Performance Issues in P2P file-sharing systems
11
Globe
 Objective: Another model for WAN O/S.
 Distributed passive object model. Processes are separate entities that
bind to objects.
 Each object consists of 4 subobjects:
 Semantics subobject for functionality.
 Communication subobject for inter-object communication.
 Replication subobject for replica handling including consistency
maintenance.
 Control subobject for control flow within the object.
 Binding to object includes two steps:
 Name & location lookup and contact address creation.
 Selecting an implementation of the interface.
 Overall:
 Similar to Legion, except that processes and objects are not tightly
integrated.
April 14, 2002
Kant, Iyer & Tewari, Performance Issues in P2P file-sharing systems
12
Globus
 Objective: Grid computing, integration of existing services.
 Defines a collection of services, e.g.,




Service discovery protocol
Resource location & availability protocol
Resource replication service
Performance monitoring service
 Any service can be defined and becomes the part of the “system”.
 Higher level services can be built on top of basic ones.
 Preserves site autonomy. Existing legacy services can be offered
unaltered.
 Overall:
 Provides excellent reusability of existing services.
 Unconstrained toolbox approach => difficult to join two “islands”.
April 14, 2002
Kant, Iyer & Tewari, Performance Issues in P2P file-sharing systems
13
JXTA



Objective: A low-level framework to support P2P applications:

Avoids any reference to specific policies or usage models.

Not targeted for any specific language, O/S, runtime environment, or networking model.

All exchanges are XML based.
Base concepts for

Peers & peer groups: An arbitrary grouping of peers; group members share resources &
services.

Pipes: Unidirectional, asynchronous communication channels. A peer can dynamically
connect/disconnect to any existing pipe within the peer group.

Advertisements: A “properties” record needed for name resolution, availability, etc. Specified as
a XML document.

Messages: Arbitrary sized w/ source and destination addresses in URI form.
At the highest abstraction defines a set of protocols using the base concepts:

Peer Discovery protocol: Discovery of peers, resources, peer groups etc.

Peer Resolver Protocol

Peer Information Protocol

Peer Membership protocol.

Pipe binding protocol
 Peer endpoint protocol.
Kant, Iyer & Tewari, Performance Issues in P2P file-sharing systems
April 14, 2002
14
JXTA
Source: White Paper on Project JXTA: A Technology Overview by Li Gong
April 14, 2002
Kant, Iyer & Tewari, Performance Issues in P2P file-sharing systems
15
Microsoft .NET in the context
of P2P
 Objective: An enabler of general XML/SOAP based web
services.
 Message transfer via SOAP (simple object access
protocol) over HTTP.
 Kerberos based user authentication.
 Extensive class library.
 Emphasizes global user authentication via passport
service (user distinct from the device being used).
 Hailstorm supports personal services which can be
accessed via SOAP from any entity
April 14, 2002
Kant, Iyer & Tewari, Performance Issues in P2P file-sharing systems
16
MAGI
 Enabler for collaborative business applications.
*Source: Peer-to-Peer Computing by David Barkai (Intel Press)
April 14, 2002
Kant, Iyer & Tewari, Performance Issues in P2P file-sharing systems
17
Magi
 Magi: Micro-Apache Generic Interface, an
extension of Apache project.
 Superset of HTTP using
 WebDAV: Web distributed authoring & versioning
protocol, which provides, locking services, discovery &
assignment services, etc. for web documents.
 SWAP (simple workflow access protocol) that supports
interaction between running services (e.g., notification,
monitoring, remote stop/synchronization, etc.)
 Intended for servers; client interface is HTTP.
April 14, 2002
Kant, Iyer & Tewari, Performance Issues in P2P file-sharing systems
18
WebOS
 Objective: WAN O/S that can dynamically push functionality to various
nodes depending on loading.
 Outgrowth of the Berkeley NOW (network of workstations) project.
 Consists of a number of components
 Global naming: Mapping a service to multiple nodes, load balancing &
failover.
 Wide-area file system (with transparent caching and cache coherency).
 Security & Authentication w/ fine-grain capability control.
 Process control: Support for remote process execution.
 Project no longer active, parts of it being used elsewhere.
 Overall: Dynamic configurability useful for P2P environment.
April 14, 2002
Kant, Iyer & Tewari, Performance Issues in P2P file-sharing systems
19
Groove
 Groove (http://www.groove.net)
 Real time, small group interaction and collaboration.
 Fundamental notion around a “shared space”
– Each member of the group owns a copy of the “shared space”.
– Changes made to the “shared space” by one member are
propagated to each member of the group (Store and forward if
some member is offline).
 Platform is secure.
– PKI for user authentication.
– End to end encryption.
– Groove components are digitally signed
 Specifically windows based implementation
April 14, 2002
Kant, Iyer & Tewari, Performance Issues in P2P file-sharing systems
20
Requirements for P2P
Applications
 Local autonomy: No control or management by a central authority.
 Scalability: Support collaboration of arbitrarily large number of nodes.
 Security & Privacy: All accesses are authenticated and authorized.
 Fault Tolerance: Assured progress with up to k failures anywhere.
 Interoperability: Any peer that follows the protocol can participate irrespective
of platform, OS, etc.
 Responsiveness: Satisfy the latency expectations of the application.
 Non-imposing: Allows machine user full resource usage whenever desired
without affecting responsiveness.
 Simplicity: Setting up a P2P application or participating in one should require
minimum of manual intervention.
 Auto-optimization: Ability to dynamically reconfigure the application (no of
nodes, functionality, etc.)
 Extensibility: Dynamic addition of functionality.
April 14, 2002
Kant, Iyer & Tewari, Performance Issues in P2P file-sharing systems
21
Some P2P Services
 Network Services.
 Enable communication directly and via firewalls and in the face of intermittent
connectivity.
 Naming, discovery and membership protocols.
 Data and Metadata services
 Generic mechanism for publishing and obtaining Metadata for various resources
(devices, CPU, memory, files, etc)
 Event and Exception management services (Publish and subscribe model)
 Low level file and storage Services
 Security Services
 Key distribution, authentication, encryption.
 Advanced Services:





Digital Rights management.
Administration, Auditing and resource management services.
High level file services akin to a virtual file system.
User and group management services.
Replication and Migration services.
April 14, 2002
Kant, Iyer & Tewari, Performance Issues in P2P file-sharing systems
22
From Services to possible Layers
Location Independent Services
Sharable Resources
Naming, Discovery, Directory
Administration, Monitoring
Identity, Presence, Community
Security
Availability
Policies
Authorization
Integrity
Privacy
Web of trust
Certification
DRM
Standards






 Availability from
unreliable components
 Replication
 Striping
 Failover
 Guaranteed message
queuing
• Transport and data
protocols for interoperability
• Common protocols: IP,
IPv6, sockets, http, XML,
SOAP, . . .
• NAT and firewall solutions
• Roaming, intermittent
connectivity
Communications
Communications
April 14, 2002
Kant, Iyer & Tewari, Performance Issues in P2P file-sharing systems
23
From Services to possible Layers
 Local Autonomy
 Int. allocation of resources
 Self administration – reliable
whole from unreliable parts
 CPU, storage,
memory
Location Independent Services
 Bandwidth
 I/O devices
 Payment tracking
Sharable Resources
 Capability discovery
 Metadata management
 Discovery & location of
peers, services,
resources, users
Administration, Monitoring
Identity, Presence, Community
Security
Availability
Communications
Communications
Policies
Naming, Discovery, Directory
Standards
 Name space
management
April 14, 2002
 Resource monitoring
 User / group identity
 Authentication
 Persistence
 Beyond a session
 Across multiple
devices
Kant, Iyer & Tewari, Performance Issues in P2P file-sharing systems
24
P2P Research Issues
 Communication:
 Communicating with peers behind NAT devices and firewalls.
 Naming and addressing peers that do not have DNS entries.
 Coping with intermittent connectivity & presence (e.g., queued transfers).
 Security and Protection
 Authentication of users independent of devices.
 Digital rights management.
 Access control in a mutually suspicious environment (host machine &
resident foreign objects cannot trust one another).
 Topological mapping:
 P2P network is typically an ad hoc overlay network
 Usually a severe mismatch between application communication pattern
and physical topology.
 For planned collaborations, need to reduce this mismatch.
April 14, 2002
Kant, Iyer & Tewari, Performance Issues in P2P file-sharing systems
25
P2P Research Issues
 Unobtrusive use by machine owner
 A mechanism to measure & control resource usage.
 Low latency service handoff protocols to allow machine owner takeover.
 On demand task migration w/o breaking the application.
 Information location and retrieval
 Efficient distributed information location & need based content migration.
 Intelligent object retrieval
 Retrieval by properties rather than URL.
 Need distributed indexing mechanisms.
 Directing searches to more promising and less loaded nodes.
 Intelligent caching of search results.
 Architectural features
 Efficiently propagate requests & responses w/o much CPU involvement
 Squelch duplicate, orphaned or very late responses.
 Stitch traffic from multiple paths to reduce latency or losses for real-time
applications.
April 14, 2002
Kant, Iyer & Tewari, Performance Issues in P2P file-sharing systems
26
Scalability Issues
 Many problems well studied in distributed systems context,
but need to be revisited.
 Need scalability to huge number of peers (e.g., 100M):
 Peer state management for huge number of peers.
 Discovery and presence management w/ essentially infinite set of
potential peers.
 Certificate management and authentication for huge user base over
a varied set of devices.
 Geographically distributed load balancing.
 Multiparty synchronization and communication.
April 14, 2002
Kant, Iyer & Tewari, Performance Issues in P2P file-sharing systems
27
Part 2: Performance
Study
Goals:
1.
Define a performance model including
- Network model
- File storage and access model
2.
Introduce a tool and discuss sample results.
www.intel.com/labs
P2P Network Characteristics
 Desirable characteristics
 Adequate representation of ad hoc nature of the network.
 Expected to contain a few special sites (well-known, content rich,
substantial resources, etc.)
 Heavy-tailed nature of connectivity.
 Other Issues
 Dynamic changes to the network
 Direct modeling not required if rate of change << request rate.
 Metadata consistency issues still need to be considered.
 Mapping of virtual P2P network on physical network
 P2P applications generally don’t pay attention to mapping.
 “Virtual links” bet. P2P neighbors are essentially statistically identical.
 A better modeling possible, but difficult to calibrate.
April 14, 2002
Kant, Iyer & Tewari, Performance Issues in P2P file-sharing systems
29
P2P Node & Link Models
 Consider a 3-tier model for nodes
 tier-1: Well-known, resource-rich, always on & part of network.
 Similar to traditional server nodes (globally known sites in Gnutella)
 Henceforth called as distinguished nodes.
 tier-2: “Hub” nodes (reasonably resource rich & mostly on)
 Contribute storage/files in addition to requesting them.
 May join/leave the network, but at time-scale >> req-response time.
 Henceforth called as undistinguished nodes.
 tier-3: Infrequently connected or primarily “client” functionality
 No need to represent these explicitly in the network
 Requests/responses from these appear to originate from tier-1/2
nodes that they home on.
 A very simple link model
 Physical topology ignored; each “link” treated like a single pipe.
=> Links uninteresting from topological perspective.
April 14, 2002
Kant, Iyer & Tewari, Performance Issues in P2P file-sharing systems
30
P2P Network Model
 Use a random graph model to represent topology.
 Traditional G(n,p) RG model too simplistic.
 Use a 2-tier non-uniform model built as follows:
 Start with a degree Kd regular graph of Nd dist. Nodes.
 Add Nu undistinguished nodes sequentially as follows:
 The new node connects to K other nodes.
 K: const or an integer-valued RV in range 1..Kmax
 Each connection targets an undistinguished node with prob qu (this
may not be possible for the first Kmax nodes).
 Dist. Node target: uniform distribution over all dist nodes.
 Undist. Node target: Zipf(a) over existing undist. nodes.
 At most one connection allowed between any pair of nodes.
 a controls the decay rate of nodal degree
 a=0 => Uniform dist => Very slow decay. Used here for simplicity.
April 14, 2002
Kant, Iyer & Tewari, Performance Issues in P2P file-sharing systems
31
Graph Construction steps
Initial graph: Dist
nodes only
Add undist node 1
Add undist node 2
Add undist node 3
April 14, 2002
Kant, Iyer & Tewari, Performance Issues in P2P file-sharing systems
32
Topological properties
 Some network properties can be analyzed analytically
 Outline of Analysis (see http://kkant.ccwebhost.com/download.htm)
 Degree distribution:





Distinguished nodes at level 0, each new node defines a new level.
Pn(l2,l): Prob(level l node has degree n when current level = l2)
Get recurrence eqns for Pn(l2,l) & hence its PGF f(z| l2,l) .
Get avg degree Dat(l2,l) at level l when current level = l2.
Can be adapted for computing the undistinguished degree of a node.
 No of nodes reached in h hops:
 Rh matrix: Rh(i,j) is prob of reaching level i from level j in exactly h hops.
 Compute Rh(i,j) by enumerating all unique paths of length h.
 Compute G(l2,h), avg no of nodes reached in h hops starting from a level l2.
 Request and response traffic at level l node:
 nreqs = No of requests reaching undist. nodes in h hops = 1 + Sh Gu(l2,h),
 nresps = 1 + Sh h G(l2,h), since resp from h hops away goes thru h nodes.
 Nodal utilization & node engineering:
 Easy to ensure that nodal utilization do not exceed some limits.
 Queuing properties generally intractable; explored via simulation.
April 14, 2002
Kant, Iyer & Tewari, Performance Issues in P2P file-sharing systems
33
Sample Results - 100 nodes
undist
prob
0.05
0.50
0.95
April 14, 2002
no_of
hops
nodes undist resps
reached reached /node
traf
/node
1
2
3
4
5
1
2
3
4
5
1
2
3
4
5
5.9
55.2
99.1
100
100
5.9
34.3
91.0
99.9
100
5.9
28.6
76.7
98.5
99.7
6.1
146.5
320.5
328.8
328.8
8.4
82.3
304.0
356.9
357.3
10.6
73.6
258.4
369.2
377.2
3.3
44.5
85.8
90.0
90.0
4.3
23.8
73.9
89.4
89.6
5.3
22.6
63.8
87.4
89.3
4.9
103.6
235.2
238.8
238.8
4.9
61.7
231.7
267.5
267.7
4.9
50.3
194.6
281.8
287.8
10 dist nodes
90 undist nodes
Init_deg = 4
Undist deg = 1..4
Uniform distribution
Undist connection
Probs .05, .5, 0.95
Kant, Iyer & Tewari, Performance Issues in P2P file-sharing systems
34
Sample Results - 500 nodes
undist
prob
0.05
0.50
0.95
no_of
hops
nodes undist resps
reached reached /node
1
2
3
4
1
2
3
4
1
2
3
4
6.0
243.7
499.7
500.0
6.0
95.7
483.5
500.0
6.0
35.1
163.5
405.7
April 14, 2002
3.6
232.7
488.6
490.0
4.7
84.2
465.1
490.0
5.8
29.1
137.1
367.7
traf
/node
5.0
6.2
480.5 711.5
1248.4 1737.0
1249.6 1739.6
5.0
8.5
184.3 264.6
1347.8 1812.4
1413.9 1903.9
5.0 10.7
63.2 91.7
448.3 582.4
1417.2 1782.7
Kant, Iyer & Tewari, Performance Issues in P2P file-sharing systems
50 dist nodes
450 undist nodes
Init_deg = 4
Undist deg = 1..4
Uniform distribution
Undist connection
Probs .05, .5, 0.95
35
Simulation of Random Graphs
 Simulation of Random graph is a hard problem
 Model represents a large number of possible topologies.
 Too many instances to simulate explicitly and then average the results.
 Example: 2 dist & 3 undist nodes, each connects to 2 nodes => 6 distinct
topologies.
 Possible approaches to simulation:
 Average case analysis
 Constrained model (limit the number of instances).
 Direct simulation of probabilistic model.
April 14, 2002
Kant, Iyer & Tewari, Performance Issues in P2P file-sharing systems
36
Average case analysis
 Intended environment
 To study performance of an “average” network defined by RG model.
 No dynamic changes to the topology possible.
 Graph construction
 Start with the regular graph of distinguished nodes (as usual).
 For adding undist nodes, work with only the avg connectivities Kd & Ku for
an incoming node.
 Always connect to the existing node with min connectivity.
 Kd & Kd can be used successively to handle non-integer Kd values
(similarly for Ku).
 Characteristics/issues
 Simple, only one graph to deal with in simulation.
 Gives correct avg reachability and avg. nodal utilizations.
 All queuing metrics (including avg response time) are underestimated.
April 14, 2002
Kant, Iyer & Tewari, Performance Issues in P2P file-sharing systems
37
Constrained Connectivity
 Intended environment
 To capture most likely scenarios of connectivity.
 Accommodate both static topology an slowly changing topology.
 Graph construction and simulation
 For the entering level l2 node, analytically estimate Dat(l2,l) at all l.
 Allow connection to a level l node only if degree(l) falls in the range
(min..max) Dat(l2,l) .
 Found that min=0.5 and max=1.5 is quite adequate.
 Generate a limited set (~100) instances of the graph.
 During simulation, each query randomly selects one instance.
 Characteristics/issues
 Avoids highly asymmetric topologies => queuing properties may be
underestimated.
 All generated instances are given equal weight. Relative weights can be
estimated but very expensive.
April 14, 2002
Kant, Iyer & Tewari, Performance Issues in P2P file-sharing systems
38
Probabilistic Graph
Emulation
 Intended environment
 To study overall performance when topology is defined by the RG model.
 Accommodate fast changing or unstable topologies.
 Method:
 For each node i, estimate relative prob qij of having an edge to node j  i.
 A query received from node k at node i is sent to node j with prob
qij/(1-qik).
 This virtual topology for the query is used to return responses as well.
 Characteristics/Issues
 Method dependent on analytic calculation of edge probabilities to
neighbors.
 Single simulation automatically visits various instances in the correct
proportion.
 No explicit control over which instances are visited => Reliable results may
take a very long time.
 Very difficult to handle complex operations (e.g., file migration).
April 14, 2002
Kant, Iyer & Tewari, Performance Issues in P2P file-sharing systems
39
File Size & access
distribution

Using a 2-segment model:
 Small sizes: Distribution generally irregular; uniform is a reasonable model.
 Pareto tail with decay rate 1<a<2 is quite reasonable.

Adopted distribution:
 Uniform dist in the small-size range 400 bytes to 4 KB.
 Pareto distribution with a min value of 4KB and mean of 40 KB => a = 1.11.
 40 KB mean is typical for web pages, but too small for MP3 files.

“File category” provides a link between file size and its “popularity”. Needed to model
higher access rate of small files.
 Chose 9 categories (equally spaced in log domain)
400B, 1.265KB, 4KB, 12.65KB, 40KB, 126.5KB, 400KB, 1.265MB, 4MB, 12.65MB

File access distribution:



Across categories, distribution specified by a discrete mass function:
(0.07, 0.14, 0.2018, 0.20, 0.14, 0.098, 0.0686, 0.048, 0.0336)
This increases linearly first and then decays geometrically w/ factor 0.7.
Within each category, assume uniform access distribution.
April 14, 2002
Kant, Iyer & Tewari, Performance Issues in P2P file-sharing systems
40
File Copy parameters

Each search in a P2P network may result in multiple “hits”.

Need only dist. of hits; precise modeling of search mechanism not needed.

Use file copies for this:
 Each file has C copies in the range (1..Cmax) with a given distribution.
 A file is now identified by the triplet: (category, file_no, copy_no) where file_no is a
unique id (e.g., sequence no) of files in a category.

This allows following capabilities:
 Unique searches specified by the file-id triplet.
 Non-unique searches specified by (category, file_no).
 Replication control and fault-tolerant operation.

File copy parameters:
 Distribution may be related to the nature of the file (not considered here).
 Separate distributions allowed for files allocated to dist & undist nodes.
 Assuming a triangular distribution with Cmax = 20, and mode Cmode= 5 for all nodes
=> Mean no of copies = 8.667.
April 14, 2002
Kant, Iyer & Tewari, Performance Issues in P2P file-sharing systems
41
File Assignment to Nodes

Assignment of copies to nodes:
 Assign copies at a fixed distance so as to distribute them evenly across the network.
 Apply an offset for each round of copy assignment to avoid bunching up.
 Do not assign more than one copy of a file to a node.

Algorithm: loop over all files
n_copies = triangular_rv(1, Cmax , Cmode)
// Generate random no of copies
distance = n_nodes/n_copies;
// Distance for copy allocation
offset = 1 + n_nodes/no_files;
// If too few files, get an offset to avoid bunching
tot_offset = (tot_offset + offset) % n_nodes;
node_no = tot_offset;
// Node for the assignment of first copy
for ( copy_no = 0; copy_no < n_copies; copy_no++) {
assign_file( node_no, file_no, size);
node_no = (node_no + distance) % n_nodes; // Next node for assignment
if ( copy_no < n_copies -1 && node_no == (tot_offset + wraps)% n_nodes) {
node_no = (node_no + 1) % n_nodes; wraps++;
}
} // loop over copies
April 14, 2002
Kant, Iyer & Tewari, Performance Issues in P2P file-sharing systems
42
Query Characteristics

Assumptions:
 No queries (searches) started from distinguished nodes since these nodes are
essentially “servers”.
 Identical query arrival process at each undistinguished node.

Arrival process model
 An on-off process with identical Pareto distribution for on \& off periods:
P(X>x) = (x/T)g for x > T
 Assume T=12 secs, and g =1.4 which gives E(X)=30 secs.
 Const inter-arrival time of 4 secs during the on-period, no traffic during off period.
 Total traffic at a node is superposition of arrivals from all reachable nodes.
 Approx. a self-similar process with Hurst parameter H=(3 - g)/2 = 0.8 when no of
reachable nodes is large.

Query properties:




Each query specifies a file (category, file_no) w/ given access characteristics.
Shown results do not specify copy_no => Multiple hits possible for each query.
Query percolates for h “hops”. (h=3 can cover 90% of nodes for chosen graph).
If a query arrives at a node more than once, it is not propagated.
April 14, 2002
Kant, Iyer & Tewari, Performance Issues in P2P file-sharing systems
43
File Retrieval

Query Response:
 Query reaching a node generates found/not found response, which travels
backwards along the search path.
 Querying node runs a timer Tu; all responses after the timeout are ignored.
 Currently no concept of retrying the timed out requests.
 Requests and responses may be culled if response time exceeds a limit.
 Distribution of Tu: Triangular in the range (3, 14) secs with mean 8.0 secs.

File retrieval:
 Randomly choose one of the positively responding nodes for file retrieval.
 Requested file(s) are obtained directly (i.e., do not follow the response path).
 Retrieved file may be optionally cached at the requesting node.

File cache flushing
 Used as an indirect modeling of dynamic changes in tier-3 nodes.
 A cache flush represents a tier3 user disconnecting and replaced by another
statistically identical tier-3 node.
 No of cycles before cache flushing: Zipf with min=30, max=120 and a =1.0.
April 14, 2002
Kant, Iyer & Tewari, Performance Issues in P2P file-sharing systems
44
Service time modeling
 Node service
 Each query & response need service at each node visited.
 File transfer needs service on both ends & has two parts
 A basic service time (indep. of file-size, given by a distribution).
 A file-size dependent component.
 Each node implements 3 priority levels for efficient processing
 Low: queries, Medium: file transfers, High: response processing.
 Overall queue size constrained to avoid long queuing delays.
 Link Service
 Link service time also has two components:
 A basic service time (indep. of transfer size, given by a distribution).
 Size dependent part determined from link bit rate.
 Link bit rate taken as 3 KB/sec (a estimate of real-life rate on Internet).
 Links are pure delay servers (assuming P2P traffic << total traffic).
April 14, 2002
Kant, Iyer & Tewari, Performance Issues in P2P file-sharing systems
45
P2P Simulation Tool (FSST)
 Developed a file sharing simulation tool (FSST) with following functionality
 Generation of random graphs instances w/ constrained degree.
 Simultaneous simulation of multiple graphs.
 Flexible specification of various network & file parameters.
 Unique & non-unique file searches. Optional culling of requests & responses.
 Queuing and service at nodes and links.
 File transfers, file caching, and cache flushing.
 Features currently unavailable
 Automatic propagation of files through the network.
 Explicit modeling of user retry behavior.
 Dynamic changes to the network.
 Mapping between P2P network and physical network.
 Tool specifics:
 Written in C/C++. Uses Sim++ package as simulation engine.
 Input interface common w/ Geist (demonstrated at this conf.).
April 14, 2002
Kant, Iyer & Tewari, Performance Issues in P2P file-sharing systems
46
Sample input file
num_graphs = 100;
# Number of graphs simulated
max_deg_mult = 1.5; min_deg_mult = 0.5;
# multipliers to get min & max degrees
num_d_nodes = 10; num_u_nodes = 90;
# No of dist/undist nodes
num_d_edges = 2; num_u_edges = 4;
# Initial no of edges for dist/undist node
undist_node_prob = 0.50;
# Prob of connecting to a undist node
num_hops = 3;
# number of hops each message
n_categories = 10;
# Total no of size categories
category_boundary =
{400, 1265, 4000, 1.265e4, 4.0e4, 1.265e5, 4.0e5, 1.265e6, 4.0e6, 1.265e7};
category_prob = {0.07, 0.14, 0.2018, 0.20, 0.14, 0.098, 0.0686, 0.048, 0.0336, 0.0};
# Relative prob of each category bucket.
d_file_size = {400, 4000, 1.265e7, 4.0e4, 0.0, 0.0, 0.0}; # Dist. file size parms
# min_unif, max_unif, max, mean, unif_prob, alpha, beta
u_file_size = d_file_size;
# Undist file size parms
d_copies_parms = {Triangle_int, 1, 20, 5, 0}; # number of file copies at dist. nodes
u_copies_parms = {Triangle_int, 1, 20, 5, 0}; # No of file copies at undist nodes
num_files = {500, 1000};
# No of files at dist/undist nodes
filestore_size = {2.0e8, 3.2e7};
# File cache size at dist/undist nodes
queue_depth = {50, 50};
# Max queue length allowed
max_cached_file_size = 80000;
# Max file size that is cached
April 14, 2002
Kant, Iyer & Tewari, Performance Issues in P2P file-sharing systems
47
Sample input file (contd)
srch_stime_parms = {Exponential, 0.010, 0.1, 0.015, 0};
# CPU time for searching and search propagation (no local hit)
local_srch_stime = {Exponential, 0.002, 0.050, 0.00225, 0};
# CPU time for search in local cache (local hit)
rel_cpu_speed = {1.0, 1.0};
# CPU speeds of dist/undist nodes
link_bandwidth = 3.0e3;
# Link BW in bytes/sec
link_stime_parms = {Exponential, 0.01, 0.20, 0.015, 0};
# Link service time
search_priority = low; response_priority = high; # Rel. priorities of query & resp.
get_priority = medium; put_priority = medium; # Rel. priority of file gets & puts
put_stime_parms = {Exponential, 0.003, 0.1, 0.005, 0}; # CPU time for file put
per_byte_proc_time = 15e-7;
# time for processing files
resp_stime_parms = {Exponential, 0.002, 0.1, 0.004, 0}; # resp proc CPU time
int_arrival_time = 4;
# Inter-arrival time during on period
on_period_parms = {Pareto, 12, 1200, 30, 0};
# On period for req. arrivals
num_user_on_cycles = {Zipf, 30, 120, 0, 1}; # num cycles before a cache flush
timer_threshold = {Triangle, 3, 14, 7, 0};
# Elapsed time for link traversal
simulation_warmup_time = 30000;
simulation_run_time = 120000;
April 14, 2002
Kant, Iyer & Tewari, Performance Issues in P2P file-sharing systems
48
Sample Results from FSST (1)
Node Utilization and Queue Lengths as a function of #hops
Node Utilization (Dist Node)
Node Utilization (Other Nodes)
Queue Length (Dist Nodes)
Queue Length (Other Nodes)
Hops = 1
Hops = 2
Hops = 3
Hops = 4
8.0%
44.3%
86.0%
96.1%
5.2%
25.9%
50.6%
58.9%
1.013
2.357
14.581
30.134
1.048
2.285
5.179
6.972
Reachability and Response Rate
Hops = 1
Num Responses Per Request
% Unexpired Responses
% Expired Responses
Num Dropped Msgs Per Request
6.58
99.39%
0.61%
0.00
Hops = 2
Hops = 3
Hops = 4
51.16
84.17
80.70
98.93%
99.27%
99.42%
1.07%
0.73%
0.58%
0.37
7.30
27.59
% Successful Searches as #Hops Increase
Observations:
% successful requests saturates
beyond 3 hops due to increased
queuing and dropped messages
Local cache hit rate changes
minimally as a function of the
number of hops
50.00%
% Successful Requests
Node utilization is significant at
hops >=3
60.00%
40.00%
% Successful Requests
30.00%
% Requests served locally
20.00%
% Requests served remotely
10.00%
0.00%
1
2
3
4
Num ber of Hops
April 14, 2002
Kant, Iyer & Tewari, Performance Issues in P2P file-sharing systems
49
Sample Results from FSST (2)
Impact of the Caching Option Selected
Node Utilization (Dist Node)
Node Utilization (Other Nodes)
Queue Length (Dist Nodes)
Queue Length (Other Nodes)
Num Responses Per Request
% Unexpired Responses
% Expired Responses
Num Dropped Msgs Per Request
Cache All
Cache < 40K No Caching
86.0%
89.8%
92.8%
50.6%
49.9%
52.2%
14.581
18.15
21.226
5.179
3.952
4.161
84.17
89.27
93.93
99.27%
100.00%
100.00%
0.73%
0.00%
0.00%
7.30
4.42
5.43
Observations:
Caching < 40K (avg file size)
seems to provide the highest hit
ratio for searches
Expired responses are negligible
(perhaps need better
parameterization).
April 14, 2002
70.00%
% Successful Searches
Node Utilization and queue length
at the distinguished nodes increases
moderately as less caching is
performed.
Impact of File Caching
% Requests served
remotely
% Requests served
locally
60.00%
50.00%
40.00%
30.00%
20.00%
10.00%
0.00%
Cache All
Cache < 40K
No Caching
Caching Option
Kant, Iyer & Tewari, Performance Issues in P2P file-sharing systems
50
Sample Results from FSST (3)
Impact of the File Store Size at Non-Distinguished Nodes
Node Utilization (Dist Node)
Node Utilization (Other Nodes)
Queue Length (Dist Nodes)
Queue Length (Other Nodes)
Num Responses Per Request
% Unexpired Responses
% Expired Responses
Num Dropped Msgs Per Request
FS = 16M
FS = 32M
86.0%
80.0%
50.6%
46.6%
14.581
11.518
5.179
4.703
84.17
77.54
99.27%
99.33%
0.73%
0.67%
7.30
6.12
Impact of File Store Size
Observations:
 Node utilization decreases
 Queue Length reduces
 Search hit ratio improves.
The average no of responses per
request reduces somewhat because
more local hits occur
April 14, 2002
% Requests served
locally
80.00%
% Successful Searches
Increasing the file store size
improves the performance scenario
considerably
% Requests served
remotely
70.00%
60.00%
50.00%
40.00%
30.00%
20.00%
10.00%
0.00%
FS = 16M
FS = 32M
File Store Size
Kant, Iyer & Tewari, Performance Issues in P2P file-sharing systems
51
Sample Results from FSST (4)
Small vs. large queue depth at the nodes
QL = 50
Node Utilization (Dist Node)
Node Utilization (Other Nodes)
Queue Length (Dist Nodes)
Queue Length (Other Nodes)
Num Responses Per Request
% Unexpired Responses
% Expired Responses
Num Dropped Msgs Per Request
% Successful Requests
% Requests served locally
% Requests served remotely
85.4%
49.3%
14.255
4.636
83.67
99.59%
0.41%
5.53
47.60%
9.92%
37.67%
QL = 1000
86.9%
50.4%
16.125
6.571
87.62
98.93%
1.07%
0.00
47.89%
9.96%
37.94%
Distinguished Nodes
Observations:
Base
Increasing the queue depth ensures
no dropping of requests BUT does
not impact the success rate of node
utilization much.
Making the distinguished nodes
more powerful seems to have no
impact other than the obvious
reduction in utilization at
distinguished nodes.
April 14, 2002
Impact of More Powerful
Node Utilization (Dist Node)
Node Utilization (Other Nodes)
Queue Length (Dist Nodes)
Queue Length (Other Nodes)
Num Responses Per Request
% Unexpired Responses
% Expired Responses
Num Dropped Msgs Per Request
% Successful Requests
% Requests served locally
% Requests served remotely
Kant, Iyer & Tewari, Performance Issues in P2P file-sharing systems
86.0%
50.6%
14.581
5.179
84.17
99.27%
0.73%
7.30
51.93%
8.10%
43.83%
DNodePow*2
51.3%
49.4%
3.504
5.447
82.33
99.39%
0.61%
9.81
51.62%
8.14%
43.49%
52
Sample Results from FSST (5)
Effect of Caching / Flushing Switches
C / Fl
Node Utilization (Dist Node)
Node Utilization (Other Nodes)
Queue Length (Dist Nodes)
Queue Length (Other Nodes)
Num Responses Per Request
% Unexpired Responses
% Expired Responses
Num Dropped Msgs Per Request
% Successful Requests
% Requests served locally
% Requests served remotely
86.0%
50.6%
14.581
5.179
84.17
99.27%
0.73%
7.30
51.93%
8.10%
43.83%
85.4%
49.3%
14.255
4.636
83.67
99.59%
0.41%
5.53
47.60%
9.92%
37.67%
No C / No FL
No C / FL
87.8%
92.8%
52.5%
52.2%
16.053
21.226
6.067
4.161
83.40
93.93
98.63%
100.00%
1.37%
0.00%
11.06
5.43
92.33%
28.81%
6.33%
0.00%
86.00%
28.81%
Effect of Enforcing Message Expiry in Network
Observations:
 When flushing and caching are
both turned off, the search hit ratio
is the best (because files do not get
replaced & lost).
 Enforcing message expiry makes
very little difference to the results
(when using the average timer
threshold value as the message
expiry threshold).
April 14, 2002
C / No FL
Base w/ inf queue + EXPIRY = 8s
Node Utilization (Dist Node)
88.3%
88.0%
Node Utilization (Other Nodes)
52.4%
52.2%
Queue Length (Dist Nodes)
17.556
17.21
Queue Length (Other Nodes)
8.578
8.377
Num Responses Per Request
89.10
88.88
% Unexpired Responses
97.98%
98.31%
% Expired Responses
2.02%
1.69%
Num Dropped Msgs Per Request
0.00
0.44
% Successful Requests
52.41%
52.15%
% Requests served locally
8.29%
8.30%
% Requests served remotely
44.12%
43.85%
Kant, Iyer & Tewari, Performance Issues in P2P file-sharing systems
53
Conclusions & Future Work
 Summary of covered material:
Introduced major developments relevant to P2P computing.
Introduced sample middleware functionality to support P2P applications.
Discussed major research issues to be resolved.
Proposed a random graph model for P2P networks and studied its
properties.
 Studied some performance issues for P2P deployments using detailed
simulation of file-sharing applications.




 Future P2P Performance Work
Various strategies for automated file propagation through the network.
Intelligent caching and invalidation of search results.
Key based file location (hashing + searching).
Dynamic changes to network and file-sets stored at nodes.
Mapping of virtual network over a physical network to obtain more realistic
link delays.
 Various ways of culling unnecessary requests and responses.





April 14, 2002
Kant, Iyer & Tewari, Performance Issues in P2P file-sharing systems
54
Relevant sites: P2P
Applications















Napster (http://www.napster.com)
Gnutella (http://gnutella.wego.com)
Freenet (http://freenet.sourceforge.net)
JXTA (http://www.jxta.org)
Avaki Corp (http://www.avaki.com)
Legion (http://legion.virginia.edu)
Globe (http://www.cs.vu.nl/~steen/globe)
Globus (http://www.globus.org)
Microsoft .Net (http://www.microsoft.com/net)
CenterSpan (http://www.centerspan.com)
vTrails (http://www.vtrails.com)
SETI@Home (http://setiathome.ssl.berkeley.edu)
CAN (http://www.acm.org/sigcomm/sigcomm2001/p13.html)
CHORD (http://www.pdos.lcs.mit.edu/chord)
PASTRY (http://research.microsoft.com/~antr/Pastry)
April 14, 2002
Kant, Iyer & Tewari, Performance Issues in P2P file-sharing systems
55
Relevant Sites: Modeling
Issues
 File-sharing networks
 Intl workshop on P2P (http://www.cs.rice.edu/Conferences/IPTPS02/)
 Jovanovic et al (U/Cinn), Scalability issues in Gnutella
(http://www.ececs.uc.edu/~mjovanov/Research/paper.html)
 Adar & Hubermann (HP), Free riding in Gnutella
(http://www.firstmonday.dk/issues/issue5_10/adar)
 Ripeanu (U/Chicago), Peer-to-Peer Architecture Case Study: Gnutella Network
http://www.cs.uchicago.edu/research/publications/techreports/TR-2001-26
 Internet graph models
 Kumar, et. al, (IBM), Web as a Graph, http://www.almaden.ibm.com/cs/k53/algo.html
 Aiello et al (AT&T/UCSD) A random graph model for massive graphs,
http://math.ucsd.edu/~llu/random_abs.html
 Taxonomy
 Kant, Iyer & Tewari (A classification framework for P2P technologies)
http://kkant.ccwebhost.com/download.html
April 14, 2002
Kant, Iyer & Tewari, Performance Issues in P2P file-sharing systems
56