Download On the Impact of P2P File Sharing Traffic Restrictions on User

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Airborne Networking wikipedia , lookup

Asynchronous Transfer Mode wikipedia , lookup

Cracking of wireless networks wikipedia , lookup

Distributed firewall wikipedia , lookup

Network tap wikipedia , lookup

Net neutrality law wikipedia , lookup

Peering wikipedia , lookup

Deep packet inspection wikipedia , lookup

Net bias wikipedia , lookup

Peer-to-peer wikipedia , lookup

Transcript
On the Impact of P2P File Sharing Traffic
Restrictions on User Perceived Performance
Ricardo Lopes Pereira, Teresa Vazão
Instituto Superior Técnico
Av. Prof. Dr. Cavaco Silva, 2744-016 Porto Salvo, Portugal
[email protected]
[email protected]
Abstract— Peer to Peer (P2P) File Sharing (FS) applications
represent, today, the major traffic source on the Internet. Unlike
other traffic types, such as HTTP, where the major traffic
sources are identifiable, peers are, by definition, spread over the
Internet, making it hard for ISPs to architect their networks
to accommodate Peer to Peer traffic. In order to alleviate the
impact of P2P FS traffic on other applications, ISPs often resort
to price strategies based on traffic or traffic shaping techniques,
in order to restrict P2P FS applications usage. The success of
these initiatives is often limited, as they may aggravate other
customers which do not use P2P FS or be circumvented by some
of the P2P FS applications, which try to misrepresent their traffic
as belonging to other applications.
In this paper we study the impact that different methods for
P2P FS traffic reduction have on the traffic carried by ISPs
and on the download performance perceived by P2P FS users.
Through simulation we compared the usage of traffic shaping
with the more recent techniques Biased Neighbour Selection
(BNS) and Adaptive Search Radius (ASR). We observed that
traffic shaping provides ISPs with the fewer traffic savings,
especially when compared to the price paid by P2P FS users. BNS
and ASR provide similar benefits for users and ISPs. We’ve also
observed that BNS and ASR are complementary technologies,
which may be combined in order to achieve higher efficiency.
I. I NTRODUCTION
Peer-to-Peer (P2P) File Sharing (FS) applications are wildly
disseminated. Their popularity is such that several sources
indicate that they are responsible for up to 70% of the Internet
traffic [1], [2]. Contrary to what was expected by some,
the availability of commercial music and video download
services and the legal pressure imposed by copyright owners’
associations, such as RIAA, did not cause P2P FS adoption to
slow down.
P2P FS traffic exhibits a behaviour very distinct from that
of traditional applications such as email or HTTP, one that
most ISPs’ networks were not designed to handle [1], [2]:
Upstream/downstream ratio: Applications such as HTTP
exhibit a very high download/upload ratio, which has
used to plan many networks. P2P FS users have incentives
to upload as much as they download, and some peers may
even operate only as uploaders (seeders).
Time of day usage patterns: ISPs expect their user population to follow certain behaviour patters. For instance,
home users are only expected to use the network at
evening and during weekends. P2P FS applications are
often left unattended, running 24 hours a day.
Traffic sources: Email traffic originates mostly from
servers within an ISP’s own network. HTTP traffic comes
primarily from local proxies or from a few well known
popular sites. P2P traffic can originate anywhere. A peer
might decide to download a file part from a peer within
the same ISP or from across the world.
Over-Subscription ratios: A typical HTTP user will download a page and then spend some time reading it. This
allows ISPs to use large over-subscription ratios, as most
of the time users aren’t utilising the bandwidth made
available to them. The download of a large movie file via
P2P may take many hours, while it will only take about
two hours to view it. P2P FS users will download/upload
almost continuously.
P2P FS traffic increases ISPs transit and peering costs while
affecting the performance of other applications, which can
alienate costumers, increasing costumer churn. Upgrading the
network to higher bandwidth is not an option as the increased
cost would not be compensated by additional revenue. Furthermore, as P2P FS applications are designed to consume as
much bandwidth as possible, the new bandwidth would rapidly
be consumed. Blocking P2P FS traffic isn’t an option either, as
P2P FS applications are one of the drivers behind broadband
adoption. Doing it would most likely result in the loss of large
numbers of subscribers.
The increasing usage of P2P FS technology for independent
and commercial music, video and software distribution means
that the problem is here to stay, and ISPs and P2P FS users
must find ways to balance their goals.
The main contribution of this paper is a simulation study
of some of the options available for minimising the impact
of P2P FS traffic on the Internet and on ISPs’ business.
We compare the tradeoff between the traffic savings reaped
by an ISP and the performance impact suffered by its P2P
using costumers when using three different techniques: traffic
shaping, Biased Neighbour Selection (BNS) and Adaptive
Search Radius (ASR). We’ve found that these technologies
are complementary, and also studied the effect of combining
them. The second contribution is the utilisation of a simulation
model which implements the full dynamics of a real P2P
protocol (eDonkey), taking into account all the effects at the
network, transport and application layers, instead of a simple
abstraction.
In the next section the traffic saving techniques are discussed, along with other methods for reducing the cost incurred by ISPs with P2P FS. Section III presents the simulation
study and its findings. Section IV presents the conclusions and
plans for future work.
II. C OST REDUCTION TECHNIQUES
It has been observed that P2P FS shows strong locality
properties, both in terms of network topology and geography,
suggesting that self-caching mechanisms could be enacted by
forcing peers to download from nearby peers instead of distant
ones [3], [4]. Self-caching would potentially result in better
performance for users and lower costs for ISPs.
BNS and ASR are two techniques that explore these locality
properties. Traffic shaping is a more traditional approach,
which limits the amount of bandwidth made available to P2P
applications.
A. Biased neighbour selection
Biased Neighbour Selection is a technique proposed for use
with the BitTorrent protocol [5]. It consists of using a transparent redirector, which intercepts the communication from
peers inside an ISP’s network with the tracker. The sources
(peers) provided by the tracker are filtered and replaced in
order to force the peer to communicate mainly with other peers
within the ISP’s network. Only a few peers from outside the
ISP’s network are provided, to ensure that the download will
complete successfully.
BNS requires ISPs to run the query interceptors at the
edge of their networks, using deep packet inspection in order
to intercept communications to the tracker/server and other
peers (due to peer source exchange and the growing use of
Kademlia). It is a solution that each ISP may choose to use
on its own, limiting the amount of traffic exchanged with other
ISPs while benefiting users, which are expected to download
faster from within their own ISP’s network.
BNS reduces the costs incurred by an ISP when there are
enough sources for the file within its network. Otherwise it
provides no advantages, as the files will have to be downloaded
from the outside. Furthermore it also fails to reduce the amount
of upload traffic, as it doesn’t prevent outside peers from
downloading from the peers within the ISP’s network.
The basic concept of BNS had already been suggested
before in studies conducted on traffic traces of Kazaa and
BitTorrent, where byte hit ratios of up to 63% were estimated
to be achievable by exploring locality properties within an ISP
[6], [7].
B. Adaptive Search Radius
ASR consists of a peer selection algorithm which can
be used with eMule, BitTorrent or other P2P file sharing
applications [8]. Instead of downloading from any peer, ASR
selects a subset of the peers it knows, the nearest ones (fewer
IP network hops), allowing the file download to be performed
with a smaller impact on the Internet.
ASR uses file availability as a metric. This is defined as
the number of contactable peers sharing the rarest file part the
peer doesn’t already have. As uploaders are contacted, their
distance (in network hops) and the file parts they share are determined. Minimum and maximum file availability thresholds
are defined as constant values in the ASR algorithm.
An ASR peer maintains a value for the Search Radius of
every file, the maximum distance (in network hops) a peer
may be in order to be considered. After learning (or updating)
which file parts a peer shares, the file availability is calculated.
If the file availability is larger than the maximum threshold, the
search radius is reduced by one hop while it remains larger
than the minimum threshold. If all peers within the search
radius have been contacted but file availability remains below
the minimum threshold, the search radius is increased one hop.
ASR will result in the use of file sources local to the
ISPs network when these are available. When contacting peers
outside an ISP’s network, the closest ones will be used.
Figure 1 depicts the behaviour of a P2P file sharing client
(node H) while downloading a file. Filled circles represent
peers, empty ones represent routers. Filled lines represent links
not being used by file transfers to peer H, while dotted lines
represent links crossed by file transfer traffic. Node H, through
the use of a file sharing P2P protocol, has learned that nodes
A to I share the wanted file.
In figure 1(a), peer H does not use ASR to filter out distant
peers, downloading from all of them. Therefore, download
traffic crosses most links, contributing to their congestion.
Figure 1(b) shows what happens when ASR is used. In this
example, it is assumed that ASR calculated a search radius of
4 network hops. This means, that peers F (4 hops), G (2 hops)
and I (3 hops), together provide the required minimum number
of sources for all the pieces peer H still has to download. As
ASR peers restrain from downloading from peers outside the
search radius for each file, information travels fewer network
hops, releasing capacity on all other links. Peer H still needs
to download a full copy of the file, but the download, being
performed from close peers, impacts fewer Internet links.
The decreased number of peers to download from is compensated by having to wait less to start downloading, as upload
queues are shorter. Since each downloading peer will restrict
the set of peers it downloads from, each uploader will have to
satisfy fewer requests.
ASR should result in lower P2P traffic crossing ISP’s boundaries while providing users with faster downlodas. However,
this is not a method that ISPs can deploy on their own. Instead
it would have to be built into P2P software.
C. Traffic shaping
In order to control the impact of P2P FS traffic on their
networks, many ISPs have deployed traffic shaping equipment.
This performs deep packet inspection on incoming or outgoing
traffic, in order to determine the traffic flows which correspond
to P2P FS sessions. Having identified P2P FS traffic, QoS
techniques may be applied to it. ISPs may choose to limit the
aggregate rate of P2P FS traffic, the rate of individual users’
traffic or provide all P2P FS traffic with a less than best effort
treatment. This allows ISPs to limit the effect P2P FS has on
(a) Without ASR
Fig. 1.
(b) With ASR using search radius of 4
Links crossed by file download traffic by node H
their networks, benefiting other applications, without requiring
additional investments in bandwidth.
However, P2P FS traffic is difficult to identify as developers
continuously try to circumvent detection. This means that
some P2P FS traffic will always get past unidentified. The
risk for false positives also exists. Furthermore, the processing
required for deep packet inspections may cause some delay on
all traffic [2]. This may alienate some subscribers.
Traffic shaping affects all P2P FS traffic equally. Even
though users may not have the right to complaint that their
illegal downloads are slow, they will protest when it affects
their rightful downloading activities [1]. This may cause legal
problems for ISPs, which may be accused of favouring other
commercial content distribution methods, especially when
network neutrality is being discussed in some countries.
D. Other methods
Some ISPs have abandoned unlimited traffic subscription
plans or increased their prices. Users are now offered different subscription plans, with different monthly bandwidth
allowances. Extra traffic is charged at high rates, which
discourages heavy P2P FS usage. This transfers the costs
associated with P2P FS traffic back to the users responsible for
the traffic. However, it also impacts other heavy users, not only
P2P FS users. Furthermore, it overlooks that P2P FS traffic
has been one of the drivers behind broadband adoption. This
method will only work when there is no competition offering
unlimited plans, as otherwise it will result in subscribers
leaving for the competition.
ISPs have also taken advantage of P2P FS protocols’ reciprocity, which provide peers with download speeds correlated
to their upload speed. Many ISPs restrict the upload speed well
beyond that imposed by the access technology used to connect
the subscribers. One such example is the use of ADSL, where
even though the maximum download speed is often used, only
a fraction of the possible upload speed is offered.
P2P FS cache is a concept similar to HTTP caches, with
which ISPs are familiar. Caches intercept P2P connections
going outside the ISP’s network and impersonate remote peers.
If the content is present in the cache, it will be served locally,
otherwise the remote peer will be contacted by the cache,
which will keep a copy of the content while delivering it to
the requesting peer. Byte hit ratios as high as 80% are claimed
by P2P caching solution vendors [1]. The use of caches
provides P2P FS users with similar or better performance,
while releasing bandwidth for other applications, allowing
ISPs to avoid or delay investments into network upgrades.
However, running a cache may implicate an ISPs in illegal
sharing of copyrighted content, subjecting it to legal problems.
Also, it has been observed that large video files are the fastest
growing file type in P2P networks, representing the largest
(65%) portion of bytes transfered. These files are the ones a
cache system should address, however, their size and number
require too large a storage system to be easily performed
outside the P2P network [6].
III. E VALUATION
A. Simulation scenario
We evaluated BNS, ASR and traffic shaping using the
SSFNet 2.0 network simulator, which provides the Layer 2,3
and 4 [9]. On it, we implemented the eDonkey/eMule protocol.
We used the GT-ITM topology generator to create a transitstub network with 350 routers and 840 P2P FS peers [10].
Hosts are connected to their routers using 0.5 and 1Mb/s links.
Stub routers are connected among themselves using 5 and 10
Mb/s links. Stub networks are connected to each other and to
transit networks using 2 and 5 Mb/s links. Transit routers are
connected using 10, 20 and 50Mb/s links. Transit networks
are connected using 10 and 20Mb/s links.
We simulated the distribution of a 100MB file, seeded by
7 peers to the other 834. This depicts a situation where a
company uses P2P FS to distribute some content (for instance
e-learning) to their various offices, or a flash crowd where
hosts continue to share after finishing their downloads.
Each time a peers asked the server for file sources, it would
reply with at most 50 peers. When using BNS, of this 50 peers,
up to 40, if available, would be within its own ISP.
ASR was configured to increase its search radius when
the file availability was below 3 and to reduce it when file
availability was greater than 6.
Traffic shaping was used by limiting the aggregate P2P FS
bandwidth permitted on the stub-stub and stub-transit links,
using a bit-bucket. We experimented limiting the available P2P
FS bandwidth to 4, 3, 2, 1 and 0.5 Mb/s.
I’ve also decided to combine the use of both ASR and BNS,
being that these are not conflicting technologies. The use of
BNS will allow ASR to converge on close-by peers more
quickly, as the peers it learns about will be mostly within
the same ISP. From the point of view of BNS, the use of ASR
will allow the peer to choose the closest peers from within the
ISP, instead of using a random set. Also, while BNS is unable
to provide traffic savings when there are no sources within the
same ISP, ASR will download from the closer peers.
B. Result analysis
Table I shows the main metrics gathered from the experiments. We can observe that both ASR, BNS and their
combination provide significant improvements on all metrics
over the use of plain eDonkey/eMule, both from the point of
view of the ISPs and of the subscriber.
From the subscriber perspective, any of the alternatives,
alone, result in noticeably faster downloads, with BNS having
a slight advantage. Their combination provides even better
results. This means that users would not feel alienated by any
of these technologies but rather have the feeling that the ISP
embraces P2P FS, which could result in increased costumer
loyalty.
From the point of view of a transit ISP, both ASR and
BNS result in very significant traffic savings, which could be
translated into better and cheaper services for their regional
ISP costumers. The combination of ASR and BNS results in
the release of more than two thirds of the used bandwidth.
From the point of view of the regional ISP, which provides
the service for the P2P FS users, the use of ASR or BNS
results in a reduction of the intra-ISP (stub) traffic and in an
even more significant reduction of the traffic exchanged with
the other ISPs (inter-ISP). The combined use of ASR and BNS
TABLE I
C OMPARING THE DIFFERENT TECHNIQUES
Avg. Download Time (s)
Transit Traffic (GB)
Stub Traffic (GB)
Inter-ISP Traffic (GB)
Number of Connections (K)
Avg. Hops Crossed
Efficiency (%)
Plain
ASR
BNS
ASR/BNS
1552
264
553
170
150
9.76
24.57
1439
160
452
115
81
7.31
32.80
1434
165
502
106
124
7.98
30.06
1389
77
387
53
72
5.55
43.25
results in even more savings, especially on the inter-ISP traffic,
which is reduced to less than a third. The large reduction in
inter-ISP traffic would allow ISPs to significantly reduce their
peering and transit costs, which could provide a competitive
advantage.
Also significant is the reduction in terms of number of
connections necessary to distribute the file. Here we see that
ASR has an advantage over BNS, but not over their combined
use. Even though the number of connections affects primarily
the peers, being that TCP is an end-to-end protocol, the fewer
connections to keep track of allow for better scaling of any
deep packet inspection equipment used by complementary
techniques.
We can also observe that the average number of network
hops crossed by each P2P FS data packet is reduced by the
use of BNS and even more by the use of ASR. Their combined
use yields the best results.
The last metric used was efficiency, were we measure the
amount of traffic which crossed every network link against
the ideal minimum traffic required to copy the file from
the sources to all the other peers. The minimum traffic was
calculated by determining the number of links present in
the minimum spanning tree rooted at all the sources, which
encompassed all the downloading peers. The number of links
was them multiplied by the size of the file. It would be
impossible for any protocol to reach 100% efficiency as this
does not take into account any of the protocol overheads
suffered by the real protocols (IP, TCP, eDonkey/eMule).
Once again we can observe that the use of ASR or BNS
is advantageous, with ASR having an edge over BNS. Their
combination is, once again, very beneficial.
Figure 2 show the evolution of the amount of inter-ISP
traffic used when the allowed inter-ISP bandwidth is varied
by traffic shaping. We can observe that significant savings are
only accomplished when the bandwidth is reduced to 1Mb/s.
The reduction of the bandwidth from 1 to 0.5 Mb/s results
in a faster decrease in the traffic consumption. The different
techniques maintain their relative positions with the different
bandwidths, except for 0.5Mb/s, where ASR provides much
greater savings than BNS, indicating that ASR copes better
with traffic shaping.
Figure 3 shows the behaviour of the average download
time achieved by the several techniques under different traffic
shaping restrictions. It is noticeable that users pay a high
price for the use of traffic shaping. The plain eDonkey/eMule
protocol would not allow ISPs to reduce P2P FS traffic below
the 2Mb/s, in this particular scenario, or they would witness
massive subscriber churn. However, only bellow this mark
were traffic saving noticeable. BNS shows a much better
behaviour, but it would also prevent the use of 0.5Mb/s, and
even 1Mb/s could be risky. ASR show a behaviour very close
to that of ASR and BNS combined, and both methods would
allow P2P FS traffic to be limited to 1Mb/s without causing
major annoyance to subscribers.
Figure 4 shows the evolution of the traffic efficiency of the
several techniques under different traffic shaping restraints. It
180
160
140
45
120
100
80
35
25
40
20
20
0.5
1
1.5
2
2.5
3
Aggregate P2P FS traffic limit (Mb/s)
3.5
4
Fig. 2. Amount of traffic exchanged among ISPs versus allowed bandwidth
18000
Plain
ASR
BNS
ASR/BNS
16000
Average download duration (s)
40
30
60
14000
12000
10000
8000
6000
4000
2000
0
0.5
Fig. 3.
Plain
ASR
BNS
ASR/BNS
50
Efficiency (%)
Inter-ISP P2P FS traffic (GB)
55
Plain
ASR
BNS
ASR/BNS
1
1.5
2
2.5
3
3.5
Aggregate P2P FS traffic limit (Mb/s)
4
Behaviour of download time with different bandwidths
is observable that the relative positions are always maintained.
However, we can see that ASR combined with BNS and,
especially, ASR by it self, react better to severe bandwidth
restrictions, increasing their efficiency. Under the 0.5 Mb/s,
the plain eDonkey/eMule protocol reduced its performance,
as the reduced bandwidth becomes insufficient for both data
and control messages, reducing each peer capacity do discover
new, closer-by peers.
IV. C ONCLUSIONS
We analysed the impact of three different bandwidth saving
techniques, which have the potential to reduce ISP’s costs
with P2P FS traffic: Biased Neighbour Selection, Adaptive
Search Radius and Traffic Shaping. We’ve used a network
simulator and a faithful implementation of the eDonkey/eMule
protocol to analyse the impact of every combination of the
above techniques on the traffic carried and exchanged by an
ISP and on the download time perceived by subscribers.
We’ve concluded that the techniques are complementary,
and that they may be combined, all three, in order to achieve
the best results.
Traffic shaping, of the three the only technique which is
wildly deployed, proved to be the least effective. Not only
0.5
Fig. 4.
1
1.5
2
2.5
3
Aggregate P2P FS traffic limit (Mb/s)
3.5
4
Efficiency of the several techniques under different bandwidths
does it provide modest traffic reduction, but renders P2P FS
unusable, due to the very high download times.
ASR and BNS, being very different by design, provide
similar results. BNS provided slightly faster downloads and
slightly higher inter-ISP traffic savings which result in more
immediate advantages for ISPs. However, ASR generates
slightly greater savings for the transit ISPs, which could result
in cheaper transit rates for regional ISPs. ASR also behaves
better when combined with extreme traffic shaping, providing
greater inter-ISP traffic savings and faster downloads.
The best results were, under every circumstance, achieved
by the combination of ASR and BNS.
R EFERENCES
[1] PeerApp. (2007, Mar.) Comparing P2P Solutions. White Paper.
[Online]. Available: http://www.peerapp.com/docs/ComparingP2P.pdf
[2] Sandvine. (2004) Meeting the Challenge of Today’s Evasive P2P
Traffic. White Paper. [Online]. Available: http://www.sandvine.com/
general/getfile.asp?FILEID=16
[3] J. Chu, K. Labonte, and B. N. Levine, “Availability and Locality
Measurements of Peer-to-Peer File Systems,” in ITCom: Scalability and
Traffic Control in IP Networks, 2002.
[4] A. Klemm, C. Lindemann, M. K. Vernon, and O. P. Waldhorst, “Characterizing the query behavior in peer-to-peer file sharing systems.” in
Internet Measurement Conference, A. Lombardo and J. F. Kurose, Eds.
ACM, 2004, pp. 55–67.
[5] R. Bindal, P. Cao, W. Chan, J. Medval, G. Suwala, T. Bates, and
A. Zhangan, “Improving Traffic Locality in BitTorrent via Biased
Neighbor Selection,” in ICDCS, July 2006.
[6] P. K. Gummadi, R. J. Dunn, S. Saroiu, S. D. Gribble, H. M. Levy, and
J. Zahorjan, “Measurement, modeling, and analysis of a peer-to-peer
file-sharing workload.” in SOSP, 2003, pp. 314–329.
[7] T. Karagiannis, P. Rodriguez, and D. Papagiannaki, “Should ISPs
fear Peer-Assisted Content Distribution?” in ACM SIGCOMM/USENIX
IMC’05, Oct. 2005.
[8] R. L. Pereira, T. Vazão, and R. Rodrigues, “Adaptive Search Radius
- Lowering Internet P2P File-Sharing Traffic through Self-Restraint,”
in The 6th IEEE International Symposium on Network Computing and
Applications (IEEE NCA07), 2007.
[9] J. H. Cowie, D. M. Nicol, and A. T. Ogielski, “Modeling the Global
Internet,” Computing in Science & Engineering, vol. 1, no. 1, pp. 42–50,
1999.
[10] E. W. Zegura, K. L. Calvert, and S. Bhattacharjee, “How to model an
internetwork,” in INFOCOM, 1996, pp. 594–602.