Download PeerToPeer

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Peer to Peer Networking
Network Models
=> Mainframe
 Ex: Terminal
 User needs direct
connection to mainframe
 Secure
 Account driven
administrator controlled
 Batch process oriented
 Data storage on the
server only
Network Models
=> Client/Server
 Ex: WWW
 User interface,
business rules
 Backend database
 Data storage on
the server, client
 N-tier architecture
 Hierarchical
Network Models
=> Distributed Architecture
 Ex: BitTorrent
 Tasks may be parallel
or autonomous
 Computation is done
at the edges
 Geographically
distributed thru
single interface
 Data storage is
distributed
Generations of P2P
 1st Generation: Centralized file list
Napster
He who controls central file is responsible legally
 2nd Generation: Decentralized file lists
Gnutella, FastTrack
Improvements – optimizations of decentralized search
 3rd Generation: No file lists
Freenet, WASTE, Entropy, MUTE
Anonymity built in
The Good, Bad, and Ugly of P2P
The Good
Security based on social contract
Free exchange of ideas
Everyone’s computer can contribute to the
greater good
The Bad
Avoids most security: Can be used for piracy
The Ugly
Often targeted by RIAA and others for piracy
Peer-to-Peer Concepts
Bootstrapping
Finding peers to connect to
Peer Discovery
Finding other peers in the system
Content Location
Finding a peer with the desired content
Content Delivery
Downloading from selected peer or peers
Napster
 Bootstrapping & Peer
Discovery
Centralized server
 Content Location
Tell server IP address &
filenames
Send query to server returns list of peers
 Content Delivery
Download from a single
peer
Napster in court
 Napster claimed they were not infringing
copyright because they were not storing any
songs
 shutdown by court injunction because case
against them was likely to succeed
Napster users likely guilty of direct copyright
infringement - copying of a work by another
Napster likely to be guilty of contributory infringement
because they learned of infringement and failed to purge
the materials from its system
Napster likely to be guilty of vicarious infringement
because they supervised or controlled the party
engaging in infringing activity and had a financial
interest in the activities
Gnutella
 peer-to-peer networking: applications connect to peer
applications
 focus: decentralized method of searching for files
 each application instance serves to:
 store selected files
 route queries (file searches) from and to its neighboring peers
 respond to queries (serve file) if file stored locally
 Gnutella history:
 3/14/00: release by AOL, almost immediately withdrawn
 too late: 10K users managed to download
Gnutella
Bootstrapping
First time: connect to a peer that you heard
about outside of gnutella
Keep a cache of peers discovered for later use
Peer Discovery
Try to always be connected to a fixed number
Send ping message - flooded to neighbors
Respond to ping with pong
Contains IP address, port, # files, # KB
Gnutella: Content Location
Searching by flooding:
 If you don’t have the file you
want, query 7 of your
partners.
 If they don’t have it, they
contact 7 of their partners,
for a maximum hop count of
10.
 Requests are flooded, but
there is no tree structure.
 No looping but packets may
be received twice.
 No prioritization
mechanism
Gnutella
Content Delivery
Direct download from peer
If peer is behind a firewall
ask it to connect to you
If you are both behind a firewall - too bad
Problems
No explicit rate limiting on ping frequency or
query frequency - overload network
Slow peers can hinder faster peers
Free Riding
We want to move from
the client server architecture:
Free Riding
Towards a robust, decentralized p2p architecture:
Free Riding
But due to free riding, we end up with:
Free Riding Characteristics
 Exhibits a Pareto distribution of sharers (many people
have small hard disks, small bandwidth and small hearts,
few have large)
 Hurts overall resiliency, network throughput
 The move from the traditional star(s) topology is less
than one would wish.
 Equilibrium far away from global optimum
Free riding statistics on Gnutella
66% of hosts share no files
73% of hosts share ten or less files
Top 1% shares 40% of the files in the
network and answers 50% of the queries
Top 20% share 98% of the files
61% never answered a query (no one
wants their files)
Gnutella
 Group Leaders
 Ultrapeers
 Low bandwidth peers
connect to group leader
 Queries through group
leaders
 Cached hash tables
 Hits include estimate of
upload speed
 Protocol extensions
 Parallel download
 Persistent, location independent filenames (URNs)
 LAN multicast
On came BitTorrent
Author: Bram Cohen
Based on Tit-for-tat
Incentive - Uploading while downloading
Pieces of files
Bittorrent
Bootstrapping
Download a .torrent file from a web server
Contact listed tracker for list of peers
Peer Discovery
Periodically contact tracker
Content Location
Check with each peer to determine which
blocks they have
Download rarest blocks first
Bittorrent - Content Delivery
 Seed
A server which has the entire file
Other peers may also act as a seed if they linger after
downloading the file
 Parallel Download
 Incentives
Serve content to k connections at a time
Serve to connections that give you the most
Periodically serve to a random connection to see if it can
do better than current connections
Overall Architecture
Tracker
Web Server
url
of the tracker
Pieces <hash1,hash2,….hashn>
Piece length
Name
Length
Files
C
A
Peer
Peer
[Leech]
B
Downloader
Peer
“US”
[Leech]
[Seed]
Overall Architecture
Tracker
Web Server
Peer-cache
State information
C
A
Peer
Peer
[Leech]
B
Downloader
Peer
“US”
[Leech]
[Seed]
Overall Architecture
Tracker
Web Server
Peer-cache
State information
C
A
Peer
Peer
[Leech]
B
Downloader
Peer
“US”
[Leech]
[Seed]
Overall Architecture
Tracker
Web Server
C
A
Peer
Peer
[Leech]
B
Downloader
Peer
“US”
[Leech]
[Seed]
Overall Architecture
Tracker
Web Server
C
A
Peer
Peer
[Leech]
B
Downloader
Peer
“US”
[Leech]
[Seed]
Overall Architecture
Tracker
Web Server
C
A
Peer
Peer
[Leech]
B
Downloader
Peer
“US”
[Leech]
[Seed]
Peer Selection (tit for tat)
Incentive Mechanism
Choking Algorithm
Temporary refusal to upload - performed every 10s
Based solely on download rate - tit for tat
Optimistic Unchoking
Rotating peer to optimistically unchoke
Rediscover unused connections and changes
Anti-snubbing
When a peer receives no data from another in 60s,
assume it is choked by all other peers. Refuse to
upload to it except for optimistic unchoking
Strengths
Better bandwidth utilization
Up to 7 MB/s from the Internet.
Limit free riding – tit-for-tat
Coupled upload and download
Spurious files not propagated
Ability to resume a download
Weaknesses and Open Issues
In practice, the seed does an
inproportionate amount of work
Peer selection strategy
Can we do better than random?
Block selection strategy
Rarest first?
How well do incentives work?