* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download PPT - Course Website Directory
Survey
Document related concepts
Transcript
CS 414 – Multimedia Systems Design Lecture 36 – Introduction to P2P and P2P Streaming Klara Nahrstedt CS 414 - Spring 2011 Administrative MP3 preview is on April 27 Sign-up sheet for APRIL 27 demonstration will be provided in class on April 27 Preview demonstrations on April 27, 7-9pm in 216 SC CS 414 - Spring 2011 Administration Final MP3 delivery on April 29 Demonstration time slots on April 29 will be allocated on Thursday, April 28 (TA will email to each group) Two demonstration intervals students not in Google competition – demos 2-4pm in 216 SC, students in Google competition – demos 5-7pm in 216 SC Pizza and Google Prizes Announcement at 7:15pm, April 29 (room 3403 SC) Homework 2 will be out on Wednesday, April 27 Deadline, May 4, 11:59pm – submission via compass CS 414 - Spring 2011 Why P2P? People like to get together Share things Share information How can I find others to share? How can I find what they are sharing? CS 414 - Spring 2011 Why P2P? WWW to share? Not everyone can set up and maintain a website Might not have a good enough search engine rank Sharing on a social network? Only possible if you created the content People have to find you, instead of the content CS 414 - Spring 2011 Napster Created in 1999 to share music Napster’s servers •Store file directory (Filename/location) Clients/peers •Store files CS 414 - Spring 2011 Napster, operation 2. Servers search using ternary tree 1. Query (keyword) 3. Response 5. Download CS 414 - Spring 2011 4. Ping candidates Napster’s drawbacks Asymmetric operation – servers vs. client peers Scalability and congestion issues Single point of failure Napster responsible for abetting users’ copyright violation CS 414 - Spring 2011 Contemporary P2P systems Symmetric operation – all nodes (peers) have the same functionality System naturally scales with new peers Basic operations Insert file Search file Delete file Maintain an overlay network CS 414 - Spring 2011 (physical network) Contemporary P2P Systems (Classification) Usually classified depending on how peers connect to each-other – i.e., how the overlay network is created and maintained Unstructured – no particular connection pattern (e.g., randomly connected) Gnutella Fast Track Skype BitTorrent, etc. CS 414 - Spring 2011 Contemporary P2P Systems (Classification) Structured – defines a distributed data structure (e.g., distributed hash table) Chord Pastry CAN Etc. CS 414 - Spring 2011 Gnutella Servents (peers) Peer pointer Peers store: •Their files •Peer pointers CS 414 - Spring 2011 Gnutella, searching 1. Flood query ( ) 2. Ignore repeated messages 3. Answer if local match 4. Query hit sent using reverse path ( ) 5. Establish connection and fetch file ( ) “jazz”?? “jazz” Query message: <id, QUERY, ttl, hops, payload length, min speed, keywords> Query hit message: <id, QUERY HIT, ttl, hops, payload length, num hits, port, ip, speed, (fileindex, filename, filesize), servent id> CS 414 - Spring 2011 Gnutella, maintaining the overlay Neighbor list: •“A” “V” A V X 1. Periodically flood ping ( ) 2. Pong sent using “X” reverse path( ) 3. Update neighbor list with received pongs • Why periodically? Ping: <id, PING, ttl, hops, payload length (zero)> Pong: <id, PONG, ttl, hops, payload length, port, ip, num. files, num. KBs> CS 414 - Spring 2011 Gnutella, maintaining the overlay V X Peers can leave or fail at any time. CS 414 - Spring 2011 Neighbor list: •“A” •“V” •“X” Gnutella: some issues Ping/Pong constitutes 50% of traffic Flooding causes excessive traffic Repeated searches with same keywords Large number of freeloaders (70% of users in 2000) CS 414 - Spring 2011 DHTs (Distributed Hash Tables) Hash table allows these operations on object identified by key: Insert Lookup Delete Distributed Hash Table – same but in a distributed setting (object could be files) CS 414 - Spring 2011 DHT performance comparison Memory Lookup latency lookup overhead Napster O(1) at client; O(N) at server O(1) O(1) Gnutella O(N) O(N) O(N) Chord (DHT) O(log(N)) O(log(N)) O(log(N)) CS 414 - Spring 2011 Streaming from servers Bandwidth at video service and number of servers have to grow with demand Flash crowds have to be taken into account clients … servers … Video service CS 414 - Spring 2011 P2P Streaming Use the participating node’s bandwidth More nodes watching stream = more shared bandwidth App-level multicast How? Peers watching stream Live stream source (could be a member of the p2p network) P2P network CS 414 - Spring 2011 P2P Streaming Common arrangements to multicast the stream Single Tree Multiple Tree Mesh-based All nodes are usually interested in the stream They all have to deal with node dynamism (join/leave/fail/capacity-changes) CS 414 - Spring 2011 Streaming in a single tree Frames c o d e r packets 3 2 1 Source 3 2 1 3 2 (using RTP/UDP) CS 414 - Spring 2011 1 Single Tree Peers interested in the stream organize themselves into a tree Source … Nodes send as many copies of a data packet as they have children … … CS 414 - Spring 2011 Joining the tree Find a node with spare capacity, then make it parent If contacted node lacks capacity, pick child Random child Round robin Child closest in physical network to joining “Parent?” node “Try one of my children” CS 414 - Spring 2011 Leaving the tree or failing Orphan nodes need a new parent Policies for new parent Children pick source Subtree nodes pick source Children pick grandfather Subtree nodes pick grandfather …then repeat join procedure CS 414 - Spring 2011 grandfather Ex-parent Orphan children after parent leaves Single tree issues Leaves do not use their outgoing bandwidth Packets are lost while recovering after a parent leaves/fails Finding unsaturated peer could take a while Tree connections could be rearranged for better transfer CS 414 - Spring 2011 Multiple Trees Challenge: a peer must be internal node in only one tree, leaf in the rest Source 1 2 3 3 1 Are nodes 1, 2, 3 receiving the same data multiple times? 2 2 1 CS 414 - Spring 2011 3 Multiple Description Coding (MDC) Packets for description 0 c o d e r 30 10 Packets for description n 3n 20 … Frames 2n 1n Each description can be independently decoded (only one needed to reproduce audio/video) More descriptions received result in higher quality CS 414 - Spring 2011 Streaming in multiple-trees using MDC Assume odd-bit/even-bit encoding -description 0 derived from frame’s oddbits, description 1 derived from frame’s even-bits 21 11 31 30 20 10 (using RTP/UDP) 1 2 3 3 1 CS 414 - Spring 2011 2 4 Multiple-Tree Issues Complex procedure to locate a potentialparent peer with spare out-degree Degraded quality until a parent found in every tree Static mapping in trees, instead of choosing parents based on their (and my) bandwidth An internal node can be a bottleneck CS 414 - Spring 2011 Mesh-based streaming Basic idea Report to peers the packets that you have Ask peers for the packets that you are missing Adjust connections depending on in/out bandwidth Description 0/1/2 Description 1,2 (mesh uses MDC) Description 0/1/2 Description 0 (Nodes are randomly connected to their Description 0,1,2 peers, instead of Description 1 CS 414 - Spring 2011 statically) Content delivery (Levels determined by hops to source) Description 0 Description 2 Description 1 1 4 10 2 6 5 11 12 7 13 (1) Diffusion Phase ( 3 14 ) 15 8 16 (2) Swarming Phase ( CS 414 - Spring 2011 9 17 ) Diffusion Phase … Segment 0 40 20 30 10 22 12 … … Segment 1 Have segment 0 42 32 22 12 Send me Segment 0 3 (during period 0) As a new segment (set of packets) of length L becomes available at 3 source every L seconds 22 Level 1 nodes pull data units 12 from source, then level 2 pulls from level 1, etc. 9 Recall that reporting and (during period 1) pulling are performed CS 414 - Spring 2011 periodically Have segment 0 Send me Segment 0 (drawings follow previous example) Swarming Phase 10 2 20 11 10 11 12 21 7 13 20 14 10 15 11 21 9 16 21 17 11 At the end of the diffusion all nodes have at least one data unit of the segment Pull missing data units from (swarm-parent) peers located at same or lower level Can node 9 pull new data units from node 16? Node 9 cannot pull data in a single swarm interval CS 414 - Spring 2011 (drawings follow previous example) Buffering Due to diffusion nodes need a buffer Size: c*L (diffusion tree depth + minimum number of swarming intervals) <= c CS 414 - Spring 2011 Conclusion P2P technology – very viable technology for multimedia distribution P2P streaming – strong alternative to CDN (Content Distribution Networks) networks Examples of P2P streaming technology PPLive Skype CS 414 - Spring 2011 Many more details in references and source code M. Castro, P. Druschel, A-M. Kermarrec, A. Nandi, A. Rowstron and A. Singh, "SplitStream: High-bandwidth multicast in a cooperative environment," SOSP 2003. H. Deshpande, M. Bawa, H. Garcia-Molina. "Streaming Live Media over Peers," Technical Report, Stanford InfoLab, 2002. N. Magharei, R. Rejaie. "PRIME: Peer-to-Peer Receiver drIven MEsh-Based Streaming," INFOCOM 2007. N. Magharei, R. Rejaie, Y. Guo. "Mesh or Multiple-Tree: A Comparative Study of Live P2P Streaming Approaches," INFOCOM 2007. http://freepastry.org CS 414 - Spring 2011