* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download ppt - Network and Systems Laboratory
Survey
Document related concepts
Net neutrality wikipedia , lookup
Asynchronous Transfer Mode wikipedia , lookup
Net neutrality law wikipedia , lookup
Network tap wikipedia , lookup
Distributed firewall wikipedia , lookup
Computer network wikipedia , lookup
TCP congestion control wikipedia , lookup
Internet protocol suite wikipedia , lookup
Piggybacking (Internet access) wikipedia , lookup
Cracking of wireless networks wikipedia , lookup
Deep packet inspection wikipedia , lookup
Airborne Networking wikipedia , lookup
Recursive InterNetwork Architecture (RINA) wikipedia , lookup
Transcript
Analyzing the Internet Seminar 37-310 Polly Huang [email protected] 11 April, 2000 1 Not quiet yet • We don’t quite know how to analyze the Internet yet. • What do I mean by analyzing the Internet – determine how much buffer for certain queues – determine what form of flow control is more appropriate – where to place web caches 2 Series of two talks • current status: what do we know – overview – review basic statistics • the future: what can we do next – identify the missing pieces – put all the pieces together 3 Outlines • Internet • different from the telephone network • the telephone network experience doesn’t help much • from exponential to heavy-tailed 4 Internet basic components • like the postal system • nodes – end hosts and less number of routers – homes and local/remote post offices • links – connecting nodes (Ethernet, T1, T3, OC3, OC12, etc) – roads/streets between homes and post offices 5 Internet basic constructions • packets – with IP addresses (129.132.66.28) – with postal addresses (Gloriastrasse 35) • protocols – packets sent with TCP (reliable) – packets sent with registered mail with confirmation – but no congestion control 6 Outlines • Internet • different from the telephone network • the telephone network experience doesn’t help much • from exponential to heavy-tailed 7 Telephone network • nodes – telephones and switches • links – connecting nodes 8 But connection-oriented • • • • reserved fully from one end to another no need for congestion control don't need to be 100% reliable calls blocked from time to time 9 Different in many ways • network topology expansion – centralized vs. highly distributed • traffic – although humans still initiate calls/web sessions – computers are doing most of the talking – (a sign of Poisson not working anymore) • probably can’t use the nice queuing theory they have!!! 10 Outlines • Internet • different from the telephone network • the telephone network experience doesn’t help much • from exponential to heavy-tailed 11 Theory for data network • Maybe the telephone network experience will help 12 Planning telephone network • Topology – big telephone companies know it's telephone network • Traffic – voice phone connections were quickly identified as Poisson 13 Queuing theory • quickly emerging the era of queuing theory – Poisson call arrivals – exponential call duration – Poisson mixing with Poisson is still Poisson • pens and papers only • (though start to see problems with FAX and Internet accesses) 14 Then, we ask: • Is it as easy for the data network (a.k.a. the Internet)? 15 NO! • Difficulties – topology – traffic 16 For topology • changes are highly decentralized and highly dynamic • on one knows what the entire network look like at the moment 17 For data traffic • computers are now doing most of the talking • proved in several studies that data connections are not Poisson (or exponential) • bye-bye queuing theory 18 Outlines • Internet • different from the telephone network • the telephone network experience doesn’t help much • from exponential to heavy-tailed 19 Curve fitting? • too many parameters for a perfect fit • nothing seems to be typical – MCI backbone, Sept. 1997, 70% HTTP – UCB Internet link, Dec. 1997, 37% HTTP – Mar 1998, LBNL, median transfer 10,900 bytes – Dec 1998, LBNL, median transfer 5,600 bytes 20 Traces? • the Internet changes as we speak • a little bit of difference in the network condition may lead to very different results (a non-linear system) 21 Invariants • must search for 'invariants' that doesn't change with time or location? 22 Heavy-tailed • it turned out computer processes tend to be heavy-tailed or power-law distributed! – – – – – – – CPU time consumed by Unix processes size of Unix files size of compressed video frames size of FTP bursts Telnet packet interarrivals size of Web items Ethernet bursts 23 How to tell? 24 Review some Statistics • • • • • density vs. distribution Poisson exponential Pareto self-similarity 25 Density vs. Distribution • Density is the probability of certain events to happen – f(x) • Distribution is usually referred to as the accumulative density – f(0)+f(dz)+f(2*dz)+…+f(x) – F(x) = 0->xf(z) dz 26 exponential • # of time units between events • f(x) = ce-cx 27 Example exponential process 28 Poisson • # of events per time unit • f(x) = ce-c/x! 29 Example Poisson process 30 Pareto • one of the heavy-tailed distributions • f(x) = c*kc/(xc+1) 31 Example Pareto process 32 Distinguishing them • density • log density • log-log density 33 Density Log Density Log-Log Density 34 Teletraffic vs. Data traffic • Teletraffic Exp Exponential • Data traffic Exponential Heavy tailed 35 Animation • Show the telephone vs. Internet traffic demo 36 37 Self-similarity • Distributions of #packets/unit look alike in different time scale Serpgask Triangles 38 Wavelet Analysis • • • • • FFT - frequency decomposition dj WT - frequency and time decomposition dj,k k(dj,k2) / Nj Ej Ej = 2j(2H-1) C (The magic!!) log2Ej Self-Similar log2 Ej = (2H-1) j + log2C -jj 39 ’Shape' of self-similarity Self-similar Periodic Multifractal? 40 Revisit the original goal • Can we analyzing data networks? – Topology??? – Traffic? • Poisson arrival • heavy-tailed duration • self-similar aggregated traffic • Pure analytical modeling for data network? – a.k.a. only pens and papers? 41 NO! • probably not in a few years • confirmed by the experts 42 A few Reasons • can't use well-known self-similar (or fractal) processes • not exactly self-similar • 'shape' self-similarity changes with the network conditions • don't know what 'self-similar' processes add up to (mathematically difficult) 43 No more math! 20 min break 44 Series of two talks • current status: what do we know – overview – basic statistics review • the future: what can we do (a new research project) – identify the missing pieces – put all the pieces together 45 The project • Modeling and Simulation for Large-scale Data Networks 46 Project goals • identify (at least) high-level user and system characteristics • run simulations the scale of the Internet with significant packet-level detail on a PC • or a cluster of low-cost PCs 47 A beautiful picture • graduate students happily analyzing protocols in realistic Internet topology and traffic setups on their home PC (or a cluster of low-cost PCs) • (might not come true exactly) 48 Two parts • modeling • simulations 49 Review the models we know • • • • • char in the connectivity and routing char in the bandwidth char in user behavior char in object size distribution char in client/server location 50 Take the simulation approach • setup the network topology • select client and server • generate traffic 51 validation • verifying simulation results with the selfsimilarity (or multifractal) aggregated traffic 52 Two parts • modeling • simulations 53 Scalable Simulation “[To simulate] five minutes of activity on a network the size of today’s Internet would require gigabytes of real memory and months of computation on today’s 100 MIPS uniprocessors.” --- Ahn and Danzig, 1996 54 Scaling Solutions • Parallel and distributed simulation • Implementation Tuning • Abstraction 55 Abstraction Techniques • Large-scale Network Topology – Algorithmic routing • Large-scale Network Traffic – Finite state automata modeling 56 Bottlenecks • Topology - routing information • Traffic - TCP 57 Routing Table Cost • All-pair shortest pair routing – O(N2) • Hierarchical routing – O(N lgN) • Algorithmic Routing – O(N) 58 Algorithmic Routing • Next hop lookup • Topology mapping: arbitrary -> tree 59 Route Lookup a-1 2 a 2a+1 3 4 …. 5 6 10 21 43 • walk up from b to root by (b=(b-1)/2) 2 1 2a+2 • next_hop(a,b) 0 – if reaching a, return last node visited • else, return (a-1)/2 22 44 45 46 Next_hop(10,44)=21 Next_hop(1,45)=4 Next_hop(5,43)=2 60 Topology Mapping 6 0 5 1 2 0 7 1 BFS 4 2 3 O(N) 0 1 4 5 5 4 7 Re-assign 2 6 3 6 3 10 61 Evaluation Memory Route Length Ns-2 routing allocation artifact 1 6.00 Flat 5.00 Hier 4.00 Algo 3.00 # Hops or % MB 7.00 0.8 0.6 diff diff% 0.4 0.2 0 0 200 400 600 # Nodes 0 200 400 600 # Nodes • Transit-stub Topologies • Short cycles 62 TCP n0 • • • • n1 Slow start -> congestion avoidance Retransmission and timeouts A lot of variables per TCP connection!! A batch of packets per RTT or Timeout 63 FSA TCP • Coarse-grain TCP behavior • FSA for Short TCP connections – Numbers of packets sent per round trip time or timeout – Combinations of packet drops • Preservation of the close-loop feedback control (the KEY property) 64 Reno TCP (Partial) 1 2 4 8 16 28-30 15-27 12 14 7 (7,7) 10 8 4 6 46 1 (5,5) 1 6 1 (6,6) 6 5 7 6 7 2+2 2 2 4 (4,4) 3 (3,3) 1 2 (wnd, ssh) = (1,2) 4 3 2 5 4 3 5 4 5 6 6 6 7 7 65 Evaluation Memory % Difference in Throughput 800 detailed fsa tcp 400 200 0 0 50 # web sessions 100 % MB 600 4 3.5 3 2.5 2 0 50 100 # of web sessions • ISP-like topology • Each web session generates ~200 TCP connections 66 Case Study : Self-similar Traffic • SIGCOMM 99, Anja Feldmann, et. al. • Internet traffic characteristics – Large scale: self-similarity (user-related factors) – RTT scale: periodicity (TCP close-loop control) – Small scale: multifractal? (TCP ack clocking?) • Wavelet-based analysis: global scaling plot to detect self-similarity and periodicity 67 Evaluation Self-similar Periodic Multifractal? FSA TCP’s delay difference is ~10msec!! Time series are taken every 10msec!! Not appropriate for multifractal analysis!!! 68 Future in modeling • client and server location • aggregated traffic – explaining the shapes of self-similarity – temporal and spatial correlation • impact of web caching, IP telephony, pricing, diffserv (QoS) on existing models • classifying the Internet 69 Future in simulation • Algorithmic Routing – tree search algorithm – optimizations – Internet topology • FSA TCP – automatic FSA generation (Markov chain model for short TCP connections) – packet batch representatives 70 You can be in that future! 71 A Few Tips to Prepare Slides • • • • must have outlines/roadmaps/overview a picture is worth a thousand words keep it less than 3 bullets per slide http://www.diz.ethz.ch/dienstleistungen/unt erlagen/ssp_unterlagen.html 72