* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Network services - Internet Network Architectures
Survey
Document related concepts
Multiprotocol Label Switching wikipedia , lookup
IEEE 802.1aq wikipedia , lookup
Distributed firewall wikipedia , lookup
Piggybacking (Internet access) wikipedia , lookup
Computer network wikipedia , lookup
Zero-configuration networking wikipedia , lookup
Deep packet inspection wikipedia , lookup
Network tap wikipedia , lookup
Wake-on-LAN wikipedia , lookup
Cracking of wireless networks wikipedia , lookup
Airborne Networking wikipedia , lookup
Recursive InterNetwork Architecture (RINA) wikipedia , lookup
Transcript
Network Services in the NextGeneration Internet Tilman Wolf Department of Electrical and Computer Engineering University of Massachusetts Amherst Department of Electrical and Computer Engineering Need for Clean-Slate Network Architecture Limitations of current architecture • Fixed TCP/IP stack • Hardware implementation of forwarding • Extensions are “hacks” Access router: - Access concentration (cable, DSL, wireless) - Network address translation - Policy-based QoS - Monitoring and billing - Firewall Core router: - Multiprotocol label switching - QoS aware routing - Monitoring − Firewalls, intrusion detection systems, network address translation Need for new network architecture • Support for more heterogeneity ` End system: - IP security - TCP termination Edge router: - Packet classification - QoS (DiffServ) - monitoring and billing − End systems: cell phones, PDAs, RFID tags, sensors − Routers: wireless infrastructure, ad-hoc networks Server: - Content-based switching - Firewall - SSL termination - IP security • Support for new networking paradigms − Data access: content distribution, content addressable networks − Protocols: multipath routing, network coding Tilman Wolf 2 Network Virtualization Virtualization of router system • Common hardware (“substrate”) • Coexistence of multiple virtual networks • Specialized networks deployed as separate protocol stacks (“slices”) Programmability in data plane parallel “protocol stacks” substrate router ` ` • Deployment of new protocols through software Questions • How to deploy new functionality in empty slice? − Per-connection functionality and network-wide functionality • How to manage processing resources in substrate? Tilman Wolf 3 Outline Introduction Network Services • Architecture • Routing with network services Packet processing systems • Runtime management Conclusions Tilman Wolf 4 Flexibility vs. Manageability Customization in network architecture Application transcoding caching • What is the right level of flexibility? Two extremes • ASIC implementation of IP router flow control SSL privacy QoS scheduling multicast Transportreliability Network anycast − All packets are handled the same way − No flexibility IDS • Active networks − Packet processing can be programmed − Too much flexibility – very difficult to manage Our approach: balanced combination • Set of well-defined protocol processing features (“network services”) caching reliability transcoding flow control anycast IDS multicast SSL privacy QoS scheduling − E.g., reliability, security, scheduling, … • Custom combination of services provides flexibility Tilman Wolf Network services 5 Network Service Architecture New communication abstraction • Custom composition of functions along end-to-end path • expressed as sequence of “network services” Service-Enabled Network EndSystem EndSystem Connection Request EndSystem Service 1 Service 2 EndSystem Benefits • End-system application can choose most suitable features • Network can control placement of services • Programmable routers implement network services Tilman Wolf 6 Related Work Protocol stack composition on end-system (“vertical”) • Configurable protocol stacks [Bhatti & Schlichting, SIGCOMM 1995] • Configurable protocol heaps [Braden, Faber & Handley, CCR 2003] • NCSU SILO project [Dutta et al., ICC 2007] Custom network processing (“horizontal”) • Active networks [Tennenhouse & Wetherall, SIGCOMM 1996] • Modular routers: Click [Kohler et al., TOCS 2000] • Programmable routers and network processors Substrate systems • Router virtualization: VINI (Princeton), SPP (Washington University) • Forwarding substrate: OpenFlow (Stanford), PoMO (Univ. of Kentucky) Our focus: abstractions for horizontal composition Tilman Wolf 7 Network Service Architecture Hierarchical inter-network and intra-network design • Autonomous System abstraction • Match with administrative boundaries of Internet Control plane Control plane • Connection setup • Routing algorithm Service Controller ServiceEnabled Network Service Controller EndSystem Data plane Data plane EndSystem Service Node Service Node Service Node Service Node Service Node Service Node Service Node • Forwarding • Packet processing Tilman Wolf 8 Connection Setup Interface to applications EndSystem • API similar to Berkeley sockets • Service specification determines sequence of requested services Example: Service Controller Options • Parameters necessary for service (e.g., LZ) • Constraints service placement (e.g., sending LAN, receiving LAN) Service Node 2 connection request mapping t service setup service setup *:*>>compression(LZ)>> decompression(LZ)>>192.168.1.1:80 • Connection to 192.168.1.1 port 80 • Compression (Lempel Ziv) on path Service Node 1 setup ack resource allocation ... setup ack connection ack data transmission service processing data transmission service processing ... Tilman Wolf 9 Multiparty Interests Connection setup can be influenced by multiple parties: • Sender (connection to destination) • Receiver (e.g., use of proxy) • Network service provider (e.g., monitoring) Explicit addition of services *:*>>monitoring>>proxy>>128.119.85.114:80 *:*>>128.119.85.114:80 *:*>>monitoring>>*.* Network service provider *:*>>proxy>>128.119.85.114:80 Sender Receiver Tilman Wolf 10 Service Routing Problem Interesting problem at connection setup • Determine path and select nodes to perform service • How can a node decide best path? − Better to perform service locally? − Better to defer to downstream node? − Which direction to route connection? Assumption: single cost metric • Otherwise NP complete Centralized solution • Global view necessary • Limited scalability Distributed solution ? s ? ? t ? • Dynamic programming Tilman Wolf 11 Distributed Service Matrix Routing Similar to Distance Vector routing services 6 7 ... 3 4 S2 S1S2 S2S1 ... 9 11 10 ... 5 15 13 ... ... destinations v1 v2 ... − Handle service locally OR − Send to neighbor with lowest cost S1 ... • Each node can determine best path - ... − Periodic service matrix exchange − Service matrices stabilize eventually ... • Expand vector to include service: “service matrix” ... Distributed Service Matrix Routing (DSMR) no service • Each neighbor announces cost of best path to each destination • Each node adds cost to neighbor and picks best router (Bellman-Ford) • Challenge: each service combination requires columns in matrix − Exponential growth of matrix with number of services Tilman Wolf 12 Use information from single service only • Matrix lists node where service is performed Upper bound on path S1 ... ... S3 3 4 ... - destinations v1 v2 ... • Allocate best node for last service • Find best path for S2 second-to-last S1 service to that node S2 S 1 • Repeat for s all services S2 ... 6,v3 9,v4 ... 7,v5 5,v4 ... ... Routing of multiple services services no service Approximate DSMR S3 t Least-cost path for services sequence Least-cost path for one service (given by service matrix) Approximate least-cost path (and upper-bound) Tilman Wolf 13 Prototype Implementation Emulab prototype • 12 Autonomous Systems • 60 nodes Service routing • Centralized within AS • Approximate DSMR between ASs 149,760 connections • All possible sourcedestination pairs • All possible service combinations Tilman Wolf 14 Evaluation of DSMR Correctness • 6 of 149,760 connections failed Convergence time • Service matrices converge • Time increases with network size Approximate DSMR • Works well for small number of services • Inefficiency grows with number of service path length of approximation over optimal route Tilman Wolf 15 Evaluation of DSMR Connection setup time with Distributed Service Routing Protocol (DSRP) • Compared to TCP • Setup time less than 2× longer Evaluation summary • Routing with service constraints can be solved efficiently • Distributed algorithm is scalable when using approximation Tilman Wolf 16 Example Scenario: IPTV Distribution Heterogeneous receivers present challenge with live IPTV • Current solution: overlay with transcoding on end-systems Network HDTV display (1080p) HDTV Source (1080p) 1080p to H.263 1080p to H.261 Video transcoding Low quality display (H.261) Low quality display (H.263) Tilman Wolf 17 Example Scenario: IPTV Distribution Transcoding in network when using network service Network HDTV display (1080p) HDTV Source (1080p) 1080p to H.263 1080p to H.261 Low quality display (H.261) Low quality display (H.263) Tilman Wolf 18 Example Scenario: IPTV Distribution Prototype implementation • Emulab simulation Service request • *:*>>monitor(bandwidth) >>multicast(192.168.1.1, videotranscode(1080p,H.264) >>monitor(bandwidth) >>192.168.2.17) >>*:5000 Also prototyped on real router system • Cisco ISR with AXP • Single core processor insufficient How to design a good packet processing system? Tilman Wolf 19 Outline Introduction Network Services • Architecture • Routing with network services Packet processing systems • Runtime management Conclusions Tilman Wolf 20 Programmable Router Flexibility through programmability • General-purpose processing capability in data path • Packet processing in software High-performance processing hardware Router Port packets Port • E.g., network processors Scalability through high level of parallelism Tilman Wolf Network Processor Network Interface Network services on packet processor Switching Fabric Port Processing Processing Engine Engine Port Interconnect Port Processing Processing Engine Engine I/O 21 Programming of Packet Processors Programming is challenging • Distribution of processing onto multiple processors − Run-to-completion model often not feasible • Limited instruction store on embedded packet processors • Contention for shared data structures • System components on MPSoC are tightly coupled • Simple code, but repetitiveness amplifies small problems Typical solution: offline optimization • Simulation to identify performance bottlenecks • Manual adjustment − Code, thread and processor allocation, memory management • Repeat Offline optimization cannot handle dynamic environment • Change in network traffic, network services, slice allocation, etc. Tilman Wolf 22 Runtime System for Packet Processors Possible data-path network services Click configuration Implementation of Click elements Offline programming and configuration Runtime management Update of profiling information Graph of schedulable Click elements User Adaptation of task allocation to processing resources • Runtime profiling to obtains usage statistics • Task mapping to adapt to current requirements Task mapping Installation of Click configuration on packet processing hardware Packets Click Click Processor core Processor core Hardware accelerator Click Processor core Heterogeneous multi-core packet processing system Current focus: processing (not memory) Tilman Wolf 23 Workload Representation Granularity of representation is important PollDevice(eth0) StrideSwitch(1, 1) • Too coarse: not easily distributed • Too fine-grained: scalability problem Good balance: Click modular router • Directly translatable into implementation To eth1 Queue ARPResponder(...) To eth1 Queue Tee(5) Unqueue Unqueue Classifier(...) Classifier(...) To eth1 Queue Tilman Wolf To eth1 Queue ARPResponder(...) Unqueue To eth1 Queue Tee(5) Strip(14) To eth1 Queue Unqueue Unqueue Unqueue Unqueue EtherEncap(...) EtherEncap(...) EtherEncap(...) EtherEncap(...) CheckIPHeader Unqueue Strip(14) Unqueue Unqueue Unqueue Unqueue EtherEncap(...) EtherEncap(...) EtherEncap(...) EtherEncap(...) CheckIP 24 Task Mapping Problem Which Click element (“task”) should run on which processor? Challenges tt 2 33 t1 t41 1 tT-21 ... t5 t5t35 tT1 tT-11 t61 t81 t71 task mapping ... ... • Different task “sizes” (in terms of processing requirements) • Different task utilization • Communication cost of inter-processor transfers t21 Our approach • Simplify problem by creating tasks of equal size Tilman Wolf ... ... ... • Computationally hard to solve ... Leads to packing problem M threads interconnect N processors packet processing system 25 Task Replication Profiling provides runtime information • Task utilization • Task processing time Compute: “work” per task • work = (utilization) × (processing time) Replicate tasks with highest work • Replication reduces utilization • Reduced utilization reduces work Benefits or replication • Task work more balanced ti-1 ti ti+1 task replication ti-11 ti1 ti+11 ti2 ti+12 ti3 − Simplifies mapping problem • Larger number of tasks − Allows scaling to large number of processors Tilman Wolf 26 Task Replication Example: Click configuration with 23 tasks • IP forwarding and IPSec as network services 155´ difference 13.5´ difference Tilman Wolf 27 Task Mapping Simple greedy algorithm • Co-locate tasks with maximum utilization edges • When processor “full” then switch to next Runtime adaptation • • • • Update profiling information Update replication Update mapping Update NP configuration Tilman Wolf 28 System Evaluation Our runtime system vs. SMP Click on 4-core Xeon • For various scenarios we observe up to 1.32x higher throughput Tilman Wolf 29 What’s Next? Trends continue • Trend towards more programmability in networks • Trend toward more embedded cores per chip • Trends towards system usability − Cisco QuantumFlow (40 cores, 4 threads) programmable in ANSI-C Question: homogeneous or heterogeneous MPSoC? • Homogeneity simplifies programmability • Hardware accelerators perform better • How to find balance in next-generation Internet? slow path processing packet processing system implementation fast path processing specialized hardware ? t Internet architecture Tilman Wolf generalpurpose processor today next-generation Internet architecture 30 What’s Next? Question: is there a better packet processor design? instruction memory service 5 /8 address shifter 32 / data in/out / flow 17 flow 9 service processor addr[11..0] 8 / service 2 32 instruction in / service 1 addr[7..0] /8 addr[17..12] addr[15..8] addr[19..18] 12 / 6 / /2 address shifter service 1 6 / PKT_DONE DEC 10 / 12 STATE_EN / PKT_OUT[31..0] packet (flow 9) 32 18 / / / PKT_IN[31..0] 24 / / data[31..0] 8 / FlowTag[23..0] service 5 packet memory packet (flow 17) 32 FLOW_EN ServiceTag[7..0] addr[19..0] Question: correct service composition semantics? local data memory 32 / • High overhead for managing packet processing context • Hardware support for context management • Processor core sees simple instruction and memory address space 32 32 • How can service specifications be verified or composed automatically? • Can enumeration of all service properties be avoided? Tilman Wolf 31 Outline Introduction Network Services • Architecture • Routing with network services Packet processing systems • Runtime management Conclusions Tilman Wolf 32 Conclusions Next-generation Internet needs to meet many demands • Flexibility is key to avoid ossification Network services implement new features • Routing with services is important control-plane problem • Distributed Service Matrix Routing provide effective solution Programmable routers provide packet processing platform • Runtime system for network processors necessary for adaptation • Mapping of processing tasks to hardware resources Exciting time for networking research • New network architecture and applications • “Clean slate” designs allow for creative contributions Tilman Wolf 33 Acknowledgements Graduate students • • • • Xin Huang Sivakumar Ganapathy Shashank Shanbhag Qiang Wu Sponsors • National Science Foundation • Intel Research Council Tilman Wolf 34 [email protected] http://www.ecs.umass.edu/ece/wolf/ Tilman Wolf 35 Publications Network service architecture • • • • Runtime management of packet processors • • • • • Tilman Wolf, Ning Weng, and Chia-Hui Tai, "Run-time support for multi-core packet processing systems,"IEEE Network, vol. 21, no. 4, pp. 29-37, July 2007. Qiang Wu and Tilman Wolf, “Dynamic workload profiling and task allocation in packet processing Systems,” in Proc. of IEEE Workshop on High Performance Switching and Routing (HPSR), Shanghai, China, May 2008. Xin Huang and Tilman Wolf, "Evaluating dynamic task mapping in network processor runtime systems," IEEE Transactions on Parallel and Distributed Systems, vol. 19, no. 8, pp. 1086–1098, Aug. 2008. Qiang Wu and Tilman Wolf, “On runtime management in multi-core packet processing systems,” in Proc. of ACM/IEEE Symposium on Architectures for Networking and Communication Systems (ANCS), San Jose, CA, Nov. 2008. Qiang Wu and Tilman Wolf, “Runtime resource allocation in multi-core packet processing systems,” in Proc. of IEEE Workshop on High Performance Switching and Routing (HPSR), Paris, France, June 2009. Service processor design • Tilman Wolf, “Service-centric end-to-end abstractions in next-generation networks,” in Proc. of Fifteenth IEEE International Conference on Computer Communications and Networks (ICCCN), Arlington, VA, Oct. 2006, pp. 79-86. Sivakumar Ganapathy and Tilman Wolf, “Design of a network service architecture,” in Proc. of Sixteenth IEEE International Conference on Computer Communications and Networks (ICCCN), Honolulu, HI, Aug. 2007. Xin Huang, Sivakumar Ganapathy, and Tilman Wolf, “A scalable distributed routing protocol for networks with data-path services,” in Proc. of 16th IEEE International Conference on Network Protocols (ICNP), Orlando, FL, Oct. 2008. Shashank Shanbhag and Tilman Wolf, “Implementation of end-to-end abstractions in a network service architecture,” In Proc. of Fourth Conference on emerging Networking EXperiments and Technologies (CoNEXT), Madrid, Spain, Dec. 2008. Qiang Wu and Tilman Wolf, “Design of a network service processing platform for data path customization,” In Proc. of The Second ACM SIGCOMM Workshop on Programmable Routers for Extensible Services of TOmorrow (PRESTO), Barcelona, Spain, August 2009. Verification of service composition • Shashank Shanbhag, Xin Huang, Santosh Proddatoori, and Tilman Wolf, “Automated service composition in nextgeneration networks,” in Proc. of The International Workshop on Next Generation Network Architecture (NGNA), Montreal, Canada, June 2009. Tilman Wolf 36