* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download MoreOnONOS - CSE Labs User Home Pages
IEEE 802.1aq wikipedia , lookup
Wake-on-LAN wikipedia , lookup
Piggybacking (Internet access) wikipedia , lookup
Distributed operating system wikipedia , lookup
Cracking of wireless networks wikipedia , lookup
Recursive InterNetwork Architecture (RINA) wikipedia , lookup
Computer network wikipedia , lookup
Zero-configuration networking wikipedia , lookup
List of wireless community networks by region wikipedia , lookup
Distributed firewall wikipedia , lookup
More on ONOS Open Network OS by ON.LAB Current ONOS Architecture Tiers subsystem components Consistency Issues in ONOS • • Switch mastership Network graph CSci8211: SDN Controller Design: More on ONOS 1 Key Performance Requirements High Throughput: ~500K-1M paths setups / second ~1-2M network state ops / second High Volume: Apps ONOS Global Network View / State ~10GB of network state data Difficult challenge! high throughput | low latency | consistency | high availability ONOS Prototype II Prototype 2: focus on improving performance, esp. event latency RAMcloud for data store: Blueprints APIs directly on top • Optimized data model table for each type of net. objects (switches, links, flows, …) minimize # of references bw elements: most updates require 1 r/w (In-memory) topology cache at each ONOS instance RAMcloud: low-latency, dist. key-value store w/ 15-30 us read/write eventual consistent: updates may receive at diff. times & orders atomicity under schema integrity: no r/w by apps midst an update Event notifications using Hazelcast, a pub-sub system CSci8211: SDN Controller Design: ONOS 3 Current ONOS Architecture Tiers Northbound Abstraction: network graph - application intents - Apps Northbound - Application Intent Framework (policy enforcement, conflict resolution) Core: - distributed - protocol independent Southbound Abstraction: - generalized OpenFlow - pluggable & extensible Distributed Core (scalability, availability, performance, persistence) Southbound (discover, observe, program, configure) OpenFlow NetConf ... Distributed Architecture for Fault Tolerance Apps NB Core API Distributed Core (state management, notifications, high-availability & scale-out) SB Core API Adapters Adapters Adapters Adapters Protocols Protocols Protocols Protocols Modularity Objectives ● Increase architectural coherence, testability and maintainability o establish tiers with crisply defined boundaries and responsibilities o setup code-base to follow and enforce the tiers ● Facilitate extensibility and customization by partners and users o unit of replacement is a module ● Avoid speciation of the ONOS code-base o APIs setup to encourage extensibility and community code contributions ● Preempt code entanglement, i.e. cyclic dependencies o reasonably small modules serve as firewalls against cycles ● Facilitate pluggable southbound Performance Objectives ● Throughput of proactive provisioning actions o path flow provisioning o global optimization or rebalancing of existing path flows ● Latency of responses to topology changes o path repair in wake of link or device failures ● Throughput of distributing and aggregating state o batching, caching, parallelism o dependency reduction ● Controller vs. Device Responsibilities o defer to devices to do what they do best, e.g low-latency reactivity, backup paths ONOS Services & Subsystems • Device Subsystem - Manages the inventory of infrastructure devices. • Link Subsystem - Manages the inventory of infrastructure links. • Host Subsystem - Manages the inventory of end-station hosts and their locations on the network. • Topology Subsystem - Manages time-ordered snapshots of network graph views. • PathService - Computes/finds paths between infrastructure devices or between end-station hosts using the most recent topology graph snapshot. • FlowRule Subsystem - Manages inventory of the match/action flow rules installed on infrastructure devices and provides flow metrics. • Packet Subsystem - Allows applications to listen for data packets received from network devices and to emit data packets out onto the network via one or more network devices. ONOS Services & Subsystems ONOS Core Subsystem Structure App App Component Component command query & command Listener command add & remove query & command notify AdminService Listener add & remove notify Service AdminService Service sync & persist Manager Component Store Manager Component Store sync & persist AdapterService AdapterRegistry AdapterService command sensing AdapterRegistry command register & unregister sensing Adapter register & unregister Adapter Adapter Adapter Component Component Adapter Adapter Component Component Protocols Protocols ONOS Modules ● Well-defined relationships ● Basis for customization ● Avoid cyclic dependencies onlab-util-misc onlab-util-osgi onlab-util-rest onos-api onos-of-api onos-core-store onos-of-ctl onos-core-net onos-of-adapter-* ... ONOS Southbound ● Attempt to be as generic as possible ● Enable partners/contributors to submit their own device/protocol specific providers ● Providers should be stateless; state may be maintained for optimization but should not be relied upon ONOS Southbound ● ONOS supports multiple southbound protocols, enabling a transition to true SDN. ● Adapters provide descriptions of dataplane elements to the core - core utilizes this information. ● Adapters hide protocol complexity from ONOS. Descriptions ● Serve as scratch pads to pass information to core ● Descriptions are immutable and extremely short lived ● Descriptions contains URI for the object they are describing ● URI also encode the Adapter the device is linked to Adapter Patterns 1. Adapter registers with core a. Core returns a AdapterService bound to the Adapter 2. Adapter uses AdapterService to notify core of new events (device connected, pktin) via Descriptions This is where the magic happens Manager Component AdapterService 2 sensing AdapterRegistry 3 1 register & unregister 4 Adapter Adapter Component Protocols 3. Core can use Adapter to issue commands to elements under Adapter control 4. Eventually, the adapter unregisters itself; core will invalidate the AdapterService Distributed Core ● Distributed state management framework o built for high-availability and scale-out ● Different types of state require different types of synchronization o fully replicated o master / slave replicated o partitioned / distributed ● Novel topology replication technique o logical clock in each instance timestamps events observed in underlying network o logical timestamps ensure state evolves in consistent and ordered fashion o allows rapid convergence without complex coordination o applications receive notifications about topology changes Consistency Issues in ONOS Recall Definitions of Consistency Models Strong Consistency • Upon an update to the network state by an instance, all subsequent reads by any instance returns last updated value. • Strong consistency adds complexity and latency to distributed data management Eventual consistency is a relaxed model • allowing readers to be behind for a short period of time • a liveness property There are other models: e.g., serial consistency, causal consistency Q: what consistency model or models should ONOS adopt? ONOS Distributed Core ● Responsible for all state management concerns ● Organized as a collection of “stores” o Ex: Topology, Link Resources, , Intents, etc. ● Properties of state guide ACID vs BASE choice Q: What are possible categories of various states in a (logically centralized) SDN control plane, e.g., ONOS? ONOS Global Network View Applications Applications Applications Applications observe program Network State • Topology (Switch, Port, Link, …) • Network Events (Link down, Packet In, …) • Flow state (Flow table, connectivity paths, ...) Switch Port Link Host Intent FlowPath FlowEntry CSci8211: SDN Controller Design: ONOS ONOS State and Properties State Properties Network Topology Eventually consistent, low latency access Flow Rules, Flow Stats Eventually consistent, shardable, soft state Switch - Controller mapping Distributed Locks Strongly consistent, slow changing Application Intents Resource Allocations Strongly consistent, durable, Immutable Switch Mastership: Strong Consistency Network Graph Distributed Network OS Registry All instances Switch A Master = NONE Instance 1 Instance 2 Switch A Switch A Master A == Master = NONE ONOS 1 Switch Switch A A Master Master == ONOS NONE1 All instances Switch A Master = NONE Timeline Master elected for switch A Delay of Locking & Consensus Instance 3 Switch A Master Master == A = ONOS ONOS NONE1 1 Instance 1 Switch A Master = ONOS 1 Instance 2 Switch A Master = ONOS 1 Instance 3 Switch A Master = ONOS 1 ONOS Switch Mastership & Topology Cache “Tell me about your slice?” Cache ONOS Switch Mastership: ONOS Instance Crash ONOS Switch Mastership: ONOS Instance Crash ONOS uses Raft to achieve consensus and strong consistency Switch Mastership Terms Switch mastership term is incremented every time mastership changes Switch mastership term is tracked in a strongly consistent store Why Strong Consistency for Master Election Weaker consistency might mean Master election on instance 1 will not be available on other instances. That can lead to having multiple masters for a switch. Multiple Masters will break our semantic of control isolation. Strong locking semantic is needed for Master Election Eventual Consistency in Network Graph Network Graph Distributed Network OS DHT Instance 1 Instance 2 Switch A SWITCH A STATE= State =INACTIVE ACTIVE Switch Switch AA State ==INACTIVE State ACTIVE All instances Switch A STATE = INACTIVE Timeline Switch Connected to ONOS Instance 1 Switch A = ACTIVE Instance 2 Switch A = INACTIVE Instance 3 Switch A = INACTIVE Delay of Eventual Consensus Instance 3 Switch Switch AA STATE STATE == INACTIVE ACTIVE All instances Switch A STATE = ACTIVE What about link failure? Both ends may detect it! Topology Events and Eventual Consistency Cache Cache Cache detection of topology event update distributed topology store Topology Events and Eventual Consistency Cache Cache Cache Topology Events and Eventual Consistency Instance 1 crashed? Cache Cache Instance 1 crashed? Cache Switch Event Numbers Topology as state machine Events: switch/port/link up or down Each event has a unique logical timestamp: [switch id, term number, event number] Events are timestamped on arrival and broadcast Partial ordering of topology event: “stale” events are dropped on receipt Anti-entropy : peer-to-peer, lightweight, gossip protocol: quickly bootstraps newly joined ONOS instance Key challenge: view should be consistent with the network, not (simply) other views! Cost of Eventual Consistency Short delay will mean the switch A state is not ACTIVE on some ONOS instances in previous example (!?) Applications on one instance will compute flow through the switch A, while other instances will not use the switch A for path computation (!?) Eventual consistency becomes more visible during control plane network congestion Q: What are possible consequences of eventual consistency of network graph? Eventual consistency of network graph eventual consistency of flow entries? re-do path computation/flow setup? Why is Eventual Consistency Good enough for Network State? Physical network state changes asynchronously Strong consistency across data and control plane is too hard Control apps know how to deal with eventual consistency In distributed control plane, each router makes its own decision based on old info from other parts of the network: it works fine But in the current distributed control plane, destination-based, shortest path routing is used; this guarantees eventual consistency of routing tables computed by each individual router! Strong Consistency is more likely to lead to inaccuracy of network state as network congestions are real How to reconcile and address this challenge? Consistency: Lessons One Consistency does not fit all Consequences of delays need to be well understood More research needs to be done on various states using different consistency models Network OS (logically centralized control plane): distributed stores and consistent network state (network graph, flow entries & various network states) also unique challenges! ONOS Service APIs Application Intent Framework ● Application specifies high-level intents; not low-level rules o focus on what should be done, rather than how it should be done ● Intents are compiled into actionable objectives which are installed into the environment o e.g. HostToHostIntent compiles into two PathIntents ● Resources required by objectives are then monitored o e.g. link vanishes, capacity or lambda becomes available ● Intent subsystem reacts by recompiling intent and re-installing revised objectives Will come back and discuss this later in the semester if we have time! Reflections/Lessons Learned: Things We Got Right Control isolation (sharding) Divide network into parts and control them exclusively Load balancing -> we can do more Distributed data store That scales with controller nodes with HA -> though we need low latency distributed data store Dynamic controller assignment to parts of network Dynamically assign which part of network is controlled by which controller instance -> we can do better with sophisticated algorithms Graph abstraction of network state Easy to visualize and correlate with topology Enables several standard graph algorithms 37 What’s Available in ONOS Today? ● ONOS with all its key features o o o o o high-availability, scalability*, performance* northbound abstractions (application intents) southbound abstractions (OpenFlow adapters) modular code-base GUI ● Open source o ONOS code-base on GitHub o documentation & infrastructure processes to engage the community ● Use-case demonstrations o SDN-IP, Packet-Optical ● Sample applications o reactive forwarding, mobility, proxy arp Please do check out the current ONOS code distribution on ONOS project site https://wiki.onosproject.org