Download Coping with Link Failures in Centralized Control Plane Architecture

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Asynchronous Transfer Mode wikipedia , lookup

IEEE 802.1aq wikipedia , lookup

Airborne Networking wikipedia , lookup

Serial digital interface wikipedia , lookup

Network tap wikipedia , lookup

IEEE 1355 wikipedia , lookup

Transcript
Coping with Link Failures in
Centralized Control Plane
Architecture
Maulik Desai, Thyagarajan Nandagopal
Introduction
• Traditional network
- Router identifies link failures and establishes alternate route
• Centralized control plane architectures – SoftRouter, 4D
- Controller sends updates to switches
- Switches have least amount of intelligence
• Link failure between switches and controller
- Switches attempt to find a controller
- Routing loops may occur
• Naïve solution – Flooding the news of link failure
- Creating a lot of unnecessary traffic
• Better solution
-
Only the necessary switches will be informed of link failure
Still maintaining minimum amount of intelligence on the switches
Implemented in a network formed by OpenFlow switches
The news of link failures can reach switches sooner than the controller can identify the failure and send out
updates
Related Work
• SoftRouter
- Control function and packet forwarding function is separated
- Increasing reliability in the network
- Elements of a SoftRouter network includes
Forwarding Element (FE): switches performing packet forwarding
Control Element (CE): controllers running control plane functions
Network Element (NE): logical grouping of some CEs and a few FEs
• 4D
- Four logical planes
Decision Plane: makes all the decisions about network control
Dissemination Plane: ensuring communication between decision plane and switches
Discovery Plane: identifying physical components of a network
Data Plane: handling individual packets, controlled by decision plane
• Both maintaining a separate control plane from the data plane
Related Work - Cont.
• OpenFlow switches
- Controlled by a remotely located controller
- Maintaining multiple data paths in the same network
- Flow table:
Header Fields:
Counters: maintaining statistics for the switch
Actions: if the header of a received packet matches with the header fields, action defined
in the field is applied
Link Failure (1)
• A simple link failure scenario
Link between A and B fails – informing all relevant switches about the failed link
and ask them to refrain from sending messages to B that travel towards A, until
the controller sends them an update
Link Failure (2)
• Island
Link between A and B fails – forming an island
- B could inform C and D of the failed link, avoiding unnecessary traffic
- A could inform the controller the failed link, preventing its attempt to reach the island
Link Failure (3)
• Routing Loop
Link between B and C fails – forming a routing loop
- B could inform A about the failed link
- This process can be completed a lot sooner than the controller could identify the link failure
and updating switches – preventing routing loops
Coping with Link Failures
• A scheme to reduce the damage caused by link failure
- In case of link failure, all the switches that could send flows in the driaction of
the failed link should be informed of this event
- Link failure messages should not propagate in the network indefinitely, and
unless required, these messages should not be flooded in the network
- The scheme should provide enough information to the network switches
regarding the flows that are affected by the failed link. At the same time, it
should make sure that the flows that are not affected by this event do not get
modified
- The proposed scheme should not violate the basic premises of keeping the
minimum amount of intelligence available at the switches
Solutions to Link Failures
• Goal
- Informing link failure event to all the switches that could send flows in the
direction of the failed link
- Making sure Link Failure Message (LFM) do not get flooded in the entire
network
• Outline
-
A proper way to define a flow
Computations for the switch experiencing link failure
Computations for the switch receiving a LFM
Importance of specifying ingress port in the flow definition
Flow tables without ingress ports
Solutions to Link Failures - A proper way to
• One way to define the flow:
define a flow
• Better way to define the flow:
Solution to Link Failures – Computations for
the switch experiencing link failure (1)
Solution to Link Failures – Computations for
the switch experiencing link failure (2)
• LFM Structure
- Source Address: IP address of the switch the
initiate the LFM
- Message ID: ensuring that the same LFM does
not get forwarded multiple times by the same
switch
- Flow Definition: a subset of header fields that
make up the definition of the flow
- Flow Count: indicating the total number of flow
specifications that are attached with the LFM
Solution to Link Failures – Computations for
the switch receiving a LFM (1)
• Upon receiving a LFM
- Making the note of the interface
(rxInterface) from where the
message came in
- Detach the list (flowList) of flow
definitions attached with the LFM
- Looking up flow table and locating
ingress ports
- Sending out new LFM (Why?)
- Modifying the Action field of the
affected flow table entries (the
tricky part)
Solution to Link Failures – Computations for
the switch receiving a LFM (2)
• Why sending out new LFM instead of forwarding the same LFM
Solution to Link Failuress – Computations for
the switch receiving a LFM (3)
• Modifying the Action field of the affected flow table entries
- Splitting a flow table entry into two
Solution to Link Failure – Importance of
specifying ingress port in the flow definition
• Specifying ingress port will be
the most helpful in the topology
that is similar to a perfect graph
• Specifying ingress port will be
the least helpful in a chain
topology
Solution to Link Failures – Flow tables without
ingress port
• Sometimes it may not be possible to specify ingress port for the flow
table entries in all the switches
- Have to flood LFM to all the switches in the network
- LFM may float around in the network indefinitely
• Solution – including a “Hop count” or “Time to live” field
- “Hop count” decreases by one every hop as LFM gets forwarded - stop
forwarding a LFM if “Hop count” is 0
- “Time to live” is a timestamp – stop forwarding a LFM once the “Time to live”
expires
- Those values have to be chosen carefully
Performance Analysis (1)
• Environment
- A small network of kernel-based OpenFlow switches
- Switches are installed on VMs that run on Debian Lenny linux
- A chain topology is used
- VMs share a 2.6Ghz processor, with 64M of RAM assigned to each of them
Performance Analysis (2)
• Results
- Since VMs are not very well time synchronized, it is difficult to calculate total
amount of time taken to reach all the switches
- Calculating the time difference between receiving a LFM and sending out a
new LFM
- Sum is 394 mSec + time taken between transmitting and receiving LFMs
- Total time taken to send LFM to every switch is negligible compared to the
time between controller’s connectivity probe which may vary between tens of
seconds to a few hundred seconds
Performance Analysis (3)
• Processing time VS. Flow table entries
Conclusion
• In centralized control plane architecture, link failure could create
many problems
• To address the problems, a solution is proposed
- Informing relevant switches to refrain from sending traffic towards the failed
link without flooding
- Simplicity – Maintaining the basic premises of keeping the minimum
intelligence available at all switches
- All the relevant switches are informed of the failed link significantly sooner
than a controller learns about the link failure and sends out an update