Download Congestion Control Outline: Queuing Discipline

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

AppleTalk wikipedia , lookup

Lag wikipedia , lookup

Network tap wikipedia , lookup

Point-to-Point Protocol over Ethernet wikipedia , lookup

Computer network wikipedia , lookup

Distributed firewall wikipedia , lookup

Internet protocol suite wikipedia , lookup

Serial digital interface wikipedia , lookup

Asynchronous Transfer Mode wikipedia , lookup

Recursive InterNetwork Architecture (RINA) wikipedia , lookup

Multiprotocol Label Switching wikipedia , lookup

Net bias wikipedia , lookup

Cracking of wireless networks wikipedia , lookup

RapidIO wikipedia , lookup

Wake-on-LAN wikipedia , lookup

Quality of service wikipedia , lookup

IEEE 1355 wikipedia , lookup

Deep packet inspection wikipedia , lookup

TCP congestion control wikipedia , lookup

Transcript
Congestion Control
Chapter 6
Outline
Resource Allocation Issues
Queuing Disciplines
FCFS (FIFO queues)
Priority Queuing
Fair Queuing (for flows)
TCP Congestion Control
Detection – Resolution approach (AIMD and Slow Start)
Alternatives: Fast Transmit / Fast Recovery
Congestion Avoidance
router-centric: DECbit and RED Gateways
host-centric: TCP Vegas
QoS
Congestion Control
ISSUES:
• How to fairly allocate resources (link bandwidths and switch buffers) among users.
• Two sides of the same coin:
– Resource allocation so as to avoid congestion (difficult with any precision)
– Congestion control if (and when) it occurs
• Resource allocation and congestion control involve both:
– hosts at the edges of the network (transport protocols)
– routers inside the network (queuing disciplines)
• Underlying service model can be
– best-effort (assume here – end-hosts
given no opportunity for QoS demands)
– multiple qualities of service QoS (later)
Source
2
Congestion in a packet-switched network
Router
1.5-Mbps T1 link
Destination
Framework
• Connectionless flows assumed: What are they? Even tho datagrams from a source to
a dest are switched independently, they typically flow thru the same path.
– Routers maintain soft state info
Source
1
• Somewhere between the hard
state info of a VC switch (bandwidth,
cell-loss ratio, etc) and no state
Source
info of pure connectionless.
2
• Correct operation does not depend on
soft state info but is improved by it.
Multiple flows passing thru a set of routers
Router
Destination
1
Router
Router
Destination
2
– Implicitly defined: router watches Source
3
for what appears to be a flow – used in TCP Congestion Control.
– Explicitly defined: source sends flow-setup (flow about to start) across network.
(a step down from a VC since explicit flow has no reliable, ordered delivery)
• Taxonomy of Resource Allocation/Congestion Control mechanisms
– Router-centric: address prob inside net (decide forwards/drops, inform hosts) versus
Host-centric: address problem from outside the network)
– Reservation-based: hosts request capacity when flow is established; versus
Feedback-based: Explicit (e.g., congested router sends “slow-down message)
Implicit (eg, host adjust rate based on, eg, cell-loss rate)
– Window-based (telling sender remaining buffer space – as in flow control) versus
Rate-based (telling sender the rate at which data can be absorbed)
Power
• Effective Resource Allocation
(utilization issue – network-wide point of view)
measured by Power = ratio of thruput to delay.
Throughput/delay
Evaluation Criteria(of resource allocation effectiveness & fairness
• Fair Resource Allocation (to individual senders)
Thrashing or
congestion collapse
Optimal
load
Load
– Can assume Fair means Equal shares
– E.g., Raj Jain proposed metric when Fair means Equal and all paths are equal length:
Jain’s Fairness Index: Given flow thruputs (units/sec) x1, x2, …, xn
f(x1, x2, …, xn) = ( 
n
i=1
xi )
2
/ ( n
n
2
i=1
xi )
If all n flows have thruput of 1 unit/sec, f = n2 / n*n = 1.
However if k have thruput 1 and n-k have thruput 0, f = k2 / n*k = k/n (less fair)
Queuing Discipline (Each router specifies a queuing discipline
regardless of resource allocation mechanism. Algorithm can be thought of as allocating
bandwidth (which packets get transmitted) and buffer space (which packets get dropped))
• First-In-First-Out or FIFO (AKA: FCFS)
– Packets transmitted in arrival order.
– No discrimination between traffic sources.
– Usually used with “tail drop” policy.
– FIFO + tail-drop = bundle.
– Widely used in Internet.
– Variations include priority queuing.
• Fair Queuing (FQ) for Flows
– explicitly segregates traffic based on flows
(separate queue per flow)
• Weighted Fair Queuing allows a
weight to be assigned to each flow.
Flow 1
Flow 2
Round-robin
service
Flow 3
Flow 4
Fair Queuing - FQ Algorithm
For simplicity, suppose clock ticks each time bit is transmitted (bit = tic)
Let Pi = length of packet i
Si = time when transmission of packet i starts
Fi = time when transmission of packet i finishes
Fi = Si + Pi
For a single flow, when does a router start transmitting packet i?
if it’s before router is finished with this flow’s packet i-1, right after last bit of i-1 (Fi-1)
if no current packets for this flow, then start transmitting when 1 arrives (at time Ai)
Thus: MAX (Fi - 1, Ai)
and
Fi = MAX (Fi - 1, Ai) + Pi
For multiple flows (Not perfect: can’t preempt current packet)
calculate Fi for each packet that arrives on each flow (treat as timestamps)
packet with lowest timestamp is next.
Flow 1
F=8
F=5
Flow 2
Output
Flow 1
(arriving)
F = 10
Queue discipline: Shortest
packet first
(a)
Flow 2
(transmitting)
Output
F = 10
F=2
Longer packet already in(b)
progress is completed first
TCP Congestion Control
• Idea
– assumes best-effort network (FIFO or FQ routers) each source
determines network capacity for itself
– uses implicit feedback (host adjusts rate based on its knowledge)
– ACKs pace transmission (self-clocking) (I.e., only allow n
outstanding un-Ack’ed packets.
• Challenge
– determining the available capacity in the first place
– adjusting to changes in the available capacity
• AIMD and Slow Start were the original solutions for TCP
Additive Increase/Multiplicative Decrease (AIMD)
Objective: adjust to changes in the available capacity
• New state variable per connection: CongestionWindow
– set by source to limit number of packets in transit
• Recall, FlowCtrl AdvertisedWindow = # of packets destination can still buffer)
MaxWin = MIN( CongestionWindow, AdvertisedWindow )
EffWin = MaxWin - ( LastByteSent - LastByteAcked )
# of outstanding packets
• Idea:
– increase CongestionWindow when congestion goes down
– decrease CongestionWindow when congestion goes up
• Question: how does the source determine whether or not the network is congested?
• Answer: a packet timeout occurs (I.e., an Ack is late)
– Assumes timeout signals that a packet was dropped due to congestion
(packet loss is so seldom due to transmission error)
– lost packet implies congestion
AIMD (cont)
Source
Destination
Algorithm: Each time source successfully sends a CongestionWindow
of packets, increase CongestionWindow by 1 packet (additive incr).
Divide CongestionWindow by 2 each timeout (multiplicative decr)
(never below Min Seg Size – MSS is in bytes – usually 1 packet)
…
• In practice however, TCP increments a little for each ACK, using:
Increment = MSS * (MSS/CongestionWindow)
CongestionWindow += Increment
Trace: CongestionWindow sawtooth behavior with AIMD
70
60
AIMD works well when
50
KB
40
source is operating close
30
to the available capacity of
20
the network. But takes too
10
long to ramp up from scratch.
1.0
2.0
3.0
SLOW START (ironically name)
is intended to solve that using multiplicative increase.
4.0
5.0
Time (seconds)
6.0
7.0
8.0
9.0
10.0
Slow Start (2
Source
nd
Destination
mechanism provided by TCP)
• Start with CongestionWindow (CW) = 1 packet
a slow start compared to a CongestionWindow=AdvertisedWindow start
• Double CongestionWindow each RTT (multiplicative incr)
until it reaches CongestionThreshold (CT), then increment by 1 per RTT.
…
Used when first starting connection and if connection goes dead
waiting for timeout (another “start over” situation).
No increase; No Acks arriving – due to lost packets
Timeout; 17=CTCW/2; CW  0
Multiplicative increase until CT, then Additive increase
Hash marks =times when each packet is transmitted
KB
Slow Start Trace:
mult increase
timeouts
70
60
50
40
30
20
10
Time in sec 1.0
No increase; No Acks arrivin
time when retransmitted packets
were first transmitted
2.0
3.0
4.0
5.0
6.0
7.0
8.0
9.0
Multiplicative increase until CT, then Additive increase
Timeout; 11=CTCW/2; CW  0
Sender
Packet 1
Packet 2
Packet 3
Packet 4
Fast Retransmit
Problem: Coarse-grain TCP timeouts lead to idle periods
Fast retransmit: use duplicate ACKs to trigger retrans.
Idea: every time a packet arrives, receiver sends ACK.
Thus, when a packet arrives out-of-order (and TCP can’t
ACK because earlier packets have not yet arrived)
TCP resends last legit cumm ACK (called duplicate ACK).
When sender sees 3 dups, retransmits next packet.
Receiver
ACK 1
ACK 2
ACK 2
Packet 5
Packet 6
ACK 2
ACK 2
Retransmit
packet 3
ACK 6
Hash marks =times when each packet is transmitted
KB
timeout
70
60
50
40
30
20
10
Time in sec
time when retransmitted packets
were first transmitted
1.0
2.0
3.0
4.0
5.0
6.0
7.0
Eliminates many of the flat areas where no packets were transmitted
Trace of CongestionWindow with fast retransmit
Fast Recovery:
Upon congestion, rather than drop back to 0 and use Slow Start, just
cut window in half and resume additive increase.
Congestion Avoidance
• TCP’s strategy is to control congestion once it happens
(repeatedly increase load to find the point at which congestion occurs, and then back off)
• Alternative strategy
– predict when congestion is about to happen
– reduce rate before packets start being discarded
– call this congestion avoidance, instead of congestion control
• Two possibilities
– router-centric: DECbit and RED Gateways
– host-centric: TCP Vegas
DECbit
Queue length
• Add congestion bit to packet header.
• Router
Previous
Current
– monitors average queue length over
cycle
cycle
Averaging
interval
last busy-idle cycle + current busy cycle,
set congestion bit if average queue length > 1
• End Host
– Destination echoes bit back to source
– Source records how many packets resulted in set bit
– If less than 50% of last CongestionWindow’s worth had bit set
• increase CongestionWindow by 1 packet
– If 50% or more of last window’s worth had bit set
• decrease CongestionWindow to 7/8th of its value.
Current
time
Time
Random Early Dectection (RED)
•
•
Notification is implicit
– Router just drops the packet when congested (TCP will timeout)
Early random drop
– rather than wait for queue to become completely full, drop each arriving packet with some
drop probability whenever the queue length exceeds some drop level
RED Details
Compute average queue length
AvgLen = (1-Weight)*AvgLen+Weight*SampleLen
0 < Weight < 1 (usually 0.002)
SampleLen = queue length each time packet arrives
Weighted runnng avg queue length
Two queue length thresholds
MaxThreshold MinThreshold
if AvgLen  MinThreshold then enqueue packet
if MinThreshold < AvgLen < MaxThreshold
then calculate probability P
drop arriving packet with probability P
if MaxThreshold  AvgLen, then drop arriving packet
AvgLen
Computing probability P
TempP = MaxP * (AvgLen - MinThreshold)
(MaxThreshold - MinThreshold)
Count = # packets (denom of AvgLen)
P = TempP/(1 - count * TempP)
P(drop)
Drop probability curve
1.0
MaxP
AvgLen
MinThresh
MaxThresh
TCP Vegas (host-centric congestion avoidance)
Idea: source watches for some sign router’s queue is building
(eg, RTT grows; sending rate flattens)
ExpectedRate =CW/BaseRTT
min of all measured RTTs,
Typically RTT of 1st packet
Diff = ExpectedRate – ActualRate
Source calculates current sending rate as the # bytes divided by the RTT for a distinguished packet
if Diff < α increase CW linearly
roughly corresponds to too little data in the network
else if Diff > β decrease CW linearly
roughly corresponds to too much data in the network
else leave CW unchanged
( when α < Diff < β )
70
60
50
40
30
20
10
Congestion Window Trace for TCP Vegas
0.5
1.0 1.5
2.0 2.5
3.0
3.5 4.0
4.5
5.0 5.5
6.0
6.5
7.0 7.5
8.0
5.0 5.5
6.0
6.5
7.0 7.5
8.0
Time (seconds)
CAM KBps
Parameters
a = 1 packet
b = 3 packets
KB
TCP Vegas (trace of congestion avoidance mechanism)
240
200
160
120
80
40
0.5
1.0 1.5
2.0 2.5
3.0
3.5 4.0 4.5
Time (seconds)
Actual throughput
Expccted throughput
Shaded area is region between a and b units away
From the Expected throughput (the goal to keep actual in
this region. Note the actual gets drug along by shaded.)
QoS
Microphone
Real-time App
Sampler
,
A D
converter
Buffer
,
D A
Speak
• Playback Buffer
Sequence number
• Require “deliver on time” assurances
– must come from inside the network (hosts cannot make such guarantees alone)
• Example application (audio)
– sample voice once every 125us
– each sample has a playback time
– packets experience variable delay in network
– add constant factor to playback time: playback point
Packet generation
Network delay
Packet arrival
Buffer
Playback
Time
Integrated Services
• Refers to the body of work by IETF 1995-97 working group on Integrated Services.
• Integrated Services allocates resources to individual flows
– whereas Differentiated Services allocates resources by “classes of traffic”
• Integrated Service Service Classes
– E.g., Guaranteed service (packets are never late – guaranteed max delay time)
• Flowspecs (Set of info we provide to the network to specify needs.)
– Tspec
• describes flow’s Traffic characteristics (e.g., average bandwidth, token issues..)
– Rspec
• describes the services Requested from the network
– E.g., guarantees, such as, delay target
RSVP Resource reSerVation Protocol
• While connection-oriented networks have setup protocols,
best-effort connectionless networks don’t – they need some sort of reservation
protocl in order to offer QoS.
– Internet resource reservation corresponds to signaling in ATM
– Proposed Internet standard is called RSVP
• Receiver-oriented
• 2 messages: PATH and RESV
Sender 1
• Source transmits PATH messages
every 30 seconds to make requests.
• Destination responds with
Sender 2
RESV message to ack.
PATH
PATH
R
RESV
(merged)
R
R
RESV
R
R
RESV
Receiver B
Receiver A
RSVP versus ATM (Q.2931)
• RSVP
–
–
–
–
receiver generates reservation
soft state info used in routeers (it is refreshed/timedout)
separate from route establishment
QoS can change dynamically
• ATM
–
–
–
–
sender generates connection request
hard state info (requires explicit delete at teardown)
concurrent with route establishment
QoS is static for life of connection
Differentiated Services (also IETF)
• Problem with Integrated Services: scalability
• Idea of Differentiated Serivces: support 2 classes of packets
– DS adds new Premium Service to best effort traffic class)
• Which packets are premium?
• Use premium-bit in header.