* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Lecture04
Network tap wikipedia , lookup
Computer network wikipedia , lookup
Asynchronous Transfer Mode wikipedia , lookup
Wake-on-LAN wikipedia , lookup
Deep packet inspection wikipedia , lookup
Cracking of wireless networks wikipedia , lookup
Recursive InterNetwork Architecture (RINA) wikipedia , lookup
Internet protocol suite wikipedia , lookup
Lecture 04: Transport Layer
Transport layer protocols in the Internet:
UDP: connectionless transport
TCP: connection-oriented transport
TCP congestion control
4-1
Provides end-to-end connectivity, but
not necessarily good performance
name
link
session
path
address
Internet transport-layer protocols
reliable, in-order
delivery (TCP)
congestion control
flow control
connection setup
unreliable, unordered
delivery: UDP
no-frills extension of
“best-effort” IP
services not available:
delay guarantees
bandwidth guarantees
application
transport
network
data link
physical
network
data link
physical
network
data link
physical
network
data link
physicalnetwork
network
data link
physical
data link
physical
network
data link
physical
application
transport
network
data link
physical
Two Basic Transport Features
Demultiplexing: port numbers
Server host 128.2.194.242
Client host
Service request for
128.2.194.242:80
(i.e., the Web server)
Web server
(port 80)
OS
Client
Echo server
(port 7)
Error detection: checksums
IP
payload
detect corruption
User Datagram Protocol (UDP)
Datagram messaging service
Demultiplexing:
port numbers
Detecting corruption: checksum
Lightweight communication between processes
Send and receive messages
Avoid overhead of ordered, reliable delivery
SRC port
DST port
checksum
length
DATA
Advantages of UDP
Fine-grain control
UDP sends as soon as the application writes
No connection set-up delay
UDP sends without establishing a connection
No connection state
No buffers, parameters, sequence #s, etc.
Small header overhead
UDP header is only eight-bytes long
Popular Applications That Use
UDP
Multimedia streaming
Retransmitting packets is not always worthwhile
E.g., phone calls, video conferencing, gaming, IPTV
Simple query-response protocols
Overhead of connection establishment is overkill
E.g., Domain Name System (DNS), DHCP, etc.
“Address for www.cnn.com?”
“12.3.4.15”
Transmission Control Protocol (TCP)
Stream-of-bytes
service
Sends and receives a
stream of bytes
Reliable, in-order
delivery
Corruption: checksums
Detect loss/reordering:
sequence numbers
Reliable delivery:
acknowledgments and
retransmissions
Connection oriented
Explicit set-up and
tear-down of TCP
connection
Flow control
Prevent overflow of
the receiver’s buffer
space
Congestion control
Adapt to network
congestion for the
greater good
Breaking a Stream of Bytes
into TCP Segments
TCP “Stream of Bytes” Service
Host A
Host B
…Emulated Using TCP
“Segments”
Host A
Segment sent when:
TCP Data
Host B
1.
2.
3.
TCP Data
Segment full (Max Segment Size),
Not full, but times out, or
“Pushed” by application.
TCP Segment
IP packet
IP Data
TCP Data (segment)
TCP Hdr
IP Hdr
No bigger than Maximum Transmission Unit (MTU)
E.g., up to 1500 bytes on an Ethernet link
TCP packet
IP packet with a TCP header and data inside
TCP header is typically 20 bytes long
TCP segment
No more than Maximum Segment Size (MSS) bytes
E.g., up to 1460 consecutive bytes from the stream
Sequence Number
Host A
ISN (initial sequence number)
Sequence
number = 1st
byte
Host B
TCP Data
TCP Data
Reliable Delivery on a Lossy
Channel With Bit Errors
Challenges of Reliable Data Transfer
Over a perfectly reliable channel
Easy: sender sends, and receiver receives
Over a channel with bit errors
Receiver detects errors and requests
retransmission
Over a lossy channel with bit errors
Some data are missing, and others corrupted
Receiver cannot always detect loss
Over a channel that may reorder packets
Receiver cannot distinguish loss from out-oforder
An Analogy
Alice and Bob are talking
What if Bob couldn’t understand Alice?
Bob asks Alice to repeat what she said
What if Bob hasn’t heard Alice for a while?
Is Alice just being quiet? Has she lost
reception?
How long should Bob just keep on talking?
Maybe Alice should periodically say “uh huh”
… or Bob should ask “Can you hear me now?”
Take-Aways from the Example
Acknowledgments from receiver
Positive: “okay” or “uh huh” or “ACK”
Negative: “please repeat that” or “NACK”
Retransmission by the sender
After not receiving an “ACK”
After receiving a “NACK”
Timeout by the sender (“stop and wait”)
Don’t
wait forever without some
acknowledgment
TCP Support for Reliable
Delivery
Detect bit errors: checksum
Used to detect corrupted data at the receiver
…leading the receiver to drop the packet
Detect missing data: sequence number
Used
to detect a gap in the stream of bytes
... and for putting the data back in order
Recover from lost data: retransmission
Sender
retransmits lost or corrupted data
Two main ways to detect lost packets
TCP Acknowledgments
Host A
ISN (initial sequence number)
Sequence number
= 1st byte
Host B
TCP Data
ACK sequence
number = next
expected byte
TCP Data
Automatic Repeat reQuest
(ARQ)
ACK and timeouts
Receiver sends ACK when
it receives packet
Sender waits for ACK
and times out
Simplest ARQ protocol
Stop and wait
Send a packet, stop and
wait until ACK arrives
Timeout
Sender
Time
Receiver
Flow Control:
TCP Sliding Window
Motivation for Sliding Window
Stop-and-wait is inefficient
Only one TCP segment is “in flight” at a time
Especially bad for high “delay-bandwidth
product”
bandwidth
22
delay
Numerical Example
1.5 Mbps link with 45 msec round-trip time
(RTT)
Delay-bandwidth product is 67.5 Kbits (or 8 KBytes)
Sender can send at most one packet per RTT
Assuming a segment size of 1 KB (8 Kbits)
8 Kbits/segment at 45 msec/segment 182 Kbps
That’s just one-eighth of the 1.5 Mbps link capacity
Pipelined protocols
Pipelining: sender allows multiple, “in-flight”, yet-tobe-acknowledged packets
range of sequence numbers must be increased
buffering at sender and/or receiver
Pipelined protocols: concurrent logical channels, sliding
window protocol
3-24
Sliding Window Protocol
Consider an infinite array, Source, at the
sender, and an infinite array, Sink, at the
receiver.
send window
Source:
P1
Sender
0
1
2
a–1 a
acknowledged
unacknowledged
next expected
received
Sink:
P2
Receiver
0
1
s–1 s
2
r + RW – 1
r
delivered
receive window
RW receive window size
SW send window size (s - a SW)
3-25
Sliding Windows in Action
Data unit r has just been received by P2
Receive window slides forward
P2 sends cumulative ack with sequence
number it expects to receive next (r+3)
send window
Source:
P1
Sender
0
1
2
a–1 a
acknowledged
s–1 s
unacknowledged
r+3
next expected
Sink:
P2
Receiver
0
1
2
r + RW – 1
r
delivered
receive window
3-26
Sliding Windows in Action
P1 has just received cumulative ack with
r+3 as next expected sequence number
Send window slides forward
send window
Source:
P1
Sender
0
1
2
a–1 a
s–1 s
acknowledged
next expected
Sink:
P2
Receiver
0
1
2
r + RW – 1
r
delivered
receive window
3-27
Sliding Window protocol
Functions provided
error
control (reliable delivery)
in-order delivery
flow and congestion control (by varying send
window size)
TCP uses only cumulative acks
Other kinds of acks
selective nack
selective ack (TCP SACK)
bit-vector representing entire state of receive
window (in addition to first sequence number of
window)
3-28
Sliding Window Protocol
At the sender, a will be pointed to by
SendBase, and s by NextSeqNum
send window
Source:
P1
Sender
0
1
2
a–1 a
acknowledged
unacknowledged
next expected
received
Sink:
P2
Receiver
0
1
s–1 s
2
r + RW – 1
r
delivered
receive window
RW receive window size
SW send window size (s - a SW)
3-29
TCP Flow Control
flow control
sender won’t overrun
receiver’s buffers by
transmitting too much,
too fast
buffer at receive side of a TCP connection
receiver: explicitly
informs sender of
(dynamically changing)
amount of free buffer
space
RcvWindow field in
TCP segment
sender: keeps amount of
transmitted, unACKed
data less than most
recently received
RcvWindow value
3-30
Optimizing Retransmissions
Packet lost
Timeout
Timeout
Timeout
Timeout
Timeout
Timeout
Reasons for Retransmission
ACK lost
DUPLICATE
PACKET
Early timeout
DUPLICATE
PACKETS
How Long Should Sender Wait?
Sender sets a timeout to wait for an ACK
Too short: wasted retransmissions
Too long: excessive delays when packet lost
TCP sets timeout as a function of the RTT
Expect ACK to arrive after an “round-trip time”
… plus a fudge factor to account for queuing
But, how does the sender know the RTT?
Running
average of delay to receive an ACK
TCP Round Trip Time and Timeout
Q: how to estimate RTT?
SampleRTT: measured time from segment
transmission until ACK receipt
ignore retransmissions
SampleRTT will vary, want estimated RTT
“smoother”
average several recent measurements, not just
current SampleRTT
TCP Round Trip Time and Timeout
EstimatedRTT = (1- )*EstimatedRTT + *SampleRTT
Exponential weighted moving average
influence of past sample decreases
exponentially fast
typical value: = 0.125
Example RTT estimation:
RTT: gaia.cs.umass.edu to fantasia.eurecom.fr
350
RTT (milliseconds)
300
250
200
150
100
1
8
15
22
29
36
43
50
57
64
71
time (seconnds)
SampleRTT
Estimated RTT
78
85
92
99
106
TCP: retransmission scenarios
Host A
X
loss
Sendbase
= 100
SendBase
= 120
SendBase
= 100
time
SendBase
= 120
lost ACK scenario
Host B
Seq=92 timeout
Host B
Seq=92 timeout
timeout
Host A
time
premature timeout scenario
3-37
TCP retransmission scenarios (more)
timeout
Host A
Host B
X
loss
SendBase
= 120
time
Cumulative ACK scenario
3-38
Fast Retransmit
Time-out period often
relatively long:
long delay before
resending lost packet
Detect lost segments
via duplicate ACKs.
Sender often sends
many segments back-toback
If segment is lost,
there will likely be many
duplicate ACKs.
If sender receives 3
ACKs for the same
data, it supposes that
segment after ACKed
data was lost:
fast retransmit: resend
segment before timer
expires
Host A
Host B
timeout
X
time
Figure 3.37 Resending a segment after triple duplicate ACK
Fast retransmit algorithm:
event: ACK received, with ACK field value of y
if (y > SendBase) {
SendBase = y
if (there remains a not-yet-acknowledged segment)
start timer
}
else {
increment count of dup ACKs received for y
if (count of dup ACKs received for y = 3) {
resend segment with sequence number y
reset timer for y
}
a duplicate ACK for
already ACKed segment
fast retransmit
3-41
Effectiveness of Fast
Retransmit
When does Fast Retransmit work best?
High likelihood of many packets in flight
Long data transfers, large window size, …
Implications for Web traffic
Most Web transfers are short (e.g., 10 packets)
• So, often there aren’t many packets in flight
Making fast retransmit is less likely to “kick in”
• Forcing users to click “reload” more often…
Starting and Ending a
Connection:
TCP Handshakes
Establishing a TCP Connection
A
B
Each host tells
its ISN to the
other host.
Three-way handshake to establish connection
Host A sends a SYN (open) to the host B
Host B returns a SYN acknowledgment (SYN ACK)
Host A sends an ACK to acknowledge the SYN ACK
What if the SYN Packet Gets
Lost?
Suppose the SYN packet gets lost
Packet is lost inside the network, or
Server rejects the packet (e.g., listen queue is
full)
Eventually, no SYN-ACK arrives
Sender sets a timer and wait for the SYN-ACK
… and retransmits the SYN if needed
How should the TCP sender set the timer?
Sender has no idea how far away the receiver is
Some TCPs use a default of 3 or 6 seconds
SYN Loss and Web Downloads
User clicks on a hypertext link
Browser creates a socket and does a “connect”
The “connect” triggers the OS to transmit a
SYN
If the SYN is lost…
The 3-6 seconds of delay is very long
The impatient user may click “reload”
User triggers an “abort” of the “connect”
Browser “connects” on a new socket
Essentially, forces a fast send of a new SYN!
Lecture 04: Transport Layer
Transport layer protocols in the Internet:
UDP: connectionless transport
TCP: connection-oriented transport
TCP congestion control
Principles of Congestion Control
Congestion:
informally: “too many sources sending too much
data too fast for network to handle”
different from flow control!
manifestations:
lost packets (buffer overflow at routers)
long delays (queueing in router buffers)
a top-10 problem!
Receiver Window vs. Congestion
Window
Flow control
Keep a fast sender from overwhelming a slow
receiver
Congestion control
Keep a set of senders from overloading the
network
Different concepts, but similar mechanisms
TCP
flow control: receiver window
TCP congestion control: congestion window
Sender TCP window =
min { congestion window, receiver window }
How it Looks to the End Host
Packet experiences high delay
Loss: Packet gets dropped along path
Delay:
How does TCP sender learn this?
Delay: Round-trip time estimate
Loss:
Timeout and/or duplicate
acknowledgments
✗
Congestion Collapse
Easily leads to congestion collapse
Senders retransmit the lost packets
Leading to even greater load
… and even more packet loss
“congestion
collapse”
Goodput
Load
Increase in load that
results in a decrease in
useful work done.
Approaches towards congestion control
End-to-end congestion
control:
no explicit feedback from
network
congestion inferred from
end-system’s observed loss
and/or delay
approach taken by TCP
Network-assisted
congestion control:
routers provide feedback
to end systems
single bit indicating
congestion (SNA,
DECbit, TCP/IP ECN,
ATM)
explicit sending rate
for sender
TCP Congestion control
end-to-end control (no network
assistance)
Tradeoff
Pro: avoids needing explicit network feedback
Con: continually under- and over-shoots “right”
rate
TCP Congestion control
Each TCP sender maintains a congestion
window
Max
number of bytes to have in transit (not
yet ACK’d)
Adapting the congestion window
Decrease upon losing a packet: backing off
Increase upon success: optimistically exploring
Always struggling to find right transfer rate
TCP Congestion Control
How does sender determine CongWin?
loss event = timeout or 3 duplicate acks
TCP sender reduces CongWin after loss event
three mechanisms:
slow start
AIMD
reduce to 1 segment after timeout event
TCP Slow Start
Probing for usable bandwidth
When connection begins, CongWin = 1 MSS
Example: MSS = 500 bytes & RTT = 200 msec
initial rate = 20 kbps
available bandwidth may be >> MSS/RTT
desirable to quickly ramp up to a higher rate
TCP Slow Start (more)
When connection
Host B
RTT
begins, increase rate
exponentially until
first loss event or
“threshold”
Host A
double CongWin every
RTT
done by incrementing
CongWin by 1 MSS for
every ACK received
Summary: initial rate
is slow but ramps up
exponentially fast
time
Congestion avoidance state &
responses to loss events
Implementation:
For initial slow start,
14
congestion window size
(segments)
Q: If no loss, when
should the exponential
increase switch to
linear?
A: When CongWin gets
to current value of
threshold
threshold is set to a very
large value (e.g., 65 Kbytes)
At loss event, threshold is set
to 1/2 of CongWin just
before loss event
TCP
Reno
12
10
8
6
threshold
4
2
0
TCP
Tahoe
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Transmission round
Series1
Series2
Rationale for Reno’s Fast Recovery
After 3 dup ACKs:
3 dup ACKs indicates
network capable of
delivering some segments
timeout occurring
before 3 dup ACKs is
“more alarming”
CongWin is cut in half
window then grows linearly
But after timeout event:
CongWin is set to 1 MSS
instead;
window then grows
exponentially to a threshold,
then grows linearly
Summary: TCP Congestion Control
When CongWin is below Threshold, sender in
slow-start phase, window grows exponentially.
When CongWin is above Threshold, sender is in
congestion-avoidance phase, window grows linearly.
When a triple duplicate ACK occurs, Threshold
set to CongWin/2 and CongWin set to
Threshold.
When timeout occurs, Threshold set to
CongWin/2 and CongWin is set to 1 MSS.
AIMD in steady state
additive increase:
increase CongWin by
1 MSS every RTT in
the absence of any
loss event: probing
multiplicative decrease:
cut CongWin in half
after loss event (3 dup
acks)
congestion
window
24 Kbytes
16 Kbytes
8 Kbytes
Long-lived TCP connection
time
Why is TCP fair?
Two competing sessions:
R
equal window size
loss: decrease window by factor of 2
congestion avoidance: additive increase
loss: decrease window by factor of 2
congestion avoidance: additive increase
Connection 1 window size R
TCP Fairness
Fairness goal: if K TCP sessions share same
bottleneck link of bandwidth R, each should have
average rate of R/K (AIMD only provides convergence
to same window size, not necessarily same throughput rate)
TCP connection 1
TCP
connection 2
bottleneck
router
capacity R
Fairness (more)
Fairness and UDP
Multimedia apps often do not use TCP
do not want rate throttled by congestion
control
Instead use UDP:
pump audio/video at constant rate, tolerate
packet loss
TCP-friendly congestion control for apps that
prefer UDP, e.g., Datagram Congestion Control
Protocol (DCCP)
End of Lecture04