Download Lecture 14 - Lyle School of Engineering

Document related concepts

I²C wikipedia , lookup

Cracking of wireless networks wikipedia , lookup

Real-Time Messaging Protocol wikipedia , lookup

IEEE 1355 wikipedia , lookup

Internet protocol suite wikipedia , lookup

Recursive InterNetwork Architecture (RINA) wikipedia , lookup

UniPro protocol stack wikipedia , lookup

TCP congestion control wikipedia , lookup

Transcript
Spring 2006
EE 5304/EETS 7304 Internet Protocols
Lecture 14
TCP-Part 1
Tom Oh
Dept of Electrical Engineering
[email protected]
TO 4-25-06 p. 1
Administrative Issues
 (For distance learning students) If you are
graduating this semester, you need to take the Final
on May 9, 2006.
 For in-class students, we will have the Final (Test
#3) on May 9, 2006, 6:30PM.
 The Final will cover lecture 11-15.
 The Final will consists of multiple choice, T/F and
short answers.
 You are allowed to bring one 3 ½ X 5 card.
TO 4-25-06 p. 2
Outline (Comer, Ch. 25)
 TCP
 TCP header
 TCP retransmissions
 TCP duplicate detection
 TCP connection set-up and close
TO 4-25-06 p. 3
TCP (Transmission Control Protocol)
 TCP is predominant transport layer protocol to add
end-to-end reliability above IP
 Designed for reliable sequential byte stream
delivery with no duplicates, no loss
 Views application data as continuous byte stream,
breaks into segments of 64-Kbyte max. length


TO 4-25-06 p. 4
Keeps track of each byte with a sequence number
Segments are prefixed with TCP header and encapsulated
into IP packets
TCP (cont)
Sending application
•••
Data
Receiving application
•••
•••
Data
Data
TCP segment
TCP header
TCP segment
Data
TCP header
IP packet
IP header
TO 4-25-06 p. 5
•••
Data
TCP header
TCP data
Data
TCP (cont)
 Provides connection-oriented service between
applications on different hosts
 An application is identified to TCP by port address


TO 4-25-06 p. 6
Application is completely identified by 16-bit port address
& 32-bit IP address
TCP connection is between two endpoints, source <host
address, port> and destination <host address, port>
TCP (cont)
Host
Application
Host
Application
TCP
port 80
TCP
port 25
Transport
TO 4-25-06 p. 7
Application
Reliable connection-oriented
service with no duplicate, lost,
misordered, or errored bytes
Application
TCP
port 26
TCP
port 18
Transport
TCP (cont)
 TCP assumes IP - a type C network - so has all of
most complicated functions of transport protocol
 Error control detects missing, errored, nonsequential, and duplicate packets

Uses sequence numbers and piggybacked ACKs, adaptive
retransmissions
 Flow control using credits
 Connection control: 3-way handshake
 Also, TCP assumes responsibility for congestion
avoidance because IP has no congestion control
TO 4-25-06 p. 8
TCP Header
Source port (16 bits): optional;
allows replies to sender
Destination port (16 bits): identifies
application at destination host
TO 4-25-06 p. 9
TCP Header
Checksum (16 bits): error detection over
pseudoheader + TCP segment
TO 4-25-06 p. 10
TCP Header (cont)
 Pseudoheader is constructed from IP packet
header including IP source/destination addresses,
protocol field (=6 for TCP), length of TCP segment
 Ensures that IP addresses are correct
 Like UDP, this violates layering principle of OSI
model
TO 4-25-06 p. 11
TCP Header
Sequence number (32 bits): number of first data byte, except if
SYN=1; data bytes are numbered sequentially, to reconstruct
sender’s byte stream
TO 4-25-06 p. 12
TCP Header (cont)
Sending application
Byte n
Byte n+1
Byte n+2
Receiving application
•••
Byte n
Data
Byte n+2
Data
Sequence number tells
where this segment belongs
in reconstructed byte stream
Number of first byte
= sequence number
TCP header
TO 4-25-06 p. 13
Byte n+1
Data
•••
TCP Header
Acknowledgement (32 bits): piggybacked ACK tells sender the
next byte that is expected; ACKs are cumulative and refers to
end of contiguous received data; additional received data, if
not contiguous, triggers a duplicate ACK
TO 4-25-06 p. 14
TCP Header (cont)
Receiver’s buffer
Sending application
bytes
Byte 399
Data
Segment B
received first
Data
Data
Data
Segment A
SEQ = 400
Segment B
SEQ = 600
Segment C
SEQ = 800
TO 4-25-06 p. 15
ACK 400
TCP Header (cont)
Receiver’s buffer
Sending application
bytes
Byte 399
Data
Segment C
received second
Data
Data
Data
Segment A
SEQ = 400
Segment B
SEQ = 600
Segment C
SEQ = 800
TO 4-25-06 p. 16
ACK 400
duplicate
TCP Header (cont)
Receiver’s buffer
Sending application
bytes
Byte 999
Data
Segment A
received third
Data
Data
Segment A
SEQ = 400
Segment B
SEQ = 600
Segment C
SEQ = 800
TO 4-25-06 p. 17
Data
ACK 1000
TCP Header (cont)
Header length (4 bits): in units of 4
bytes; header is 20 bytes (value = 5)
+ options (if any)
TO 4-25-06 p. 18
Reserved (6 bits):
all zeros
TCP Header (cont)
Flags (6 bits):
URG: tells if Urgent pointer is used
ACK: tells if Acknowledgement field is used
PUSH: forces immediate transmission at sender
RST: tells receiver to abort and reset connection
SYN: segments for 3-way handshake to set up connection
FIN: segments for 3-way handshake to terminate connection
TO 4-25-06 p. 19
TCP Header (cont)
Urgent pointer (16 bits): used if URG=1
URG flag: tells if Urgent pointer is used
TO 4-25-06 p. 20
TCP Header (cont)
 Urgent pointer (2 bytes): points to number of first
byte after urgent data in segment


If URG flag =1, data up to urgent pointer is urgent data to
be processed immediately; rest of data is regular (not
urgent)
Allows "out of band" data (to be processed immediately,
out of sequence)
TCP header
Data
Urgent
data
Regular
data
Urgent pointer
TO 4-25-06 p. 21
TCP Header (cont)
 Push function:
 Normally, TCP accumulates data from sender
before transmitting a segment
 If sender issues a “push”, TCP will send the ready
data, even if segment will be short (e.g., 1 byte of
data)
TO 4-25-06 p. 22
TCP Header (cont)
Window (16 bits): piggybacked credit advertised
by receiver; for flow control of sender
TO 4-25-06 p. 23
TCP Retransmissions
 Sender waits for piggybacked acknowledgements


ACK is next expected byte (cumulative: acknowledges all
previous bytes)
ACK does not acknowledge any additional non-contiguous
data received
 Sender will resend if retransmission timer expires


TO 4-25-06 p. 24
TCP tries to adjust time-out to just a little longer than
estimated roundtrip time (RTT)
But timer is very difficult to determine when RTT varies
widely in Internet
TCP Adaptive Retransmission Algorithm
 Sender keeps track of returned ACKs as samples of
RTT
 Can continually update estimate of average
roundtrip delay
as weighted average of new
measurement and old estimate, eg:
TO 4-25-06 p. 25
TCP Adaptive Retransmission Algorithm
(cont)
 Noticed β should depend on variance of roundtrip
samples

Estimate can’t keep up with widely varying samples,
resulting in unnecessary retransmissions
 Current algorithm adapts RTO based on mean and
variance of RTT
TO 4-25-06 p. 26
TCP Adaptive Retransmission Algorithm
(cont)
packets
RTO
ACKs
mean RTT
standard dev.
packets
RTO
ACKs
mean RTT
standard dev.
RTT with small
variance
TO 4-25-06 p. 27
RTT with large
variance
TCP Adaptive Retransmission Algorithm
(cont)
 Problem: acknowledgement ambiguity problem

Suppose segment is transmitted twice, and then ACKed

Does ACK refers to first segment or duplicate?
Sender cannot
know which
case is true
TO 4-25-06 p. 28
packet
packet
duplicate
duplicate
ACK
ACK
TCP Adaptive Retransmission Algorithm
(cont)
 If assume ACK from first transmission, RTT
estimate could be too small → cause RTO to be too
short and unnecessary retransmissions
 If assume ACK from duplicate packet, RTT estimate
could be too large → cause RTO to be too long
TO 4-25-06 p. 29
TCP Adaptive Retransmission Algorithm
(cont)
 Karn's algorithm: timer backoff strategy




TO 4-25-06 p. 30
RTT estimate is adjusted only for unambiguous ACKs
If segment is sent twice due to time-out, ignore measured
delay to get its ACK and instead increase next RTO
Rate of increase is implementation-dependent, usually
increases by factor of 2
On next unambiguous ACK, recompute RTT estimate and
reset RTO
TCP Duplicate Detection
 Receiver can get duplicate segments caused by
early time-outs, lost ACKs, or late ACKs

Should be no confusion because duplicates of TCP
segment are identified by same sequence number
 Large range of sequence numbers needed to avoid
ambiguity


TO 4-25-06 p. 31
TCP uses 32 bits (4 billion) so sequence numbers will not
wrap around in short time
Receiver will not be confused by duplicate segments with
same number
TCP Duplicate Detection (cont)
 For duplicate segments, receiver assumes first
ACK was lost and will ACK the duplicate
 Sender will not be confused by duplicate ACKs
 Possible confusion is a duplicate TCP segment
arrives after connection is closed and new
connection is opened
CLS (FIN=1)
CLS (FIN=1)
CLS (FIN=1)
Connection
clos es
RFC (SYN=1)
RFC (SYN=1)
RFC (SYN=1)
Connection
opens
old duplicate TCP
s egment arrives
TO 4-25-06 p. 32
TCP Duplicate Detection (cont)
 TCP segment from old connection could arrive
during new connection and be mistaken for a valid
TCP segment
 TCP avoids this confusion by:



TO 4-25-06 p. 33
New connection starts with random initial sequence
number
Duplicate segments arriving during new connection will
probably have a sequence number outside of new range
Any duplicate segments received during this time are
discarded
TCP Duplicate Detection (cont)
Byte number
0
Byte number
232
bytes
An old segment from another
connection will more likely fall
outside of expected range when
range is very big (as in TCP)
TO 4-25-06 p. 34
Byte numbers used
for this connection
New TCP connection
chooses initial byte
number at random
TCP Duplicate Detection (cont)
 Also, TCP keeps record of old connection for a
timed Wait state after connection is closed


TO 4-25-06 p. 35
Time = 2 x Maximum Segment Lifetime (MSL = longest
time a TCP segment might take to arrive)
Any duplicate segments received during this time are
discarded
TCP Connection Set-up
 TCP 3-way handshake:
A
Connection request;
first data byte will be x
B
SYN=1, SEQ=x
SYN=1, SEQ=y,
ACK=x+1
Connection confirm;
send data starting at
byte x
TO 4-25-06 p. 36
SYN=1, SEQ=x,
ACK=y+1
Connection
acknowledgement;
first data byte will be y
TCP Connection Set-up (cont)
 As seen before, 3-way handshake works even if
both initiate connection at same time
 Use of retransmission timer may cause duplicate
SYN segments but there is no confusion
host A
host B
host A
host B
SYN i
old SYN i
SYN j, ACK i
SYN j, ACK i
SEQ i, ACK j
RST , ACK j
host A
host B
SYN i
old SYN k, ACK m
RST, ACK k
SYN j, ACK i
SEQ i, ACK j
normal
TO 4-25-06 p. 37
old SYN, connection is
rejected by A
delay ed SYN/ACK,
connection is rejected by A,
new connection is accepted
TCP Connection Close
 3-way handshake like procedure for connection setup
 Connection can be closed in one direction with
segment with FIN=1

No more data is accepted in this direction
 Other end will immediately ACK to prevent getting
duplicate FIN segments

TO 4-25-06 p. 38
Delays FIN response until application is ready to close
connection in reverse direction
Spring 2006
EE 5304/EETS 7304 Internet Protocols
Lecture 14
TCP-Part 2
Tom Oh
Dept of Electrical Engineering
[email protected]
TO 4-25-06 p. 39
Outline
 TCP flow control
 TCP congestion avoidance
 Slow start
 Fast retransmit and recovery
TO 4-25-06 p. 40
Flow Control vs Congestion Control
 Flow control: destination can slow down source
through feedback control

Destination may not be ready to receive data

Host-to-host control (network not involved)
 Congestion control: network should not get
overloaded with traffic

TO 4-25-06 p. 41
May be handled by hosts (e.g., TCP), the network (e.g.,
resource reservations), or both hosts and network
cooperating together (e.g., congestion notification)
Flow Control
 2 approaches to flow control:

Window-based control (typically sliding window):
destination constrains how many packets (volume) can be
in transit by slowing down ACKs or withholding credits
•
•

Rate-based control: destination constrains the sender’s
transmission rate (not volume)
•
TO 4-25-06 p. 42
Destination simply advertises the amount of its unused buffer
space
Inefficient for high-speed networks
Suited for streaming type applications that need a minimum
bandwidth
TCP Flow Control
 TCP flow control operates in units of bytes (not
segments)
 Destination piggybacks ACK (4 bytes) and window
advertisement (2 bytes) in data segments going to
source



TO 4-25-06 p. 43
Advertised window = number of bytes it is ready to receive
beyond last ACK’ed byte (i.e., a credit)
Example: <ACK n+1, window advertisement = m> gives
the sender permission to send up to byte n+m
Window advertisement = 0 means stop sending
TCP Flow Control (cont)
 Possible deadlock if destination closes window,
then opens window but this credit is lost

Destination is expecting data while sender thinks window
is closed
 Sender starts a persist timer when window is
closed

TO 4-25-06 p. 44
If timer expires, sender will send a window probe (TCP
segment with 1-byte data) to see if window has been
increased
TCP Flow Control (cont)
Sender
Dest.
ACK=x, credit=0
ACK=x, credit=m
Persist
timer
Host is
waiting
Lost
Host is
waiting
Probe with one
byte of data
Process continues until
credit is received or
connection is closed;
persist timer doubles
each time up to 60 sec
TO 4-25-06 p. 45
ACK=x, credit=m
Probe should trigger
duplicate of last credit or
a new credit
Congestion Control
 Without congestion control, Internet would reach
congestion collapse
 Since IP is best effort, sender’s best strategy is to
send as much data as possible to hog the network
and increase its chances of successful delivery
 Everyone following this strategy will increase load
on network, pushing it into congestion
 Increasing congestion will cause more
retransmissions → higher load will increase
congestion even more → congestion collapse: very
long delays; network full of duplicate packets; few
packets delivered
TO 4-25-06 p. 46
Congestion Control (cont)
ideal
controlled
throughput
uncontrolled congestion collapse
offered load
TO 4-25-06 p. 47
Congestion Control (cont)
 Congestion control can be:

Window-based
•
•

Traditional sliding window is naturally responsive to
congestion
Congestion increases → RTT increases → ACKs slow down
→ sender slow down
Rate-based
•
•
Better suited for streaming type applications
Easier to think in terms of fair shares of bandwidth
 TCP congestion control is window-based
TO 4-25-06 p. 48
Congestion Control (cont)
 Congestion control can be:

Preventive: traffic is blocked from entering network to
prevent congestion from occurring
•

Need some type of admission control procedure or explicit
congestion notification
Reactive: traffic is restricted after congestion occurs
•
•
Can be implemented in hosts without complexity of admission
control or congestion notification
Congestion prevention is preferred when possible
 TCP uses reactive congestion control because IP
layer does nothing
TO 4-25-06 p. 49
Congestion Control (cont)
 Closely related, congestion control can be:

Closed loop
•

Continuous feedback during transmission allows sender to
adapt its rate to current congestion state
Open loop
•
•
Traffic is either admitted or blocked; once admitted,
transmission is not controlled by feedback but source must
conform to its specified rate
Good for streaming type applications, if admission control is
possible
 TCP uses closed loop control (keeps routers
simple)
TO 4-25-06 p. 50
Congestion Control (cont)
 Closed loop control uses feedback that is either:

Explicit
•
•

Congested routers send explicit congestion notification
Sender can adapt its rate to current congestion state
Implicit
•
•
•
Sender must adapt its rate by inferring the congestion state typically from packet losses and RTT
No information from routers
Performance will not be as good as explicit feedback
 TCP uses implicit feedback (keeps routers simple)
TO 4-25-06 p. 51
TCP Congestion Avoidance (cont)
 TCP sender reacts to congestion in network by
keeping an adaptive “congestion window”

Congestion window (cwnd) = amount of data that is
appropriate for level of network congestion
 Current sending window = min(window
advertisement, congestion window)

Sender is constrained by either network congestion or the
destination
 Congestion avoidance algorithm: adapts
congestion window by AIMD (additive increase,
multiplicative decrease)
TO 4-25-06 p. 52
TCP Congestion Avoidance (cont)
 Multiplicative decrease: idea is to back off senders
quickly (exponentially) when congestion is
detected



TO 4-25-06 p. 53
TCP assumes a lost segment (detected by retransmission
timeout) is caused by congestion, and not because of error
in RTO
If segment is lost (and retransmitted), decrease congestion
window by half
If loss continues, congestion window keeps decreasing by
half (down to one segment)
TCP Congestion Avoidance (cont)
Retransmission
timeout drops
cwnd to half
Idealized
cwnd
Linear increase
Time
TO 4-25-06 p. 54
TCP Congestion Avoidance (cont)
 Why back off window exponentially?

Some believe queues build exponentially during
congestion → sources should back off as quickly
 Additive increase: when congestion abates (an ACK
for new data), increase congestion window linearly
(one more segment per RTT)


TO 4-25-06 p. 55
Why not increase multiplicatively?
Leads to instability and oscillations (easy to cause
congestion, harder to recover)
TCP Slow Start
 Idea: if network is in equilibrium (running stably
with full window in transit on each connection)
when new connection starts or recovering from
long period of congestion, sending a large initial
window of segments might upset equilibrium and
cause oscillations or congestion
 Slow start: idea is to start congestion window at
one segment and gradually increase rate


TO 4-25-06 p. 56
Increase congestion window by one segment for each
ACK that is returned
Attempts to probe network for acceptable sending rate
TCP Slow Start (cont)
 Slow in sense of starting with small window but
rate of increase may not be slow



Window could increase exponentially: send 1 → get 1
ACK, increase window to 2 → get 2 ACKs, increase
window to 4,...
This is actually fast rate of increase to allow sender to
reach equilibrium point quickly (although gently)
Eventually, a segment will be lost
•
TO 4-25-06 p. 57
Set “slow start threshold” SST = 1/2 current congestion
window (the equilibrium point); then go into congestion
avoidance
Slow Start and Congestion Avoidance
 These are separate algorithms but implemented
together because both triggered by time-out and
change congestion window
 New connection begins with congestion window =
1 segment, SST = 65,535 bytes
 Go into slow start to search for acceptable window
 Congestion is indicated by packet loss evidenced
by timeout

TO 4-25-06 p. 58
Set SST = 1/2 current congestion window
Slow Start and Congestion Avoidance
 If time-out occurred (this assumes that adaptive
timer is accurate, so time-out means a lost
segment), set congestion window = 1 segment and
go into slow start
 Slow start can continue until window reaches SST
(half of window when congestion occurred)

TO 4-25-06 p. 59
Then go into congestion avoidance phase: congestion
window can increase beyond SST but at more cautious
rate (as it approaches the equilibrium point when
congestion occurred)
Slow Start and Congestion Avoidance
 In congestion avoidance
phase, congestion
window increases linearly
as long as ACKs are
returned
 Whenever congestion
window ≤ SST, it’s in slow
start; if congestion
window > SST, then it’s
in congestion avoidance
TO 4-25-06 p. 60
Congestion
avoidance
Congestion
avoidance
Slow
start
Slow
start
Fast Retransmit and Recovery Algorithm
 Destination will send duplicate ACK whenever it
gets out-of-order segment

Sender does not know if duplicate ACKs mean segment
was lost or segments were received out of order
 Fast retransmit algorithm:

TO 4-25-06 p. 61
Assumes that out-of-order segments will result in only 1 or
2 duplicate ACKs, and 3 or more duplicate ACKs means a
segment was lost
TCP Header (cont)
Receiver’s buffer
First ACK
ACK
Data
Data
Data
These out-of-order
segments will cause 3
duplicate ACKs → TCP
assumes that missing
segment is lost
TO 4-25-06 p. 62
Fast Retransmit and Recovery Algorithm
 That lost segment is retransmitted immediately
(even if retransmit timer hasn’t expired)
 Fast recovery: do congestion avoidance but not
slow start because duplicate ACKs indicate that
some segments (after lost segment) were delivered,
so congestion is not too bad



TO 4-25-06 p. 63
Set SST = 1/2 congestion window
Reduce congestion window to half + 3 segments (to allow
for 3 segments already at dest.)
Expand congestion window linearly until next lost segment
Fast Retransmit and Recovery Algorithm
 Retransmissions around
time = 10, 14, and 21 sec
 SST is sent to 1/2
congestion window but
window is allowed to
increase with each
duplicate ACK
 When missing segment
is ACKed, congestion
window closes down to
SST
TO 4-25-06 p. 64