Download Chapter 13

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Deep packet inspection wikipedia , lookup

RS-232 wikipedia , lookup

Cracking of wireless networks wikipedia , lookup

Recursive InterNetwork Architecture (RINA) wikipedia , lookup

IEEE 1355 wikipedia , lookup

Internet protocol suite wikipedia , lookup

UniPro protocol stack wikipedia , lookup

TCP congestion control wikipedia , lookup

Transcript
TCP: Reliable Stream Transport
Chapter 13
Introduction
• Application programs often need to send large
volumes of data from one computer to another
• An unreliable connectionless delivery system for
large volume transfers is tedious and annoying
– Why?
• There is a need for reliable stream delivery which
– isolates the application programs from the
details of networking and
– defines a uniform interface for stream transfer
Properties of a Reliable Delivery
Service
• Stream Orientation - data is a stream of bytes
• Virtual Circuit Connection - a connection is
agreed upon, data is transferred, disconnection
• Buffered Transfer - sending application decides
data size, the size may change at a lower layer,
then data is available to destination application
• Unstructured Stream - structure boundaries are
not preserved
• Full Duplex Connection - concurrent transfer in
both directions
Providing Reliability
• Reliable protocols usually require the receiver to
send an acknowledgement (ACK) or a PAR
• The sender keeps a copy of the packet that was
sent and waits for an ACK
– if an ACK is received, next packet is sent
– if an ACK is not received, after a timer expires, the
packet is retransmitted
• Figure 13.1 is a timeline for normal case
• Figure 13.2 is a timeline showing packet loss
• What to do about duplicate packets or ACKs?
– Remember the sequence # field in IP header?
Sliding Windows
• In the case of Figure 13.1, the network is idle
during machine delays
• We can allow the sender to transmit multiple
packets before waiting for an ACK
• A protocol can place a small window over the
sequence of packets to send and transmit all
packets in that window as in Figure 13.3
• The window slides forward as ACKs are sent for
packets in the window
Sliding Windows
• The performance of sliding window protocols
depends on:
– the window size
– the speed at which the network accepts packets
• See Figure 13.4 which timelines 3 packets
• A window of size one is just like a positive
acknowledgement with retransmission, or PAR
• A steady state is reached when the sender can
transmit packets as fast as the network can - the
network is kept busy
Sliding Windows
• A sliding window protocol must:
– remember which packets have been acknowledged
– keep a separate timer for each unacknowledged packet
• if a packet is lost, timer expires and sender retransmits
• When the sender slides its window, it moves past
all acknowledged packets
• Similar software exists in the receiver
• The window partitions the packets into:
– those sent and ACKed, those being transmitted and
those not yet transmitted
The Transmission Control
Protocol
• TCP is a transmission protocol, not software
• TCP defines the concepts, it is not the
implementation
• What does TCP specify?
– the format of data and acknowledgements
– procedures to ensure that data arrives correctly
– how software distinguishes among multiple destinations
on a given machine
– how to recover from errors
– how to set up a connection to transfer data
The Transmission Control
Protocol
• What does the protocol not provide?
– does not dictate the details of the interface between
TCP and the application layer
• TCP can be used with a variety of packet delivery
systems, not just IP
• TCP can use dial-up lines, local area networks,
high speed fiber, long haul, etc.
Ports, Connections and Endpoints
• TCP resides above the IP layer, along with UDP as
shown in Figure 13.5
• TCP
– allows multiple application programs to communicate
concurrently
– demultiplexes incoming TCP traffic among applications
– uses protocol port numbers to identify the ultimate
destination within a machine
Ports, Connections and Endpoints
• TCP ports identify virtual circuit connections, not
a single object like the UDP port identifies
• Connections are identified by a pair of endpoints
– An endpoint is a pair of integers (host, port) where host
is the IP address for a host and port is a TCP port on
that host
• Example:
– A connection between a machine at MIT (18.26.0.36)
and a machine at Purdue (128.10.2.3) is defined by
(18.26.0.36, 1069) and (128.10.2.3, 25)
Ports, Connections and Endpoints
• Multiple connections may be made to another
protocol port on a given machine
• Multiple connections may share a complete
endpoint
• Because TCP identifies a connection by a pair of
endpoints, a given TCP port number can be shared
by multiple connections on the same machine
– this means that a program can provide concurrent
service to multiple connections simultaneously without
giving unique port numbers for each one (email)
Passive and Active Opens
• TCP is connection-oriented and requires both
endpoints to agree to participate
– Thus, the application program on one end performs a
passive open by telling its operating system that it is
willing to accept an incoming connection
– At that time, the O.S. assigns a TCP port number for
this end
– The application program at the other end issues an
active open to its O.S. to request a connection
– The two establish and verify a connection
Segments, Streams and Sequence
Numbers
• The data stream is viewed as a sequence of octets
that are divided into segments for transmission
• Each segment transmitted in a single IP datagram
• Octets in the TCP data stream are numbered
sequentially, and a sender keeps 3 pointers for
each connection (see Figure 13.6):
– the left of the sliding window:ack’d vs. unack’d
– the right of the sliding window, marks the highest
sequence # that can be sent before ACKs required
– octets sent and octets not sent
Segments, Streams and Sequence
Numbers
• The receiver keeps similar window information
• Because TCP connections are full-duplex, the TCP
software maintains window information at each
end for both sending and receiving
– thus, four windows per connection
Variable Window Size and
Flow Control
• TCP allows the window size to vary over time
• Each ACK indicates how many octets have been
received
– It also indicates a window advertisement which states
how many additional octets the receiver is willing to
receive (specifies the receiver’s current buffer size)
• In response to an increased window advertisement,
the sender increases the size of its sliding window
• In response to a decreased window advertisement,
the sender decreases the size of its sliding window
Variable Window Size and
Flow Control
• A sliding window
– provides flow control
– provides reliable transfer
• To stop all transfer, the receiver can send a
window advertisement of zero
– A sender is allowed to transmit a segment with an
urgent bit set
– To avoid deadlock, the sender can probe periodically,
by asking if buffer space is available
Variable Window Size and
Flow Control
• Two flow problems
– internet flow control between source and ultimate
destination (called end-to-end flow control)
– flow control for intermediate systems (routers)
• When intermediate systems become overloaded,
we have a condition called congestion
– We will look at ways to handle congestion later
TCP Segment Format
• The unit of transfer between the TCP software on
two machines is called a segment
• Segments are exchanged to:
–
–
–
–
–
establish connections
transfer data
send ACKs
advertise window sizes
close connections
• Using piggybacking, an ACK from one machine to another
may travel in the same segment with data travelling in the
opposite direction (in reality, this does not happen often)
TCP Segment Format
• A segment is made up of a header and data
• The TCP header consists of:
– source and destination ports which identify applications
– a sequence number identifying this data’s position in
the sender’s byte stream (stream flowing this direction)
– an ACK number identifying which octet the source is
expecting next (stream flowing opposite direction of this stream)
– header length ( in 32 bit multiples) to cover options
– 6 code bits to indicate purpose and contents of segment
(URG, ACK, PSH, RST, SYN, FIN)
TCP Segment Format
– how much data it is willing to accept, buffer size
• window advertisements accompany each segment, and are
piggybacking on ACKs
– checksum
– urgent pointer which specifies the end of the urgent
data
• what marks the beginning of the urgent data?
Out of Band Data
• It may be necessary to send messages that are not
a part of the regular data in the stream, out of band
data
– Example: sending a keyboard sequence that interrupts
or aborts the program at the other end
• This data is specified as urgent and is indicated by
the URG bit and is in the data portion of this
segment
– its end is known by the number in the Urgent Pointer
field
Maximum Segment Size Option
• Both ends need to agree on a maximum segment
size they will transfer
– If the two endpoints are on the same network, this will
be the network’s MTU
– If not on the same network, they will go through a
process of MTU discovery, or choose 536 (default size
of IP datagram, 576, plus TCP and IP headers)
• What happens when:
– segment size is too small? network utilization is low
– segment size is too big? fragmentation
TCP Checksum Computation
• A pseudo header is prepended
• Zeroes are added to make the segment a multiple
of 16 bits
• Takes the one’s complement of sum of all 16-bit
entities, assuming the original checksum is zero
• On the receiving side, IP passes source and
destination IP addresses when it passes the
segment
Acknowledgements and
Retransmission
• The receiver reconstructs the original stream sent
by the sender by piecing together segments
• Some segments may be lost, delayed, or arrive out
of order
• The receiver uses the sequence numbers to
reconstruct the stream
• The receiver acknowledges the longest contiguous
prefix of the stream that has arrived correctly
– It sends the sequence number of the segment it is
expecting next
Acknowledgements and
Retransmission
• When an acknowledgement is not received within
a given timeout period, the segments may be
retransmitted
– only the first unacknowledged segment
• worst case is to have to retransmit one at a time
– all segments in the window
• worst case is only the first one was needed
Timeout and Retransmission
• Every time a segment is sent, TCP starts a timer
and waits for an ACK
• If the timer expires before an ACK is received, it
assumes the segment was lost or corrupted and
retransmits it
• How long should the timeout be?
• See Figure 13.10 for roundtrip times of 100
successive IP datagrams on the Internet
Timeout and Retransmission
• An adaptive retransmission algorithm for TCP
monitors a connection and determines a
reasonable timeout period (RTT - Round Trip
Time) for that connection
– TCP records the time that a segment was sent, and the
time that an ACK was received for it
– the elapsed time is a round trip sample
– a new sample is obtained and the RTT is modified by
RTT = (alpha * oldRTT) + ((1-alpha) * newSample)
• When performance changes, the value of RTT is
Timeout and Retransmission
• When the value for alpha is close to:
– zero: the weighted average responds to changes in
delay very quickly
– one: the weighted average does not change significantly
when a temporary change is noticed
Accurate Measurement of Round
Trip Samples
• In measuring the round trip samples, we could
have acknowledgement ambiguity if we had to
retransmit
– When we receive an ACK is it for the original datagram
or for the retransmitted datagram?
– Problems with assuming the ACK for the original, and
with assuming the ACK is for the most recent
transmission
Karn’s Algorithm
• Karn’s algorithm for handling ambiguous
acknowledgements is to not update the round trip
estimate for retransmitted segments
• If retransmission times are completely ignored, a
large delay could not be noticed and the problem
of spirally retransmissions could occur
• Karn’s algorithm suggests a timer backoff
– if the timer expires and retransmission is done, TCP
increases the timeout up to a point that is larger than the
delay along any path in the internet
Responding to High Variance in
Delay
• Research shows that previous algorithms do not
respond well to a wide range of variation in delay
• Queuing theory suggests that the variation in
round trip time varies proportional to 1/(1-L)
where L is the current network load
• A 1989 specification of TCP requires estimating
the average round trip time and the variance
Responding to Congestion
• When congestion occurs, delays increase and
routers queue datagrams
• Routers have finite buffer space and when that
limit is reached, datagrams will be discarded
• Delay causes retransmissions and retransmissions
cause more delay ... until congestion collapse
• To avoid congestion, transmission rates must be
reduced
– Routers watch queue length and use ICMPsource
quench
Responding to Congestion
• TCP maintains a congestion window which is the
smaller of the receiver’s advertised window and
the current congestion window
– multiplicative decrease - reduces the congestion
window by half each time a segment is lost
• if loss continues, TCP limits transmission to one
datagram and doubles timeout values before
retransmitting
• provides significant and fast response to congestion
• lets routers clear datagrams in their queues
Responding to Congestion
• How does TCP recover when congestion ends?
– slow start
• start with setting the congestion window to the size
of one segment
• increase the congestion window by one segment
each time an acknowledgement arrives
– thus, one segment, two segments, four segments, eight …
• once the congestion window is half of its original
size before congestion occurred, it slows down and
increases by one segment only, if all goes well
Congestion, Tail Drop and TCP
• An early policy, called Tail Drop discarded a
datagram if the input queue at a router was full
• Since datagrams are typically multiplexed, with
successive datagrams from a different source, the
router might discard one segment from N
connections rather than N segments from one
connection
• So, what’s wrong with that?
– All N instances of TCP will enter slow start at the same
time
Random Early Discard (RED)
• A router uses Tmin and Tmax to mark positions in
the queue
• If the queue currently contains fewer than Tmin
datagrams, add the new datagram to the queue
• If the queue contains more than Tmax datagrams,
discard the new datagram
• If the queue contains between Tmin and Tmax
datagrams, randomly discard the datagram with a
probability p
Random Early Discard (RED)
• A router slowly and randomly drops datagrams as
congestion increases
• How are Tmin and Tmax determined?
• How is the value of p determined?
Establishing a TCP Connection
• To establish a connection, TCP uses a three-way
handshake as shown in Figure 13.13 (consider right
side the server and left side the client - see handout)
– the first segment has the SYN bit set, port number of
the server that the client wants to connect to, and an
initial sequence number (ISN)
– the second segment has both the SYN and ACK bits
set, server sends its own ISN, the ACK contains the
client’s ISN plus one
– the third segment is an ACK with the server’s ISN plus
one
Establishing a TCP Connection
• Usually the TCP software on one machine waits
passively for a handshake, the other initiates it
• However, if both request a connection at the same
time, this can be done, and is called a
simultaneous open
Initial Sequence Numbers
• Each machine in a connection chooses an ISN at
random
• These numbers are shared with the other machine
in the connection establishment handshake
• Segments that follow in the data transfer period
will be numbered sequentially from the initial
numbers
Closing a TCP Connection
• When one side of the connection has no more data
to send, it will close the connection in one
direction
• At the end of its data transfer, this side will send a
segment with the FIN bit set
• The other side will acknowledge the FIN segment
and may continue sending until it has no more
data, when it sends its own FIN segment
• See Figure 13.14 - third step is ACK
TCP Connection Reset
• Normally, connections are released as in the
previous section
• But, sometimes a connection is broken by accident
• To completely close the connection, it can be reset
• This is done to release resources like buffers
• One side sends a segment with the RST bit set
TCP State Machine
•
•
•
•
See Figure 13.15
Circles represent states
Arrows represent transitions between states
Labels show what causes the transition and what it
sends in response
• Notice the syn, ack , rst and fin indicators that are
in the segment header
Forcing Data Delivery
• We don’t always want to wait until a buffer is full
in order to send it
• If an application is gathering keystrokes, we want
them to appear on the screen as they are entered,
not as a block of keystrokes
• TCP provides a push operation that forces delivery
of octets without waiting for the buffer to fill - it
uses the psh code
TCP Reserved Port Numbers
• Well-known ports were originally those under 256
• But some over 1024 have been assigned for such
• TCP and UDP have some commonly numbered
ports
• See some of the currently assigned TCP port
numbers in Figure 13.16
Silly Window Syndrome
• The problem that is presented for this discussion is
one that arises when an application reads
incoming data one octet at a time
– When a connection is established, the receiving TCP allocates a
buffer of K octets
– If the sender generates data quickly, the sending TCP transmits
segments to fill the entire window
– Then the receiver indicates that it has no buffer space available
– When the receiver reads one octet, it can advertise that it has one
octet of space available
– Thus, very small segments are generated with large overhead (41
bytes/segment) and much processing for little data
Avoiding Silly Window
Syndrome
• Receive Side
– The receiver maintains the actual available buffer
space, but will not advertise an increase until the
window can be advanced by:
• half of the receiver’s buffer
• or the maximum size of a segment
• Delayed Acknowledgements
– The receiver delays sending an ACK when the window
is not sufficiently large enough to advertise (!> 500ms)
• advantage: delayed ACKs decrease traffic, increase throughput
• disadvantage: the sender may retransmit
Avoiding Silly Window
Syndrome
• Send Side
– New data is placed in the buffer, but the sender does not
send until a maximum size segment is filled
– If still waiting to send when an ACK arrives, send all
data that has accumulated in the buffer
– Apply the rule even if the user has requested a push
– This is called the Nagle algorithm and requires little
computation
• In general, the receiver avoids advertising a small
window and the sender delays transmission
Summary
• TCP provides
– reliable stream delivery
– full-duplex connections between two machines
allowing for exchange of large volumes of data
• TCP uses sliding windows for efficient use of the
network
• TCP provides flow control and allows systems of
varying speeds to communicate
• Segments are used to transfer data or control
information
Summary
• TCP implements flow control by letting the
receiver advertise the amount of data it is willing
to accept, yet supports out of band messages
• Current TCP uses:
– exponential backoff for retransmission timers
– congestion avoidance algorithms
– heuristics to avoid transferring small packets
For Next Time
• Several chapters on routing
• Read Chapter 14