* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Internet Protocols - NYU Computer Science Department
Asynchronous Transfer Mode wikipedia , lookup
Network tap wikipedia , lookup
Wake-on-LAN wikipedia , lookup
Computer network wikipedia , lookup
Airborne Networking wikipedia , lookup
Piggybacking (Internet access) wikipedia , lookup
Deep packet inspection wikipedia , lookup
List of wireless community networks by region wikipedia , lookup
Zero-configuration networking wikipedia , lookup
Real-Time Messaging Protocol wikipedia , lookup
Cracking of wireless networks wikipedia , lookup
TCP congestion control wikipedia , lookup
Recursive InterNetwork Architecture (RINA) wikipedia , lookup
Data Communication and Networks Lecture 9/10 Internet Protocols November 6, 2003 Joseph Conron Computer Science Department New York University [email protected] What’s the Internet: Components view millions of connected computing devices: hosts, end- systems pc’s workstations, servers PDA’s phones, toasters running network apps communication links fiber, copper, radio, satellite routers: forward packets (chunks) of data thru network What’s the Internet: Components view protocols: control sending, receiving of msgs e.g., TCP, IP, HTTP, FTP, PPP Internet: “network of networks” loosely hierarchical public Internet versus private intranet Internet standards RFC: Request for comments IETF: Internet Engineering Task Force What’s the Internet: a service view communication infrastructure enables distributed applications: WWW, email, games, e-commerce, database., voting, more? communication services provided: connectionless connection-oriented Internet structure: network of networks roughly hierarchical national/international backbone providers (NBPs) e.g. BBN/GTE, Sprint, AT&T, IBM, UUNet interconnect (peer) with each other privately, or at public Network Access Point (NAPs) regional ISPs connect into NBPs local ISP, company connect into regional ISPs local ISP regional ISP NBP B NAP NAP NBP A regional ISP local ISP Connectionless Operation Corresponds to datagram mechanism in packet switched network Each NPDU treated separately Network layer protocol common to all DTEs and routers Known generically as the internet protocol Internet Protocol One such internet protocol developed for ARPANET RFC 791 (Get it and study it) Lower layer protocol needed to access particular network Connectionless Internetworking Advantages Flexibility Robust No unnecessary overhead Unreliable Not guaranteed delivery Not guaranteed order of delivery Packets can take different routes Reliability is responsibility of next layer up (e.g. TCP) Internet protocol stack application: supporting network applications ftp, smtp, http transport: host-host data transfer tcp, udp network: routing of datagrams from source to destination ip, routing protocols link: data transfer between neighboring network elements ppp, ethernet physical: bits “on the wire” application transport network link physical Protocol layering and data Each layer takes data from above adds header information to create new data unit passes new data unit to layer below source M Ht M Hn Ht M Hl Hn Ht M application transport network link physical destination application Ht transport Hn Ht network Hl Hn Ht link physical M message M segment M M datagram frame Internet Protocol (IP) Only protocol at Layer 3 Defines Internet addressing Internet packet format Internet routing RFC 791 (1981) IP Address Details 32 Bits - divided into two parts Prefix identifies network Suffix identifies host Global authority assigns unique prefix to network (IANA) Local administrator assigns unique suffix to host IP Addresses given notion of “network”, let’s examine IP addresses: “class-full” addressing: class A 0 network B 10 C 110 D 1110 1.0.0.0 to 127.255.255.255 host network 128.0.0.0 to 191.255.255.255 host network multicast address 32 bits host 192.0.0.0 to 223.255.255.255 224.0.0.0 to 239.255.255.255 Classes and Network Sizes Maximum network size determined by class of address Class A large Class B medium Class C small IP Addressing Example Subnets and Subnet Masks Allow arbitrary complexity of internetworked LANs within organization Insulate overall internet from growth of network numbers and routing complexity Site looks to rest of internet like single network Each LAN assigned subnet number Host portion of address partitioned into subnet number and host number Local routers route within subnetted network Subnet mask indicates which bits are subnet number and which are host number Routing Using Subnets IP addressing: CIDR classful addressing: inefficient use of address space, address space exhaustion e.g., class B net allocated enough addresses for 65K hosts, even if only 2K hosts in that network CIDR: Classless InterDomain Routing network portion of address of arbitrary length address format: a.b.c.d/x, where x is # bits in network portion of address network part host part 11001000 00010111 00010000 00000000 200.23.16.0/23 Internet Packets Contains sender and destination addresses Size depends on data being carried Called IP datagram Two Parts Of An IP Datagram Header Contains source and destination address Fixed-size fields Data Area (Payload) Variable size up to 64K No minimum size IP datagram format IP protocol version number header length (bytes) “type” of data max number remaining hops (decremented at each router) upper layer protocol to deliver payload to 32 bits type of ver head. len service length fragment 16-bit identifier flgs offset time to upper Internet layer live checksum total datagram length (bytes) for fragmentation/ reassembly 32 bit source IP address 32 bit destination IP address Options (if any) data (variable length, typically a TCP or UDP segment) E.g. timestamp, record route taken, specify list of routers to visit. IP Fragmentation & Reassembly network links have MTU (max.transfer size) - largest possible link-level frame. fragmentation: in: one large datagram out: 3 smaller datagrams different link types, different MTUs large IP datagram divided (“fragmented”) within net one datagram becomes several datagrams “reassembled” only at final destination IP header bits used to identify, order related fragments reassembly IP Fragmentation and Reassembly length ID fragflag offset =4000 =x =0 =0 One large datagram becomes several smaller datagrams length ID fragflag offset =1500 =x =1 =0 length ID fragflag offset =1500 =x =1 =1480 length ID fragflag offset =1040 =x =0 =2960 IP Semantics IP is connectionless Datagram contains identity of destination Each datagram sent/ handled independently Routes can change at any time IP Semantics (continued) IP allows datagrams to be Delayed Duplicated Delivered out-of-order Lost Called best effort delivery Motivation: accommodate all possible networks Datagram Lifetime Datagrams could loop indefinitely Consumes resources Transport protocol may need upper bound on datagram life Datagram marked with lifetime Time To Live field in IP Once lifetime expires, datagram discarded (not forwarded) Hop count Decrement time to live on passing through a each router Time count Need to know how long since last router ICMP Internet Control Message Protocol RFC 792 Transfer of (control) messages from routers and hosts to hosts Feedback about problems e.g. time to live expired Encapsulated in IP datagram Not reliable ICMP Error Messages When an ICMP error message is sent, the message always contains the IP header and the first 8 bytes of the IP datagram that caused the problem ICMP has rules regarding error message generation to prevent broadcast storms ICMP Echo Command Used by “ping” and “tracert” When a destination IP host receives an ICMP echo command, it returns and ICMP “echo reply” Ping uses this to determine if a path to a destination (and its return path) are “up” Tracert uses echo in a clever way to determine the identities of the routers along the path (by “scoping” TTL). Address Resolution Problem Suppose we know the IP Address of a local system (one to which we are connected) We would like to send an IP packet to that system. The link layer (ethernet, for instance) only knows about MAC addresses! How do we determine the MAC address associated with the IP address? ARP Address resolution provides a mapping between two different forms of addresses 32-bit IP addresses and whatever the data link uses ARP (address resolution protocol) is a protocol used to do address resolution in the TCP/IP protocol suite (RFC826) ARP provides a dynamic mapping from an IP address to the corresponding hardware address ARP Protocol A knows B's IP address, wants to learn physical address of B A broadcasts ARP query pkt, containing B's IP address all machines on LAN receive ARP query B receives ARP packet, replies to A with its (B's) physical layer address A caches (saves) IP-to-physical address pairs until information becomes old (times out) soft state: information that times out (goes away) unless refreshed ARP Cache The cache maintains the recent IP to physical address mappings Each entry is aged (usually the lifetime is 20 minutes) forcing periodic updates of the cache ARP replies are often broadcast so that all hosts can update their caches ARP Packet Format 8 16 31 Hardware Type Hardware Size Protocol Type Protocol Size Operation Sender’s Hardware Address (for Ethernet 6 bytes) Sender’s Protocol Address (for IP 4 bytes) Target Hardware Address Target Protocol Address Destination IP Address Internet Transport Protocols Two Transport Protocols Available Transmission Control Protocol (TCP) connection oriented most applications use TCP RFC 793 User Datagram Protocol (UDP) Connectionless RFC 768 Transport layer addressing Communications endpoint addressed by: IP address (32 bit) in IP Header Port number (16 bit) in TP Header1 Transport protocol (TCP or UDP) in IP Header 1 TP => Transport Protocol (UDP or TCP) Standard services and port numbers service echo daytime netstat ftp-data ftp telnet smtp time domain finger http pop-2 pop sunrpc uucp-path nntp talk tcp udp 7 7 13 13 15 20 21 23 25 37 37 53 53 79 80 109 110 111 111 117 119 517 TCP: Overview RFCs: 793, 1122, 1323, 2018, 2581 point-to-point: full duplex data: one sender, one receiver bi-directional data flow in same connection MSS: maximum segment size reliable, in-order byte steam: no “message boundaries” pipelined: connection-oriented: handshaking (exchange of control msgs) init’s sender, receiver state before data exchange TCP congestion and flow control set window size send & receive buffers flow controlled: socket door application writes data application reads data TCP send buffer TCP receive buffer segment socket door sender will not overwhelm receiver TCP Header TCP segment structure 32 bits URG: urgent data (generally not used) ACK: ACK # valid PSH: push data now (generally not used) RST, SYN, FIN: connection estab (setup, teardown commands) Internet checksum (as in UDP) source port # dest port # sequence number acknowledgement number head not UA P R S F len used checksum rcvr window size ptr urgent data Options (variable length) application data (variable length) counting by bytes of data (not segments!) # bytes rcvr willing to accept Reliability in an Unreliable World IP offers best-effort (unreliable) delivery TCP uses IP TCP provides completely reliable transfer How is this possible? How can TCP realize: Reliable connection startup? Reliable data transmission? Graceful connection shutdown? Reliable Data Transmission Positive acknowledgment Receiver returns short message when data arrives Called acknowledgment Retransmission Sender starts timer whenever message is transmitted If timer expires before acknowledgment arrives, sender retransmits message THIS IS NOT A TRIVIAL PROBLEM! – more on this later. TCP Flow Control Receiver Advertises available buffer space Called window This is a known as a CREDIT policy Sender Can send up to entire window before ACK arrives Each acknowledgment carries new window information Called window advertisement Can be zero (called closed window) Interpretation: I have received up through X, and can take Y more octets Credit Scheme Decouples flow control from ACK May ACK without granting credit and vice versa Each octet has sequence number Each transport segment has seq number, ack number and window size in header Use of Header Fields When sending, seq number is that of first octet in segment ACK includes AN=i, W=j All octets through SN=i-1 acknowledged Next expected octet is i Permission to send additional window of W=j octets i.e. octets through i+j-1 Credit Allocation TCP Flow Control flow control sender won’t overrun receiver’s buffers by transmitting too much, too fast RcvBuffer = size of TCP Receive Buffer RcvWindow = amount of spare room in Buffer receiver buffering receiver: explicitly informs sender of (dynamically changing) amount of free buffer space RcvWindow field in TCP segment sender: keeps the amount of transmitted, unACKed data less than most recently received RcvWindow TCP seq. #’s and ACKs Seq. #’s: byte stream “number” of first byte in segment’s data ACKs: seq # of next byte expected from other side cumulative ACK Q: how receiver handles out-of-order segments A: TCP spec doesn’t say, - up to implementor Host A User types ‘C’ Host B host ACKs receipt of ‘C’, echoes back ‘C’ host ACKs receipt of echoed ‘C’ simple telnet scenario time TCP ACK generation [RFC 1122, RFC 2581] Event TCP Receiver action in-order segment arrival, no gaps, everything else already ACKed delayed ACK. Wait up to 500ms for next segment. If no next segment, send ACK in-order segment arrival, no gaps, one delayed ACK pending immediately send single cumulative ACK out-of-order segment arrival higher-than-expect seq. # gap detected send duplicate ACK, indicating seq. # of next expected byte arrival of segment that partially or completely fills gap immediate ACK if segment starts at lower end of gap TCP: retransmission scenarios time Host A Host B X loss lost ACK scenario Host B Seq=100 timeout Seq=92 timeout timeout Host A time premature timeout, cumulative ACKs Why Startup/ Shutdown Difficult? Segments can be Lost Duplicated Delayed Delivered out of order Either side can crash Either side can reboot Need to avoid duplicate ‘‘shutdown’’ message from affecting later connection TCP Connection Management Recall: TCP sender, receiver establish “connection” before exchanging data segments initialize TCP variables: seq. #s buffers, flow control info (e.g. RcvWindow) client: connection initiator Socket clientSocket = new Socket("hostname","port number"); server: contacted by client Socket connectionSocket = welcomeSocket.accept(); Three way handshake: Step 1: client end system sends TCP SYN control segment to server specifies initial seq # Step 2: server end system receives SYN, replies with SYNACK control segment ACKs received SYN allocates buffers specifies server-> receiver initial seq. # TCP Connection Management (OPEN) client server opening opening established closed TCP Connection Management (cont.) Closing a connection: client closes socket: clientSocket.close(); client server close Step 1: client end system sends TCP FIN control segment to server close replies with ACK. Closes connection, sends FIN. timed wait Step 2: server receives FIN, closed TCP Connection Management (cont.) Step 3: client receives FIN, client replies with ACK. Enters “timed wait” - will respond with ACK to received FINs closing Step 4: server, receives ACK. closing can handle simultaneous FINs. timed wait Connection closed. Note: with small modification, server closed closed TCP Connection Management (cont) TCP server lifecycle TCP client lifecycle Timing Problem! The delay required for data to reach a destination and an acknowledgment to return depends on traffic in the internet as well as the distance to the destination. Because it allows multiple application programs to communicate with multiple destinations concurrently, TCP must handle a variety of delays that can change rapidly. How does TCP handle this ..... Solving Timing Problem Keep estimate of round trip time on each connection Use current estimate to set retransmission timer Known as adaptive retransmission Key to TCP’s success TCP Round Trip Time and Timeout Q: how to set TCP timeout value? longer than RTT note: RTT will vary too short: premature timeout unnecessary retransmissions too long: slow reaction to segment loss Q: how to estimate RTT? SampleRTT: measured time from segment transmission until ACK receipt ignore retransmissions, cumulatively ACKed segments SampleRTT will vary, want estimated RTT “smoother” use several recent measurements, not just current SampleRTT TCP Round Trip Time and Timeout EstimatedRTT = (1-x)*EstimatedRTT + x*SampleRTT Exponential weighted moving average influence of given sample decreases exponentially fast typical value of x: 0.1 Setting the timeout EstimtedRTT plus “safety margin” large variation in EstimatedRTT -> larger safety margin Timeout = EstimatedRTT + 4*Deviation Deviation = (1-x)*Deviation + x*|SampleRTT-EstimatedRTT| Implementation Policy Options Send Deliver Accept Retransmit Acknowledge Send If no push or close TCP entity transmits at its own convenience (IFF send window allows!) Data buffered at transmit buffer May construct segment per data batch May wait for certain amount of data Deliver (to application) In absence of push, deliver data at own convenience May deliver as each in-order segment received May buffer data from more than one segment Accept Segments may arrive out of order In order Only accept segments in order Discard out of order segments In windows Accept all segments within receive window Retransmit TCP maintains queue of segments transmitted but not acknowledged TCP will retransmit if not ACKed in given time First only Batch Individual Acknowledgement Immediate as soon as segment arrives. will introduce extra network traffic Keeps sender’s pipe open Cumulative Wait a bit before sending ACK (called “delayed ACK”) Must use timer to insure ACK is sent Less network traffic May let sender’s pipe fill if not timely! UDP: User Datagram Protocol “no frills,” “bare bones” Internet transport protocol “best effort” service, UDP segments may be: lost delivered out of order to app connectionless: no handshaking between UDP sender, receiver each UDP segment handled independently of others [RFC 768] Why is there a UDP? no connection establishment (which can add delay) simple: no connection state at sender, receiver small segment header no congestion control: UDP can blast away as fast as desired UDP: more often used for streaming multimedia apps loss tolerant Length, in bytes of UDP rate sensitive other UDP uses DNS SNMP reliable transfer over UDP: add reliability at application layer application-specific error recover! segment, including header 32 bits source port # dest port # length checksum Application data (message) UDP segment format UDP Uses Inward data collection Outward data dissemination Request-Response Real time application