Download End-to-end argument paper

End-to-end argument paper • One of the most widely cited/read papers in systems - Odd because no numbers, graphs, etc. - No specific software or system, either • But this paper is why we have TCP/IP! - E.g., Clark has been working on Internet since 1970s - Reed also participated in early design work - The reasoning in this paper is behind many of the design decisions in today’s network File transfer program • Say you want to copy a file reliably across network • Requirement: Read of file on machine A should return same data as on machine B • One implementation - Copy file across network - Can verify by copying file, then re-reading and computing checksum (E.g., CRC or better yet SHA-1) • Another implementation - Build a perfectly reliable network - Build perfectly reliable routers - Build perfectly reliable disks, network cards, etc. - Build perfectly reliable Operating System Multics source code disaster • MIT had “reliable” network in multics era - Gateways used a checksum on each hop - Would catch any transmission errors • Programmers assumed network reliable - Since no transmission errors, file copy was simple. . . • Turned out one of the gateways was flaky - Exchanged a byte pair one in 106 times it copied bytes - Corrupted vast quantities of multics source code - Required checking against source code printouts to fix! • So should have checked end-to-end anyway - Link-by-link checksums weren’t really useful ARPANET delivery guarantees • Original ARPANET sent one message at a time - Receiver responded with RFNM (request for next message) - Network won’t let you send next message until RFNM • Turns out not to be useful - Applications don’t care if message was received - Care if message was acted upon - E.g., Want to know mail message has been spooled, not just read from network • Had to send app.-level acknowledgments anyway - App.-level ack makes RFNM totally redundant Encrypting VPNs • Transparently encrypt all network traffic - Goal: Transparently improve application security • But real applications need authentication - E.g., Users must type passwords - So could use protocols like SRP to negotiate encryption keys • In fact, VPN + insecure protocol = insecure system - E.g., NFS allows any user on net to pretend to be any other - So still can’t use NFS securely over VPN • But app. with end-to-end security doesn’t need VPN Banks and record-keeping • Banks are legally required to audit their books - Also send statements to customers on monthly basis • Audits will catch human and computer errors - E.g., Deposit/withdrawal to wrong account - But also would catch race condition in database system! • So system can’t assume perfect electronic infrastructure - Also places limit to how much $$$ to spend on reliability Voting • Requirement: Vote counted should be vote intended by user - Okay in old-fashioned voting systems - User filled out paper ballot, checked contents - In worst case could look at ballots by hand • Failure 1 (Florida): Paper ballot not human readable - So users don’t notice hanging chads, etc. • Failure 2 (e-voting): Users have no idea what computer is doing, no paper record at all - Already evidence of serious flaws in voting machines - Voters would have no way of detecting this! What’s the lesson here? • Instinctively we like modularity & clean interfaces - Which means putting functionality in low-level abstractions • Examples: - Reliable communication - In-order communication - Secure communication • But correct applications can’t really exploit this - So low-level functionality might be redundant - Or might be insufficient - Or might be harmful – E.g., sending real-time audio over a reliable, in-order delivery channel The End-to-end argument “The function in question can completely and correctly be implemented only with the knowledge and help of the application standing at the end points of the communication system. Therefore, providing that questioned function as a feature of the communication system itself is not possible. (Sometimes an incomplete version of the function provided by the communication system may be useful as a performance enhancement.)” The end-to-end principle Application Application Library Library user kernel user kernel kernel hardware kernel hardware router • Place functionality closer to the endpoints Examples • IP datagram protocol • The fact that TCP is in your computer, not your router • TCP checksum (though it should be stronger) • AAL-5 vs. AAL-4 (checksum over entire CS-PDU) • Overlay networks & Source routing • End-system multicast • Insecurity of VPNs Performance issues • MAC-layer CRCs and retransmission - TCP can recover from errors - But performance very bad when window < 4 pkts - No better if most packet loss is only from congestion • IP fragmentation seriously hurts reliability - Because individual fragments are not retransmitted Current research at NYU • SUNDR secure file system - End-to-end security requirement: Users should read data written other legitimate users - File system guarantees this without trusting server • Coral content-distribution network - Most P2P data storage systems dictate data placement (E.g., store on closest node to ID in Chord or Pastry.) - Also attempt to provide reliability and consistency - Coral is optimized for placement of pointers End nodes determine placement of data - Gains efficiency by sacrificing consistency (perfect when want some copy of data, not all) Some concluding remarks • Why are computer networks so interesting? - Because so much functionality is at the end-points - Can program your computers • Why has the Internet been so conducive to innovation? - Because datagrams are in some sense a lowest-common-denominator abstraction - Can implement many protocols over a datagrams, including reliable, in-order delivery (e.g., TCP) • End-to-end argument isn’t just about correctness/performance - The closer to the end points you place functionality, the more control users have, which allows more innovation Quiz Review • Open book - Bring text and papers, you will need them! - All class notes on line, feel free to print and bring - Books & papers only; no laptops, cell phones, . . . • Topics: Will cover full semester - Grade based on max((mid + final)/2, (mid + 2 × final)/3) - More emphasis on material since midterm • No make-up finals, so please show up! RPC • XDR language for specifying protocol • At-most once semantics - How to implement (replay cache) - How to implement when nodes might crash (cookies) Multicast • Three possible layers – know the trade-offs - Data line layer (Ethernet) - Network layer (IP multicast) - Application layer (End-system multicast) • Optimality: Stretch and stress • Routing protocols - Link state – relatively straight-forward, but expensive - Distance Vector – Reverse Path Broadcast (RPB) - Rev. Path Multicast (RPM) is RPB where you prune if no receivers - Sparse-mode PIM – send joins to Rendez-vous Point to build tree; for active sender, send source-specific join Caching • Big issue in caching is consistency. Approaches: - Use TTLs to limit stale data (DNS, HTTP) - Use polling to see if cached copy is up-to-date (HTTP) - Use callbacks, where server notifies you if object changes - Use leases, in case node to which callback is promised dies • Write caching – write through vs. write behind - When do writes become visible? Stable? Caching tricks • Cache hierarchies • Bloom filters for advertising cache contents • Consistent hashing • CARP protocol • Spring & Wetherall trick for redundant data Replication • Single-server consistency model • Maintaining order of events - Lamport clocks • Dealing with failure - Majority of nodes need to be okay and see updates • View changes - Need majority to survive between old/new view 802.11 wireless networks • 802.11 addresses several issues: - Hidden nodes (undetected collision) - Exposed nodes (could falsely make nodes wait to transmit) - Solution: RTS/CTS, don’t send if you see CTS - Also ACK received packet—everyone waits for ACK before transmitting - Backoff if two RTS packets collide • Infrastructure mode - Distribution (e.g., wired Ethernet) connects Access Points - Nodes select APs with scanning - 802.11 packets contain 4 address fields for when going over distribution network Ad hoc mode • Wireless network w/o wired infrastructure - Nodes to forward data to each other - But don’t know anything about node locations a priori - Examples: Emergency workers, mining equipment, etc. • How to route on Ah Hoc networks - Standard DV/LS routing not so good (many redundant links, too much power consumption, link asymmetry) - DSR – Use source routes, determine routes on-the-fly, heavily cache routes - Basically flood query route request messages - Eavesdrop on network + forwarded packets to learn other people’s routes • GPSR – route using Geography Cryptography • Symmetric cryptography: - Encryption – keeps data secret - MAC (message authentication code) – detects tampering • Public key setting: - Encryption – as in symmetric case, but “public” key encrypts, while “private” key decrypts - Digital signatures – PK equivalent of MAC, anyone can verify signed message but only private key holder can sign • Key management - Can use certification authorities - Or secure password protocols possibly better – special crypto avoids off-line password-guessing attacks Security • Use valid crypto primitives - Wall Street Journal cooked up their own broken MAC - Allowed Fu et al. to break MAC key in linear time • Never assume anything from context of a message - SSH signed login request - But request didn’t say where user wanted to log in - Allows one server to log into another as the user, by relaying request for signature Unstructured P2P systems • Napster: Centralized DB - Centralized lookup, P2P transfers • Gnutella: Decentralized P2P queries - Form random overlay network - Use flood queries - Use TTL + Cache queries (to avoid re-forwarding) - Route replies back same way (bread-crumb trail) Structured P2P systems • Key-based routing - Assign each node and each key an ID - Each node knows about some number of other nodes - Can efficiently “route” to nodes closer to any ID - Use to implement things like distributed hash tables • Chord: 160-bit IDs are points on circle - Route clockwise around cicrle • Pastry: Can route in either direction - Prefix-based routing table, plus leaf set Structured P2P applications • Multicast • File systems • Content distribution

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download End-to-end argument paper