Download End-to-end argument paper

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

AppleTalk wikipedia , lookup

Piggybacking (Internet access) wikipedia , lookup

SIP extensions for the IP Multimedia Subsystem wikipedia , lookup

Distributed operating system wikipedia , lookup

Wake-on-LAN wikipedia , lookup

IEEE 802.1aq wikipedia , lookup

Network tap wikipedia , lookup

CAN bus wikipedia , lookup

Distributed firewall wikipedia , lookup

IEEE 1355 wikipedia , lookup

Zero-configuration networking wikipedia , lookup

Deep packet inspection wikipedia , lookup

Computer network wikipedia , lookup

Internet protocol suite wikipedia , lookup

Cracking of wireless networks wikipedia , lookup

Recursive InterNetwork Architecture (RINA) wikipedia , lookup

Airborne Networking wikipedia , lookup

Peer-to-peer wikipedia , lookup

Routing in delay-tolerant networking wikipedia , lookup

Transcript
End-to-end argument paper
• One of the most widely cited/read papers in
systems
- Odd because no numbers, graphs, etc.
- No specific software or system, either
• But this paper is why we have TCP/IP!
- E.g., Clark has been working on Internet since 1970s
- Reed also participated in early design work
- The reasoning in this paper is behind many of the design
decisions in today’s network
File transfer program
• Say you want to copy a file reliably across network
• Requirement: Read of file on machine A should
return same data as on machine B
• One implementation
- Copy file across network
- Can verify by copying file, then re-reading and computing
checksum (E.g., CRC or better yet SHA-1)
• Another implementation
- Build a perfectly reliable network
- Build perfectly reliable routers
- Build perfectly reliable disks, network cards, etc.
- Build perfectly reliable Operating System
Multics source code disaster
• MIT had “reliable” network in multics era
- Gateways used a checksum on each hop
- Would catch any transmission errors
• Programmers assumed network reliable
- Since no transmission errors, file copy was simple. . .
• Turned out one of the gateways was flaky
- Exchanged a byte pair one in 106 times it copied bytes
- Corrupted vast quantities of multics source code
- Required checking against source code printouts to fix!
• So should have checked end-to-end anyway
- Link-by-link checksums weren’t really useful
ARPANET delivery guarantees
• Original ARPANET sent one message at a time
- Receiver responded with RFNM (request for next message)
- Network won’t let you send next message until RFNM
• Turns out not to be useful
- Applications don’t care if message was received
- Care if message was acted upon
- E.g., Want to know mail message has been spooled, not just
read from network
• Had to send app.-level acknowledgments anyway
- App.-level ack makes RFNM totally redundant
Encrypting VPNs
• Transparently encrypt all network traffic
- Goal: Transparently improve application security
• But real applications need authentication
- E.g., Users must type passwords
- So could use protocols like SRP to negotiate encryption keys
• In fact, VPN + insecure protocol = insecure system
- E.g., NFS allows any user on net to pretend to be any other
- So still can’t use NFS securely over VPN
• But app. with end-to-end security doesn’t need
VPN
Banks and record-keeping
• Banks are legally required to audit their books
- Also send statements to customers on monthly basis
• Audits will catch human and computer errors
- E.g., Deposit/withdrawal to wrong account
- But also would catch race condition in database system!
• So system can’t assume perfect electronic
infrastructure
- Also places limit to how much $$$ to spend on reliability
Voting
• Requirement: Vote counted should be vote
intended by user
- Okay in old-fashioned voting systems
- User filled out paper ballot, checked contents
- In worst case could look at ballots by hand
• Failure 1 (Florida): Paper ballot not human
readable
- So users don’t notice hanging chads, etc.
• Failure 2 (e-voting): Users have no idea what
computer is doing, no paper record at all
- Already evidence of serious flaws in voting machines
- Voters would have no way of detecting this!
What’s the lesson here?
• Instinctively we like modularity & clean interfaces
- Which means putting functionality in low-level abstractions
• Examples:
- Reliable communication
- In-order communication
- Secure communication
• But correct applications can’t really exploit this
- So low-level functionality might be redundant
- Or might be insufficient
- Or might be harmful – E.g., sending real-time audio over a
reliable, in-order delivery channel
The End-to-end argument
“The function in question can completely and correctly
be implemented only with the knowledge and help of
the application standing at the end points of the
communication system. Therefore, providing that
questioned function as a feature of the communication
system itself is not possible. (Sometimes an incomplete
version of the function provided by the communication
system may be useful as a performance enhancement.)”
The end-to-end principle
Application
Application
Library
Library
user
kernel
user
kernel
kernel
hardware
kernel
hardware
router
• Place functionality closer to the endpoints
Examples
• IP datagram protocol
• The fact that TCP is in your computer, not your
router
• TCP checksum (though it should be stronger)
• AAL-5 vs. AAL-4 (checksum over entire CS-PDU)
• Overlay networks & Source routing
• End-system multicast
• Insecurity of VPNs
Performance issues
• MAC-layer CRCs and retransmission
- TCP can recover from errors
- But performance very bad when window < 4 pkts
- No better if most packet loss is only from congestion
• IP fragmentation seriously hurts reliability
- Because individual fragments are not retransmitted
Current research at NYU
• SUNDR secure file system
- End-to-end security requirement:
Users should read data written other legitimate users
- File system guarantees this without trusting server
• Coral content-distribution network
- Most P2P data storage systems dictate data placement
(E.g., store on closest node to ID in Chord or Pastry.)
- Also attempt to provide reliability and consistency
- Coral is optimized for placement of pointers
End nodes determine placement of data
- Gains efficiency by sacrificing consistency
(perfect when want some copy of data, not all)
Some concluding remarks
• Why are computer networks so interesting?
- Because so much functionality is at the end-points
- Can program your computers
• Why has the Internet been so conducive to
innovation?
- Because datagrams are in some sense a
lowest-common-denominator abstraction
- Can implement many protocols over a datagrams,
including reliable, in-order delivery (e.g., TCP)
• End-to-end argument isn’t just about
correctness/performance
- The closer to the end points you place functionality, the
more control users have, which allows more innovation
Quiz Review
• Open book
- Bring text and papers, you will need them!
- All class notes on line, feel free to print and bring
- Books & papers only; no laptops, cell phones, . . .
• Topics: Will cover full semester
- Grade based on max((mid + final)/2, (mid + 2 × final)/3)
- More emphasis on material since midterm
• No make-up finals, so please show up!
RPC
• XDR language for specifying protocol
• At-most once semantics
- How to implement (replay cache)
- How to implement when nodes might crash (cookies)
Multicast
• Three possible layers – know the trade-offs
- Data line layer (Ethernet)
- Network layer (IP multicast)
- Application layer (End-system multicast)
• Optimality: Stretch and stress
• Routing protocols
- Link state – relatively straight-forward, but expensive
- Distance Vector – Reverse Path Broadcast (RPB)
- Rev. Path Multicast (RPM) is RPB where you prune if no
receivers
- Sparse-mode PIM – send joins to Rendez-vous Point to
build tree; for active sender, send source-specific join
Caching
• Big issue in caching is consistency. Approaches:
- Use TTLs to limit stale data (DNS, HTTP)
- Use polling to see if cached copy is up-to-date (HTTP)
- Use callbacks, where server notifies you if object changes
- Use leases, in case node to which callback is promised dies
• Write caching – write through vs. write behind
- When do writes become visible? Stable?
Caching tricks
• Cache hierarchies
• Bloom filters for advertising cache contents
• Consistent hashing
• CARP protocol
• Spring & Wetherall trick for redundant data
Replication
• Single-server consistency model
• Maintaining order of events
- Lamport clocks
• Dealing with failure
- Majority of nodes need to be okay and see updates
• View changes
- Need majority to survive between old/new view
802.11 wireless networks
• 802.11 addresses several issues:
- Hidden nodes (undetected collision)
- Exposed nodes (could falsely make nodes wait to transmit)
- Solution: RTS/CTS, don’t send if you see CTS
- Also ACK received packet—everyone waits for ACK before
transmitting
- Backoff if two RTS packets collide
• Infrastructure mode
- Distribution (e.g., wired Ethernet) connects Access Points
- Nodes select APs with scanning
- 802.11 packets contain 4 address fields for when going over
distribution network
Ad hoc mode
• Wireless network w/o wired infrastructure
- Nodes to forward data to each other
- But don’t know anything about node locations a priori
- Examples: Emergency workers, mining equipment, etc.
• How to route on Ah Hoc networks
- Standard DV/LS routing not so good (many redundant
links, too much power consumption, link asymmetry)
- DSR – Use source routes, determine routes on-the-fly,
heavily cache routes
- Basically flood query route request messages
- Eavesdrop on network + forwarded packets to learn other
people’s routes
• GPSR – route using Geography
Cryptography
• Symmetric cryptography:
- Encryption – keeps data secret
- MAC (message authentication code) – detects tampering
• Public key setting:
- Encryption – as in symmetric case, but “public” key
encrypts, while “private” key decrypts
- Digital signatures – PK equivalent of MAC, anyone can
verify signed message but only private key holder can sign
• Key management
- Can use certification authorities
- Or secure password protocols possibly better – special
crypto avoids off-line password-guessing attacks
Security
• Use valid crypto primitives
- Wall Street Journal cooked up their own broken MAC
- Allowed Fu et al. to break MAC key in linear time
• Never assume anything from context of a message
- SSH signed login request
- But request didn’t say where user wanted to log in
- Allows one server to log into another as the user, by
relaying request for signature
Unstructured P2P systems
• Napster: Centralized DB
- Centralized lookup, P2P transfers
• Gnutella: Decentralized P2P queries
- Form random overlay network
- Use flood queries
- Use TTL + Cache queries (to avoid re-forwarding)
- Route replies back same way (bread-crumb trail)
Structured P2P systems
• Key-based routing
- Assign each node and each key an ID
- Each node knows about some number of other nodes
- Can efficiently “route” to nodes closer to any ID
- Use to implement things like distributed hash tables
• Chord: 160-bit IDs are points on circle
- Route clockwise around cicrle
• Pastry: Can route in either direction
- Prefix-based routing table, plus leaf set
Structured P2P applications
• Multicast
• File systems
• Content distribution