Download 12Introspection - BNRG - University of California, Berkeley

Berkeley-Helsinki Summer Course Lecture #12: Introspection and Adaptation Randy H. Katz Computer Science Division Electrical Engineering and Computer Science Department University of California Berkeley, CA 94720-1776 1 Outline • • • • Introspection Concept and Methods SPAND Content Level Adaptation MIT Congestion Manager/TCP Layer Adaptation ICAP Cache-Layer Adaptation 2 Outline • • • • Introspection Concept and Methods SPAND Content Level Adaptation MIT Congestion Manager/TCP Layer Adaptation ICAP Cache-Layer Adaptation 3 Introspection • From Latin introspicere, “to look within” – Process of observing the operations of one’s own mind with a view to discovering the laws that govern the mind • Within the context of computer systems – Observing how a system is used (observe): usage patterns, network activity, resource availability, denial of service attacks, etc. – Extracting a behavioral model from such use (discover) – Use this model to improve the behavior of the system, by making it more proactive, rather than reactive, to how it is used – Improve performance and fault tolerance, e.g., deciding when to make replicas of objects and where to place them 4 Introspection in Computer Systems • Locality of Reference – Temporal: objects that are used are likely to be used again in the near future – Geographic: objects near each other are likely to be used together • Exploited in many places – – – – Hardware caches, virtual memory mechanisms, file caches Object interrelationships Adaptive name resolution Mobility patterns • Implications – Prefetching/prestaging – Clustering/grouping – Continuous refinement of behavioral model 5 Example: Wide-Area Routing and Data Location in OceanStore • Requirements – Find data quickly, wherever it might reside » Locate nearby data without global communication » Permit rapid data migration – Insensitive to faults and denial of service attacks » Provide multiple routes to each piece of data » Route around bad servers and ignore bad data – Repairable infrastructure » Easy to reconstruct routing and location information • Technique: Combined Routing and Data Location – Packets are addressed to GUIDs, not locations – Infrastructure gets the packets to their destinations and verifies that servers are behaving John Kubiatowicz 6 Two-levels of Routing • Fast, probabilistic search for “routing cache” – Built from attenuated Bloom filters – Approximation to gradient search – Not going to say more about this today • Redundant Plaxton Mesh used for underlying routing infrastructure: – Randomized data structure with locality properties – Redundant, insensitive to faults, and repairable – Amenable to continuous adaptation to adjust for: » Changing network behavior » Faulty servers » Denial of service attacks John Kubiatowicz 7 Basic Plaxton Mesh Incremental suffix-based routing 3 4 NodeID 0x79FE NodeID 0x23FE NodeID 0x993E 4 NodeID 0x035E 3 NodeID 0x43FE 4 NodeID 0x73FE 3 NodeID 0x44FE 2 2 4 3 1 1 3 NodeID 0xF990 2 3 NodeID 0x555E 1 NodeID 0x73FF 2 NodeID 0xABFE NodeID 0x04FE NodeID 0x13FE 4 NodeID 0x9990 1 2 2 NodeID 0x423E 3 NodeID 0x239E 1 NodeID 0x1290 John Kubiatowicz 8 Use of Plaxton Mesh Randomization and Locality John Kubiatowicz 9 Use of the Plaxton Mesh (Tapestry Infrastructure) • As in original Plaxton scheme: – Scheme to directly map GUIDs to root node IDs – Replicas publish toward a document root – Search walks toward root until pointer locatedlocality! • OceanStore enhancements for reliability: – Documents have multiple roots (Salted hash of GUID) – Each node has multiple neighbor links – Searches proceed along multiple paths » Tradeoff between reliability and bandwidth? – Routing-level validation of query results • Dynamic node insertion and deletion algorithms – Continuous repair and incremental optimization of links John Kubiatowicz 10 OceanStore Domains for Introspection • Network Connectivity, Latency – Location tree optimization, link failure recovery • Neighbor Nodes – Clock synchronization, node failure recovery • File Usage – File migration – Clustering related files – Prefetching, hoarding • Storage Peers – Accounting, archive durability, backlisting • Meta-Introspection – Confidence estimation, stability Dennis Geels, [email protected] 11 Common Functionality • These targets share some requirements: – High input rates » Watch every file access, heartbeat, packet transmission – Both short- and long-term decisions » Respond to changes immediately » Extract patterns from historical information – Hierarchical, Distributed Analysis » Low levels make decisions based on local information » Higher levels possess broader, approximate knowledge » Nodes must cooperate to solve problem • We can build shared infrastructure Dennis Geels, [email protected] 12 Architecture for Wide-Area Introspection • Fast Event-Driven Handlers – Filter and aggregate incoming events – Respond immediately if necessary • Local Database, Periodic Analysis – Store historical information for trend-watching – Allow more complicated, off-line algorithms • Location-Independent Routing – Flexible coordination, communication Dennis Geels, [email protected] 13 Event-Driven Handlers • Treat all incoming data as events: messages, timeouts, etc. – Leads to natural state-machine design – Events cause state transitions, finite processing time – A few common primitives could be powerful: average. count, filter by predicate, etc. • Implemented in “small language” – Counts important primitives for aggregation, database access – Facilitates implementation of introspective algorithms » Allows greater exploration, adaptability – Can verify security, termination guarantees • E.g., EVENT.TYPE=“file access” : increment COUNT in EDGES where SRC==EVENT.SRC and DST==EVENT.SRC Dennis Geels, [email protected] 14 Local Database, Periodic Analysis • Database Provides Powerful, Flexible Storage – Persistent data allows long-term analysis – Standard interface for event handler scripting language – Leverage existing aggregation functionality » Considerable work from Telegraph Project – Can be lightweight • Sophisticated Algorithms Run On Databases – Too resource-intensive to operate directly on events – Allow use of full programming language – Security, termination still checkable; should use common mechanisms • E.g., expensive clustering algorithm operating over edge graph, using sparse-matrix operations to extract eigenvectors representing related files Dennis Geels, [email protected] 15 Location-Independent Routing • Not a very good name for a rather simple idea. Interesting introspective problems are inherently distributed. Coodination among nodes is difficult. Needed: – – – – Automatically create/locate parents in aggregation hierarchy Path redundancy for stability, availability Scalability Fault tolerance, responsiveness to fluctuation in workload • OceanStore data location system shares these requirements. This coincidence is not surprising, as each are instances of wide-area distributed problem solving. • Leverage OceanStore Location/Routing System Dennis Geels, [email protected] 16 Summary: Introspection in OceanStore • Recognize and share a few common mechanisms – Efficient event-driven handlers – More powerful, database-driven algorithms – Distributed, location-independent routing • Leverage common architecture to allow system designers to concentrate on developing & optimizing domain-specific algorithms Dennis Geels, [email protected] 17 Outline • • • • Introspection Concept and Methods SPAND Content Level Adaptation MIT Congestion Manager/TCP Layer Adaptation ICAP Cache-Layer Adaptation 18 SPAND Architecture Mark Stemm 19 SPAND Architecture Mark Stemm 20 What is Needed • An efficient, accurate, extensible and timeaware system that makes shared, passive measurements of network performance • Applications that use this performance measurement system to enable or improve their functionality Mark Stemm 21 Issues to Address • Efficiency: What are the bandwidth and response time overheads of the system? • Accuracy: How closely does predicted value match actual client performance? • Extensibility: How difficult is it to add new types of applications to the measurement system? • Time-aware: How well does the system adapt to and take advantage of temporal changes in network characteristics? Mark Stemm 22 SPAND Approach: Shared Passive Measurements Mark Stemm 23 Related Work • Previous work to solve this problem – Use active probing of network – Depend on results from a single host (no sharing) – Measure the wrong metrics (latency, hop count) • NetDyn, NetNow, Imeter – Measure latency and packet loss probability • Packet Pair, bprobes – If Fair Queuing, measures “fair share” of bottleneck link b/w) – Without Fair Queuing, unknown (min close to link b/w) • Pathchar – Combines traceroute & packet pair to find hop-by-hop latency & link b/w • Packet Bunch Mode – Extends back-to-back technique to multiple packets for greater accuracy Mark Stemm 24 Related Work • Probing Algorithms – Cprobes: sends small group of echo packets as a simulated connection (w/o flow or congestion control) – Treno: like above, but with TCP flow/congestion control algorithms – Network Probe Daemon: traces route or makes short connection to other network probe daemons – Network Weather Service: makes periodic transfers to distributed servers to determine b/w and CPU load on each Mark Stemm 25 Related Work • Server Selection Systems – DNS to map name to many servers » Either round-robin or load balancing – Boston University: uses cprobes, bprobes – Harvest: uses round trip time – Harvard: uses geographic location – Using routing metrics: » IPV6 Anycast » HOPS » Cisco Distributed Director » University of Colorado – IBM WOM: uses ping times – Georgia Tech: uses per-application, per-domain probe clients Mark Stemm 26 Comparison with Shared Passive Measurement • What is measured? – Others: latency, link b/w, network b/w – SPAND: actual response time, application specific • Where is it implemented? – Others: internal network, at server – SPAND: only in client domain • How much additional traffic is introduced? – Others: tens of Kbytes per probe – SPAND: small performance reports and responses • How realistic are the probes? – Others: artificially generated probes that don’t necessarily match realistic application workloads – SPAND: actual observed performance from applications Mark Stemm 27 Comparison with Shared Passive Measurement • Does the probing use flow/congestion control? – Others: no – SPAND: whatever the application uses (usually yes) • Do clients share performance information? – Others: no; sometimes probes are made on behalf of clients – SPAND: yes Mark Stemm 28 Benefits of Sharing and Passive Measurements • Two similarly connected hosts are likely to observe same performance of distant hosts • Sharing measurements implies redundant probes can be eliminated Mark Stemm 29 Benefits of Passive Measurements Mark Stemm 30 Design of SPAND Mark Stemm 31 Design of SPAND • Modified Clients – Make Performance Reports to Performance Servers – Send Performance Requests to Performance Servers • Performance Servers – Receive reports from clients – Aggregate/post process reports – Respond to requests with Performance Responses • Packet Capture Host – Snoops on local traffic – Makes Performance Reports on behalf of unmodified clients Mark Stemm 32 Design of SPAND 33 Design of SPAND • Applications Classes – Way in which an application uses the network – Examples: » Bulk transfer: uses flow control, congestion control, reliable delivery » Telnet: uses reliability » Real-time: uses flow control and reliability – (Addr, Application Class) is target of a Performance Request/Report Mark Stemm 34 Issues • Accuracy – Is net performance stable enough to make meaningful Performance Reports? – How long does it take before the system can service the bulk of the Performance Requests? – In steady state, what % of Performance Requests does the system service? – How accurate are Performance Responses? • Stability – Performance results must not vary much with time • Implications of Connection Lengths – Short TCP connections dominated by round trip time; long connections by available bandwidth Mark Stemm 35 Application of SPAND: Content Negotiation Web pages look good on server LAN Mark Stemm 36 Implications for Distant Access, Overwhelmed Servers Mark Stemm 37 Content Negotiation Mark Stemm 38 Client-Side Negotiation Results Mark Stemm 39 Server-Side Dynamics Mark Stemm 40 Server-Side Negotiation: Results Mark Stemm 41 Content Negotiation Results • Network is the bottleneck for clients and servers • Content negotiation can reduce download times of web clients • Content negotiation can increase throughput of web servers • Actual benefit depends on fraction of negotiable documents Mark Stemm 42 Outline • • • • Introspection Concept and Methods SPAND Content Level Adaptation MIT Congestion Manager/TCP Layer Adaptation ICAP Cache-Layer Adaptation 43 Congestion Manager (Hari@MIT, Srini@CMU) • The Problem: – Communications flows compete for same limited bandwidth resource (especially on slow start!), implement own congestion response, no shared learning, inefficient, within end node • The Power of Shared Learning and Information Sharing f1 f2 Server f(n) Internet Client 44 Adapting to Network f1 ? Internet Server Client • New applications may not use TCP – Implement new protocol – Often do not adapt to congestion: not “TCP-friendly” Need system that helps applications learn and adapt to congestion 45 State of Congestion Control • Increasing number of concurrent flows • Increasing number of non-TCP apps Congestion Manager (CM): An end-system architecture for congestion management 46 The Big Picture HTTP Per-macroflow statistics (cwnd, rtt, etc.) Congestion Manager TCP1 Audio Video1 TCP2 Video2 UDP API IP All congestion management tasks performed in CM Applications learn and adapt using API 47 Problems • How does CM control when and whose transmissions occur? – Keep application in control of what to send • How does CM discover network state? – What information is shared? – What is the granularity of sharing? Key issues: API and information sharing 48 The CM Architecture Applications (TCP, conferencing app, etc) API Congestion Controller Prober Sender Scheduler CM protocol Congestion Detector Responder Receiver 49 Feedback about Network State • Monitoring successes and losses – Application hints – Probing system • Notification API (application hints) 50 Probing System • Receiver modifications necessary – Support for separate CM header – Uses sequence number to detect losses – Sender can request count of packets received • Receiver modifications detected/negotiated via handshake – Enables incremental deployment IP header IP payload CM header IP payload 51 Congestion Controller • Responsible for deciding when to send a packet • Window-based AIMD with traffic shaping • Exponential aging when feedback low – Halve window every RTT (minimum) • Plug in other algorithms – Selected on a “macro-flow” granularity 52 Scheduler • Responsible for deciding who should send a packet • Hierarchical round robin • Hints from application or receiver – Used to prioritize flows • Plug in other algorithms – Selected on a “macro-flow” granularity – Prioritization interface may be different 53 Sequence number CM Web Performance TCP Newreno With CM Time (s) CM greatly improves predictability and consistency 54 Layered Streaming Audio Sequence number 450 400 350 Competing TCP 300 TCP/CM 250 200 150 Audio/CM 100 50 0 0 5 10 15 Time (s) 20 25 Audio adapts to available bandwidth Combination of TCP & Audio compete equally with normal TCP 55 Congestion Manager Summary • CM enables proper & stable congestion behavior • Simple API enables app to learn/adapt to network state • Improves consistency/predictability of net xfers • CM provides benefit even when deployed at senders alone 56 Outline • • • • Introspection Concept and Methods SPAND Content Level Adaptation MIT Congestion Manager/TCP Layer Adaptation ICAP Cache-Layer Adaptation 57 How Internet Content is Delivered Today Server farm database Boston Internet New York English Spanish mainframe component solution Last Mile Access Broadband Connections cable modems DSL, dial-up, wireless Internet Caching & Internet Content Delivery localizes content Centralized Servers Applications & Multiple versions of content are centralized 58 What is iCAP? • iCAP lets clients send HTTP messages to servers for “adaptation” – In essence, an “RPC” mechanism (Remote Procedure Call) for HTTP messages • An adapted message might be a request: – Modify request method, URL being requested, etc. • ...or, it might be a reply – Change any aspect of delivered content • iCAP enables edge services 59 What iCAP is not -for now • A way to specify adaptation policy • A configuration protocol • A protocol that establishes trust between previously unrelated parties • In other words: ICAP defines the how, not who, when or why 60 iCAP Makes Content Smarter! iCAP Enables Local Services Congested, Slow, Distant, and/or Expensive Link Clients ISP Internet Network (Large Backbone ISP) To Server Farms Content distribution network, or Cache Local sources of content: Better for everyone (client, network, server) 61 Ad insertion Why iCAP? Language Translator Virus Checker Web Server or Proxy Legend: iCAP servers for Compute-Intensive Operations Transcoder Content Filter • Fast, Simple, Scalable • Allows services to be customized 62 ICAP Benefits • Very simple operation – ICAP builds on HTTP GET and POST • No proprietary APIs Required • Standards-based • Leverages the latest Internet infrastructure developments • Fast, simple, scalable, and reliable • Allows you to customise services 63 iCAP general design • Simple, simple, simple: CGI should be able to turn a web server to an ICAP server • Based on HTTP (+special headers) • Three modes: – Modify a request – Satisfy a request (like any other proxy) – Modify a response 64 Request Modification • The request is passed to the ICAP server (almost) unmodified; just like a proxy would • The ICAP server sends back a modified request, encapsulated in response headers – Body, if any (e.g. POST), may also be modified 65 Request Modification 6 Client 1 Proxy cache 5 iCAP client 2 3 4 Origin server ICAP Server ICAP server modifies a request; modified request continues on its way 66 Response Modification • ICAP client always uses POST to send body • Encapsulated in POST headers may also be: – Headers used by user to request the object – Headers used by origin server in its reply • ICAP server replies with modified content 67 Response Modification Client 6 3 1 2 Proxy Cache (ICAP Client) 4 Origin Server 5 ICAP Server ICAP server modifies a response from an origin server; might be once, as the object is cached, or once per client served 68 Request Satisfaction 6 1 Client Proxy Cache (ICAP Client) 2 ICAP Server 5 Origin Server 4 3 ICAP server satisfies a request just like a proxy; origin server MAY be contacted by ICAP server (or not) 69 Infinite Variations • Allows innovations. You choose 3rd party applications. • iCAP enables many different kinds of apps! – Edge content sources can pass pages to ad servers – Expensive operations can be offloaded – Content filters can respond either with an unmodified request or HTML (“Get Back to Work!”) 70 Next Steps • ICAP Supporters continue to enhance protocol – Learn from solutions and fix “bugs” – Build future functionality later • IETF – ICAP Forum will submit the specification to IETF for draft RFC status in mid 2000 • Additional partners – Software developers, infrastructure companies and Internet content delivery service providers will be solicited for participation – Need to get on same page 71 More on – Important iCAP info at this site http://www.i-cap.org. – Become an iCAP Participant by sending an e-mail to mailto:[email protected]. A reply will be sent outlining requirements. 72

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download 12Introspection - BNRG - University of California, Berkeley