Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Adding a Flow-Oriented Paradigm to Commodity Operating Systems Cristian Soviani, Stephen A. Edwards, and Angelos Keromytis Department of Computer Science, Columbia University in the City of New York {soviani,sedwards,angelos}@cs.columbia.edu The Status Quo: Memory as Buffer Web Server OS Kernel Main memory DATA Memory−I/O bus (e.g., PCI) Hard Disk MPEG encoder Crypto Accelerator Network Interface Network I/O Becoming Faster than CPUs 5000 4500 CPU Speed Memory Bus Speed 4000 3500 MHz 3000 2500 2000 1500 1000 500 0 1986 1988 1990 1992 1994 1996 1998 2000 2002 2004 Year CPU and Memory speeds (Source: Intel) Our Idea: Flows Controlled by OS Web Server OS Kernel Main memory Flow initialization Memory−I/O bus (e.g., PCI) Hard Disk FPGA board JPEG encoding + IPsec processing Crypto Accelerator Network Interface Network Like a modern router’s control & data plane Layers User User User Kernel Kernel Kernel HW HW HW Typical e.g., sendfile() Ours Abstract Server Model Sources Transformers Sinks Crypto Network Video MPEG Hard in encode Drive MPEG Audio decode Out Hard Drive Network Signaling API Need some sort of mechanism for creating, controlling, and tearing down flows, e.g., flow = flow_open(); flow_source(flow, "/usr/http/secretfile.html"); flow_xformer(flow, "/dev/crypto"); flow_xformer(flow, "/dev/http"); flow_sink(flow, "/dev/inet/192.168.1.3"); flow_start(flow); flow_stop(flow); Exception Handling What happens if something goes wrong? • OS may try to re-start device • Redirect to a different device • Error passed to application • Switch to all-software flow • Terminate Resource Scheduler What if two processes want access to /dev/crypto? • Performance requirements • Available resources • Priorities Flow-level scheduling costly, but infrequent Detailed (e.g., bus access) scheduling more frequent Programmable Peripherals RadiSys ENP-2611 Altera Stratix PCI Network Card w/ IXP2400 FPGAs: flow components or wrappers around legacy peripherals (DMA absorbers) Proof-of-Concept System system CPU SRAM 16kB main system memory CPU0 General purpose peripherals Interrupt USER JTAG VGA controller I/O UART controller S S S off - chip 2MB ZBT SRAM S S M LRU arbiter A M NIC CPU M S OPB CRYPTO bridge CPU M S SRAM 8kB S NIC device M S bridge S SRAM 8kB S CRYPTO device SNIC CPU M S bridge OPB bridge S 8kB dual ported SRAM private circuitry M OPB S 8kB dual ported SRAM private circuitry HDC CPU OPB S S M M S SRAM 8kB S 8kB dual ported SRAM S HDC device private circuitry S SRAM 8kB 8kB dual ported SRAM S SNIC device private circuitry Experimental Results Time to send 1 Million packets through the pipeline Packet size 64 bytes 1024 bytes Memorycentric 15.5s 114 Flowcentric* Speed-up 13.6 13% 59.2 49% *Plus main processor mostly idle for these packets Conclusions New flow-centric architecture for operating systems Have the OS manage inter-peripheral flows under application control Requires programmable peripherals: many already extant Proof-of-concept showed nearly a 2× speedup Minimal performance impact from additional computations (e.g., security)