Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Design & Co-design of Embedded Systems The ODYSSEY Methodology: ASIP-Based Design of Embedded Systems from Object-Oriented System-Level Models Maziar Goudarzi 1 Outline • Motivation • Related Work • ODYSSEY: Theory • ODYSSEY: Implementation • ODYSSEY: Design Automation • Summary and Conclusion 2 Embedded Systems Market • Rapidly growing market – Compound Annual Growth Rate (CAGR) of 17.3% The future of computing resides in embedded computing 3 Market Life Cycle • A delay to the market window causes a huge revenue impact Source: Agilent Technologies 4 Motivation: Design Automation • Conclusion: – Design Automation Tools & Methodologies are needed for Embedded System Design • Question: – At what level of abstraction? 5 The Design Productivity Gap 6 Motivation: Electronic System-Level (ESL) design • Solution: – Raise the level of abstraction • Historical examples: – Place & route tools – Hardware description languages – Hardware synthesis • Latest suggestion: Source: Monterey Design Systems – ESL • Spans SW+HW 7 Motivation • Conclusion: – The embedded system industry is in need of ESL Design Methodologies and supporting Design Automation Tools • Question: – How to specify, implement, and validate the embedded system? 8 The First Challenge in ESL: Specification • Alternatives: – Extend HW modeling (e.g. VHDL) to SW – Extend SW modeling (e.g. Java) to HW – Use HW/SW-neutral or mathematical models (e.g. Codesign FSM) • Observations: – Software accounts for 80% of embedded system development cost [ITRS-2003] – Technology trend toward SW: • Catapult (Mentor Co.) • Agility Compiler & DK Design Suite (Celoxica Co.) • Cascade (CriticalBlue Co.) 9 ESL Challenges (cont’d) • Conclusion: – Object-oriented design methodology is a reasonable answer • Questions: – What about other ESL challenges? • Implementation, verification, automation of the design • To be discussed later in the talk… 10 Thesis of this work There is scope to raise the abstraction-level of processors when designing embedded systems, and furthermore, such raise helps to address modelling, implementation, and reuse challenges in the design and designautomation of modern embedded systems. 11 Outline • Motivation • Related Work • ODYSSEY: Theory • ODYSSEY: Implementation • ODYSSEY: Design Automation • Summary and Conclusion 12 Related Work • OO used for hardware Modeling – Extensions of VHDL • Myriads of different proposals – Objective-VHDL, several flavours of OO-VHDL, SUAVE • Just a few consider synthesis – Java • HW components viewed as objects • Signals travelling among components viewed as objects – C++ • SystemC • CynLib from CynApps 13 Related Work (cont’d) • OO used for hardware modeling (cont’d) – Modeling is good, but synthesis is the major concern • Major approaches to OO synthesis – – – – ODETTE OASE Enodia® Architecture Not in our area of work: • Wolf’s OO Co-synthesis • Matisse • jHISC 14 The ODETTE Approach • ODETTE proposal: – View objects are Finite-State Machines (FSM) – Object attributes: FSM state variables – Method calls: FSM state transitions 15 The ODETTE Object FSM 16 Polymorphism in ODETTE 17 Analysis of ODETTE • Nice, but very high overhead – One FSM per object => High area and power overhead: O(no) – Polymorphism: Replication => High area and power overhead +Maximum potential concurrency – (Apparently) FSM => sequential method-call inside objects • Q: What if a method calls another one? • Q: How to extend to HW/SW systems? 18 The OASE Approach • OASE Proposal: – Reuse and customize behavioural synthesis techniques – Static analysis & transformation of the OO code – Converts OO constructs to non-OO ones • Access to object attributes • Non-virtual method calls • Virtual method calls (polymorphism) 19 OASE Transformation Process e Source Syntax Tree Scanner / Parser Semantic Analysis Control Flow Analysis Data Flow Analysis Concurrency Analysis Symbol Tables Control Flow Graph Output of Intermediate Format Verilog The transformation process from ‘e’ to Verilog [Kuhn et al., DAC’01] 20 Polymorphism in OASE Object Reference variable Set x S1, S2 y S2 S1 S2 S3 z S1, S2, S3 Results of static analysis switch (z) { case S1: S1_foo(); case S2: S2_foo(); case S3: S3_foo(); } An example in e language [Kuhn et al., DAC’01] 21 Analysis of OASE • Nice extension of behavioural synthesis to OO, but still high overhead for polymorphism – Area/power overhead: O(nonmc) 22 The Enodia® Architecture • Silicon Infusion Co. (UK startup) • Enodia Proposal: – Bottom-up composition of a variety of their IP cores – An Object-Orientated SoC architecture • Patented in UK and US 23 Enodia® E9610 product Internal architecture of Enodia E9610 chip [Silicon Infusion Co., 2004] 24 Analysis of Enodia® • Patent on high-performance caching • Chip architecture very similar to ours, but – uses firmware for polymorphism => performance overhead – Bottom-up approach => one manual chip design per application domain 25 Summary & Comparison ODETTE OASE Enodia Impl. Style ASIC ASIC Heterogeneous Multiprocessor Synthesis Approach Per-object method replication Static analysis + inlining Multiple objects per method impl. Language Objective-VHDL, SystemC-Plus Java, SystemC, e N.A. Optimization Dead-code removal object reachability N.A. Polymorphism Method replication & multiplexing Method inlining Firmware 26 Summary & Comparison (cont’d) ODETTE OASE Enodia HW-SW? Not provided Stub generation SW on multiprocessor Model of Concurrency Objects invoked from processes Multiple processes in modules N.A. Dynamic (de)allocation Not supported Not supported Supported 27 Summary & Comparison (cont’d) • Major shortcomings 1. 2. 3. 4. 5. Viewing objects as structural components Too verbose languages Unacceptable area/power overhead No or unclear path toward HW-SW system HW designers’ reluctance to OO • We propose ODYSSEY – Object-oriented Design and sYntheSiS of Embedded sYstems 28 Outline • Motivation • Related Work • ODYSSEY: Theory • ODYSSEY: Implementation • ODYSSEY: Design Automation • Summary and Conclusion 29 ASIP vs. ASIC Source: K. Keutzer, S. Malik R. Newton, “From ASIC to ASIP: The Next Design Discontinuity”, ICCD, 2002 Application-specific instruction-processors (ASIPs) are replacing ASICs 30 OO-ASIP: Object-Oriented ASIP • Our proposal: A – Let methods of a class library be the instruction-set of a processor The class library i: int f() g() B c: char f() h() C f: float a1 Data Memory The OO-ASIP b1.h() b1 a2.g() a2 ap->f() g() k() Instruction Memory 31 OO-ASIP vs. Traditional Processors • OO-ASIP for int/float = a traditional processor • Differentiating features – OO-ASIP instructions can call one another – OO-ASIP instructions can be implemented in software as well as in hardware – Big instructions Independent execution units for each HW instruction Dynamic power management by de-activating not-running instructions & Dynamic area management by caching most-recently-run instructions – OO-ASIP implements polymorphism in hardware 32 OO-ASIP vs. Other ASIPs • Typical ASIP-design flow Applications and Design Constraints Application Analysis Architectural Design-Space Exploration Instruction-set generation Code Synthesis Hardware Synthesis Object code Code Source: M.K.Jain, M. Balakrishnan, A. Kumar, “ASIP Design Methodologies: Survey and Issues”, VLSI-Design Conf., 2001. • Disadvantage – No guarantee to suit future different (but related) applications • OO-ASIP: future related apps. shall use today class lib. 33 Design-Space Represented by OO-ASIP Given an OO application with No objects Implementation by a traditional processor Number of objects per OO-ASIP No OO-ASIP 2 1 All HW ODETTE implementation All SW Style of methods (HW or SW) 34 Design Flow using OO-ASIPs OO-ASIP Design Flow OO-ASIP Reuse Flow Disciplined Benchmarking (OO-ASIP, HW Class Lib.) Choose suitable class lib. Database Hardware Class lib. HW class lib. Model+verify the App. OO-ASIP Synthesis The OO-ASIP Data memory OO-ASIP Compile toward the ASIP Instr. memory 35 Design Flow using OO-ASIP: Another View Application SW Model Software C++ ASIP ISA: Hardware ASIP Programming Path f, g, k SystemC (C++) ASIP Synthesis Path Hardware Class Lib. D DD Software Class Lib. A f() h() f() g() h() B k() BB C System Class Lib. ASIP Hardware 36 Programming the OO-ASIP • Requirements on the OO-ASIP compiler – Retargetable to various OO-ASIPs – Retargetable to various processor cores – Capable of early hardware-software co-validation • Our solution: – Source-to-source transformation 37 38 The ODYSSEY Ultimate Goal • The ODYSSEY target chip: FPGA-like array of OO-ASIPs • Interconnection: – Packet-routing network – Motivation: • Network-on-Chip viewed as future paradigm in DSM technologies ODYSSEY System-Synthesizer On-Chip network of OO-ASIPs OO-ASIP1 router OO-ASIP2 OO-ASIP3 router Processor OO-ASIP4 router Processor 39 Outline • Motivation • Related Work • ODYSSEY: Theory • ODYSSEY: Implementation • ODYSSEY: Design Automation • Summary and Conclusion 40 A Simple OO-ASIP Architecture Functional Units (FUs) Implementation A Traditional Processor B f() g() f() h() B::f() routine of A::f() Implementation To Data Memory Object Management Unit (OMU) of A::g() Implementation of B::h() The OO-ASIP Method Invocation Unit (MIU) From Instruction Memory VMT OTT 41 Case Study 1: Traffic-Light Controller traffic_light status: int elapsed_time: int open() close() timekeeper() farmroad_light highway_light fixed_green: int min_green: int open() close() All methods implemented in hardware 42 Case Study 1: Traffic-Light Controller Share in total area (%) 30 25 20 15 10 5 ) se ( w ay hi gh d: :o ro a fa rm ::c lo pe n () r() ek ee pe e( ) ::t im ::c tr af f… c… tr af fi t:: op e ig h c_ l tr af fi lo s n( ) U O M M IU 0 Values reported by LeonardoSpectrum tool over a sample 0.5um process 43 Case Study 1: Traffic-Light Controller Power Consumption (nW) 15% reduction 200 180 160 140 120 100 80 60 40 20 0 20% reduction Without Powerdown With Powerdown g ts ts ssin bj ec bjec o o o r t c t h h li g l ig o ad oad_ armr w ay_ r h F m g r y i h fa wa i th 4 i th 4 High w w n n o ti t io Junc Junc Values estimated by Synopsys PowerCompiler tool over a 1 um process with 5V operating voltage 44 Analysis of the Architecture • Area/Power management – Static (application-specific) policy – Dynamic (application-independent) policy • Polymorphism overhead – Performance improved by HW MIU – Area/power overhead still present 45 Our Solution: Network-on-Chip Architecture • Dispatch virtual-methods at the same time that packets are routed on an on-chip network Processor Object Management Unit (OMU) The OO-ASIP On-chip Network To Data Memory A::f() A::g() B::h() Functional Units (FUs) From Instruction Memory A B f() g() f() h() 46 NoC: Network-on-Chip • NoC emergence: – Fully synchronous designs not feasible anymore – Unreliable communication in very deep submicron technologies (90 nm and beyond) – Solution: leverage computer networks and protocols for communication inside chips – NoC seems unavoidable Reference: L. Benini, G. DeMicheli, “Networks on Chips: a New SoC Paradigm,” 47 IEEE Computer, 35(1):70-78, 2002. Ordinary-Method Dispatch by Network Routing • FU-identifier: FU=<method.class> • Object-identifier: object=<class.num> • Method call = invoke a method on an object <method.object> = <method.<class.num>> = <<method.class>.num> = <FU.num> = Packet destined to the node addressed FU 48 Virtual-Method Dispatch by Network Routing • To dynamically bind a method call (e.g. objp->method(params) in C++) 1.Assemble a packet as <method, objp, params> 2.Send it over the on-chip network 3.The (probable) return value is sent back as another packet 49 Case Study 2: A Codec Engine data_block data[20]: byte Hardware methods Software methods print() encode() decode() xor_encoded_data swap_encoded_data cypher: byte convert_char(byte) encode() decode() swap(byte, byte) encode() decode() 50 Case Study 2: Implementation in SystemC 51 Outline • Motivation • Related Work • ODYSSEY: Theory • ODYSSEY: Implementation • ODYSSEY: Design Automation • Summary and Conclusion 52 Input-Output Correspondence Class definition attributes attributes attributes HW-methods HW-methods HW-methods SW-methods SW-methods SW-methods main() function System Model (C++) The OO-ASIP Object-Management Unit (OMU) Processor Module thread__main() HW-method implementation SW-method implementation on-chip network System Implementation (SystemC) 53 Big Picture of Tool Flow OO-ASIP System Model (C++) Synthesis HW-method Transformations Parsing + Analysis Partitioning HW-structure generator System-level Synthesis OO-ASIP Compilation SW-method Transformations SW-structure generator Hardware (SystemC) Instr-set extenstions Software (C++) SystemC Synthesis Traditional Processor C++ Compiler Gate-level HW Binary SW Final System Downstream Synthesis 54 HW-SW Co-simulation Model HW-method Transformations System Model (C++) Synthesis Parsing + Analysis Partitioning HW-structure generator Co-simulation model System-level SW-method Transformations SW-structure generator Hardware (SystemC) Instr-set extenstions Software (C++) SystemC Synthesis Traditional Processor C++ Compiler Gate-level HW Binary SW Final System Downstream Synthesis 55 Experiments on Co-simulation Performance* 10000 1000 Attr.-access freq. (10K acc/s) 100 Method-calls freq. (100 call/s) 10 Imposed overhead (% ) 1 A 09 0 00 1 A 09 0 00 2 B Imposed overhead (%) 01 0 00 0 B 01 0 00 1 B 01 0 00 3 Method-calls freq. (100 call/s) Attr.-access freq. (10K acc/s) * All experiments done on a Celeron 2.0 GHz processor with 256MB of RAM ** Worst-case assumed: All methods are implemented in hardware 56 Analysis of Experimental Results • High MC/sec. = High Communication/Computation ratio = Most of the time spent in comm. instead of comp. = Potentially low performance in final implementation • Conclusion: – Low co-simulation performance ~ Potentially low final performance => Hint to the designer: Decrease comm./comp. time (e.g. by combining methods) 57 Outline • Motivation • Related Work • ODYSSEY: Theory • ODYSSEY: Implementation • ODYSSEY: Design Automation • Summary and Conclusion 58 Summary • An ESL design methodology for embedded systems was – developed – implemented – automated • The main thrusts: – The design methodology – The raise in abstraction-level of processor ISA – The OO-ASIP processor 59 Further Research • Currently going-on: – Case studies on real-life industrial apps. • JPEG codec (Morteza NajafVand) • MPEG decoder (Naser MohammadZadeh) – Object-aware cache – – – – – • Application-specific data prefetching in hardware (Mehdi Modarressi) Synthesis of a Multiprocessor OO-ASIP (Hani JavanHemmat) RT-Level co-simulation (Ms. Zeinolabedini) Using IP-Cores in OO-ASIPs (Ms. Hashemi) Fault-Tolerance by software standby sparing Assertion-based verification • A few others – Application-specific memory synthesis for OO-ASIP – Fault-tolerance by dynamic reconfiguration using polymorphism – Multithreaded OO-ASIP 60 Conclusion There is scope to raise the abstraction-level of processors when designing embedded systems, and furthermore, such raise helps to address modelling, implementation, and reuse challenges in the design and designautomation of modern embedded systems. 61 Supplementary Material 62 Supplements • FDL’03 Poster • Presentation at Oldenburg • Progress Report 1 at Department of High- Tech. Industries, Ministry of Industries and Mines 63