Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Pulse-width modulation wikipedia , lookup
Buck converter wikipedia , lookup
Mains electricity wikipedia , lookup
Power engineering wikipedia , lookup
Alternating current wikipedia , lookup
Switched-mode power supply wikipedia , lookup
Time-to-digital converter wikipedia , lookup
DSP Architectures Additional Slides Professor S. Srinivasan Electrical Engineering Department I.I.T.-Madras, Chennai –600 036 [email protected] Figure 4.3(a) Block diagram of a barrel shifter Figure 4.3(b) Implementation of a 4-bit, shift-right barrel shifter Figure 4.5 A MAC unit with accumulator guard bits Figure 4.6 A schematic diagram of the saturation logic Figure 4.7 Block diagram of an arithmetic logic unit Figure 4.9 Register pointer updating algorithm for circular buffer addressing mode: SAR = start address register contents, EAR = end address register contents, PNTR = pointer Figure 4.10 Different cases that arise in updating the pointer in circular buffer addressing mode Figure 4.10 Continued Figure 4.11 Block diagram of an address generation unit Bit-reversal Hardware Figure 4.12 A conceptual diagram of a program sequencer Instruction Level Parallelism VLIW architecture • Each instruction specifies several operations to be done in parallel • Advantages : Simple hardware compilers can spot ILP easily • Disadvantages : Little compatibilty between generations Explicit NOPs bloat code size Super scalar architecture • Hardware responsible for finding ILP in a sequential program • Advantage : Compatibility between generations • Disadvantage : Very complex hardware Explicitly Parallel Instruction Computing (EPIC) • Combines VLIW and super scalar architectures • Instructions are grouped into 3 operating blocks and a template block • Template block tells hardware if instructions can be executed in parallel • Also gives information whether the block can be executed in parallel ILP versus Power Increasing instructions / cycle Requires fewer cycles to execute a task Uses longer clock for same performance Uses lower supply voltage And hence uses less power However, too many functional units and too many transitions per clock cycle increase power consumption. Low Power architecture Power consumed by additional circuits vs. ability to lower clock rate while maintaining performance Circuits must be highly used Move complexity into software Voltage scaling : Reduce Vdd Clock gating : Turn off clock when chip is not in use ( applies to sub-modules of chip also) VLIW is more suitable than super scalar for low power - VLIW is smaller for same number of functional units - Compiler is better at finding parallelism than hardware Put multiple processors on chip rather than lots of functional units in one processor Helps in running independent tasks General Purpose Microprocessor 2000 GHz clock speed 32-bit address or more 32-bit bus, 128-bit instructions Complex MMU Super scalar CPU MMX instructions On chip cache Single cycle execution 32-bit floating point ALU on board Very expensive 10s of watts of power DSP in 2000 Clock 100 ~ 200 MHz 16-bit floating point or 32-bit floating point 16-24 bits address space Large on-chip and off-chip memories Single cycle execution of most instructions Harvard architecture Lots of special DSP instructions 50 mw to 2w power Cheap Future of DSP Microprocessor Sufficiently unique for an independent class of applications (HDD, cell phone) Low power consumption, low cost High performance within power, cost constraints (MIPS/mw, MIPS/$) Fixed point & floating point Better compilers - but users must be informed Hybrid DSP/ GP systems