* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Download Dynamically Parameterized Architectures for Power Aware Video
Transmission line loudspeaker wikipedia , lookup
Electrification wikipedia , lookup
Power inverter wikipedia , lookup
Variable-frequency drive wikipedia , lookup
Buck converter wikipedia , lookup
Immunity-aware programming wikipedia , lookup
Electrical substation wikipedia , lookup
Stray voltage wikipedia , lookup
Power electronics wikipedia , lookup
History of electric power transmission wikipedia , lookup
Time-to-digital converter wikipedia , lookup
Power engineering wikipedia , lookup
Rectiverter wikipedia , lookup
Distribution management system wikipedia , lookup
Voltage optimisation wikipedia , lookup
Switched-mode power supply wikipedia , lookup
Using System-on-a-Chip as a
Vehicle for VLSI Design Education
Andrew Laffely and Wayne Burleson
Electrical and Computer Engineering
University of Massachusetts Amherst
{alaffely,burleson}@ecs.umass.edu
This material is based upon work supported by the National Science Foundation under Grant No. 9988238
and SRC Tasks 766 and 1075
Burleson, UMASS
1
Challenges in VLSI Education
•
•
•
•
•
Advancing Processing Technology
Higher level design tools
Realistic yet tractable design projects
Preparation for jobs in semiconductor
and other sectors.
Making best use of faculty/student time
and university resources
Burleson/UMASS
2
ECE 559/659: VLSI Design Project
(10 grads, 20 seniors)
Course Objectives:
• Learn design process for a complex VLSI in
deep sub-micron CMOS
• Learn VLSI design skills and tools, including
working in teams
• Learn about a particular application
component and its VLSI implementation
• Learn to present formal design reviews using
oral, written, graphical and web-based
techniques
Burleson/UMASS
3
Key Aspects of the Course
•
aSoC (home-grown SoC platform)
•
•
•
•
•
•
Graduate and undergraduate teamwork
•
•
•
Provides a unifying framework to class
Allows for subdivision but inter-relation of projects
Interesting cutting edge architecture based on NSF- and
SRC-funded research at UMASS and elsewhere
Covers many aspects of VLSI Design
Realistic constraints on area, timing, power and I/O
Graduate students provide leadership, motivation and
experience
Commercial tools and design flow
Review-based evaluation
•
Oral and web-based reports for 4 different reviews:
proposal, feasibility, implementation, integration
Burleson/UMASS
4
Adaptive System-on-a-Chip (aSoC)
Tile
mProc
•
•
Multiplier
•
Communication
Interface
North
FPGA
Tiled architecture with
mesh interconnect
Multiplier
East
West
Allows for heterogeneous
cores
•
•
ctrl
South
Core
Burleson/UMASS
Differing sizes, clock
rates, voltages
Low-overhead core
interface for
•
•
Point to point
communication pipeline
On-chip bus substitute
for streaming
applications
Based on static
scheduling
•
Fast and predictable
5
Communication Interface
Core
•
Core-ports
North
North
South
East
South
•
East
•
West
West
Inputs
Local Config.
Crossbar
Decoder
North to South & East
•
Outputs
Controller
Custom design to
maximize speed
and reduce power
Local
Frequency
& Voltage
•
•
Core-ports
Crossbar
Controller
Instruction
memory
Local frequency
and voltage supply
PC
Instruction Memory
Burleson/UMASS
6
Class Projects
SoC Infrastructure1,3
•
Communication
Interface
• Interconnect3
• Power Distribution
• Clock System
• Power Management
•
1
2
3
Used in PhD Dissertation
Used in Masters Thesis
Used in Publications
•
Cores
Motion estimation for
video encoding2,3
• AES Cryptography3
• Cache2,3
• Huffman Coding
• 3D Graphics1,2,3
• Discrete Cosine
Transform2,3
• Smart Card2,3
•
Burleson/UMASS
7
Design Flow
http://vsp2.ecs.umass.edu/vspg/658/TA_Tools/design_flow.html
•
Architecture to Layout
Architecture: Block diagram of system and behavioral description
Logic: Gate level or schematic description
• Circuit: Transistor sizing
• Layout: Floorplanning, clock and power distribution
•
•
•
Tools
•
•
•
•
•
•
•
•
VerilogXL: behavioral representation
VTVT: standard cell library
Synopsys: standard cell gate level netlist generation
Silicon Ensemble: standard cell netlist to layout
Cadence LayoutPlus: schematic and layout design
NCSU CDK: design and extraction rules
Cadence Layout vs. Schematic: layout verification
HSPICE: circuit simulator
Burleson/UMASS
8
aSoC Implementation and Integration
2500 l
.18m TSMC technology
Full custom
3000 l
Burleson/UMASS
9
Advanced Signaling Techniques
(building on SRC-funded work)
Differential current sensing
Booster Insertion
Multi-level current signaling
Phase coding
Burleson/UMASS
10
Circuit Level Simulation (HSPICE)
Evaluating Subsystems with realistic models
•
•
•
Capacitance, resistance and inductance
Process variations
Process generations
Burleson/UMASS
11
Interconnect Characterization:
Comparing delay and power of signaling techniques for different
tile sizes at 250nm, 180nm, 130nm, 100n
Burleson/UMASS
12
Voltage Scaling Approach
•
Core-ports
•
•
Single buffer for each
stream to cross
clock/voltage barrier
between core and
interface
Reading/Writing
success rates indicate
core utilization
Input blocked: Core
too slow
• Output blocked:
Core too fast
•
•
Controller
•
Interprets core-port
success rates to adjust
local clock and voltage
Core
Buffer
Processing
Pipeline
Local Local
Vdd Clock
Input
Core-port
Output
Core-port
Clock
Blocked
Blocked
and
Supply
Controller
Interconnect
Burleson/UMASS
13
Vdd Selection Criteria
Normalized Core Critical Path Delay vs. Vdd
12
Normalized
Delay 10
1/8 Speed 8
6
1/4 Speed 4
1/2 Speed
•
•
•
As Vdd decreases delay increases
exponentially
Use curve to match available clock
frequencies to voltages
The voltage and frequency change
reduces power by 79%, 96%, and
98.7%
•
P = aC(Vdd)2f
2
Max Speed
0
0.4 0.6 0.8 1 1.2 1.4 1.6 1.8
0.73 1.16
2
Voltage
Burleson/UMASS
14
Clock Distribution
Tile
• Tiled architecture extends life
of globally synchronous
systems
• Precise H-tree implementation
• Load is small and equal at each
branch
• Skew can be reduced by 70%
with advanced deskew circuits1
64 tile aSoC
70nm
100nm
130nm
180nm
Chip Area
(9.24mm)2
(13.3mm)2
(17.2mm)2
(23.8mm)2
Frequency
5 GHz
2 GHz
1 GHz
0.5 GHz
Power
126 mW
240 mW
445 mW
784 mW
Mean Skew
41 ps
50 ps
92 ps
70.6 ps
Percent Skew
21 %
10 %
9%
4%
S. Tan et al. “Clock Generation and Distribution for the First IA-64 Microprocessor”
IEEE JSSC, Nov. 2000
Burleson/UMASS
1
15
Power Distribution
• Heterogeneous cores may
require multiple power supply
voltages
• Tile structure enables
uniform interwoven grid
• Larger grid for higher current
demands
Gnd
Vml
Vl
Vmh
• Reduced resistance
• Higher capacitance
Vh
64 tile aSoC
Vh
Vmh
Vml
Vl
Voltage
1.8V
1.16V
0.73V
0.6V
Current
per Core
110mA
25mA
13mA
7mA
Total Power
12.1 W
1.86 W
607 mW
269 mW
Burleson/UMASS
16
Architecture Evaluation
(Motion Estimation)
•
Array-based
architecture
•
•
Memory
Pipelined ME
FIFOs
Parameterized
search window
size
•
•
•
Address
Generation
Unit
Full search
Choose 16x16 or
8x8 windows
Reduce power
Burleson/UMASS
Processing
Element
Array
17
Modify Existing Designs
•
•
Take existing Verilog code or hardware and improve
or change functionality (e.g. add motion estimation
algorithms, provide AES key-length flexibility)
Evaluate changes in performance and overhead
- Old PE Layout
- New PE Layout
Burleson/UMASS
18
Conclusions
•
Advancing Process Technology
•
•
Higher level design tools
•
•
Re-use existing projects and provide unifying themes
Preparation for jobs in semiconductor and other sectors.
•
•
•
Combine synthesis and custom techniques
Realistic yet tractable design projects
•
•
Target .18u for affordable fab but also do scaling studies
Focus on system design and appropriate levels of abstraction
Teach how to learn new tools
Making best use of faculty/student time and university resources
Leverage research
Combine grad and undergrad
• Re-use materials, tools
•
•
Burleson/UMASS
19