Download Memory Design

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Random-access memory wikipedia , lookup

Transcript
Memory Design
•
•
•
•
•
Memory Types
g
Memoryy Organization
ROM design
g
RAM design
PLA design
Adapted from J. M. Rabaey, A. Chandrakasan and B. Nikolic, Digital
Integrated Circuits, 2nd ed. Copyright 2003 Prentice Hall/Pearson.
ECE 261
James Morizio
1
Semiconductor Memory
Classification
Read-Write Memory
Non-Volatile
Read-Write
Read-Only Memory
Memory
Random
Access
Non-Random
Access
EPROM
2
E PROM
SRAM
FIFO
DRAM
LIFO
Mask-Programmed
P
Programmable
bl (PROM)
FLASH
Shift Register
CAM
ECE 261
James Morizio
2
M
Memory
Timing:
Ti i
Definitions
D fi i i
Read cycle
y
READ
Read access
Read access
Write cycle
WRITE
Write access
Data valid
DATA
Data written
ECE 261
James Morizio
3
Memory
y Architecture:
Decoders
M bits
S0
M bits
S0
Word 0
S1
Word 1
S2
Word 2
Storage
cell
Word 0
A0
Word 1
A1
Word 2
A K2
words
N SN 2
2
SN 2
Word N 2 2
Decoder Word N 2 2
1
Word N 2 1
Word N 2 1
1
Storage
cell
K 5 log2N
Input-Output
Input-Output
(M bits)
(M bits)
Intuitive architecture for N x M memory Decoder reduces the number of select signals
Too many select signals:
K = log2N
N words
d == N select
l t signals
i
l
ECE 261
James Morizio
4
Array-Structured Memory Architecture
Problem: ASPECT RATIO or HEIGHT >> WIDTH
2L 2 K
Storage cell
Row Decoder
AK
Bit line
A K1 1
AL2 1
Word line
M.2K
Sense amplifiers / Drivers
A0
A K2 1
Column decoder
Amplify swing to
rail-to-rail amplitude
Selects appropriate
word
p
p
Input-Output
(M bits)
ECE 261
James Morizio
5
Hierarchical Memory Architecture
Block 0
Block i
Block P 2 1
Row
address
dd
Column
address
Block
address
Global data bus
Control
circuitry
Block selector
Global
amplifier/driver
I/O
Advantages:
1. Shorter wires within blocks
2. Block address activates only 1 block => power savings
ECE 261
James Morizio
6
R d O l Memory
Read-Only
M
Cells
C ll
BL
BL
BL
VDD
WL
WL
WL
1
BL
WL
BL
BL
WL
WL
0
GND
Diode ROM
ECE 261
MOS ROM 1
James Morizio
MOS ROM 2
7
MOS OR ROM
BL[0]
BL[1]
BL[2]
BL[3]
WL[0]
V DD
WL[1]
WL[2]
V DD
WL[3]
V bias
Pull-down loads
ECE 261
James Morizio
8
ROM Example
• 4-word x 6-bit ROM
Word 0: 010101
– Represented with dot diagram
– Dots indicate 1’s in ROM
weak
pseudo-nMOS
pullups
A1 A0
Word 1: 011001
Word 2: 100101
Word 3: 101010
2:4
DEC
ROM Arrayy
Y5
Y4
Y3
Y2
Y1
Y0
Looks like 6 4
4-input
input pseudo
pseudo-nMOS
nMOS NORs
ECE 261
James Morizio
9
MOS NOR ROM
V DD
Pull up devices
Pull-up
WL[0]
GND
WL [1]
WL [2]
GND
WL [3]
BL [0]
ECE 261
BL [1]
BL [2]
James Morizio
BL [3]
10
MOS NOR ROM Layout
Cell (9.5λ x 7λ)
Programmming using the
Active Layer
y Only
y
Polysilicon
Metal1
Diffusion
Metal1 on Diffusion
ECE 261
James Morizio
11
MOS NOR ROM Layout
Cell (11λ x 7λ)
Programmming using
the Contact Layer Only
Polysilicon
Metal1
Diffusion
Metal1 on Diffusion
ECE 261
James Morizio
12
MOS NAND ROM
V DD
Pull-up devices
BL [0]
BL[1]
BL[2]
BL[3]
WL [0]
WL [1]
WL [2]
WL [3]
All word lines high by default with exception of selected row
ECE 261
James Morizio
13
MOS NAND ROM Layout
Cell (8λ x 7λ)
Programmming
P
i using
i
the Metal-1 Layer Only
No contact to VDD or GND necessary;
drastically reduced cell size
Loss in performance compared to NOR ROM
Polysilicon
Diffusion
Metal1 on Diffusion
ECE 261
James Morizio
14
NAND ROM Layout
Cell (5λ x 6λ)
Programmming
P
i using
i
Implants Only
Polysilicon
Threshold-altering
implant
Metal1 on Diffusion
ECE 261
James Morizio
15
Decreasing
g Word Line Delayy
Driver
WL
Polysilicon word line
Metal word line
(a) Driving the word line from both sides
Metal bypass
WL
K cells
Polysilicon word line
(b) Using a metal bypass
ECE 261
James Morizio
16
Precharged
g MOS NOR ROM
f
V DD
pre
Precharge devices
WL [0]
GND
WL [1]
WL [2]
GND
WL [3]
BL [0]
BL [1]
BL [2]
BL [3]
PMOS precharge device can be made as large as necessary,
but clock driver becomes harder to design.
design
ECE 261
James Morizio
17
Read-Write Memories (RAM)
‰ STATIC (SRAM)
Data
D
t stored
t
d as llong as supply
l iis applied
li d
Large (6 transistors/cell)
Fast
Differential
‰ DYNAMIC (DRAM)
Periodic refresh required
Small (1-3 transistors/cell)
Slower
Single Ended
ECE 261
James Morizio
18
6 transistor CMOS SRAM Cell
6-transistor
WL
V DD
M2
M5
M4
Q
Q
M1
M3
BL
ECE 261
M6
BL
James Morizio
19
6T-SRAM — Layout
M2
VDD
M4
Q
Q
M1
M3
GND
M5
BL
ECE 261
M6
WL
BL
James Morizio
20
Statue of Goethe and Schiller: the German National Theater, Weimar
ECE 261
James Morizio
21
3-Transistor DRAM Cell
BL 1
BL 2
WWL
WWL
RWL
M3
X
M1
CS
M2
RWL
X
BL 1
BL 2
No constraints on device ratios
Reads are non-destructive
Value stored at node X when writing a “1” = V WWL-VTn
ECE 261
James Morizio
22
3T-DRAM — Layout
BL2
BL1
GND
RWL
M3
M2
WWL
M1
ECE 261
James Morizio
23
1-Transistor DRAM Cell
BL
Write 1
WL
Read 1
WL
M1
V DD 2 V T
X GND
CS
V DD
BL
V DD /2
V /2
sensing DD
CBL
Write: C S is charged or discharged by asserting WL and BL.
Read: Charge redistribution takes places between bit line and storage capacitance
CS
ΔV = VBL – V PRE = V BIT – V PRE -----------C S + CBL
Voltage swing is small; typically around 250 mV.
ECE 261
James Morizio
24
DRAM Cell Observations
‰ 1T
DRAM requires a sense amplifier for each bit line, due to
charge redistribution read-out.
‰ DRAM memory cells
ll are single-ended
i l
d d in
i contrast to SRAM
cells.
‰The read-out of the 1T DRAM cell is destructive; read and
refresh
f
h operations
ti
are necessary for
f correctt operation.
ti
‰ Unlike 3T cell, 1T cell requires presence of an extra
capacitance that must be explicitly included in the design.
‰ When writing a “1” into a DRAM cell, a threshold voltage is
lost. This charge loss can be circumvented by bootstrapping
the word lines to a higher value than VDD
ECE 261
James Morizio
25
1-T DRAM Cell
Capacitor
Metal word line
M 1 word
line
SiO2
Poly
n+
Field Oxide
n+
Poly
Inversion layer
induced by
plate bias
Diffused
bit line
Polysilicon
gate
Cross-section
Cross
section
Polysilicon
plate
Layout
Uses Polysilicon-Diffusion Capacitance
Expensive in Area (trend now is to use trench capacitors
ECE 261
James Morizio
26
Periphery
p y
‰ Decoders
‰ Sense Amplifiers
p
‰ Input/Output Buffers
‰ Control / Timing Circuitry
ECE 261
James Morizio
27
R D
Row
Decoders
d
Collection of 2M complex logic gates
Organized in regular and dense fashion
(N)AND Decoder
NOR Decoder
ECE 261
James Morizio
28
Hierarchical Decoders
Multi-stage implementation improves performance
•••
WL 1
WL 0
A 0A 1 A 0A 1 A 0A 1 A 0A 1
A 2A 3 A 2A 3 A 2A 3 A 2A 3
•••
NAND decoder using
2-input pre
pre--decoders
A1 A0
ECE 261
A0
A1
A3 A2
A2
A3
James Morizio
29
D namic Decoders
Dynamic
Precharge devices
GND
VDD
GND
WL3
VDD
WL 3
WL 2
WL 2
VDD
WL 1
WL 1
V DD
WL 0
WL 0
VDD φ
A0
A0
A1
A1
2-input NOR decoder
ECE 261
A0
A0
A1
A1
φ
2-input NAND decoder
James Morizio
30
4-to-1 tree based column decoder
BL 0 BL 1 BL 2 BL 3
A0
A0
A1
A1
D
Number of devices drastically reduced
Delay increases quadratically with # of sections; prohibitive for large decoders
Solutions: buffers
progressive sizing
combination
bi ti off tree
t
and
d pass transistor
t
i t approaches
h
ECE 261
James Morizio
31
PLA versus ROM
‰ Programmable Logic Array
structured approach to random logic
“two level logic implementation”
NOR-NOR (product of sums)
NAND-NAND (sum of products)
SIMILAR TO ROM
‰ Main difference
ROM: fully populated
PLA: one element per minterm
Note: Importance
p
of PLA’s has drastically
y reduced
1. slow
2. better software techniques (mutli-level logic
synthesis)
B t…
But
ECE 261
James Morizio
32
Programmable Logic Array
P
Pseudo-NMOS
d NMOS PLA
GND
GND
GND
V DD
GND
GND
GND
GND
V DD
X0
X0
X1
X1
X2
X2
AND-plane
ECE 261
f0
f1
OR-plane
James Morizio
33
Dynamic PLA
f AND
V DD
GND
f
OR
f
OR
f AND
V DD
X0
X0
X1
X1
X2
X2
AND-plane
AND
plane
ECE 261
f0
f 1 GND
OR-plane
OR
plane
James Morizio
34
PLA Layout
VDD
And-Plane
x0 x0 x1 x1 x2 x2
Pull-up devices
ECE 261
James Morizio
Or-Plane
φ
GND
f0 f1
Pull-up devices
35
CAMs
• Extension
E t i off ordinary
di
memory ((e.g. SRAM)
– Read and write memory as usual
– Also match to see which words contain a key
adr
data/key
read
CAM
match
write
ECE 261
James Morizio
36
10T CAM Cell
• Add four match transistors to 6T SRAM
– 56 x 43 λ unit cell
bit
bit b
bit_b
word
cell_b
cell
match
ECE 261
James Morizio
37
CAM Cell Operation
• Read and write like ordinary SRAM
• For matching:
Leave wordline low
Precharge matchlines
Place key on bitlines
Matchlines evaluate
address
clk
weak
miss
match0
row decoder
–
–
–
–
CAM cell
match1
match2
match3
read/write
• Miss line
column circuitry
data
– Pseudo-nMOS NOR of match lines
– Goes high if no words match
ECE 261
James Morizio
38