Download memories

Document related concepts
no text concepts found
Transcript
Digital Integrated
Circuits
A Design Perspective
Jan M. Rabaey
Anantha Chandrakasan
Borivoje Nikolic
Semiconductor
Memories
December 20, 2002
© Digital Integrated Circuits2nd
Memories
Chapter Overview
 Memory Classification
 Memory Architectures
 The Memory Core
 Periphery
 Reliability
 Case Studies
© Digital Integrated Circuits2nd
Memories
Semiconductor Memory Classification
Read-Write Memory
Random
Access
Non-Random
Access
SRAM
FIFO
DRAM
LIFO
Non-Volatile
Read-Write
Memory
Read-Only Memory
EPROM
Mask-Programmed
E2PROM
Programmable (PROM)
FLASH
Shift Register
CAM
© Digital Integrated Circuits2nd
Memories
Memory Timing: Definitions
Read-access is from read request time to the time data is available on the output
Write-access is from the write request to the final writing of the input data to the memory
© Digital Integrated Circuits2nd
Memories
Memory Architecture: Decoders
M bits
S0
S0
Word 0
Word 1
S2
Word 2
SN 2
2
SN 2
1
Storage
cell
Word 0
A0
Word 1
A1
Word 2
A K2
1
Decoder
S1
N words
M bits
Word N 2 2
Word N - 2
Word N 2 1
Word N - 1
Storage
cell
K = log2N
Input-Output
(M bits)
Intuitive architecture for N x M memory
Too many select signals:
N words == N select signals
© Digital Integrated Circuits2nd
Input-Output
(M bits)
Decoder reduces the number of selected signals
K = log2N
Memories
Array-Structured Memory Architecture
Problem: ASPECT RATIO or HEIGHT >> WIDTH
Amplify swing to
rail-to-rail amplitude
Selects appropriate
word
© Digital Integrated Circuits2nd
Memories
Hierarchical Memory Architecture
Advantages:
1. Shorter wires within blocks
2. Block address activates only 1 block => power savings
© Digital Integrated Circuits2nd
Memories
Block Diagram of 4 Mbit SRAM
Clock
generator
Z-address
buffer
X – row address
Y – column address
Z – block address
X-address
buffer
Predecoder and block selector
Bit line load
128 K Array Block 0
Block 1
Subglobal row decoder
Global row decoder
Subglobal row decoder
Block 30
Block 31
Transfer gate
Column decoder
Local row decoder
Sense amplifier and write driver
CS, WE
buffer
I/O
buffer
© Digital Integrated Circuits2nd
x1/x4
controller
Y-address
buffer
[Hirose90]
X-address
buffer
Memories
Contents-Addressable Memory
Data (64 bits)
Every row that satisfies the validity
check (matches the mask bits) will
activate the validity bits
I/O Buffers
Comparand
Commands
Data pattern to be matched
© Digital Integrated Circuits2nd
CAM Array
2 words 64 bits
9
ProrityEncoder
Control Logic R/W Address (9 bits)
Address Decoder
Mask
Priority encoder passes
Only the valid rows and
selects the one with the
highest address in case
of the multiple matches
Memories
Memory Timing: Approaches
Multiplexed addressing
Smaller number of pins needed since row
and column addresses are sent sequentially
Strobe signals RAS and CAS are used to
read them
DRAM Timing
Multiplexed Addressing
© Digital Integrated Circuits2nd
Complete addressing
Complete address is presented at once
with sensing circuits to detect transitions
on the address buss
SRAM Timing
Self-timed
Memories
Read-Only Memory Cells
Bit line (BL) is resistively clamped to the ground, so its
default value is 0
Diode disadvantage – no electrical isolation between bit
and word lines
BL
BL is resistively clamped
to VDD, so its default
value is 1
BL
BL
VDD
WL
WL
WL
1
BL
WL
BL
BL
WL
WL
0
GND
Diode ROM
© Digital Integrated Circuits2nd
MOS ROM 1
MOS ROM 2
Memories
MOS OR ROM
BL[0]
BL[1]
BL[2]
BL[3]
WL[0]
V DD
WL[1]
WL[2]
V DD
WL[3]
V bias
Pull-down loads
© Digital Integrated Circuits2nd
Memories
MOS NOR ROM
V DD
Pull-up devices
WL[0]
GND
WL [1]
WL [2]
GND
WL [3]
BL [0]
© Digital Integrated Circuits2nd
BL [1]
BL [2]
BL [3]
Memories
MOS NOR ROM Layout
Cell (9.5l x 7l)
Programming using the
Active Layer Only
GND
- diffusion is added to
create transistors
Polysilicon
GND
Metal1
Diffusion
Metal1 on Diffusion
VDD
VDD
© Digital Integrated Circuits2nd
VDD
VDD
Connected to VDD through pMOS
Memories
MOS NOR ROM Layout
Cell (11l x 7l)
Programming using
the Contact Layer Only
GND
- contacts are added
to create transistors
Polysilicon
Metal1
GND
Diffusion
Metal1 on Diffusion
VDD VDD
© Digital Integrated Circuits2nd
VDD
VDD
Connected to VDD through pMOS
Memories
MOS NAND ROM
V DD
Pull-up devices
BL [0]
BL [1]
BL [2]
BL [3]
WL [0]
WL [1]
WL [2]
WL [3]
All word lines high by default with exception of selected row
© Digital Integrated Circuits2nd
Memories
MOS NAND ROM Layout
Cell (8l x 7l)
Programming using
the Metal-1 Layer Only
No horizontal contact to GND necessary;
drastically reduced cell size
Loss in performance compared to NOR ROM
Polysilicon
Diffusion
Metal1 on Diffusion
Add metal to eliminate transistor (short circuit)
© Digital Integrated Circuits2nd
Memories
NAND ROM Layout
Cell (5l x 6l)
Programming using
Implants Only
Polysilicon
Implant switches transistor on permanently
thus eliminating it in ROM
this results in a very small layout area
(2 times smaller than NOR cell)
© Digital Integrated Circuits2nd
Threshold-altering
implant
Metal1 on Diffusion
Memories
Equivalent Transient Model for MOS NOR ROM
V DD
Model for NOR ROM
BL
rword
WL
Cbit
cword

Word line parasitics
 Wire capacitance and gate capacitance
 Wire resistance (polysilicon)

Bit line parasitics
 Resistance not dominant (metal)
 Drain and Gate-Drain capacitance
© Digital Integrated Circuits2nd
Memories
Equivalent Transient Model for MOS NAND ROM
V DD
Model for NAND ROM
BL
CL
r bit
WL
r word
cbit
cword

Word line parasitics
 Similar to NOR ROM

Bit line parasitics
 Resistance of cascaded transistors dominates
 Drain/Source and complete gate capacitance
© Digital Integrated Circuits2nd
Memories
Decreasing Word Line Delay
Driver
WL
Polysilicon word line
Metal word line
(a) Driving the word line from both sides
Metal bypass
WL
K cells
Polysilicon word line
(b) Using a metal bypass
(c) Use silicides
© Digital Integrated Circuits2nd
Memories
Precharged MOS NOR ROM
f
V DD
pre
Precharge devices
WL [0]
GND
WL [1]
WL [2]
GND
WL [3]
BL [0]
BL [1]
BL [2]
BL [3]
PMOS precharge device can be made as large as necessary,
but clock driver becomes harder to design.
© Digital Integrated Circuits2nd
Memories
Non-Volatile Memories
The Floating-gate Avalanche injection
MOS transistor (FAMOS)
Floating gate
Gate
Source
D
Drain
G
tox
tox
n+
p
n+_
S
Substrate
Device cross-section
© Digital Integrated Circuits2nd
Schematic symbol
Memories
Floating-Gate Transistor Programming
20 V
10 V
S
5V
0V
20 V
D
Avalanche injection
Hot electrons go through
the oxide and are trapped
reducing floating gate voltage
© Digital Integrated Circuits2nd
- 5V
S
5V
0V
D
Removing programming
voltage leaves charge trapped
- 2.5 V
S
5V
D
Programming results in
higher V T .
Typically new threshold is
around 7 V so 5 V supply is
not sufficient to turn transistor
on, so the device is disabled
Memories
A “Programmable-Threshold” Transistor
Shifted threshold effectively
switches transistor off permanently
Shift in the threshold depends on the
charge injected onto the floating gate
VT  (QFG ) / CFG
Injected charges are well insulated by silicon dioxide and can
stay there with power off for many years.
Floating gate used in almost all nonvolatile memories (EPROM, EEPROM, Flash)
© Digital Integrated Circuits2nd
Memories
Erasable-Programmable
Read-Only Memory (EPROM)
EPROM is erased by shining ultraviolet light
through a transparent package window
UV makes the oxide slightly conductive by
generation of electron-hole pairs
The erasure process is slow (seconds to minutes)
Programming takes 5-10 msec
Erasing can be repeated up to a thousand times
Threshold is difficult to control after many erasures
so special on-chip circuitry is required to control it
High current required during programming
© Digital Integrated Circuits2nd
Memories
Floating Gate Tunneling Oxide
FLOTOX Transistor EEPROM
Gate
Floating gate
I
Drain
Source
20–30 nm
V GD
-10 V
10 V
n1
n1
Substrate
p
10 nm
FLOTOX transistor
Fowler-Nordheim
tunneling
I-V characteristic
Smaller voltage is required to program and
programming is reversible by changing the voltage sign
© Digital Integrated Circuits2nd
Memories
EEPROM Cell
BL Absolute threshold control
WL
VDD
© Digital Integrated Circuits2nd
is hard
Programmed transistor might
be in depletion mode hard to
turn off by word-line signal
2 transistor cell
Design is larger than
EPROM device
Thin oxide is hard to make
Erase-program can be
repeated 105 times
Memories
Flash EEPROM
oFlash EPROM combines density of EPROM with versatility of EEPROM
oProgramming performed by avalanche hot-electron injection (fast 1-10msec)
oErasure of the complete chip is done using tunneling (careful control >100ms)
oNo extra access transistor needed
Control gate
Floating gate
Erasure
12 V on source
n + source
Thin tunneling oxide ~10 nm
Programming
12 V on gate
n + drain
p- substrate
© Digital Integrated Circuits2nd
Memories
Cross-sections of NVM cells
Flash
© Digital Integrated Circuits2nd
EPROM
Courtesy Intel
Memories
Basic Operations in a NOR Flash Memory―
Erase
All transistors erased
Read back and repeat if
erase is still needed
© Digital Integrated Circuits2nd
Memories
Basic Operations in a NOR Flash Memory―
Write
Programming requires 12V ant
the gate and 6V at the drain
with 0V source
© Digital Integrated Circuits2nd
Memories
Basic Operations in a NOR Flash Memory―
Read
In read operation the programmed
transistor stores 1 as it is switched
off permanently
NOR Flash memories have
Fast random read time
Slow erasure and programming time
Need precise control of thresholds
© Digital Integrated Circuits2nd
Memories
NAND Flash Memory
Select transistors
High dielectric material –
oxide-nitride-oxide for large CFG
Word line(poly)
Gate
ONO
Unit Cell
Gate
Oxide
Source line
(Diff. Layer)
© Digital Integrated Circuits2nd
FG
Large storage density lower cost
Fast programming 200-400 nsec
Fast serial access
Courtesy Toshiba
Memories
NAND Flash Memory
All contacts between word line
are eliminated resulting in 40%
smaller cell than NOR structure
Select transistor
Active area
STI
Bit line contact
© Digital Integrated Circuits2nd
Word lines
During erasure all transistor are
depletion mode obtained by
setting 20V at the source
During program the selected word
line is set to 20V to store “1”
increasing its threshold
During read both select transistors
are enabled and read proceeds as
in the NAND ROM
Source line contact
Courtesy Toshiba
Memories
New Nonvolatile Memories

FRAM – Ferroelectric RAM





Uses programmable capacitors
Dielectric cristals polarize under electric field
Very high density, many read/write cycles,
Low power
MRAM – Magnetoresistive RAM
 Similar to magnetic core memories
 Spin electronic or tunneling magnetic resistance
© Digital Integrated Circuits2nd
Memories
Characteristics of State-of-the-art NVM
© Digital Integrated Circuits2nd
Memories
Read-Write Memories (RAM)
 STATIC (SRAM)
Data stored as long as supply is applied
Large (6 transistors/cell)
Fast
Differential
 DYNAMIC (DRAM)
Periodic refresh required
Small (1-3 transistors/cell)
Slower
Single Ended
© Digital Integrated Circuits2nd
Memories
6-transistor CMOS SRAM Cell
WL
V DD
M2
M5
Q
M1
BL
© Digital Integrated Circuits2nd
M4
Q
M6
M3
BL
Memories
CMOS SRAM Analysis (Read)
WL
V DD
M4
BL
Q= 0
M5
V DD
Cbit
M1
Q= 1
V DD
BL
M6
V DD
Cbit
Initially BL and BLbar are precharged. Assume that cell stores 1
When the WL goes high BLbar is discharged low
Must keep Qbar voltage rise below the nMOS threshold (0.4V)
to avoid flipping the cell
© Digital Integrated Circuits2nd
Memories
CMOS SRAM Analysis (Read)
1.2
Voltage Rise (V)
1
Cell ratio
0.8
0.6
0.4
0.2
0
0
0.5
© Digital Integrated Circuits2nd
1 1.2 1.5 2
Cell Ratio (CR)
2.5
3
Design for
cell ratio in the
green zone
Memories
CMOS SRAM Analysis (Write)
WL
V DD
M4
M5
Q= 1
M1
BL = 1
© Digital Integrated Circuits2nd
M6
Q= 0
V DD
BL = 0
Memories
CMOS SRAM Analysis (Write)
Must pull down cell voltage VQ below the nMOS threshold (0.4V)
© Digital Integrated Circuits2nd
Memories
6T-SRAM — Layout
VDD
M2
M4
Q
Q
M1
M3
GND
M5
BL
© Digital Integrated Circuits2nd
M6
WL
BL
Memories
Resistance-load SRAM Cell
WL
V DD
RL
M3
BL
RL
Q
Q
M1
For large resistor values use
undoped polysilicon with
sheet resistance in Tohm/sq
M2
M4
BL
An alternative solution uses low
quality parasitic thin-film PMOS
(TFT) with OFF current 10-13A
Static power dissipation -- Want R L large
Bit lines precharged to V DD to address t p problem
© Digital Integrated Circuits2nd
Memories
Resistance-load SRAM Cell
WL
V DD
RL
M3
BL
RL
Q
Q
M1
For large resistor values use
undoped polysilicon with
sheet resistance in Tohm/sq
M2
M4
BL
An alternative solution uses low
quality parasitic thin-film PMOS
(TFT) with OFF current 10-13A
Static power dissipation -- Want R L large
Bit lines precharged to V DD to address t p problem
© Digital Integrated Circuits2nd
Memories
Static CAM Memory Cell
Incoming data on
Bit and Bit_bar
are compared with
the stored data
S and S_bar
Bit
Word
Bit Bit
Bit
Bit
CAM •••
Match line is
•••
precharged high
M4
CAM
Word
Word
CAM
•••
•••
Bit
CAM
M8
M9
M6
M7
S
M3
Match
int
M5
S
M2
M1
Wired-NOR Match Line
If there is a match the internal signal int is grounded and match stays high
Otherwise int is pulled high and match goes low
© Digital Integrated Circuits2nd
Memories
SRAM Characteristics
© Digital Integrated Circuits2nd
Memories
3-Transistor DRAM Cell
BL 1
BL 2
WWL
WWL
RWL
M3
X
M1
CS
M2
RWL
V DD 2 V T
X
BL 1
BL 2
V DD
DV
V DD 2 V T
No constraints on device ratios
Reads are non-destructive
Value stored at node X when writing a “1” = V WWL-VTn
© Digital Integrated Circuits2nd
Memories
3T-DRAM — Layout
BL2
BL1
GND
RWL
M3
M2
WWL
M1
© Digital Integrated Circuits2nd
Memories
1-Transistor DRAM Cell
Storage capacitance
Bit-line is precharged
Write: C S is charged or discharged by asserting WL and BL.
Read: Charge redistribution takes places between bit line and storage capacitance
CS
V = VBL – V PRE = (VBIT – V PRE )-----------C S + CBL
Voltage swing is small; typically around 250 mV.
© Digital Integrated Circuits2nd
Memories
DRAM Cell Observations
 1T DRAM requires a sense amplifier for each bit line,
due to charge redistribution read-out.
 DRAM memory cells are single ended in contrast to
SRAM cells.
The read-out of the 1T DRAM cell is destructive; read
and refresh operations are necessary for correct
operation.
 Unlike 3T cell, 1T cell requires presence of an extra
capacitance that must be explicitly included in the
design.
 When writing a “1” into a DRAM cell, a threshold
voltage is lost. This charge loss can be circumvented by
bootstrapping the word lines to a higher value than VDD
© Digital Integrated Circuits2nd
Memories
Sense Amp Operation
V BL
V(1)
V PRE
D V(1)
V(0)
Sense amp activated
Word line activated
© Digital Integrated Circuits2nd
t
Memories
1-T DRAM Cell
Capacitor
M 1 word
line
Metal word line
SiO2
Poly
n+
Field Oxide
n+
Poly
Inversion layer
induced by
plate bias
Cross-section
Diffused
bit line
Polysilicon
gate
Polysilicon
plate
Layout
Uses polysilicon-diffusion capacitance
Expensive in area
© Digital Integrated Circuits2nd
Memories
Poly-diffusion capacitor 1T-DRAM
© Digital Integrated Circuits2nd
Memories
Advanced 1T DRAM Cells
Word line
Insulating Layer
Cell plate
Capacitor dielectric layer
Cell Plate Si
Capacitor Insulator
Refilling Poly
Transfer gate
Isolation
Storage electrode
Storage Node Poly
Si Substrate
2nd Field Oxide
Trench Cell
© Digital Integrated Circuits2nd
Stacked-capacitor Cell
Memories
Advanced 1T DRAM Cells
Word line
Insulating Layer
Cell plate
Capacitor dielectric layer
Cell Plate Si
Capacitor Insulator
Refilling Poly
Transfer gate
Isolation
Storage electrode
Storage Node Poly
Si Substrate
2nd Field Oxide
Trench Cell
© Digital Integrated Circuits2nd
Stacked-capacitor Cell
Memories
CAM in Cache Memory
CAM
SRAM
ARRAY
ARRAY
Hit Logic
Input Drivers
Sense Amps / Input Drivers
Address Decoder
Address
Tag
Hit
R/W
Data
Cash memory is used to store frequently accessed data to lower the memory
access time and power. In cash CAM stores addresses and SRAM stores data.
Once the address of requested data matches the one in CAM Hit signal goes
high and data is read from SRAM otherwise the external slow memory
must be read
© Digital Integrated Circuits2nd
Memories
Periphery
 Decoders
 Sense Amplifiers
 Input/Output Buffers
 Control / Timing Circuitry
© Digital Integrated Circuits2nd
Memories
Row Decoders
Collection of 2M complex logic gates
Organized in regular and dense fashion
(N)AND Decoder - followed by inverter
WL0  A0 A1 A2 A3 A4 A5 A6 A7
WL127  A0 A1 A2 A3 A4 A5 A6 A7
set by bits 00000000
set by bits 11111111
NOR Decoder
WL0  A0  A1  A2  A3  A4  A5  A6  A7
WL127  A0  A1  A2  A3  A4  A5  A6  A7
© Digital Integrated Circuits2nd
Memories
Hierarchical Decoders
Multi-stage implementation improves performance
•••
WL 1
WL 0
A 0A 1 A 0A 1 A 0A 1 A 0A 1
A 2A 3 A 2A 3 A 2A 3 A 2A 3
•••
A1 A0
A0
A1
A3 A2
NAND decoder using
2-input pre-decoders
© Digital Integrated Circuits2nd
A2
A3
WL0  A0 A1 A2 A3 A4 A5 A6 A7
 ( A0  A1 )( A2  A3 )( A4  A5 )( A6  A7 )
Memories
Dynamic Decoders
Precharge devices
GND
VDD
GND
WL 3
VDD
WL 3
WL 2
WL 2
VDD
WL 1
WL 1
V DD
WL 0
WL 0
VDD f
A0
A0
A1
A1
2-input NOR decoder
Identical to NOR ROM
© Digital Integrated Circuits2nd
A0
A0
A1
A1
f
2-input NAND decoder
Identical to NAND ROM
Memories
4-input pass-transistor based
Column Decoder BL BL BL BL
0
A0
1
2
3
S0
S1
S2
A1
S3
2-input NOR decoder
D
Advantages: speed (tpd does not add to overall memory access time)
Only one extra transistor in signal path
Disadvantage: Large transistor count
© Digital Integrated Circuits2nd
Memories
4-to-1 tree based Column Decoder
BL 0 BL 1 BL 2 BL 3
A0
A0
A1
A1
D
Number of devices drastically reduced
Delay increases quadratically with # of sections; prohibitive for large decoders
Solutions: buffers
progressive sizing
combination of tree and pass transistor approaches
© Digital Integrated Circuits2nd
Memories
Decoder for circular shift-register
V DD
WL
V DD
V DD
WL
0
R
f
f
f
f
R
V DD
V DD
WL
1
f
f
f
f
R
V DD
2
f
f
f
f
• • •
V DD
In serial access memories read or write address changes sequentially
Only one WLi is active (a pointer) and shifts by one with clock f
R is used for reset
© Digital Integrated Circuits2nd
Memories
Sense Amplifiers
 V
C
tp = ---------------Iav
large
make  V as small
as possible
small
Idea: Use Sense Amplifer
small
transition
s.a.
input
© Digital Integrated Circuits2nd
output
Memories
Differential Sense Amplifier
V DD
M3
M4
y
M1
bit
SE
M2
M5
Out
bit
Directly applicable to
SRAMs. Gain
Asense=-gm (ro2||ro4)
gm is transconductance
of the input transistors
© Digital Integrated Circuits2nd
Memories
Differential Sensing ― SRAM
V DD
PC
BL
Precharge bit lines by pulling PC_bar low
Disable precharge to read
Set SE to activate the sense amplifier
BL
EQ
WL
V DD
V DD
V DD
y M3
i
M1
x
SE
M4
M2
yx-
SE
M5
SE
x-
x
Inputs from memory
SRAM cell i
Diff.
x Sense xAmp
V DD
Output
y
SE
Output
(a) SRAM sensing scheme
© Digital Integrated Circuits2nd
(b) two stage differential amplifier
Memories
Latch-Based Sense Amplifier (DRAM)
EQ
BL
BL
VDD
SE
SE
Initialized in its meta-stable point with EQ
Once adequate voltage gap is created, sense amp enabled with SE
Positive feedback quickly forces output to a stable operating point.
© Digital Integrated Circuits2nd
Memories
© Digital Integrated Circuits2nd
Memories
Charge-Redistribution Amplifier
V ref
BL
Vin
M2
VL
M3
M1
VS
C large
C small
Transient Response
Concept
Vs prechrged to VDD and VL to Vref-Vth
M1 is cut off
When M2 pulls down M1 conducts and Vs
is quickly lowered to equalize VL
© Digital Integrated Circuits2nd
Memories
Charge-Redistribution Amplifier―
V
EPROM
DD
SE
Load
M4
Out
V casc
M3
Cascode
device
Cout
Ccol
WLC
Column
decoder
M2
BL
WL
© Digital Integrated Circuits2nd
M1
CBL
EPROM
array
Memories
Single-to-Differential Conversion
WL
BL
x-
x
Cell
Diff.
S.A.
+
-
V ref
Output
How to make a good Vref for differential amplifier?
Must deliver x_bar smaller than x if “1” is stored
and larger than x if “0” is stored
© Digital Integrated Circuits2nd
Memories
Open bitline architecture with
dummy cells
EQ
L
L1
L0
V DD
R0
R1
L
SE
BLL
CS
…
CS
BLR
CS
Dummy cell
…
SE
CS
CS
CS
Dummy cell
Bit line divided into left and right halves to reduce capacitance
Dummy cells are added for reference
When EQ signal is raised BLL and BLR are precharged to VDD/2
and L and L_bar are enabled charging Dummy cell to VDD/2
When reading Li word activate both Li and L
When reading Ri word activate both Ri and L_bar
© Digital Integrated Circuits2nd
Memories
DRAM Read Process with Dummy Cell
3
3
2
2
BL
V
1
0
0
BL
V
BL
1
2
1
0
3
BL
0
1
t (ns)
2
3
t (ns)
reading 0
reading 1
3
EQ
WL
2
V
SE
1
0
0
1
2
3
t (ns)
control signals
© Digital Integrated Circuits2nd
Memories
Voltage Down Converter
VDD
IR
IR
IL- IR
VREF
Mdrive
VDL
IL
Memories require different level
of voltages
Boosted word-line voltage VDD+Vtn
Precharge half VDD voltage
Reduced internal VDD power supply
Negative substrate bias voltage
Equivalent Model
Vbias
VREF
This circuit delivers the reference
voltage to VDL
When VDL<VREF => PMOS drive
gate is discharged increasing VDL
When VDL>VREF=> PMOS drive
gate is charged increasing VDL
© Digital Integrated Circuits2nd
+
Mdrive
VDL
Memories
Charge Pump
-
Transistors M1 and M2 are connected as diodes.
The charges stored at capacitor Q=Cpump(VDD-VT)
When B rises above Vload by more than threshold the Vload increases
Maximum load voltage Vload rises to 2*(VDD-VT) – higher than VDD
© Digital Integrated Circuits2nd
Memories
DRAM Timing
Total of 24 timing constraints must be observed
© Digital Integrated Circuits2nd
Memories
RDRAM Architecture
Very high speed packet transfer protocol is used to transfer
large amount of data
Narrow bus uses several clock cycles to transfer the data
Bus
Clocks
Data
bus
k
Column
Row
© Digital Integrated Circuits2nd
k* l memory
networkarray
mux/demux
demux
packet dec.
demux
packet dec.
Memories
Address Transition Detection
SRAM memories are triggered by events detected by ATD circuits
V DD
A0
DELAY
td
A1
DELAY
td
A N2 1
DELAY
td
© Digital Integrated Circuits2nd
ATD
ATD
…
Memories
Reliability and Yield
© Digital Integrated Circuits2nd
Memories
Trends in DRAM Parameters
1000
bit line capacitance
voltage swing
storage charge
storage capacitance
power supply voltage
C D (fF)
V smax (mV)
100
C S(fF)
Q S(fC)
are reduced in
new technologies
10
V DD (V)
Q S = C S V DD / 2
V smax = Q S / (C S + C D )
4K
64K
1M
16M 256M 4G
Memory Capacity (bits
© Digital Integrated Circuits2nd
From [Itoh01]
64G
/ chip)
Memories
Open Bit-line Architecture —Cross Coupling
Word line selected
creates charge redistribution
Dummy line selected to
compensate charge redistribution
EQ
WL 1
WL 0
WL
C WBL D
C WBL
WL D
WL 1
WL 0
BL
BL
C BL
C
C
Charge redistribution
C
Sense
Amplifier
C BL
C
C
C
VW LCW LB
CW LB  C BL
If both sides of memory array were symmetrical then the injected bit line
noise would be compensated as a common mode signal for sense amplifiers
© Digital Integrated Circuits2nd
Memories
Folded-Bitline Architecture
WL 1
WL 0 C
WBL
WL 0
WL D
WL D
C BL
BL
…
BL
WL 1
C
y
x
C
C
C
C
C
Sense
EQ Amplifier
x
C BL
y
C WBL
Better matching of BL BL_bar capacitances, so the cross-coupling
noise can be suppressed
© Digital Integrated Circuits2nd
Memories
Transposed-Bitline Architecture
C cross
BL ‘
BL
SA
BL
BL ‘’
(a) Straightforward bit-line routing
C cross
BL ‘
BL
SA
BL
BL ‘’
(b) Transposed bit-line architecture
equalizes cross coupling noise in BL and BL_bar
© Digital Integrated Circuits2nd
Memories
Sources of Power Dissipation in
Memories
V DD
I DD = S C i V if+S I DCP
CHIP
nC DE V INT f
m
selected
C PT V INT f
mi act
I DCP
n
ROW
DEC
PERIPHERY
iact active selector current
ihld data retention current
CDE decoder capacitance
CPT peripheral capacitance
IDCP static peripheral current
VINT internal supply voltage
m(n - 1)i hld
non-selected
ARRAY
mC DE V INT f
COLUMN DEC
V SS
© Digital Integrated Circuits2nd
From [Itoh00]
Memories
Noise Sources in 1T DRam
BL
CWBL
substrate Adjacent BL
a-particles
Capacitance Cs must be above 30fF
otherwise a single a-particle
can destroy its charge
WL
leakage
CS
electrode
Ccross
© Digital Integrated Circuits2nd
Free neutrons from cosmic rays
carry 10 time more charges than
alpha particles
Memories covered with polymide to
protect against alpha radiation
Error correction codes used to
correct most failures
Memories
Alpha-particles
a-particle
WL
V DD
BL
n+
SiO 2
-
+
-
+
+
Alpha particle have energy 8-9 MeV
And penetrates silicon up to 10mm depth
-
-
+
+
+
-
1 Particle generates ~ 1 Million electron-hole pairs
This is comparable with the 50 fF capacitance storage at 3.5V
© Digital Integrated Circuits2nd
Memories
Memory Yield
Yield curves at different stages of process maturity
(from [Veendrick92])
Memory yield problem is fought using redundancy and error correction
© Digital Integrated Circuits2nd
Memories
Redundancy
Row
Address
Redundant
rows
:
Fuse
Bank
Redundant
columns
Memory
Array
Row Decoder
Column Decoder
© Digital Integrated Circuits2nd
Column
Address
Memories
Error-Correcting Codes
Example: Hamming Codes
e.g. B3 Wrong
with
1
1
=3
0
© Digital Integrated Circuits2nd
Memories
Redundancy and Error Correction
© Digital Integrated Circuits2nd
Memories
Redundancy and Error Correction
© Digital Integrated Circuits2nd
Memories
Data Retention in SRAM
SRAM leakage increases with technology scaling yet for 64-Gb
memory leakage current should be less than 3.5 aA at 25C
Reduce leakage by turning
off unused memory blocks (cashes)
Increase thresholds by using
body biasing
Increase resistance in the leakage path
Lower the supply voltage
Lower the junction temperature
Scale down the refresh period
1.30u
0.13 m m CMOS
1.10u
Ileakage
900n
700n
(A)
500n
Factor 7
300n
0.18 m m CMOS
100n
0.00
.600
1.20
1.80
VDD
© Digital Integrated Circuits2nd
Memories
Suppressing Leakage in SRAM
V DD
V DD
low-threshold transistor
V DDL
sleep
V DD,int
sleep
V DD,int
SRAM
cell
SRAM
cell
sleep
SRAM
cell
SRAM
cell
SRAM
cell
V SS,int
Inserting Extra Resistance
© Digital Integrated Circuits2nd
SRAM
cell
Reducing the supply voltage
Memories
Data Retention in DRAM
Data retention
(standby) current
Active current
Estimated current distribution for DRAM generations
© Digital Integrated Circuits2nd
From [Itoh00]
Memories
Case Studies
 Programmable
Logic Array
 SRAM
 Flash
Memory
© Digital Integrated Circuits2nd
Memories
PLA versus ROM
 Programmable Logic Array
structured approach to random logic
“two level logic implementation”
NOR-NOR (product of sums)
NAND-NAND (sum of products)
IDENTICAL TO ROM!
 Main difference
ROM: fully populated
PLA: one element per minterm
Note: Importance of PLA’s has drastically reduced
1. slow
2. better software techniques (mutli-level logic
synthesis) for random logic design
© Digital Integrated Circuits2nd
Memories
Programmable Logic Array
Pseudo-NMOS PLA
GND
GND
GND
V DD
GND
GND
GND
GND
V DD
X0
X0
X1
AND-plane
© Digital Integrated Circuits2nd
X1
X2
X2
f0
f1
OR-plane
Memories
Dynamic PLA
f AND
V DD
GND
f
OR
f
OR
f AND
V DD
X0
X0
X1
X1
AND-plane
© Digital Integrated Circuits2nd
X2
X2
f0
f 1 GND
OR-plane
Memories
Clock Signal Generation
for self-timed dynamic PLA
Self-timing is recommended in PLA for maximum performance
Dummy AND row in PLA estimates maximum loading condition
for the required precharge time
Worst case discharge time is estimated in a similar way
f
f
f
f
AND
Dummy AND row
f
OR
AND
tpre teval
f
Dummy AND row
f
AND
OR
(a) Clock signals
© Digital Integrated Circuits2nd
(b) Timing generation circuitry
Memories
PLA Layout
And-Plane
VDD
x0 x0 x1 x1 x2 x2
Pull-up devices
© Digital Integrated Circuits2nd
Or-Plane
f
GND
f0 f1
Pull-up devices
Memories
4 Mbit SRAM
Hierarchical Word-line Architecture
To improve performance and reduce power consumption in large memories
the word line is hierarchically divided and sections are activated as needed
© Digital Integrated Circuits2nd
Memories
Bit-line Circuitry
Block
select
Bit-line
load
ATD
BEQ
Local WL
Memory cell
B /T
B /T
CD
CD
CD
I/O
I/O line
I/O
Sense amplifier
© Digital Integrated Circuits2nd
Memories
Sense Amplifier (and Waveforms)
Address
I /O
I /O
ATD
SEQ
Block
select
ATD
BS
SA
BS
BEQ
Vdd
I/O Lines
GND
SA
SEQ
SEQ
SEQ
SEQ
SEQ
Vdd
DATA
Dei
SA, SA
GND
DATA
BS
Data-cut
© Digital Integrated Circuits2nd
Memories
1 Gbit Flash Memory
© Digital Integrated Circuits2nd
From [Nakamura02]
Memories
Number of cells
Writing Flash Memory
108
106
104
Read level (4.5 V)
102
100
0V
1V
2V
3V
4V
Vt of memory cells
Evolution of thresholds
Final Distribution
During erasure all bits are programmed to become depletion devices
It takes four cycles of write/erase to establish all device threshold > 0.8 V
© Digital Integrated Circuits2nd
From [Nakamura02]
Memories
1Gbit NAND Flash Memory
Charge pump
2kB Page buffer & cache
10.7mm
2
125mm
32 word lines
x 1024 blocks
16896 bit lines
11.7mm
© Digital Integrated Circuits2nd
From [Nakamura02]
Memories
125mm2 1Gbit NAND Flash Memory









Technology
0.13mm p-sub CMOS triple-well
1poly, 1polycide, 1W, 2Al
Cell size
0.077mm2
Chip size
125.2mm2
Organization 2112 x 8b x 64 page x 1k block
Power supply 2.7V-3.6V
Cycle time
50ns
Read time
25ms
Program time 200ms / page
Erase time
2ms / block
© Digital Integrated Circuits2nd
From [Nakamura02]
Memories
Semiconductor Memory Trends
(up to the 90’s)
Memory Size as a function of time: x 4 every three years
© Digital Integrated Circuits2nd
Memories
Semiconductor Memory Trends
(updated)
© Digital Integrated Circuits2nd
From [Itoh01]
Memories
Semiconductor Memory Trends
There was an apparent shift in memory market due to
the increase of flash memory use for video
and other personalized storage devices other
than personal computers
 SOC technology implements systems integrated with
memories and processors on a single die
Challenges of making even bigger and denser
memories with lower market incentives responsible
for the slow down in the memory chip capacities
© Digital Integrated Circuits2nd
Memories
Trends in Memory Cell Area
© Digital Integrated Circuits2nd
From [Itoh01]
Memories
Semiconductor Memory Trends
Technology feature size for different SRAM generations
© Digital Integrated Circuits2nd
Memories