Download A sensing circuit for single-ended read

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Ground loop (electricity) wikipedia , lookup

Islanding wikipedia , lookup

Power inverter wikipedia , lookup

Rectifier wikipedia , lookup

Alternating current wikipedia , lookup

Current source wikipedia , lookup

Stray voltage wikipedia , lookup

Voltage optimisation wikipedia , lookup

Switched-mode power supply wikipedia , lookup

Fault tolerance wikipedia , lookup

Electrical substation wikipedia , lookup

Immunity-aware programming wikipedia , lookup

Buck converter wikipedia , lookup

Bio-MEMS wikipedia , lookup

Ground (electricity) wikipedia , lookup

Earthing system wikipedia , lookup

Two-port network wikipedia , lookup

Transistor wikipedia , lookup

Resistive opto-isolator wikipedia , lookup

Circuit breaker wikipedia , lookup

Schmitt trigger wikipedia , lookup

Power MOSFET wikipedia , lookup

Rectiverter wikipedia , lookup

Regenerative circuit wikipedia , lookup

Mains electricity wikipedia , lookup

Surge protector wikipedia , lookup

Opto-isolator wikipedia , lookup

Flexible electronics wikipedia , lookup

Electrical wiring in the United Kingdom wikipedia , lookup

RLC circuit wikipedia , lookup

CMOS wikipedia , lookup

Network analysis (electrical circuits) wikipedia , lookup

Transcript
Adv. Radio Sci., 9, 247–253, 2011
www.adv-radio-sci.net/9/247/2011/
doi:10.5194/ars-9-247-2011
© Author(s) 2011. CC Attribution 3.0 License.
Advances in
Radio Science
A sensing circuit for single-ended read-ports of SRAM cells with
bit-line power reduction and access-time enhancement
T. Heselhaus and T. G. Noll
Chair of Electrical Engineering and Computer Systems, RWTH Aachen University, 52062 Aachen, Germany
Abstract. The conventional sensing scheme of single-ended
read-only-ports as integrated in 8T-SRAM cells suffers from
low performance compared to double-ended complementary
sensing schemes. In the proposed sensing scheme the precharge voltage of the single-ended read-bit-line is set to a
level above the threshold voltage of the sensing device with
an adjustable margin. This margin is minimized to speed up
the read access on the one hand and kept large enough to
provide a sufficient bit-line noise margin on the other hand.
The pre-charge voltage level of the proposed sensing circuit
tracks the threshold voltage of the sensing device under process variations in order to maintain a minimum required bitline noise margin. To avoid unnecessary bit-line discharging, the proposed sensing scheme employs a modified 8TSRAM cell. Compared to the conventional 8T-SRAM cell,
the read port of the proposed cell provides a virtual ground
line running in parallel to the bit-lines. An internal driver of
the sensing circuit releases the virtual ground line during the
evaluation period to prevent the charge dissipation resulting
in a raised voltage level. The reduced pre-charge level and
the increased virtual ground lead to a reduced bit-line voltage swing and thus a bit-line power reduction. Access time,
energy dissipation, and noise margin of the proposed sensing circuit are compared with conventional sensing circuits
from the literature for different numbers of memory cells
connected to the bit-line. It is shown, that for a specific number of memory cells per bit-line the proposed circuit achieves
fastest access time at low power operation.
Correspondence to: T. Heselhaus
([email protected])
1
Introduction
Single-ended read sensing schemes will become more important in memory architectures of SRAM cells with additional
read-only-ports to overcome the reliability problems related
to standard 6T-SRAM cells. Though such single-ended ports
require only one bit-line, the main power dissipation originates from the full-swing bit-line sensing schemes.
The intention of this work is to set the initial bit-line voltage close to the threshold level of the individual local sensing device and stop the discharging, once the local sensing
has detected the data bit, such that no more than necessary of
the bit-line charge is dissipated. The expected side effect is a
performance upgrade, since the initial bit-line voltage resides
close to the sensing threshold level, such that this level can
be reached in a shorter evaluation time.
Section 2 presents a modified memory cell, which is required to enable the bit-line swing reduction. In Sect. 3 the
proposed sensing circuit is described. In Sect. 4 the proposed
sensing scheme is verified with simulations and compared
with other sensing circuits from the literature.
2
The proposed 8T-SRAM cell
Figure 1 shows the conventional (Chang et al., 2005) and the
proposed 8T-SRAM cell. Both cell variants consist of a conventional cross-coupled inverter-pair connected via two access transistors to a complementary bit-line pair p and p,
which represents the write-port for storing data into the cell
by activating the storage-word-line s. The read-only-port
consists of two series connected transistors for single-ended
read-out with a separate read-word-line r.
Area and most of the wiring of the proposed cell is identical to the conventional cell, however the proposed cell connects the source of the read-only-port to an additional virtual
ground line v instead of VSS . This additional virtual ground
Published by Copernicus Publications on behalf of the URSI Landesausschuss in der Bundesrepublik Deutschland e.V.
2248
M4 are both zero and the bit-line is charged by the series connected transistors M3 and M4. As the gates of M5 and M6
are
at zero, these transistors establish a voltage divider of the
T. Heselhaus and T. G. Noll: A sensing circuit for single-ended read-ports of SRAM cells
T. Heselhaus and T. G. Noll: A Sensing Circuit for Single-Ended Read-Ports . . .
actual bit-line
voltage. The
to the diϕ gate of M3 is connected
r
e
vided voltage u, which is a fraction of the bit-line voltage m.
While the bit-lineM5’
voltage rises, the tapped gate potential u
↑
↑
↑ ↑
↑
↑ ↑
↑
rises as well. The pre-charge
stops, if
M5
process of the bit-lineM11
p V SS m
p
p
m 
p
u rises above VDD −|Vth |, such thatM3
M3 enters the off-region.

m

m
M6
Hence the final pre-charge voltage
Vm,max of the bit-line has
r
r
PSfrag replacements m
M4 level VDD − |Vth | of
risen in excess of the sensing threshold
s
s
M3. This overshoot
can be adjustedM7
by choosing the approM2
priate dimensioning of M5 and M6. y
M9
M1
g
ag replacements
M8achieve the required
The intention of the overshoot is to
noise margin on the bit-line, which must be maintained even
M10 the feed-back loop for preunder process variations. With
V SS
V SS
charging the bit-line via the transistors M3, M4, M5 and M6,
p
p
p
p
the voltage overshoot of Vm,max tracks the sensing threshold
conventional
proposed
level of M3. This tracking effect is verified by simulations in
2. Schematic
theproposed
proposed single-ended
sensing
and and
Fig. Fig.
2. Schematic
ofofthe
single-endedread
read
sensing
Section
4.
pre-charge circuit for one memory column and domino connected
Fig.
Conventionaland
andproposed
proposed
8T-SRAM-Cell
as schematic
Fig. 1. Conventional
8T-SRAM
cell as schematic
and pre-charge
circuit
for
one
memory
column
and
domino
connected
Inline
fact,
in the proposed sensing scheme the signal is demultiplex.
and
layout
view.
additional
virtual
ground
is labeled
layout
view.
TheThe
additional
virtual
ground
line line
is labeled
as v.as v. digitdigit
line multiplex.
tected
with two successive sensing stages, realized by M3
and M9. It behaves similar to an inverter sense circuit followed
a domino
NMOS
pull-down
the
The by
intention
of the
overshoot
is to device.
achieve However,
the required
runs in parallel
to the read-bit-line
m and can be easily
3linePre-charge
and sensing
circuit
nodes
x,
y
and
v
in
the
sensing
circuit
have
to
be
pre-charged
noise margin on the bit-line, which must be maintained even
placed in the thin layout cell without any area penalty. The
or clamped
before
enteringWith
the evaluation
phase.
During
the
under
process
variations.
the feed-back
loop
for preintention
of this
additional virtual
groundcircuit
line isoftoa singlereduce
Fig.
2 shows
the pre-charge
and sensing
pre-charge-phase
the
bit-line
m
rises
to
V
and
shortly
m,max
charging the bit-line via the transistors M3, M4, M5 and M6,
the bit-line
swing
functionality
ended
bit-line
for and
one power
columndissipation.
of memoryThe
cells.
Only the
M8 voltage
begins to
dischargeofthe
internal
node
y such
thatthreshold
the main
the
overshoot
Vm,max
tracks
the
sensing
of this virtual ground
line will cell
be described
the nextthere
secread-only-port
of one memory
is shown,inhowever,
output
stage
M9
turns
off.
The
drain
of
M9
is
connected
to
level of M3. This tracking effect is verified by simulations in
tion.N memory cells connected to the bit-lines m and v. The
are
a
common
node
x,
which
could
be
a
global
bit-line
in
a
subSect. 4.
basic idea of the pre-charge circuit is to pre-charge the bitdivided bit-line memory architecture or a common digit line
line m to a defined voltage level above the threshold-level of
In fact, in the proposed sensing scheme the signal is de3
Pre-charge
and
sensing
circuit
in
some other hierarchical memory architecture (e.g. a colthe sensing transistor M3. Because the drain current of M3
tected with two successive sensing stages, realized by M3
umn
or data word multiplex output). Here, this output x is
isFigure
used 2toshows
pre-charge
the bit-line,
should
itselfofturn
off as
and M9. It behaves similar to an inverter sense circuit folthe pre-charge
andM3
sensing
circuit
a singleassumed
to be domino-connected, thus x requires an addithe
pre-charge
reached.
pre-charge
is
lowed by a domino NMOS pull-down device. However, the
ended
bit-line voltage
for one iscolumn
of The
memory
cells. process
Only the
tional
pre-charge
M11, circuit
which have
is nottopart
of the sensdescribed
as follows.
nodes x, y and v indevice
the sensing
be pre-charged
read-only-port
of one memory cell is shown, however, there
ing
circuit.
However,
M11
pre-charges
x
to
V
such that
DD
or clamped before entering the evaluation phase. During
the
the pre-charge-phase
(φthe
= 0)
the word-line
deareDuring
N memory
cells connected to
bit-lines
m and v.is The
M10
turns
on
and
clamps
the
virtual
ground
line
v
to VSS .
pre-charge-phase
the
bit-line
m
rises
to
V
and
shortly
activated
(r
=
0)
and
the
sensing
is
disabled
(e
=
1).
For
an
m,max
basic idea of the pre-charge circuit is to pre-charge the bitThebegins
circuittois discharge
now prepared
for sensing.
M8
the internal
node y such that the main
initially
bit-line level
m theabove
gate the
potentials
of M3 and
line m todischarged
a defined voltage
threshold-level
of
φ to VDD
The stage
pre-charge
phase
is completed
byM9
reverting
output
M9 turns
off.
The drain of
is connected
to
M4
both transistor
zero and the
bit-line
is charged
by the
series
the are
sensing
M3.
Because
the drain
current
ofconM3
which
turns
off
M4.
Since
M6
is
connected
as
a
diode,
M6
a
common
node
x,
which
could
be
a
global
bit-line
in
a
subnected
transistors
M3
and
M4.
As
the
gates
of
M5
and
M6
is used to pre-charge the bit-line, M3 should itself turn off as
turns offbit-line
as well,
the voltage
divider or
built
by M5 and
divided
memory
architecture
a common
digitM6
lineis
are
zero, these
transistors
establish
a voltage
divider
of the
the at
pre-charge
voltage
is reached.
The
pre-charge
process
is
now
no
longer
active
and
u
follows
the
bit-line
voltage
via
in some other hierarchical memory architecture (e.g. a coldescribed as follows.
M5.
The
transistor
M5’
may
be
added
to
afford
a
close
couumn or data word multiplex output). Here, this output x is
During the
(φ = 0) ethe word-line is depling between
and u over a wide range
bit-line voltages.
ϕ
r pre-charge-phase
assumed
to bem
domino-connected,
thus xofrequires
an addiactivated (r = 0) and the sensing is disabled (e = 1). For an
Actually
both
M5’
and
M5
together
build
up
a transmission
tional
pre-charge
device
M11,
which
is
not
part
of
the
sensM5’
initially discharged bit-line m the gate potentials of M3 and
gatecircuit.
to connect
m to theM11
gatepre-charges
of M3. In some
cases
where
M5
ing
However,
x
to
V
such
that
DD
M4 are both zero and the
series conM5 bit-line
 is charged by theM11
is
strong
enough
for
a
close
coupling
between
m
and
u,
M5’
M10 turns on and clamps the virtual ground line v to VSS .
nected transistors M3 and M4. AsM3
the gates of M5
 and M6
might
be omitted.
However,
this work M5’ is included
M6
The
circuit
is now prepared
forinsensing.
are atmzero, these transistors establish a voltage divider of the
Sfrag replacements
in
each
power
and
performance
simulation
and comparison.
M4
The pre-charge phase is completed by reverting
φ to VDD
actual bit-line voltage. The gate of M3 is connected to the diThe
transistor
M5’
is
minimally
dimensioned.
M2
M7
which
turns
off
M4.
Since
M6
is
connected
as
a
diode,
M6
vided voltage u, which is a fraction of
voltage m.
y the bit-line
Additionally,
ethe
is now
tieddivider
to zero,built
andby
M7
connects
the
M9
turns
off
as
well,
voltage
M5
and
M6
is
M1 voltage rises, the tapped gate potential u
While theg bit-line
sensing
transistor
M3
to
the
internal
node
y.
Any
leakage
M8
now
no
longer
active
and
u
follows
the
bit-line
voltage
via
rises as well. The pre-charge process of the bit-line stops, if
of M3
duetransistor
to the near
bit-line
m
M5.
The
M5’threshold-level
may be addedpre-charged
to afford a close
couM10that M3 enters the off-region.
u rises above VDD −|Vth |, such
and
node
u
is
compensated
now
by
the
weakly
dimensioned
pling between m and u over a wide range of bit-line voltages.
Hence the final pre-charge voltage Vm,max of the bit-line has
transistorboth
M8,M5’
suchand
thatM5
y is
kept tobuild
VSS .upThe
evaluationActually
together
a transmission
risen in excess of the sensing threshold level VDD − |Vth | of
phase
starts
by
activating
the
word-line
r.
Depending
on
gate to connect m to the gate of M3. In some cases where M5
M3.2.This
overshoot
canproposed
be adjusted
by choosing
the approthe
stored
data
g,
the
bit-line
remains
pre-charged
or
is
disFig.
Schematic
of the
single-ended
read sensing
and
is strong enough for a close coupling between m and u, M5’
priate dimensioning
of M5
and M6.
pre-charge
circuit for one
memory
column and domino connected
charged
g = VDD However,
via M2 and
the memory
cell and
might
beifomitted.
in M1
this of
work
M5’ is included
digit line multiplex.
M10 of the sensing circuit. The gate potential u of M3 falls
Adv. Radio Sci., 9, 247–253, 2011
www.adv-radio-sci.net/9/247/2011/
w
tu
no
M
pl
A
ga
is
m
in
Th
se
of
an
tra
ph
th
ch
M
mV
mV
µA
ρ = 0.65 (for lM5 = lM6 = 40nm). Reducing the variability
M3, M4, M9, and M10 have a gate width of w = 480nm.
of the voltage divider reduces the standard deviation of the
Simulation using slow corner device models at 125°C and
access
time from 10.1% to 9%, while the mean access time
a minimal supply voltage of 900mV are used to verify the
remains
nearly constant.
function of the proposed sensing circuit under worst case
-Ended Read-Ports
. . . and T. G. Noll: A sensing circuit for single-ended3read-ports of SRAM cells
T. Heselhaus
249
timing conditions. Fig. 3 shows the simulated pre-charge and
evaluation-phase of a read cycle with a stored ’1’PSfrag
on the
sereplacements
600mV
0
 M5 = 420nm
lected memory
cell (g = VDD ). The achieved bit-line swing
ng
 M5 = 280nm

−30
DD
∆V is approximately 435mV, a little bit less than VDD
/2
tes
M5 = 140nm
V m,mx
−60
such that bit-line energy savings of 50% compared to a full
x
x.
V
corr.
swing bit-line
sensing scheme can be expected.
900

y
ed
frag replacementsTracking the pre-charge voltage according to the sensing
V DD − |Vth |
m
dithreshold level of M3 under process variations was functionΔV ≈ 435mV
100
100

ell
ally verified
ps
ps
450 by Monte Carlo simulations. For this case pro300mV
oncess variations apply
to−M3
while all other transistors
−→
−→
V DD
|Vthonly,
|
0
t ACC
0
t ACC
t
t
operate in the typical corner. Fig. 4 shows the probability
ed
std(t ACC ) = 8.1%
std(t ACC ) = 1.5%

density functions
(PDF) of sensing threshold level variations
in0
and
the
resulting
access
time profile
t ACC without tracking on the
Fig. 4.
4. Verification
Verification of
of the
t is
Fig.
the sensing
sensing threshold
threshold level
level tracking.
tracking. The
The
left side900
and with ϕtracking on the rightr side. On the ordinate
left diagram
diagram shows
shows the
the resulting
access
time
profile
with
aafixed
bitleft
resulting
access
time
profile
with
fixed
bitlar
450sensing threshold level V , the mean pre-charge
the mean
line pre-charge
pre-charge voltage,
voltage, while
the
right
diagram
shows
the
results
th
line
while
the
right
diagram
shows
the
results
0
nes
when the
the sensing
sensing circuit
circuit includes
level Vm,max
when
includes the
the voltage
voltage divider
divider M5/M6
M5/M6 for
for
0 , and their PDFs
500 are plotted.
1000The abscissas
1500show
ge
the tracking
tracking (Monte
(Monte Carlo
Carlo simulations,
25°C,
VVDD =
900mV).
the mean access times tACC and
their
PDFs
for
both
cases.
the
simulations,
25°C,
=
900mV).
DD
t / ps
he
rly
ed.
ca-
3. Simulatedwaveform
waveform for
for aa read-cycle
proposed
sensFig.Fig.
3. Simulated
read-cycleofofthethe
proposed
sens- 4 Simulation results
ing circuit (slow corner, 125°C, VDD = 900mV).
ing circuit (slow corner, 125°C, VDD = 900mV).
The sensing circuit was simulated for a 40nm general purpose bulk CMOS technology with a nominal supply voltin each power and performance simulation and comparison.
transistor
M5’ is minimally
dimensioned.
TheThe
standard
deviation
of the access
time PDF was reduced age of 900mV. All circuit simulations are based on
transistor schematics. Extracted parasitics from a layout
e is now tied to zero, and M7 connects the
from Additionally,
8.1% to 1.5%.
view are added to the bit-lines and to the virtual ground
sensing transistor M3 to the internal node y. Any leakage of
When
considering process variations of all transistors of line corresponding to the connected number of cells N ∈
M3 due to the near threshold-level pre-charged bit-line m and
thenode
sensing
circuit this also affects the fraction of the voltage {8,16,32,64,128}. Internal nodes of the sensing circuit are
uru is compensated now by the weakly dimensioned trandivider
M5/M6.
Fig.y5isshows
correlation of loaded with estimated parasitic capacitances rated between
oltsistor M8, such that
kept tothe
VSSsimulated
. The evaluation-phase
thestarts
sensing
thresholdthelevel
variation
to the pre-charge
level 0.4fF and 0.7fF depending on the assumed wiring complexon
by activating
word-line
r. Depending
on the stored
variation.
The
left side
of pre-charged
Fig. 5 shows
voltage iflevel ity. The transistor dimensioning of the proposed sensing cirdata g, the
bit-line
remains
or isthe
discharged
out
g = VDD viawhere
M2 andall
M1transistors
of the memory
cell sensing
and M10circuit
of the in cuit was basically designed with the minimal allowed gate
distributions,
of the
nd
width of w = 120nm. Only the driving/sensing transistors
circuit. The
gateapotential
u of M3
and
M3=now
Fig.sensing
2 are designed
with
gate lengths
of lfalls
lM6
40nm. M3, M4, M9, and M10 have a gate width of w = 480nm.
M5 =
∈
directly
turnsmismatches
on, which leads
to avoltage
steep rising
edgetransistors
on the
Since
device
of the
divider
are
Simulation using slow corner device models at 125°C and
internal node y. The rising edge on y activates M9, which
M5 and M6 disturb the correlation of the sensing threshold a minimal supply voltage of 900mV are used to verify the
en
now discharges the common data output node x.
level The
to the
pre-charge level, a reduction of the variability of function of the proposed sensing circuit under worst case
expower reduction feature is induced by the reduced preM5
and
M6
the correlation.
a gate timing conditions. Figure 3 shows the simulated pre-charge
ircharge levelcan
andimprove
the discharge
inhibition withChoosing
the additional
length
of ground
lM5 = line
lM6v=of70nm
for the8T-SRAM
voltage divider
M5/M6 and evaluation-phase of a read cycle with a stored “1” on the
ate
virtual
the proposed
cell and tranimproves
the ofcorrelation
toM10
ρ = is0.8
compared
sistor M10
the sensingcoefficient
circuit. Since
connected
to to selected memory cell (g = VDD ). The achieved bit-line swing
ors
1V is approximately 435mV, a little bit less than VDD /2
data(for
output
which
zero if the
selected memory
cell
ρ =the
0.65
lM5x,=
lM6 gets
= 40nm).
Reducing
the variability
such that bit-line energy savings of 50% compared to a full
discharges
bit-linereduces
m, M10 the
turnsstandard
off and inhibits
further
of the
voltagethedivider
deviation
of the swing bit-line sensing scheme can be expected.
nd
charge
dissipation
to VSS to
as 9%,
soon while
the datathe
bit mean
is detected.
Be-time
access
time
from 10.1%
access
he
Tracking the pre-charge voltage according to the sensing
cause the bit-lines m and v are designed similar in the layout
remains nearly constant.
ase
threshold
level of M3 under process variations was functionview, the parasitic capacitances of both bit-lines are similar.
ally
verified
by Monte Carlo simulations. For this case prond
Thus, once M10 turns off, the remaining charge on the bitcess
variations
apply to M3 only, while all other transistors
seline m will be balanced between m and v. The final bit-line
acements
operate
in
the
typical
corner. Figure 4 shows the probability
600mV
=
ng420nm potential of m would therefore settle to nearly half of the
density
functions
(PDF)
of sensing threshold level variations
= 280nm
/2140nm bit-line potential, when the data bit was detected. The final
=
V m,mx
and
the
resulting
access
time profile without tracking on the
potential depends on the ratio of the parasitic capacitances of
ull
x
left
side
and
with
tracking
on the right side. On the ordinate
 v.
m Vand
corr.
the mean sensing threshold level Vth , the mean pre-charge
ng
level Vm,max , and their PDFs are plotted. The abscissas show
V DD − |Vth |
the mean access times tACC and their PDFs for both cases.
on100 The standard deviation of the access time PDF was reduced
100
rops
ps
from 8.1% to 1.5%.
300mV
ors
−→
−→
ity
ons
he
ate
rge
0
t ACC
t
std(t ACC ) = 8.1%
0
t
std(t ACC ) = 1.5%
t ACC
Fig.www.adv-radio-sci.net/9/247/2011/
4. Verification of the sensing threshold level tracking. The
left diagram shows the resulting access time profile with a fixed bitline pre-charge voltage, while the right diagram shows the results
when the sensing circuit includes the voltage divider M5/M6 for
Adv. Radio Sci., 9, 247–253, 2011
The proposed sensing circuit was compared to the following sensing circuits (refer to Table 1). As explained for the
proposed sensing circuit, the sensing circuits for compariT. Heselhaus
and T.
Noll:circuit
A Sensing
Circuit for read-ports
Single-Ended
Read-Ports
T. Heselhaus
and T. G. Noll:
A G.
sensing
for single-ended
of SRAM
cells . . .
4250
V m,mx / mV
 = 40nm
erenc
the re
gate w
transi
signed
Designation
shows
800 The proposed sense circuit as shown in Fig. 2
the re
A
B A domino style sensing of the local bit-line
m with a time t
 M5 = 420nm
PMOS transistor, which implicitly implements the digit and E
750
 = 70nm
650
Increasing
the bit-line
noisevariations
margin was
simulated
by of
enWhen considering
process
of all
transistors
larging
the gate
length
M5
up tothe
lM5fraction
= 420nm
forvoltage
a series
the sensing
circuit
this of
also
affects
of the
Figure 5 shows
thekeeping
simulated
ofdivider
MonteM5/M6.
Carlo simulations,
while
thecorrelation
gate lengthofof
thefixed
sensing
level
variation
to
the
pre-charge
level
M6
to lthreshold
=
70nm.
The
results
shown
in
Fig. 6 exhibit
M6
The left side
ofpre-charge
Fig. 5 shows
theinvoltage
levelto
a variation.
similar distribution
of the
level
correlation
distributions,
where
all
transistors
of
the
sensing
circuit
in
the sensing threshold voltage like depicted on the right diaFig.
2
are
designed
with
a
gate
lengths
of
l
=
l
=
40nm.
M5
M6
gram in Fig. 5. However, the dot clouds are shifted to higher
Since device
mismatches
of the voltage
dividerl transistors
pre-charge
levels
with increasing
gate length
M5 . The elM5
and
M6
disturb
the
correlation
of
the
sensingerror
threshold
lipses inside the dot clouds denote the standard
ellipse
level
to
the
pre-charge
level,
a
reduction
of
the
variability
of the two dimensional density function. The resulting of
imM5 and M6 can improve the correlation. Choosing a gate
provement of the mean bit-line noise margin with increasing
length of lM5 = lM6 = 70nm for the voltage divider M5/M6
the gate length
lM5 = 70nm → {140nm,280nm,420nm} are
improves the correlation
coefficient to ρ = 0.8 compared to
respectively ∆VBL,margin = {35mV,100mV,150mV}.
ρ = 0.65 (for lM5 = lM6 = 40nm). Reducing the variability
sensing
circuitthe
wasstandard
compared
to the followofThe
the proposed
voltage divider
reduces
deviation
of the
ing
sensing
circuits
(refer
to
Table
1).
As
explained
the
access time from 10.1% to 9%, while the mean accessfor
time
proposed
sensing
circuit,
the
sensing
circuits
for
compariremains nearly constant.
V m,mx / mV
line multiplex (see Fig. 7) (Takeda et al., 2004)
 M5 = 280nm
Sensing with an inverter followed by an NMOS
transistor
for
digit
line
multiplex
(Cosemans
et
al.,
2007).
A
similar
700
550
sensing circuit implies a static CMOS NAND gate instead
replacements
of an inverter for early digit line multiplex (Chang et al.,
 M5 = 140nm
650 2007).
300
400
500 300
400
500
m
D An AC coupled sense amplifier as PMOS replacement
in a
PSfrag replacements
V DD − |Vth | / mV
V DD − |Vth | / mV
PSfrag replacements
domino read path (Qazi et al., 2010). For the reduced bitPearson ρ = 0.65
Pearson ρ = 0.8
600
line swing due to word-line pulsing 400mV is assumed. To
mean(t ACC ) = 150ps
mean(t ACC ) = 152ps
compare this circuit, the driving part for the global bit-line
std(t ACC ) = 10.1%
std(t ACC ) = 9.0%
g
300
350
400
450
500
550
is included.
V DD − |Vth | / mV
Fig.
level correlation
correlationtotoM3-threshold-level
M3-threshold-level
E An AC coupled
sense amplifier (Verma and Chandrakasan,
Fig.5.5.Simulated
Simulatedpre-charge
pre-charge level
variations
runs for
for two
twodifferent
differentgate
gatelengths
lengths
variationsof
of1000
1000 Monte
Monte Carlo
Carlo runs
2009). For the reduced bit-line swing due to word-line
6. Monte
Carlo
simulated
pre-charge level for different lM5 in
ofofthe
(25°C, VVDD
=900mV).
900mV).
thevoltage
voltagedivider
divider M5/M6
M5/M6 (25°C,
pulsing
200mV
ispre-charge
assumed.
DD=
Fig. 6.Fig.
Monte
Carlo
simulated
level for different lM5 in
correlation to the M3-threshold-level (lM6 = 70nm, 1000 iterations,
Fig. 7
correlation to the M3-threshold-level (lM6 = 70nm, 1000 iterations,
25°C, VDD = 900mV).
1.
Proposed
and
reference
sensing
circuits
for
comparison.
sensin
25°C, VTable
=
900mV).
DD
600
C
Table 1. Proposed and reference sensing circuits for comparison.
V m,mx / mV
son are simulated using transistor schematics with estimated
parasitics
for their internal nodes (0.4...0.7fF). The transisDesignation
tors are minimally dimensioned (l = 40nm, w = 120nm), exA
senseand
circuit
as shown
in Fig.
2
ceptThe
forproposed
pre-charge
other
driving
devices
(w = 480nm).
B A domino style sensing of the local bit-line m with a PMOS
For any number of cells N ∈ {8,16,32,64,128} the extracted
transistor, which implicitly implements the digit line mulparasitics
from
a 7)
column
8T-SRAM cells as
tiplex (see
Fig.
(Takedaofetconventional
al., 2004)
shown
in
Fig.
1
are
attached
to
the
bit-line
of each refC Sensing with an inverter followed by an NMOSm
transistor
erence
circuit
schematic.
The
circuit
simulations
of the reffor digit line multiplex (Cosemans et al., 2007). A similar
sensing
circuitare
implies
a static
NAND
gate instead
erence
circuits
applied
for CMOS
the same
simulation
corners,
of
an
inverter
for
early
digit
line
multiplex
(Chang
et
al.,
temperature and supply voltage as for the proposed sensing
2007).
circuit.
The cycle period was 2500ns. Fig. 7 shows the refD An AC coupled sense amplifier as PMOS replacement in a
erence circuits B and C as implemented for comparison. For
domino read path (Qazi et al., 2010). For the reduced bitthe reference
circuits
domino-read
and standard
inverter
line swing due
to word-line
pulsing 400mV
is assumed.
To the
gatecompare
widths this
of the
sensing
transistor
M3
and
the
pre-charge
circuit, the driving part for the global bit-line
transistor
M4 were set to w = 480nm, all others were deis included.
Increasing the bit-line noise margin was simulated by enEsigned
An AC
coupled
sense gate
amplifier
(Verma
and=Chandrakasan,
with
minimal
width
of w
120nm. Table 2
larging the gate length of M5 up to lM5 = 420nm for a series
2009).
For the reduced
bit-line
swing
due tocircuit
word-line
shows
the
simulated
results
of
the
proposed
A versus
of Monte Carlo simulations, while keeping the gate length of
pulsing
200mV
is
assumed.
800
the reference circuits B. . . E for N = 128 cells. The access
M6 fixed to lM6 = 70nm. The results shown in Fig. 6 exhibit
time tACC of the proposed circuit A is faster than B, C, D
 M5 = 420nm
a similar distribution of the pre-charge level in correlation
to
and E for N = 128 cells. The access time tACC consists of a
the sensing
750 threshold voltage like depicted on the right diagram in Fig. 5. However, the dot clouds are shifted to higher
tors
are minimally dimensioned (l = 40nm, w = 120nm), ex M5 = 280nm
pre-charge levels with increasing gate length lM5 . The elcept for rpre-chargeϕ and other drivingrdevices (w
= 480nm).
ϕ
700 the dot clouds denote the standard error ellipse
lipses inside
For any number of cells N ∈ {8,16,32,64,128} the extracted
of the two dimensional density function. The resulting imM4
parasitics fromM4
a column of conventional 8T-SRAM
cells as
 M5 = 140nm
provement
shown in Fig. 1 are attached to the bit-line m ofM3each ref650of the mean bit-line noise margin with increasing
the gate length lM5 = 70nm → {140nm,280nm,420nm} are
m simulations of the referencemcircuit schematic. The circuit
PSfrag replacements

respectively 1VBL,margin = {35mV,100mV,150mV}.
Sfrag replacements
erence circuits are applied for the same simulation corners,
600
M3 voltage

temperature and supply
as for the proposed sensThe proposed sensing circuit was compared to the followg
g
ing sensing300
circuits
(refer
to
Table
1).
As
explained
for
the
ing
circuit.
The
cycle
period
was
2500ps. Figure 7 shows
350
400
450
500
550
proposed sensing circuit,
the
sensing
circuits
for
comparithe
reference
circuits
B
and
C
as
implemented
for comparV DD − |Vth | / mV
son are simulated using transistor schematics with estimated
ison. For the reference circuits domino-read and standard
Inverter
parasitics for their internal nodes (0.4...0.7fF). The transisinverter Domino-Read
the gate widths of the sensingStandard
transistor
M3 and the
Fig. 6. Monte Carlo simulated pre-charge level for different lM5 in
correlation to the M3-threshold-level (lM6 = 70nm, 1000 iterations,
25°C,
Adv.VRadio
Sci., 9, 247–253, 2011
DD = 900mV).
Fig. 7. Schematics of the domino-read (B) and inverter-read (C)
sensing circuits for comparison.
www.adv-radio-sci.net/9/247/2011/
0nm
shows the simulated results of the proposed circuit A versus
proposed sensing circuit dissipates the least amount energy
the reference circuits B. . . E for N = 128 cells. The access
and shows the fastest access time compared to all reference
time tACC of the proposed circuit A is faster than B, C, D
circuits for N ∈ [64...128].
and
E for N =and
128
The
tACC
consists of aread-ports of SRAM cells
T. Heselhaus
T. cells.
G. Noll:
A access
sensingtime
circuit
for single-ended
251
circu
gray
To
signa
0nm
ϕ
r
M4
in
tions,
2
A (prop.)
M3
replacements
(b) t ACC
2
M4
0nm
M5
(a) E tot
ϕ
r
m
1.5
B
1.5
m
m


M3
C
1
g
Domino-Read
D PSfrag replacements 
1
g
PSfrag replacements0.5
V m,mx / mV
V DD − |Vth | / mV
0
Standard Inverter
E
8
16 32 64 128
0.5
N
8
16 32 64 128
Fig.7.7. Schematics
Schematics of
(B)(B)
andand
inverter-read
(C) (C)
Fig.
of the
thedomino-read
domino-read
inverter-read
sensingcircuits
circuits for
Fig.8.
8. Simulated
Simulated ttACC and
Etot for different number of connected
sensing
for comparison.
comparison.
Fig.
ACC and Etot for different number of connected
cells
N
(normalized
to
the
proposed
125°C,
cells N (normalized to the proposedcircuit
circuitA,A,slow
slowcorner,
corner,
125°C,
VDD = 900mV).
VDD = 900mV).
Table 2. Simulated results for N = 128 cells (slow corner, 125°C,
VDD = 900mV).
tACC / ps
tACC,SA / ps
ESense / fJ
EBL / fJ
Etot / fJ
VBL,margin / mV
Transistor counta
A
B
C
D
E
300
82
5.08
5.85
10.93
120
10
408
45
1.12
13.40
14.53
181
3
512
53
2.21
13.42
15.62
379
5
401
94
5.58
6.19
11.77
96
13
325
126
9.95
2.94
12.89
53
14
a without transistors of memory cells.
Table 3. Comparison of the average access time and relative standard deviation of the proposed sensing circuit (with lM5 = lM6 =
70nm) for N = 64 using 1000 Monte Carlo simulations.
)a
mean (tACC
std (tACC )a
mean (tACC )b
std (tACC )b
A
B
C
D
E
152 ps
9.0%
153 ps
9.5%
167 ps
11.1%
164 ps
14.7%
205 ps
9.5%
204 ps
10.6%
169 ps
10.1%
170 ps
10.6%
157 ps
10.6%
158 ps
11.3%
a constant temperature 25°C and supply voltage V
DD = 900mV.
b normally distributed temperature and supply voltage variations,
temperature ∈ [−55°C...125°C], VDD ∈ [810mV...990mV].
pre-charge transistor M4 were set to w = 480nm, all others
were designed with minimal gate width of w = 120nm. Table 2 shows the simulated results of the proposed circuit A
versus the reference circuits B. . . E for N = 128 cells. The
access time tACC of the proposed circuit A is faster than B, C,
D and E for N = 128 cells. The access time tACC consists of
a bit-line dependent contribution tACC,BL and an access time
offset tACC,SA , which depends on the internal complexity of
the sensing circuit. Apart from circuit B and C, the proposed
circuit shows a smaller intrinsic access time tACC,SA compared to circuit D and E.
The sense error immunity to bit-line noise was simulated
with current pulses to inject charge to the bit-line during an
evaluation period. Read evaluation faults are detected within
an evaluation period of 2500ps. The tolerable bit-line noise
VBL,margin is shown in Table 2 for each circuit.
Table 2 shows that the bit-line energy per cycle of circuit A is similar to circuit D. This results from nearly the
same bit-line swing achieved in circuit A and D. The energy dissipation ESense of circuit A is less than that of circuit D and E. Since circuit B and C operate with full swing
bit-lines, but dissipate the least energy ESense , there must be
a crossover for a certain number of cells. Figure 8a shows
that this crossover point can be identified at N ≥ 64 cells.
www.adv-radio-sci.net/9/247/2011/
A similar crossover point for the access time is located at
N ≈ 40 cells (see Fig. 8b). Simulations indicate smaller access times of circuit E for N > 256 cells. However, the proposed sensing circuit dissipates the least amount energy and
shows the fastest access time compared to all reference circuits for N ∈ [64...128].
Table 3 shows the mean access time over 1000 Monte
Carlo iterations for each circuit and its standard deviation relative to the mean access time. The proposed sensing circuit
provides the fastest mean access time and lowest access time
variation compared to the reference circuits.
To implement the proposed sensing circuit in a hierarchical memory architecture, the sensing circuit on all but the last
hierarchy level needs a small modification to provide the signaling scheme consisting of a digit line and a virtual ground
line. The modification affects the output stage of the proposed sensing circuit and requires an additional internal inverter for driving the virtual ground switch. In Fig. 9 such a
modified sensing circuit and its application in a hierarchical
memory architecture is shown. The source terminal of the
output transistor is now connected to a virtual ground line
xv which enables the same signaling scheme as used for the
Adv. Radio Sci., 9, 247–253, 2011
Fig.
outpu
tectu
lated for the proposed signaling scheme with included virtual
gray shaded box).
To show the performance improvement of the proposed ground concept. This waveform shows the same properties
as the waveform in Fig. 3: the virtual grounds are successignaling
scheme, two test circuits representing
252
T. Heselhaus the
and data
T. G.sigNoll: A sensing circuit for single-ended read-ports of SRAM cells
ϕ
e
 m
 m
µA
.)
SA*
g replacements

m
M10
0
−300
−600
 DD
900

PSfrag replacements
ϕ
e
SA*
SA*
 DD

SA*
mV
m
SA*
450
PSfrag replacements
SA*

0
m

SA
mV
ence
900
450
0
531ps
ϕ
r
673ps
ϕ
r
Fig.
Themodified
modified sensing
sensing circuit
onon
thethe
0
400 800 1200
0
400 800 1200
Fig.
9. 9.The
circuit with
withaavirtual
virtualground
ground
output
side
and
its
implementation
in
a
hierarchical
memory
archit
/
ps
t / ps
output side and its implementation in a hierarchical memory architecture.
SA*
denotes
the
modified
sensing
circuit.
tecture. SA* denotes the modified sensing circuit.
6
T. Heselhaus and T. G. Noll: A Sensing Circuit for Single-Ended Read-Ports . . .
Fig. 11.
11. Comparison
Comparison of
of of
thethe
Fig.
ofsimulated
simulatedwaveforms
waveformsforfora read-cycle
a read-cycle
proposed
sensing
circuit
on
the
left
and
a
domino-style
full-swing
proposed sensing circuit on the left and a domino-style full-swing
Test circuit 1:
sively released
soon
thecorner,
data
bit
is detected.
The resignaling
circuit on
right
(slow
125°C,
VDD
==
900mV).
signaling
circuit
onasthe
the
rightas
(slow
corner,
125°C,
VDD
900mV).
duced
voltage
swing
of
approximately
475mV
leads
to
digit
SA*
SA*
SA
r
C1
C2
C3
line energy savings of approximately 50%. The right waveC1
C2
C3
digit-line,
andthe
virtual
groundsignals
of theof
test
are equally
g
form shows
full-swing
thecircuits
domino-style
sensloaded
with
parasitic
capacitances
on
each
hierarchy
level.
ing circuits. The access time of the proposed sensing
cirFigure
11 shows
the simulated
waveform
for
read-cycles
of
cuit
(t
=
531ps)
is
smaller
than
the
access
time
of
the
ACC
ag replacements Test circuit 2:
both
sensing circuits.
left read-cycle
was simulated
forto
domino-style
sensing The
scheme
(tACC = 673ps).
This leads
the
proposed
signaling
schemeofwith
included virtual
an access
time
improvement
approximately
21%.ground
r

concept. This waveform shows the same properties as the
C1
C3
g
waveform in Fig. 3: the virtual grounds are successively reC2
5 Conclusions
leased
as soon as the data bit is detected. The reduced voltage swing of approximately 475mV leads to digit-line enFig.
for sensing
sensing scheme
scheme comparison.
comparison. Test
Test circuit
circuit11
A new
sensing
circuit for a single-ended
Fig. 10. Test
Test circuits
circuits for
ergy
savings
of approximately
50%. The read-only-port
right waveformof
shows
sensing scheme
scheme with
with virtual
virtual grounds.
grounds. Test
Testcircirshows the
the proposed
proposed sensing
SRAM
cells
is
introduced.
Instead
of
pre-charging
the bitshows the full-swing signals of the domino-style sensing
cuit
sensing scheme.
scheme. The
The gray
gray shaded
shadedtrantrancuit 22 shows
shows aa domino-style
domino-style sensing
line
to
V
,
the
proposed
circuit
sets
the
pre-charge
level
circuits. DD
The access time of the proposed sensing circuit
sistors
the digit-lines
digit-lines and
and are
are not
not relevant
relevantfor
for
sistors are
are used
used to
to pre-charge
pre-charge the
close
to
the
threshold
level
V
−
|V
|
of
the
individual
(tACC = 531ps) is smaller than theDD
accessth
time of the dominothe
The dotted
dotted box
box encloses
encloses the
the read-only
read-only
the access
access time
time comparison.
comparison. The
sensing
device,
while
ensuring
a process
variation
tolerant
style
sensing
scheme
(t
=
673ps).
This
leads
to an access
ACC
port
8T-SRAM cell.
cell.
port of
of the
the (modified)
(modified) 8T-SRAM
bit-line
noise
margin.
With
a
small
modification
of
the 8Ttime improvement of approximately 21%.
SRAM cell to provide an additional bit-line v, the charge
dissipation of the bit-line m during evaluation can automatmodified
cell.memory
Since now
the output
signal
xm
nal
path of8T-SRAM
a hierarchical
architecture
were
designed
5ically
Conclusions
be stopped by the proposed circuit, once the data bit
will compared
not fall below
the thresholdThe
voltage
thecircuit
virtualemploys
ground
and
by simulations.
firstoftest
is detected by the sensing circuit. Both effects together reswitch
M10, the
output
signaland
xm pairs
cannot
used
to and
control
the
proposed
sensing
circuits
ofbe
digit
line
virAduce
newthe
sensing
circuit
forand
a single-ended
of
bit-line
swing
the bit-line read-only-port
power dissipation.
the gate
of M10
required to generate
the
tual
ground
lineanymore.
on each Thus,
level itofis hierarchy.
The second
SRAM
cells
is
introduced.
Instead
of
pre-charging
the
bitFor N = 64...128 cells, the proposed circuit achieves fewest
control
signal
for theavirtual
ground switch
thesimilar
sensing
test
circuit
employs
domino-style
sensinginside
circuit
to
line
to Vcompared
circuit sets
the pre-charge
DD , the proposed
energy
to the reference
circuits
along withlevel
good
circuit
an additional
(highlighted
in Fig. 9 sensby a
the
leftwith
circuit
of Fig. 7.inverter
However,
in domino-style
close to the threshold level VDD − |Vth | of the individual
performance improvements. The proposed sensing circuit is
graycircuits
shaded successive
box).
sensing device, while ensuring a process variation tolerant
ing
stages alternatively employ NMOS
also suited for a hierarchical memory architecture utilizing
show sensing
the performance
of the
bit-line noise margin. With a small modification of the 8TandToPMOS
devices. improvement
The test circuits
areproposed
depicted
the optimized pre-charge level and virtual ground concepts.
SRAM cell to provide an additional bit-line v, the charge
signaling
scheme,
two comparison,
test circuits representing
the data sigin
Fig. 10.
For fair
each corresponding
bitdissipation of the bit-line m during evaluation can automatnal
path
of
a
hierarchical
memory
architecture
were
designed
line, digit-line, and virtual ground of the test circuits are
ically
be stopped by the proposed circuit, once the data bit
and
compared
by
simulations.
The
first
test
circuit
employs
equally loaded with parasitic capacitances on each hierarReferences
is
detected
by the sensing circuit. Both effects together rethe
proposed
sensing
circuits
and
pairs
of
digit
line
and
virchy level. Fig. 11 shows the simulated waveform for readChang,
L.,
Fried,
D.
M., Hergenrother,
J., Sleight,
W., Dennard,
duce
the
bit-line
swing
and the bit-line
powerJ.dissipation.
tual
ground
line
on
each
level
of
hierarchy.
The
second
cycles of both sensing circuits. The left read-cycle was simuR.
H.,
Montoye,
R.
K.,
Sekaric,
L.,
McNab,
S.
J.,
Topol,
A. W.,
For
N
=
64...128
cells,
the
proposed
circuit
achieves
fewest
test
circuit
employs
a
domino-style
sensing
circuit
similar
to
lated for the proposed signaling scheme with included virtual
Adams,
C.
D.,
Guarini,
K.
W.,
and
Haensch,
W.:
Stable
SRAM
energy
compared
to
the
reference
circuits
along
with
good
7.
However,
in
domino-style
sensthe
left
circuit
of
Fig.
ground concept. This waveform shows the same properties
Cell Designimprovements.
for the 32nm Node
Beyond,sensing
in: Symposium
performance
Theand
proposed
circuit ison
ingthe
circuits
successive
employ
as
waveform
in Fig.stages
3: thealternatively
virtual grounds
are NMOS
succesTechnology,
pp. 128–129,
2005. architecture utilizing
alsoVLSI
suited
for a hierarchical
memory
and PMOS sensing devices. The test circuits are depicted
Chang, L., Nakamura, Y., Montoye, R. K., Sawada, J., Martin,
the optimized pre-charge level and virtual ground concepts.
in Fig. 10. For fair comparison, each corresponding bit-line,
cted
5°C,
µA
0
−300
Adv. Radio
−600
900

Sci.,DD9, 247–253, 2011

 DD
A. K., Kinoshita, K., Gebara, F. H., Agarwal, K. B., Acharyya,
D. J., Haensch, W., Hosokawa, K., and Jamsek, D.: A 5.3GHz
8T-SRAM with Operation
Down to 0.41V in 65nm CMOS, in:
www.adv-radio-sci.net/9/247/2011/
Symposium on VLSI Circuits, pp. 252–253, 2007.
Cosemans, S., Dehaene, W., and Catthoor, F.: A Low-Power Embedded SRAM for Wireless Applications, IEEE Journal of Solid-
R
A
C
V
Chan
A
D
8T
Sy
Cose
be
St
Qazi
51
A
Vo
So
Take
N
Se
C
C
Verm
U
Jo
T. Heselhaus and T. G. Noll: A sensing circuit for single-ended read-ports of SRAM cells
References
Chang, L., Fried, D. M., Hergenrother, J., Sleight, J. W., Dennard,
R. H., Montoye, R. K., Sekaric, L., McNab, S. J., Topol, A. W.,
Adams, C. D., Guarini, K. W., and Haensch, W.: Stable SRAM
Cell Design for the 32 nm Node and Beyond, in: Symposium on
VLSI Technology, 128–129, 2005.
Chang, L., Nakamura, Y., Montoye, R. K., Sawada, J., Martin,
A. K., Kinoshita, K., Gebara, F. H., Agarwal, K. B., Acharyya,
D. J., Haensch, W., Hosokawa, K., and Jamsek, D.: A 5.3 GHz
8T-SRAM with Operation Down to 0.41 V in 65 nm CMOS, in:
Symposium on VLSI Circuits, 252–253, 2007.
Cosemans, S., Dehaene, W., and Catthoor, F.: A Low-Power Embedded SRAM for Wireless Applications, IEEE Journal of SolidState Circuits, 42, 1607–1617, 2007.
www.adv-radio-sci.net/9/247/2011/
253
Qazi, M., Stawiasz, K., Chang, L., and Chandrakasan, A.: A
512 kb 8T SRAM Macro Operating Down to 0.57V with An
AC-Coupled Sense Amplifier and Embedded Data-RetentionVoltage Sensor in 45 nm SOI CMOS, in: IEEE International
Solid-State Circuits Conference, 350–351, 2010.
Takeda, K., Hagihara, Y., Aimoto, Y., Nomura, M., Uchida, R.,
Nakazawa, Y., Hirota, Y., Yoshida, S., and Saito, T.: Per-Bit
Sense Amplifier Scheme for 1GHz SRAM Macro in Sub-100 nm
CMOS Technology, in: IEEE International Solid-State Circuits
Conference, 1, 502–542, 2004.
Verma, N. and Chandrakasan, A. P.: A High-Density 45 nm SRAM
Using Small-Signal Non-Strobed Regenerative Sensing, IEEE
Journal of Solid-State Circuits, 44, 163–173, 2009.
Adv. Radio Sci., 9, 247–253, 2011