Download Click here to Paper Template

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts

Mains electricity wikipedia, lookup

Ground (electricity) wikipedia, lookup

Electronic engineering wikipedia, lookup

Buck converter wikipedia, lookup

Switched-mode power supply wikipedia, lookup

Opto-isolator wikipedia, lookup

Current mirror wikipedia, lookup

Alternating current wikipedia, lookup

Power electronics wikipedia, lookup

Resistive opto-isolator wikipedia, lookup

Rectifier wikipedia, lookup

Current source wikipedia, lookup

Ohm's law wikipedia, lookup

Islanding wikipedia, lookup

Power inverter wikipedia, lookup

Electrical substation wikipedia, lookup

Transistor wikipedia, lookup

Immunity-aware programming wikipedia, lookup

Earthing system wikipedia, lookup

Regenerative circuit wikipedia, lookup

Ground loop (electricity) wikipedia, lookup

Multimeter wikipedia, lookup

Fault tolerance wikipedia, lookup

CMOS wikipedia, lookup

Two-port network wikipedia, lookup

Flexible electronics wikipedia, lookup

Flip-flop (electronics) wikipedia, lookup

Time-to-digital converter wikipedia, lookup

Circuit breaker wikipedia, lookup

Transcript
CURRENT MODE DIFFERENTIAL ANALOG VITERBI DECODER
Cyril Prasanna Raj P.
Professor, MSEC, Bangalore
cyrilyahoo@gmail.com
S. L. Pinjare
Professor, ECE Dept., RITM, Bangalore
Sl_pinjare@gmail.com
ABSTRACT: Differential Analog Viterbi decoding presents a powerful Forward-ErrorCorrection (FEC) channel coding for digital communications. Viterbi decoder is widely used in
space, satellite, CDMA, Digital, PCS and DVB systems and hence there is a need for efficient,
low-power implementation of the Viterbi decoding algorithm. In this paper, modified differential
decoder architecture is proposed and implemented using 130nm CMOS technology. The output
samples from the sample and hold circuit are converted to current output and is processed. The
advantages of working in the current mode are the reduction in number of transistors. The addcompare-select logic is optimized by replacing the adder logic with an optimized adder that
consumes 24 transistors. The encoded data is QPSK modulated and the decoder logic operates on
I and Q channels. The major computation blocks like BMC and ACS are also implemented with
Current mirrors which will in reducing power and area as well as computing complexity. The
proposed design is modeled using Matlab and is verified for its functionality, the design is
captured using Cadence Virtuoso and is functionally verified for its logic correctness. The
differential logic operates at a maximum data rate of 200 Mbps and consumes power 10 times
less than the analog logic design. The designed decoder is suitable for low power applications.
Key Words: Analog VLSI, Differential Viterbi Decoder, Current Mode, High Speed, error
coding
1. INTRODUCTION
Channel encoding scheme such as convolutional coding is adopted for encoding the message and
at the receiver Viterbi decoder algorithm is used to extract message from the received corrupted
signal. The computation complexities of Viterbi Decoder (VD) restrict the use of VD for real
time applications, decoding of message using in VD requires large number of iterations hence
the algorithm consumes time. Several architectures and improvements to the existing
architectures have been proposed and adopted for VD. VLSI platforms such as ASIC and FPGA
platforms have been used for high speed implementation of VD algorithm. VD algorithm can be
implemented using analog circuits and digital circuits only. Digital implementation of VD on
FPGAs and DSP processors consume power and hence dedicated ASICs have shown to provide
performance in terms of area, power and speed. All Digital implementation of Viterbi decoder
(DVD) on ASIC platform can results in larger area and power hungry designs (Jia et al 1999,
Sridharan & Carley 2000) as they require large number of intermediate memories. As an
alternate to digital Viterbi Decoding analog decoding are finding importance. In the analog
domain computation of VD requires minimum number of transistors and logic circuits hence
reducing the size and power consumption. The analog approach was demonstrated in the late
1990s when Analog VD (AVD) begins to appear in the literature (Mathews & Spenser 1993,
Shakiba et al 1998, He & Cauwenberghs 2000, Demosthenous & Taylor 2002, Zand & Johns
2002, Kim et ak 2005). In parallel with that, the advantages of the analog approach have been
exploited in the design of maximum a posteriori (MAP) decoders for tail-biting trellis and
MSJETR, MSEC, Bangalore
Page 1
Hamming codes (Lustenberger et al 1999, Moerz et al 2000, Winstead et al 2006), and in other
iterative decoders for various block codes (Winstead et al 2004, Hemati et al 2006). It should be
noted that the analog decoders in (Lustenberger et al 1999, Moerz et al 2000, Winstead et al
2006, Winstead et al 2004, Hemati et al 2006) are soft-input and soft-output decoders (Howard et
al 2006). In an analog VD the digital path memory accounts for more than 50% of the die area
(Demosthenous & Taylor 2002). Several analog circuits have been developed to realize certain
sections of the decoder, such as branch metric calculation (BMC) and add-compare-select (ACS)
units (Shakiba et al 1998, He & Cauwenberghs 2000). Acampora & Gilmore (1978) suggested an
analog Viterbi decoder using sample-and-hold circuits and voltage adders to store and update the
path metrics. In their work, several nonlinear effects in the voltage-mode analog circuit have
been addressed, such as amplifier voltage offset, loop gains differing from unity, and nonlinear
compression within analog devices. Demosthenous & Taylor (2001) realized the minimum
Euclidean distance decoder in a current mode analog circuit, where they used a switchedcapacitor circuit as a front-end sample and-hold block to store the current value that represents
the previous path metric. He & Cauwenberghs (2000) implemented the minimum Hamming
distance decoder with a current-mode analog circuit based on a switched-capacitor and a winnertake-all circuit. Since they considered a Hamming distance, it is a hard-decision Viterbi decoder,
thus it does not take advantage of having continuous signals. Wen-Ta Lee et al (2006) reports
that the analog decision device chip with UMC 0.18-µm 1P6M CMOS technology. This chip
contains 494 transistors, operates to 100Mb/s and consumes 17.46mw. The chip area of the
analog Decision is about 0.544mm. This has advantage of low-power, small-area and is easy to
be combined with the RF front-end receiver. Demosthenous & Taylor (2002) reports that a 4state rate-1/2 analog convolutional decoder fabricated in 0.8-um CMOS technology, operates at
data rates up to 115 Mb/s and consumes 39 mW at that rate from a single 2.8-V power supply.
The die has a core area of 1 mm2 of which about 1/3 contains the analog section. This work
focuses on design and implementation of current mode analog Viterbi decoder for mixed signal
OFDM demodulator. The building blocks of decoder are optimized for area by reducing the
number of transistors and the operating speed is improved with design of optimum transistor
geometries and current buffers. The design is carried out using 130nm CMOS technology.
Differential Analog Viterbi decoding by Maunu et al (2008) presents a powerful Forward-ErrorCorrection (FEC) channel coding for digital communications. A Differential Analog Viterbi
decoder allows the high-speed and power consuming A/D converter to be excluded, because the
input data stream of the Viterbi decoder is inherently analog quantity. Widely used in space,
satellite, CDMA, Digital, PCS and DVB systems. The differential architecture enables the trace
back memory to be excluded and makes online decoding after initial transitional stages possible.
It utilizes twice the number of parallel states as compared with analog Viterbi decoder; hence it
is required to optimize the architecture. The need for efficient, low-power implementation of the
Viterbi decoding algorithm prompts alternative VLSI solutions.
Section II discusses proposed differential analog Viterbi decoder, section III discusses results
and conclusion is presented in section IV.
2. DESIGN OF DIFFERENTIAL ANALOG VITERBI DECODER
Differential Analog Viterbi decoding presents a powerful Forward-Error-Correction (FEC)
channel coding for digital communications. Viterbi decoders output a 0 or a 1 based on an
estimate of the input signal. With the increased data rates, the elimination of the high-speed and
power consuming A-D converter has made an analog Viterbi decoder a promising alternative
compared with its digital counterparts. Analog processing enables the analog–digital converter to
MSJETR, MSEC, Bangalore
Page 2
be excluded from the decoder realization. Moreover, high-speed operation can be achieved via
differential processing. Figure 1 shows the top level architecture of modified differential analog
Viterbi decoder.
ACS
Analog
Input
Sample
Hold
Voltage
to Current
converter
BMC
WTA
Decoded
output
ACS
Clk Gen
Circuit
Figure 1 Modified block diagram of Differential Viterbi Decoder
In the proposed design decoding is performed in current mode by converting the voltage to
current, which results in a high speed and savings in power consumption and silicon area. The
add-compare-select (ACS) unit architecture proposed by Tomatsopoulos & Demosthenous
(2008) is based on modified feedback decoding algorithm, in this proposed work the ACS unit
architecture is based on differential signaling. The sub circuits of Viterbi decoder such as Sample
and Hold (S/H) circuit, Branch Metric Unit (BMU), ACS unit and Winner Take All (WTA)
circuit proposed by Demosthenous & Taylor (2002) are modified to operate at high frequency,
optimizing number of transistors.
BMC, ACS and WTA are designed to operate on current samples. The branch metric
calculation channel symbols are compared with all possible code words by calculating the
Euclidean distance. The add-compare-select unit updates the probabilities of state transitions
according to the new branch metrics, compares two competitive probabilities (i.e. paths) entering
the block and selects the most probable path with the smallest distance. A detailed discussion on
design of sub circuits for the proposed current mode differential analog Viterbi decoder (DAVD)
is presented in subsequent sections.
2.1 Sample and Hold
The sample and hold circuit (S/H) forms the front end of Viterbi decoder, the incoming data is
from the channel is sampled and held in the internal capacitor and is transferred to the branch
metric unit. Shakiba et al (1998) have proposed ping-pong sample and hold circuit shown in
Figure 2.
MSJETR, MSEC, Bangalore
Page 3
Figure 2 S/H circuit with intermediate source follower
The circuit consists of two S/H sub circuits that operate in parallel, controlled by phase shifted
nonoverlapping clock signals (Φ1 and Φ2). Each S/H consists of capture and transfer module
realized using NMOS transmission gates with PMOS source follower sandwiched between them.
The input data m1(k) is captured during clock signal Φ1, and is transferred to next stage during
clock phase Φ2. In every clock cycle TACS two symbols are captured, as the phase shifted clock
signals have clock periods TACS/2. The S/H circuit is BiCMOS implementation, in which the
output stage consists of BJTs to boost the driving current. Source followers perform level
shifting and prevent discharge of captured sample due to MOS parasitics. Demonthenous and
Taylor (2002), have proposed front end sample & hold circuit shown in Figure 3 that is fully
differential.
Figure 3 One half of the FE-S/H circuit
The circuits has two parallel S/H that processes differential input signal Vin+ and Vin-, each S/H
consists of four stages of transmission gates that perform capture and transfer process. The first
stage is controlled by nonoverlapping phase shifted clock signals (Φ1a, Φ1b, Φ2a, Φ2b), the second
stage is controlled by inverted clock signals (Φ1, Φ2). The output stage consists of source
followers circuits that perform level shifting and provide lower output impedance. The
limitations of S/H circuit proposed by Andreas Demosthenous and John Taylor (2002) is that the
intermediate capacitor would discharge into the parasitic capacitance of transmission gates thus it
is required to perform fast switching and since it is fully differential 28 transistors are required.
The S/H circuit proposed by Shakiba et al (1998) is single ended and requires 10 transistors,
avoids leakage of captured data with intermediate source follower circuit. In the proposed work,
fully differential current mode Viterbi decoder it is required to have a differential S/H circuit that
can reduce charge discharge with reduced number of transistors, a novel S/H circuit is proposed
combining the advantages of both S/H circuits. In the proposed S/H circuit, S/H circuit proposed
by Shakiba et al (1998) is designed to operate on differential inputs (Vin+ and Vin-) by having two
S/H circuits, and the output stage consists of source follower stage to perform level shifting and
reduce output impedance. The proposed S/H circuit is shown in Figure 4.
MSJETR, MSEC, Bangalore
Page 4
VControl
M5
M2
M12
M9
Vbias3
M6
Vbias1
M3
M13
M10
Vbias2
M15
M8
M4
M11
M14
M16
M1
M7
VControl
M21
M18
Vbias3
M28
Vbias1
M29
Vbias2
M25
M22
M19
M26
M31
M24
M20
M27
M30
M32
M17
M23
Figure 4 Proposed S/H circuit with differential signaling
The transistor geometries (Wp/Wn) ratio is set to 1/0.3 µm, hold capacitor is set to 1 pF and the
sampling clock is set to operate at 200 MHz. The schematic is captured and simulated in
Virtuoso. The simulation result is shown in Figure 3.5, the input signal of 100 mV of 100 K
frequency is sampled at 200 MHz clock and the output sampled signals are captured during
every positive edge of clock.
MSJETR, MSEC, Bangalore
Page 5
Figure 5 Sample and hold circuit
2.2 Voltage to Current Converter
The output of sample and hold circuit (Vr1+, Vr2+, Vr1-, Vr2-,) are concerted to current with the
voltage to current converter shown in Figure 6 (a). The circuit consists of differential inputs that
are connected to differential outputs from S/H Vrx+ and Vrx-. The differential input controls the
output current in the load circuit, since the output current is generated at the output terminal of
the differential amplifier by the current mirror circuit the maximum current is governed by
Io=gm*Vin/2 (2.4 µA). The transistors ratios of voltage to current converter is set to 1/0.3 µm,
test input signal of 100 K is applied at the differential input with 200 mV bias. The simulation
results are shown in Figure 6(b), the differential amplifier successfully converts the voltage to
corresponding current in the range 1.2 µA to 2.4 µA.
Figure 6 Voltage to current converter (a) Schematic (b) Simulation results
2.3 Branch Metric Unit
The branch metric unit computes the Euclidean distance between the received symbol and
possible trellis outputs. As there are four states, each state has two incoming paths, computing
the Euclidean distance between incoming symbol and possible output will produce four possible
outputs. For example, if the received sequence is 01, and the possible outputs are 10 and 01,
computing Euclidean distance leads to output (2 Euclidean distance between 01and 10) and (0,
Euclidean distance between 01 and 01). At every state the possible branch metric output is{0, 1,
2}. The incoming signal is BPSK modulated, with two symbols per clock. Euclidean distance is
MSJETR, MSEC, Bangalore
Page 6
computed by reversal of output currents (current directions can be
) in the PMOS current
mirrors by the incoming signals. Demosthenous and Taylor (2002) in their work have presented
transconductors based branch metric computation block shown in Figure 7.
Figure 7 Branch metric unit with cascode current mirror
The BMC unit drives PMOS based high swing cascode mirrors, four mirrors are cascode to
produce four output currents for each of the add-compare-select unit. The current flow in the
cascode mirror circuits need to be stable for fast switching. To improve the current driving
capability of transconductors to stabilize constant current across all four cascode current mirror
circuits, an additional current bias circuit (MB1, MB2, MB3, MB4) is integrated so as to ensure
that there is always a constant bias current. Thus the current driving capacity of BMC is
improved and this also ensures the increase in speed of operation. The constant current IBP flows
in the current mirror is controlled by the bias voltage VB. Figure 8 shows the modified BMC
circuit proposed with improved current driving capacity.
Figure 8 Modified BMC with constant bias current
MSJETR, MSEC, Bangalore
Page 7
Figure 9 Branch metric unit schematic
Figure 9 shows the simulation results of modified BMU architecture that is designed to stabilize
bias current improving frequency of operation. A constant bias current of 200 nA drives the ACS
unit, the modified BMC unit modulates this current based on the input symbols received. The
simulation results show the input signals, constant bias current and modulated bias current. The
current samples are sent to the add-compare-select unit.
2.4 Add-Compare-Select Unit
The ACS unit is one of the primary building blocks of analog Viterbi decoder. As there are four
states, every state has one ACS unit that accumulates the branch metric with the corresponding
path metric, compares the path metrics and selects the path metric with minimum accumulated
error. Figure 10 (a) shows the trellis path for every ACSU and Figure 10 (b) is the functional
diagram of ACSU.
t-1
Py,t-1
Py,t-1 BMyx,t-1
Px,t-1 BMxx,t-1
+
+
t
BMyy,t-1
Py,t
Compare
BMyx,t
Select
BMxy,t-1
Store
Px,t-1
BMxx,t-1
Px,t
Px,t
To next ACSU
Figure 10 ACS unit (a) Trellis path for every ACSU (b) Functional diagram
MSJETR, MSEC, Bangalore
Page 8
Figure 11 ACS unit
Each ACS unit consists of replicating current comparator (RCC) and switched current (SI) path
metric memory. The branch metric current and path metric current (is discussed in below
section) are accumulated (as the design is current mode, addition of two currents are considered
to be node currents from two branches) enter the ACSU unit. Current I1 and I2 enter high swing
sub circuits that have realized using NMOS transistors M1-M4 and M5-M8 as shown in Figure
11. Current sensing latch is triggered by the output of high swing current mirror circuit. The latch
is controlled by the cross connected transistors M9 and M10, M15 resets the latch every
decoding cycle. The circuit operation is triggered by switching of M15, the currents entering the
high swing circuits trigger the latch through M9 and M10. If I1 is greater than I2, the voltage at
A is VSS else is VDD. The node voltages at A and A’ drive current steering switches M16, M17
and M18, M19. The current steering switches mirror the input currents I1 and I2 into PMOS
mirror circuits M20-M23 driving the SI path metric memory. The SI path metric memory
consists of pair of memory cells represented by M28-M31 and M32-M35. The memory cells are
controlled to operate in ping-pong fashion with non overlapping clock pulses Φ1 and Φ2. The
output of RCC is connected to memory cell through M24 and M25. The output of memory cells
are connected to high swing PMOS mirror M36-M43 through transistors M26 and M27. The
nonoverlapping clock pulse during phase 1 triggers the left memory cell; the output from
memory cell is connected to PMOS swing mirror during phase 2. The selection or direction of
current flowing in the PMOS mirror depends upon the input currents I1 and I2. The memory
output also drives the RCC back forming a feedback loop. A detailed discussion on ACS unit is
presented in Tomatsopoulos & Demosthenous (2008). One of the limitations of this circuit is
the delay in differential current steering switches M16, M17 and M18, M19, due to the presence
of latch. When the latch is RESET by switching on M15, both nodes A and A’ have common
potential and hence M16 and M18 are switched off. The currents flow in high swing PMOS
mirror is constant or zero. When M15 is switched on during next clock interval the transistors
M16 and M18 are biased and drive high swing PMOS mirror circuits, this causes delay as the
current in the circuit has to rise to its maximum. In order to overcome this delay, transistor M16a
and M18a are connected across M16 and M18 as shown in Figure 12.
MSJETR, MSEC, Bangalore
Page 9
Figure 12 Modified ACSU to reduce delay during RESET
The inclusion of M16a and M18a ensures that the current remains constant in the current steering
switches and mirrors I1 and I2 even during RESET operation. When M15 is switched on, the
circuit performs its normal operation and drives the high swing PMOS mirror circuits. The
modified ACS unit circuit schematic is captured using Virtuoso, each of the sub circuits are
integrated and the top module is schematic is shown in Figure 13. Figure 14 shows the
simulation results of one of the four ACS units.
Figure 13 Top level schematic of ACS unit
Current sources I1 and I2 generated from current sources as input sources to ACSU. The voltage
signal of latch output A and A’ are verified, depending upon I1 and I2 currents, A and A’
produce complementary outputs as shown in Figure 14. The latch output drives the SI unit as
well as differential current steering switches. The final output from ACSU is fed back into path
MSJETR, MSEC, Bangalore
Page 10
metric unit is shown in the simulation results, confirming the logic correctness of the proposed
ACSU.
Figure 14 Simulation results of ACS unit
2.5 Winner-Take-All (WTA)
WTA architecture consists of replicating current comparator circuits and CMOS latch.
Tomatsopoulos and Demosthenous (2008) have proposed six- 2-input RCC circuits with CMOS
latch arranged in tree structure for WTA circuit, the circuit is designed for Viterbi decoding
based on modified feedback decoding algorithm (MFDA). The WTA circuit is shown in Figure
15.
Figure 15 WTA circuit with tree structure
In this work, current mode differential analog decoder which is based on direct decoding method
and the WTA circuit proposed by Tomatsopoulos and Demosthenous (2008) is customized to the
ACS unit proposed in this work based on differential signaling. In their work the decoding
algorithm is split into double trellis and is based on modified feedback decoding algorithm. In
this work two ACS units are designed that operate on differential signal inputs; Figure 16 shows
the circuit schematic of proposed WTA circuit.
MSJETR, MSEC, Bangalore
Page 11
Figure 16 Modified WTA circuit with differential signaling
The modified WTA circuit consists of NMOS RCC circuit that processes four ACSU currents
and selects the competitive current which is given to the PMOS RCC circuit which selects the
final output current and is stored in the CMOS latch. The control signals FWTA are
appropriately sequenced to control the data flow into the tree structure. Figure 17 shows NMOS
RCC which is an inverted version of the RCC used in the ACS circuit shown in Figure 12. Idc1
and Idc2 are the two input currents that drive the high swing current mirrors (M1-M4 and M7M10), the output of current mirrors trigger the differential pair M5 and M6. If Idc1 > Idc2, M5 is
switched ON, current flows from VDD through M12 and M5 switching off transistor M14. If
M14 is switched off, the Iomin current flows through M13, thus selecting the minimum of Idc1
and Idc2.
M
1
1
M
1
3
M
1
2
M
1
4
FWT
A
M
5
M1-M4 is high
swing current
mirrors
M
6
M7-M10 is
high swing
current mirrors
Figure 17 NMOS RCC circuit schematic captured in Virtuoso
MSJETR, MSEC, Bangalore
Page 12
PMOS replicating current comparator circuit for load currents decision is shown in Figure 18.
These outputs are potential winners and need to process the output of NMOS RCC for WTA
comparison. The circuit operates similar to that of NMOS RCC, the circuit schematic is
complementary to NMOS RCC. Current sinks are replaced with current sources and the
minimum of two input currents drive the next stage in the WTA tree. PMOS RCC circuit is used
in order to reduce the winners from the previous level to two potential candidates of winner
current path metrics. The Fwta1 is set high to compute the first decision from the NMOS RCC,
when the Fwta2 is high the PMOS RCC computes the WTA current. At the last stage of WTA
tree a CMOS latch stores the path with the smallest metric.
Figure 18 PMOS RCC schematic in Virtuoso
2.6 Clock Generator Circuit
The clock generator circuit generates non-overlapping clock signals. The circuit schematic for
clock generator circuit is shown in Figure 19 (a) the incoming master clock is processed to
generate multiple clock signals. Mod counters are used for clock division design used D-flip
flops, buffers (realized using two inverters) are connected in sequence to delay the clock signals
according to the required sequence for control of Viterbi decoder sub circuits. Figure 19(b)
presents the simulation results of clock generated sequences.
Figure 19 Clock generation (a) Circuit schematic (b) Simulation results
2.7 Biasing Circuit
It generates necessary biasing voltages, biasing circuitry is included to provide on-chip biasing
generated from the power supply, as shown in Figure 20.
MSJETR, MSEC, Bangalore
Page 13
Figure 20 Biasing circuit (a) Circuit schematic (b) Simulation results
The four bias voltages are set by the external reference current source. The biasing circuit takes
input current of 100uA with voltage reference Vref of 0.65mV generates different biases of
Vbias3, Vbias2 and Vbias1 of 1.65V, 1.7V and 1.8V respectively. These biasing voltages are
required for add-compare-select circuit and branch metric unit and also clock generator circuit
utilizes.
The DAVD logic is implemented using 130nm CMOS technology, using TSMC standard
cell library. In order to optimize the layout for its area, layout techniques such as common
centroid and inter-digitization methods are adopted. Optimum choice of drivers and sizing of
drivers will improve the driving capability of sub systems. Transistor sizing and ordering of
transistors will minimize the delay in sub systems. To drive large output capacitances, the
geometries of transistors are set between 1µ to 2 µ. As the transistor width increases, layout
design for large geometry transistors is carried out using multiple transistors of smaller
geometries connected in parallel instead of a single transistor. The minimum geometries of
transistors are chosen to be of 40 nanometer. The sub block designed for the DAVD are
integrated and the top level block diagram of the DAVD architecture is shown in Figure 21. The
schematics are integrated with use of appropriate driver’s circuits for minimum delay. The
integrated design is simulated and the functionality of the DAVD is analyzed.
Figure 21 Top level architecture of proposed Viterbi Decoder
3. RESULTS AND DISCUSSION
The DAVD sub blocks that have been modified to operate on current input samples are
integrated to form the top level architecture. DAVD accepts convolution encoded binary symbols
and finds maximum likelihood sequence. A convolutional encoder of coding rate R = 1/2,,
corresponding to the desired data rate with constraint length K=3. The convolutional encoder
uses the generator polynomials, G0=101 and G1=111, R=1/2. In order to test the functionality of
the integrated DAVD, a message data [100111011] is encoded using convolutional encoder and
MSJETR, MSEC, Bangalore
Page 14
the encoded data is modulated using QPSK. The modulated data along with noise (AWGN with
5dB SNR) is provided as one of the test vectors to the DAVD. In order to test integrated DAVD,
sinusoidal signal of varied frequencies is considered as test vectors for the DAVD. The
simulation results were discussed in the previous section. The performance factors such as
maximum operating frequency, delay, power consumption that is reported in this section. The
DAVD model is first verified based on software reference model developed in Matlab. Figure 22
plots the input message signal and the encoded message data. The encoded symbol rate is twice
the information symbol rate.
Figure 22 Input message and encoded data
The binary symbols are first convolution encoded using modulo (XOR) 2 adder, with rate=1/2
and constraint length K=3. The encoded symbols are modulated using QPSK modulated
scheme. The modulated data is further added with AWGN and is transmitted. The term, 10*log10(Nsamp), is used to scale the noise power with the oversampling. The term, -10*log10
(1/code rate), is used to scale the noise power to match the coded symbol rate. The in-phase and
quadrature components of the noiseless QPSK signal are plotted. Once the binary symbols are
convolution encoded next step is to carryout modulation. The convolutional code successfully
modulated by QPSK consisting of I and Q signals is transmitted, at the receiver the signal is
demodulated and the message symbols are obtained, that are fed to the Viterbi decoder along
with noise. The decoder logic decodes the message, in this work a trace back length of 32 is
chosen to demodulate the 32 symbols received. Figure 23 shows the Viterbi decoder results. The
decoded symbols are plotted in blue stems with circles while the original (unencoded) symbols
are plotted in red stems. Bit error is computed based on the input message and the decoded
message data. The bit error rate is found to be of 10-5 at 5dB SNR.
Figure 23 Decoded symbols of convolutional codes
MSJETR, MSEC, Bangalore
Page 15
The functionally correct decoder is designed for optimum area, power and speed performances.
The design schematic captured using Cadence Virtuoso targeting 130nm CMOS technology is
simulated and the performance parameters of sub systems are discussed in detail. Table 1 shows
the parameters of S/H circuit. The transistor geometries are designed for optimum drive strength
and maximum frequency of operation. The S/H operates at a clock period of 5ns and hold the
data for duration of 50ns. Table 1 shows the performance parameters of S/H circuit. The power
consumption is about 14 mW and the maximum frequency of operation is 270 MHz with a load
capacitance of 100 pF.
Table 1 Sample and hold circuit parameters
Tcount
Wp/Wn(m)
Cap-Value
Vin (sin)
V/Freq
Clk
V/Time
Power
10
1µ/300n
1pF
100mV/100K
1.2/5ns
140µW
Table 2 shows the performance factors of V2I converter. The transistor geometries are set to the
size of S/H circuit. The V2I is simulated for duration of 6 clock cycles (T-count). The input
voltage applied to V2I from the S/H circuit is converted to current of maximum of 2.4µA. The
transistor geometries and designed to operate at maximum frequency.
Table 2 Voltage to current converter circuit parameters
T-count
Wp/Wn(m)
Vin1/Freq
Vin2/Freq
Vbias
Iout
6
1µ/300n
2V/10K
2V/10K
200mV/100ns
2.4µA
Table 3 shows the BMU parameters. The ACSU receives two outputs generated from BMU
which are given to differential ACSU logic. Input signals Vrm and Vrp are applied at two different
carrier frequency of 1KHz and 10KHz, the input signals are processed to two sets of current
output samples and are fed into the ACSU. The bias voltage is set to 200mV. The delay of the
BMU unit is 41.2 ns. The maximum operating frequency of BMU is 232 MHz. Power dissipation
is found to be 340 µW.
Table 3 Branch metric unit circuit parameters
Tcount
Wp/
Wn
Ibm (A)
Vrp
V/Hz
Vrm
V/Hz
Vbias3
ToACSU1
In Amps
ToACSU2
In Amps
Power
µW
15
1µ/
300n
1.42nA
1.2V/
1K
1.2V/
10K
200
mV
9.2 pA
151 pA
340
In the differential decoder logic, two ACSU operate in parallel, the transistor geometries are
chosen for optimum performance. The maximum operating frequency of ACSU is 221 MHz and
power consumption is less than 800µW. Table 4 shows the performance parameters of ACSU.
Table 4 ACS unit parameters
Tcount
Wp/
Wn
Idc1
Idc2
Idc3
Idc 4
Add1
Add2
Compare
Min
(Add1+Add2)
Select
15
1µ
/300n
1µA
2µA
10µA
20µA
3µA
30µA
3.95nA,
4.8nA
4.5nA
MSJETR, MSEC, Bangalore
Page 16
The Table 5 presents the specifications of winner take all circuit. The circuit needs to operate for
92 clock cycles and the minimum and maximum current are set to 1µA and 5µA respectively.
Table 5 WTA Unit Circuit Parameters
T-count
Wp/Wn
Idc
(1-4)
Idc (58)
Min (I1I4)
Min (I5I8)
Compare output
Min
(Idc1-4, Idc5-8 )
92
1µ/300n
1µA
2µA
3µA
4µA
5µA
6µA
7µA
8µA
1µA
5µA
1µA
The clock generator, RCC and biasing circuit are designed and simulated for its functionality.
The transistor geometries are designed for optimum performance. The nonlinearities of analog
circuit designed are compensated based on matching circuits, the width of NMOS and PMOS are
identified for symmetric layout design. Table 6 compares the performances of the proposed
DAVD with analog Viterbi decoder design.
Table 6 Comparison of DAVD with AVD
Demosthenous
&
Tomatsopoulos and
Parameters
Proposed DAVD
Taylor (2002)
Demosthenous (2008)
0.13 µm CMOS
0.8 µm CMOS
0.6 µm CMOS
Technology
1.2 V
2.8 V
3V
Supply voltage
2
2
0.2 mm
1mm
0.5 mm2
Core area
115 Mbps
>1 Mbps
Maximum decoding 200 Mbps
speed
< 1.2 mW
14.9 mW
-Power dissipation
340 µW
1.9 mW
-BMC
800 µW
9.4 mW
-ACSU
110 µW
1.5 mW
-PMR
207 µW
2.5 mW
2.45 mW
Bias circuit
Form the results tabulated it is found that the operating speed of proposed DAVD is improved by
42%, power dissipation is reduced by 92%.
4. Conclusion
Analog Viterbi decoders are realized using differential decoder logic. In this work, two different
ACSU is used to decode the encoded message signals that operate on differential signals. The
output of S/H circuit is voltage which is converted to current samples and is used in decoding.
Thus the proposed design reduces the number of transistors and also improves the operating
frequency. The proposed design is verified for its functionality based on Matlab models for
various test vectors. The transistor geometries are designed for optimum performance and is
captured using Cadence Virtuoso. The differential analog Viterbi decoder implemented using
0.13µm CMOS technology. Sample and hold unit is implemented and is utilizing 140µW of
power and occupying area 5x9 µm2. Voltage to current Converter draws 2.4µA and occupying
area 6x6µm2. Branch Metric Computation Unit is implemented as a sub block of Viterbi decoder
is utilizing 151pA, 15x14 µm2 of area. Add-Compare-Select Unit is implemented and is utilizing
4.8nA of current, 15x12µm2 of area. The designed DAVD is suitable for low power and high
MSJETR, MSEC, Bangalore
Page 17
speed applications. The circuit performance can be further enhanced with impedance matching
circuits at the output of BMU and ACSU.
Acknowledgement: The authors would like to acknowledge MSEC, Bangalore for the support
extended in providing the resources and financial assistance in carrying out his work. The
authors would also acknowledge MATLAB and Cadence for the online support in modeling the
design and verification of the design.
REFERENCES
1. Acampora, A & Gilmore, R 1978, ‘Analog Viterbi Decoding for High Speed Digital Satellite
Channels’, IEEE Transactions on Communications, vol. 26, no. 10, pp. 1463-1470.
2. Demosthenous, A & Taylor, J 1996, Proceedings of the Third IEEE International Conference
on Electronics, Circuits and Systems (ICECS), vol. 1, pp. 33-36
3. Demosthenous Andreas, Ctline Verdier & John Taylor 1997, ‘A new architecture for low
power analogue convolutional decoders’, IEEE International Symposium on Circuits and
Systems, vol. 1, pp. 37-40.
4. Demosthenous, A & Taylor, J 2002, ‘A 100-Mb/s 2.8-V CMOS current mode analog Viterbi
decoder’, IEEE Journal Solid-State Circuits, vol. 37, no. 7, pp. 904–910.
5. Demosthenous, A & Taylor, J 2001, ‘Effects of signal-dependant errors on the performance
of switched-current Viterbi decoders’, IEEE Trans. Circuits Syst. I, vol. 48, pp. 1225–1228.
6. Hemati, S, Banihashemi, AH, & Plett C 2006, ‘A 0.18-umCMOS analog min-sum iterative
decoder for a (32,8) low-density parity-check (LDPC) code’, IEEE J. Solid-State Circuits,
vol. 41, no. 11, pp. 2531–2540.
7. Janne Maunu, Mika Laiho, and Ari Paasio 2008, ‘A Differential Architecture for an Online
Analog Viterbi Decoder’, IEEE transactions on circuits and systems—I, regular papers, vol.
55, no. 4, pp. 1133-1140
8. Jia, L, Gao, Y, Isoaho, J, & Tenhunen, H 1999, ‘Design of a super-pipelined Viterbi
decoder’, Proc. International Symposium on Circuits and Systems (ISCAS), vol. 1, pp. 133–
136.
9. Matthews, TW & Spencer, RR 1993, ‘An integrated analog CMOS Viterbi detector for
digital magnetic recording’, IEEE J. Solid- State Circuits, vol. 28, pp. 1294-1302.
10. Mark Anders, Sanu Mathew, Steven Hsu, Ram Krishnamurthy & Shekhar Borkar 2008, ‘A
1.9 Gb/s 358 mW 16-256 State Reconfigurable Viterbi Accelerator in 90 nm CMOS’, IEEE
Journal of Solid-State Circuits, vol. 43, no. 1, pp. 214-222.
11. Sridharan, S. & Carley, LR 2000, ‘A 100-MHz 350-mW 0.6-�m CMOS 16-state
generalized-target Viterbi detector for disk drive read channels’, IEEE J. Solid-State Circuits,
vol. 35, no. 3, pp. 362–370.
12. Viterbi, J 1967, ‘Error Bounds for Convolutional Codes and an Asymptotically Optimum
Decoding Algorithm’, IEEE Transactions on Information Theory, vol. IT-13, pp. 260–269.
13. Wang, B, Poa, P, Wei, L, Li, L, Yang, Y & Chen, Y 2007, ‘(n,m) Selectivity of single-walled
carbon nanotubes by different carbon precursors on Co–Mo catalysts’, J. Amer. Chem. Soc.,
vol. 129, no. 9, pp. 9014–9019.
14. Wang Xiumin,Zhang Yang & Chen Haowei 2012, ‘Design of Viterbi Decoder Based on
FPGA, International Conference on Applied Physics and Industrial Engineering, Physics
Procedia, pp. 1243-1247.
15. Wen-Ta Lee, Ming-Jlun, Yuh-Shyan Hwang & Jiann-Jong Chen 2005, IC Design of a New
Decision
Device
for
Analog
Viterbi
Decoder’.
Available
from
MSJETR, MSEC, Bangalore
Page 18
http://140.117.166.1/eehome/ISCOM2005/SubmitPaper/UploadPapers/ISCON05_00133.pdf.
[6 January 2010]
16. Winstead, C, Nguyen, N, Gaudet, VC, & Schlegel, C 2006, ‘Low-voltage CMOS circuits for
analog iterative decoders’, IEEE Trans. Circuits Syst. I: Regular Papers, vol. 53, no. 4, pp.
829–841.
17. Winstead, C, Jie, D, Shuhuan, Y, Myers, C, Harrison, RR, & Schlegel, C 2004, ‘CMOS
analog MAP decoder for (8,4) Hamming code’, IEEE J. Solid-State Circuits, vol. 39, no. 1,
pp. 122–131.
18. Wonsun Yoo, Yunho Jung, Moo Young Kim & Seongjoo Lee 2012, A Pipelined 8-bit Soft
Decision Viterbi Decoder for IEEE802.11ac WLAN Systems, IEEE Transactions on
Consumer Electronics, Vol. 58, No. 4, pp. 1162-1168.
19. Yan Sun & Zhizhong Ding 2012, FPGA Design and Implementation of a Convolutional
Encoder and a Viterbi Decoder Based on 802.11a for OFDM, Wireless Engineering and
Technology, vol. 3, pp. 125-131.
20. Yao Gang & Tughrul Arslan 2004, ‘A Novel VLSI Architecture For ACS Operation In
Adaptive Viterbi Decoding’, International Conference on Communications, Circuits and
Systems (ICCCAS 2004), vol. 2, pp: 1434 – 1437.
21. Yao Gang, Tughrul Arslan & Ahmet Erdogan 2004, ‘An efficient reformulation based VLSI
architecture for adaptive Viterbi decoding in wireless applications’, SIPS.
22. Zand, B & Johns, DA 2002, ‘High-speed CMOS analog Viterbi detector for 4-PAM partialresponse signaling’, IEEE J. Solid-State Circuits, vol. 37, no. 7, pp. 895–903.
23. Zhiwei Xu, Qun Jane Gu, Yi-Cheng Wu & Mau-Chung Frank Chang 2015, ‘Integrated Dband transmitter and receiver for wireless data communication in 65 nm CMOS’, Analog
Integrated Circuits Signal Process, vol.82, pp. 171–179.
MSJETR, MSEC, Bangalore
Page 19