Download Click here to Paper Template

CURRENT MODE DIFFERENTIAL ANALOG VITERBI DECODER Cyril Prasanna Raj P. Professor, MSEC, Bangalore [email protected] S. L. Pinjare Professor, ECE Dept., RITM, Bangalore [email protected] ABSTRACT: Differential Analog Viterbi decoding presents a powerful Forward-ErrorCorrection (FEC) channel coding for digital communications. Viterbi decoder is widely used in space, satellite, CDMA, Digital, PCS and DVB systems and hence there is a need for efficient, low-power implementation of the Viterbi decoding algorithm. In this paper, modified differential decoder architecture is proposed and implemented using 130nm CMOS technology. The output samples from the sample and hold circuit are converted to current output and is processed. The advantages of working in the current mode are the reduction in number of transistors. The addcompare-select logic is optimized by replacing the adder logic with an optimized adder that consumes 24 transistors. The encoded data is QPSK modulated and the decoder logic operates on I and Q channels. The major computation blocks like BMC and ACS are also implemented with Current mirrors which will in reducing power and area as well as computing complexity. The proposed design is modeled using Matlab and is verified for its functionality, the design is captured using Cadence Virtuoso and is functionally verified for its logic correctness. The differential logic operates at a maximum data rate of 200 Mbps and consumes power 10 times less than the analog logic design. The designed decoder is suitable for low power applications. Key Words: Analog VLSI, Differential Viterbi Decoder, Current Mode, High Speed, error coding 1. INTRODUCTION Channel encoding scheme such as convolutional coding is adopted for encoding the message and at the receiver Viterbi decoder algorithm is used to extract message from the received corrupted signal. The computation complexities of Viterbi Decoder (VD) restrict the use of VD for real time applications, decoding of message using in VD requires large number of iterations hence the algorithm consumes time. Several architectures and improvements to the existing architectures have been proposed and adopted for VD. VLSI platforms such as ASIC and FPGA platforms have been used for high speed implementation of VD algorithm. VD algorithm can be implemented using analog circuits and digital circuits only. Digital implementation of VD on FPGAs and DSP processors consume power and hence dedicated ASICs have shown to provide performance in terms of area, power and speed. All Digital implementation of Viterbi decoder (DVD) on ASIC platform can results in larger area and power hungry designs (Jia et al 1999, Sridharan & Carley 2000) as they require large number of intermediate memories. As an alternate to digital Viterbi Decoding analog decoding are finding importance. In the analog domain computation of VD requires minimum number of transistors and logic circuits hence reducing the size and power consumption. The analog approach was demonstrated in the late 1990s when Analog VD (AVD) begins to appear in the literature (Mathews & Spenser 1993, Shakiba et al 1998, He & Cauwenberghs 2000, Demosthenous & Taylor 2002, Zand & Johns 2002, Kim et ak 2005). In parallel with that, the advantages of the analog approach have been exploited in the design of maximum a posteriori (MAP) decoders for tail-biting trellis and MSJETR, MSEC, Bangalore Page 1 Hamming codes (Lustenberger et al 1999, Moerz et al 2000, Winstead et al 2006), and in other iterative decoders for various block codes (Winstead et al 2004, Hemati et al 2006). It should be noted that the analog decoders in (Lustenberger et al 1999, Moerz et al 2000, Winstead et al 2006, Winstead et al 2004, Hemati et al 2006) are soft-input and soft-output decoders (Howard et al 2006). In an analog VD the digital path memory accounts for more than 50% of the die area (Demosthenous & Taylor 2002). Several analog circuits have been developed to realize certain sections of the decoder, such as branch metric calculation (BMC) and add-compare-select (ACS) units (Shakiba et al 1998, He & Cauwenberghs 2000). Acampora & Gilmore (1978) suggested an analog Viterbi decoder using sample-and-hold circuits and voltage adders to store and update the path metrics. In their work, several nonlinear effects in the voltage-mode analog circuit have been addressed, such as amplifier voltage offset, loop gains differing from unity, and nonlinear compression within analog devices. Demosthenous & Taylor (2001) realized the minimum Euclidean distance decoder in a current mode analog circuit, where they used a switchedcapacitor circuit as a front-end sample and-hold block to store the current value that represents the previous path metric. He & Cauwenberghs (2000) implemented the minimum Hamming distance decoder with a current-mode analog circuit based on a switched-capacitor and a winnertake-all circuit. Since they considered a Hamming distance, it is a hard-decision Viterbi decoder, thus it does not take advantage of having continuous signals. Wen-Ta Lee et al (2006) reports that the analog decision device chip with UMC 0.18-µm 1P6M CMOS technology. This chip contains 494 transistors, operates to 100Mb/s and consumes 17.46mw. The chip area of the analog Decision is about 0.544mm. This has advantage of low-power, small-area and is easy to be combined with the RF front-end receiver. Demosthenous & Taylor (2002) reports that a 4state rate-1/2 analog convolutional decoder fabricated in 0.8-um CMOS technology, operates at data rates up to 115 Mb/s and consumes 39 mW at that rate from a single 2.8-V power supply. The die has a core area of 1 mm2 of which about 1/3 contains the analog section. This work focuses on design and implementation of current mode analog Viterbi decoder for mixed signal OFDM demodulator. The building blocks of decoder are optimized for area by reducing the number of transistors and the operating speed is improved with design of optimum transistor geometries and current buffers. The design is carried out using 130nm CMOS technology. Differential Analog Viterbi decoding by Maunu et al (2008) presents a powerful Forward-ErrorCorrection (FEC) channel coding for digital communications. A Differential Analog Viterbi decoder allows the high-speed and power consuming A/D converter to be excluded, because the input data stream of the Viterbi decoder is inherently analog quantity. Widely used in space, satellite, CDMA, Digital, PCS and DVB systems. The differential architecture enables the trace back memory to be excluded and makes online decoding after initial transitional stages possible. It utilizes twice the number of parallel states as compared with analog Viterbi decoder; hence it is required to optimize the architecture. The need for efficient, low-power implementation of the Viterbi decoding algorithm prompts alternative VLSI solutions. Section II discusses proposed differential analog Viterbi decoder, section III discusses results and conclusion is presented in section IV. 2. DESIGN OF DIFFERENTIAL ANALOG VITERBI DECODER Differential Analog Viterbi decoding presents a powerful Forward-Error-Correction (FEC) channel coding for digital communications. Viterbi decoders output a 0 or a 1 based on an estimate of the input signal. With the increased data rates, the elimination of the high-speed and power consuming A-D converter has made an analog Viterbi decoder a promising alternative compared with its digital counterparts. Analog processing enables the analog–digital converter to MSJETR, MSEC, Bangalore Page 2 be excluded from the decoder realization. Moreover, high-speed operation can be achieved via differential processing. Figure 1 shows the top level architecture of modified differential analog Viterbi decoder. ACS Analog Input Sample Hold Voltage to Current converter BMC WTA Decoded output ACS Clk Gen Circuit Figure 1 Modified block diagram of Differential Viterbi Decoder In the proposed design decoding is performed in current mode by converting the voltage to current, which results in a high speed and savings in power consumption and silicon area. The add-compare-select (ACS) unit architecture proposed by Tomatsopoulos & Demosthenous (2008) is based on modified feedback decoding algorithm, in this proposed work the ACS unit architecture is based on differential signaling. The sub circuits of Viterbi decoder such as Sample and Hold (S/H) circuit, Branch Metric Unit (BMU), ACS unit and Winner Take All (WTA) circuit proposed by Demosthenous & Taylor (2002) are modified to operate at high frequency, optimizing number of transistors. BMC, ACS and WTA are designed to operate on current samples. The branch metric calculation channel symbols are compared with all possible code words by calculating the Euclidean distance. The add-compare-select unit updates the probabilities of state transitions according to the new branch metrics, compares two competitive probabilities (i.e. paths) entering the block and selects the most probable path with the smallest distance. A detailed discussion on design of sub circuits for the proposed current mode differential analog Viterbi decoder (DAVD) is presented in subsequent sections. 2.1 Sample and Hold The sample and hold circuit (S/H) forms the front end of Viterbi decoder, the incoming data is from the channel is sampled and held in the internal capacitor and is transferred to the branch metric unit. Shakiba et al (1998) have proposed ping-pong sample and hold circuit shown in Figure 2. MSJETR, MSEC, Bangalore Page 3 Figure 2 S/H circuit with intermediate source follower The circuit consists of two S/H sub circuits that operate in parallel, controlled by phase shifted nonoverlapping clock signals (Φ1 and Φ2). Each S/H consists of capture and transfer module realized using NMOS transmission gates with PMOS source follower sandwiched between them. The input data m1(k) is captured during clock signal Φ1, and is transferred to next stage during clock phase Φ2. In every clock cycle TACS two symbols are captured, as the phase shifted clock signals have clock periods TACS/2. The S/H circuit is BiCMOS implementation, in which the output stage consists of BJTs to boost the driving current. Source followers perform level shifting and prevent discharge of captured sample due to MOS parasitics. Demonthenous and Taylor (2002), have proposed front end sample & hold circuit shown in Figure 3 that is fully differential. Figure 3 One half of the FE-S/H circuit The circuits has two parallel S/H that processes differential input signal Vin+ and Vin-, each S/H consists of four stages of transmission gates that perform capture and transfer process. The first stage is controlled by nonoverlapping phase shifted clock signals (Φ1a, Φ1b, Φ2a, Φ2b), the second stage is controlled by inverted clock signals (Φ1, Φ2). The output stage consists of source followers circuits that perform level shifting and provide lower output impedance. The limitations of S/H circuit proposed by Andreas Demosthenous and John Taylor (2002) is that the intermediate capacitor would discharge into the parasitic capacitance of transmission gates thus it is required to perform fast switching and since it is fully differential 28 transistors are required. The S/H circuit proposed by Shakiba et al (1998) is single ended and requires 10 transistors, avoids leakage of captured data with intermediate source follower circuit. In the proposed work, fully differential current mode Viterbi decoder it is required to have a differential S/H circuit that can reduce charge discharge with reduced number of transistors, a novel S/H circuit is proposed combining the advantages of both S/H circuits. In the proposed S/H circuit, S/H circuit proposed by Shakiba et al (1998) is designed to operate on differential inputs (Vin+ and Vin-) by having two S/H circuits, and the output stage consists of source follower stage to perform level shifting and reduce output impedance. The proposed S/H circuit is shown in Figure 4. MSJETR, MSEC, Bangalore Page 4 VControl M5 M2 M12 M9 Vbias3 M6 Vbias1 M3 M13 M10 Vbias2 M15 M8 M4 M11 M14 M16 M1 M7 VControl M21 M18 Vbias3 M28 Vbias1 M29 Vbias2 M25 M22 M19 M26 M31 M24 M20 M27 M30 M32 M17 M23 Figure 4 Proposed S/H circuit with differential signaling The transistor geometries (Wp/Wn) ratio is set to 1/0.3 µm, hold capacitor is set to 1 pF and the sampling clock is set to operate at 200 MHz. The schematic is captured and simulated in Virtuoso. The simulation result is shown in Figure 3.5, the input signal of 100 mV of 100 K frequency is sampled at 200 MHz clock and the output sampled signals are captured during every positive edge of clock. MSJETR, MSEC, Bangalore Page 5 Figure 5 Sample and hold circuit 2.2 Voltage to Current Converter The output of sample and hold circuit (Vr1+, Vr2+, Vr1-, Vr2-,) are concerted to current with the voltage to current converter shown in Figure 6 (a). The circuit consists of differential inputs that are connected to differential outputs from S/H Vrx+ and Vrx-. The differential input controls the output current in the load circuit, since the output current is generated at the output terminal of the differential amplifier by the current mirror circuit the maximum current is governed by Io=gm*Vin/2 (2.4 µA). The transistors ratios of voltage to current converter is set to 1/0.3 µm, test input signal of 100 K is applied at the differential input with 200 mV bias. The simulation results are shown in Figure 6(b), the differential amplifier successfully converts the voltage to corresponding current in the range 1.2 µA to 2.4 µA. Figure 6 Voltage to current converter (a) Schematic (b) Simulation results 2.3 Branch Metric Unit The branch metric unit computes the Euclidean distance between the received symbol and possible trellis outputs. As there are four states, each state has two incoming paths, computing the Euclidean distance between incoming symbol and possible output will produce four possible outputs. For example, if the received sequence is 01, and the possible outputs are 10 and 01, computing Euclidean distance leads to output (2 Euclidean distance between 01and 10) and (0, Euclidean distance between 01 and 01). At every state the possible branch metric output is{0, 1, 2}. The incoming signal is BPSK modulated, with two symbols per clock. Euclidean distance is MSJETR, MSEC, Bangalore Page 6 computed by reversal of output currents (current directions can be ) in the PMOS current mirrors by the incoming signals. Demosthenous and Taylor (2002) in their work have presented transconductors based branch metric computation block shown in Figure 7. Figure 7 Branch metric unit with cascode current mirror The BMC unit drives PMOS based high swing cascode mirrors, four mirrors are cascode to produce four output currents for each of the add-compare-select unit. The current flow in the cascode mirror circuits need to be stable for fast switching. To improve the current driving capability of transconductors to stabilize constant current across all four cascode current mirror circuits, an additional current bias circuit (MB1, MB2, MB3, MB4) is integrated so as to ensure that there is always a constant bias current. Thus the current driving capacity of BMC is improved and this also ensures the increase in speed of operation. The constant current IBP flows in the current mirror is controlled by the bias voltage VB. Figure 8 shows the modified BMC circuit proposed with improved current driving capacity. Figure 8 Modified BMC with constant bias current MSJETR, MSEC, Bangalore Page 7 Figure 9 Branch metric unit schematic Figure 9 shows the simulation results of modified BMU architecture that is designed to stabilize bias current improving frequency of operation. A constant bias current of 200 nA drives the ACS unit, the modified BMC unit modulates this current based on the input symbols received. The simulation results show the input signals, constant bias current and modulated bias current. The current samples are sent to the add-compare-select unit. 2.4 Add-Compare-Select Unit The ACS unit is one of the primary building blocks of analog Viterbi decoder. As there are four states, every state has one ACS unit that accumulates the branch metric with the corresponding path metric, compares the path metrics and selects the path metric with minimum accumulated error. Figure 10 (a) shows the trellis path for every ACSU and Figure 10 (b) is the functional diagram of ACSU. t-1 Py,t-1 Py,t-1 BMyx,t-1 Px,t-1 BMxx,t-1 + + t BMyy,t-1 Py,t Compare BMyx,t Select BMxy,t-1 Store Px,t-1 BMxx,t-1 Px,t Px,t To next ACSU Figure 10 ACS unit (a) Trellis path for every ACSU (b) Functional diagram MSJETR, MSEC, Bangalore Page 8 Figure 11 ACS unit Each ACS unit consists of replicating current comparator (RCC) and switched current (SI) path metric memory. The branch metric current and path metric current (is discussed in below section) are accumulated (as the design is current mode, addition of two currents are considered to be node currents from two branches) enter the ACSU unit. Current I1 and I2 enter high swing sub circuits that have realized using NMOS transistors M1-M4 and M5-M8 as shown in Figure 11. Current sensing latch is triggered by the output of high swing current mirror circuit. The latch is controlled by the cross connected transistors M9 and M10, M15 resets the latch every decoding cycle. The circuit operation is triggered by switching of M15, the currents entering the high swing circuits trigger the latch through M9 and M10. If I1 is greater than I2, the voltage at A is VSS else is VDD. The node voltages at A and A’ drive current steering switches M16, M17 and M18, M19. The current steering switches mirror the input currents I1 and I2 into PMOS mirror circuits M20-M23 driving the SI path metric memory. The SI path metric memory consists of pair of memory cells represented by M28-M31 and M32-M35. The memory cells are controlled to operate in ping-pong fashion with non overlapping clock pulses Φ1 and Φ2. The output of RCC is connected to memory cell through M24 and M25. The output of memory cells are connected to high swing PMOS mirror M36-M43 through transistors M26 and M27. The nonoverlapping clock pulse during phase 1 triggers the left memory cell; the output from memory cell is connected to PMOS swing mirror during phase 2. The selection or direction of current flowing in the PMOS mirror depends upon the input currents I1 and I2. The memory output also drives the RCC back forming a feedback loop. A detailed discussion on ACS unit is presented in Tomatsopoulos & Demosthenous (2008). One of the limitations of this circuit is the delay in differential current steering switches M16, M17 and M18, M19, due to the presence of latch. When the latch is RESET by switching on M15, both nodes A and A’ have common potential and hence M16 and M18 are switched off. The currents flow in high swing PMOS mirror is constant or zero. When M15 is switched on during next clock interval the transistors M16 and M18 are biased and drive high swing PMOS mirror circuits, this causes delay as the current in the circuit has to rise to its maximum. In order to overcome this delay, transistor M16a and M18a are connected across M16 and M18 as shown in Figure 12. MSJETR, MSEC, Bangalore Page 9 Figure 12 Modified ACSU to reduce delay during RESET The inclusion of M16a and M18a ensures that the current remains constant in the current steering switches and mirrors I1 and I2 even during RESET operation. When M15 is switched on, the circuit performs its normal operation and drives the high swing PMOS mirror circuits. The modified ACS unit circuit schematic is captured using Virtuoso, each of the sub circuits are integrated and the top module is schematic is shown in Figure 13. Figure 14 shows the simulation results of one of the four ACS units. Figure 13 Top level schematic of ACS unit Current sources I1 and I2 generated from current sources as input sources to ACSU. The voltage signal of latch output A and A’ are verified, depending upon I1 and I2 currents, A and A’ produce complementary outputs as shown in Figure 14. The latch output drives the SI unit as well as differential current steering switches. The final output from ACSU is fed back into path MSJETR, MSEC, Bangalore Page 10 metric unit is shown in the simulation results, confirming the logic correctness of the proposed ACSU. Figure 14 Simulation results of ACS unit 2.5 Winner-Take-All (WTA) WTA architecture consists of replicating current comparator circuits and CMOS latch. Tomatsopoulos and Demosthenous (2008) have proposed six- 2-input RCC circuits with CMOS latch arranged in tree structure for WTA circuit, the circuit is designed for Viterbi decoding based on modified feedback decoding algorithm (MFDA). The WTA circuit is shown in Figure 15. Figure 15 WTA circuit with tree structure In this work, current mode differential analog decoder which is based on direct decoding method and the WTA circuit proposed by Tomatsopoulos and Demosthenous (2008) is customized to the ACS unit proposed in this work based on differential signaling. In their work the decoding algorithm is split into double trellis and is based on modified feedback decoding algorithm. In this work two ACS units are designed that operate on differential signal inputs; Figure 16 shows the circuit schematic of proposed WTA circuit. MSJETR, MSEC, Bangalore Page 11 Figure 16 Modified WTA circuit with differential signaling The modified WTA circuit consists of NMOS RCC circuit that processes four ACSU currents and selects the competitive current which is given to the PMOS RCC circuit which selects the final output current and is stored in the CMOS latch. The control signals FWTA are appropriately sequenced to control the data flow into the tree structure. Figure 17 shows NMOS RCC which is an inverted version of the RCC used in the ACS circuit shown in Figure 12. Idc1 and Idc2 are the two input currents that drive the high swing current mirrors (M1-M4 and M7M10), the output of current mirrors trigger the differential pair M5 and M6. If Idc1 > Idc2, M5 is switched ON, current flows from VDD through M12 and M5 switching off transistor M14. If M14 is switched off, the Iomin current flows through M13, thus selecting the minimum of Idc1 and Idc2. M 1 1 M 1 3 M 1 2 M 1 4 FWT A M 5 M1-M4 is high swing current mirrors M 6 M7-M10 is high swing current mirrors Figure 17 NMOS RCC circuit schematic captured in Virtuoso MSJETR, MSEC, Bangalore Page 12 PMOS replicating current comparator circuit for load currents decision is shown in Figure 18. These outputs are potential winners and need to process the output of NMOS RCC for WTA comparison. The circuit operates similar to that of NMOS RCC, the circuit schematic is complementary to NMOS RCC. Current sinks are replaced with current sources and the minimum of two input currents drive the next stage in the WTA tree. PMOS RCC circuit is used in order to reduce the winners from the previous level to two potential candidates of winner current path metrics. The Fwta1 is set high to compute the first decision from the NMOS RCC, when the Fwta2 is high the PMOS RCC computes the WTA current. At the last stage of WTA tree a CMOS latch stores the path with the smallest metric. Figure 18 PMOS RCC schematic in Virtuoso 2.6 Clock Generator Circuit The clock generator circuit generates non-overlapping clock signals. The circuit schematic for clock generator circuit is shown in Figure 19 (a) the incoming master clock is processed to generate multiple clock signals. Mod counters are used for clock division design used D-flip flops, buffers (realized using two inverters) are connected in sequence to delay the clock signals according to the required sequence for control of Viterbi decoder sub circuits. Figure 19(b) presents the simulation results of clock generated sequences. Figure 19 Clock generation (a) Circuit schematic (b) Simulation results 2.7 Biasing Circuit It generates necessary biasing voltages, biasing circuitry is included to provide on-chip biasing generated from the power supply, as shown in Figure 20. MSJETR, MSEC, Bangalore Page 13 Figure 20 Biasing circuit (a) Circuit schematic (b) Simulation results The four bias voltages are set by the external reference current source. The biasing circuit takes input current of 100uA with voltage reference Vref of 0.65mV generates different biases of Vbias3, Vbias2 and Vbias1 of 1.65V, 1.7V and 1.8V respectively. These biasing voltages are required for add-compare-select circuit and branch metric unit and also clock generator circuit utilizes. The DAVD logic is implemented using 130nm CMOS technology, using TSMC standard cell library. In order to optimize the layout for its area, layout techniques such as common centroid and inter-digitization methods are adopted. Optimum choice of drivers and sizing of drivers will improve the driving capability of sub systems. Transistor sizing and ordering of transistors will minimize the delay in sub systems. To drive large output capacitances, the geometries of transistors are set between 1µ to 2 µ. As the transistor width increases, layout design for large geometry transistors is carried out using multiple transistors of smaller geometries connected in parallel instead of a single transistor. The minimum geometries of transistors are chosen to be of 40 nanometer. The sub block designed for the DAVD are integrated and the top level block diagram of the DAVD architecture is shown in Figure 21. The schematics are integrated with use of appropriate driver’s circuits for minimum delay. The integrated design is simulated and the functionality of the DAVD is analyzed. Figure 21 Top level architecture of proposed Viterbi Decoder 3. RESULTS AND DISCUSSION The DAVD sub blocks that have been modified to operate on current input samples are integrated to form the top level architecture. DAVD accepts convolution encoded binary symbols and finds maximum likelihood sequence. A convolutional encoder of coding rate R = 1/2,, corresponding to the desired data rate with constraint length K=3. The convolutional encoder uses the generator polynomials, G0=101 and G1=111, R=1/2. In order to test the functionality of the integrated DAVD, a message data [100111011] is encoded using convolutional encoder and MSJETR, MSEC, Bangalore Page 14 the encoded data is modulated using QPSK. The modulated data along with noise (AWGN with 5dB SNR) is provided as one of the test vectors to the DAVD. In order to test integrated DAVD, sinusoidal signal of varied frequencies is considered as test vectors for the DAVD. The simulation results were discussed in the previous section. The performance factors such as maximum operating frequency, delay, power consumption that is reported in this section. The DAVD model is first verified based on software reference model developed in Matlab. Figure 22 plots the input message signal and the encoded message data. The encoded symbol rate is twice the information symbol rate. Figure 22 Input message and encoded data The binary symbols are first convolution encoded using modulo (XOR) 2 adder, with rate=1/2 and constraint length K=3. The encoded symbols are modulated using QPSK modulated scheme. The modulated data is further added with AWGN and is transmitted. The term, 10*log10(Nsamp), is used to scale the noise power with the oversampling. The term, -10*log10 (1/code rate), is used to scale the noise power to match the coded symbol rate. The in-phase and quadrature components of the noiseless QPSK signal are plotted. Once the binary symbols are convolution encoded next step is to carryout modulation. The convolutional code successfully modulated by QPSK consisting of I and Q signals is transmitted, at the receiver the signal is demodulated and the message symbols are obtained, that are fed to the Viterbi decoder along with noise. The decoder logic decodes the message, in this work a trace back length of 32 is chosen to demodulate the 32 symbols received. Figure 23 shows the Viterbi decoder results. The decoded symbols are plotted in blue stems with circles while the original (unencoded) symbols are plotted in red stems. Bit error is computed based on the input message and the decoded message data. The bit error rate is found to be of 10-5 at 5dB SNR. Figure 23 Decoded symbols of convolutional codes MSJETR, MSEC, Bangalore Page 15 The functionally correct decoder is designed for optimum area, power and speed performances. The design schematic captured using Cadence Virtuoso targeting 130nm CMOS technology is simulated and the performance parameters of sub systems are discussed in detail. Table 1 shows the parameters of S/H circuit. The transistor geometries are designed for optimum drive strength and maximum frequency of operation. The S/H operates at a clock period of 5ns and hold the data for duration of 50ns. Table 1 shows the performance parameters of S/H circuit. The power consumption is about 14 mW and the maximum frequency of operation is 270 MHz with a load capacitance of 100 pF. Table 1 Sample and hold circuit parameters Tcount Wp/Wn(m) Cap-Value Vin (sin) V/Freq Clk V/Time Power 10 1µ/300n 1pF 100mV/100K 1.2/5ns 140µW Table 2 shows the performance factors of V2I converter. The transistor geometries are set to the size of S/H circuit. The V2I is simulated for duration of 6 clock cycles (T-count). The input voltage applied to V2I from the S/H circuit is converted to current of maximum of 2.4µA. The transistor geometries and designed to operate at maximum frequency. Table 2 Voltage to current converter circuit parameters T-count Wp/Wn(m) Vin1/Freq Vin2/Freq Vbias Iout 6 1µ/300n 2V/10K 2V/10K 200mV/100ns 2.4µA Table 3 shows the BMU parameters. The ACSU receives two outputs generated from BMU which are given to differential ACSU logic. Input signals Vrm and Vrp are applied at two different carrier frequency of 1KHz and 10KHz, the input signals are processed to two sets of current output samples and are fed into the ACSU. The bias voltage is set to 200mV. The delay of the BMU unit is 41.2 ns. The maximum operating frequency of BMU is 232 MHz. Power dissipation is found to be 340 µW. Table 3 Branch metric unit circuit parameters Tcount Wp/ Wn Ibm (A) Vrp V/Hz Vrm V/Hz Vbias3 ToACSU1 In Amps ToACSU2 In Amps Power µW 15 1µ/ 300n 1.42nA 1.2V/ 1K 1.2V/ 10K 200 mV 9.2 pA 151 pA 340 In the differential decoder logic, two ACSU operate in parallel, the transistor geometries are chosen for optimum performance. The maximum operating frequency of ACSU is 221 MHz and power consumption is less than 800µW. Table 4 shows the performance parameters of ACSU. Table 4 ACS unit parameters Tcount Wp/ Wn Idc1 Idc2 Idc3 Idc 4 Add1 Add2 Compare Min (Add1+Add2) Select 15 1µ /300n 1µA 2µA 10µA 20µA 3µA 30µA 3.95nA, 4.8nA 4.5nA MSJETR, MSEC, Bangalore Page 16 The Table 5 presents the specifications of winner take all circuit. The circuit needs to operate for 92 clock cycles and the minimum and maximum current are set to 1µA and 5µA respectively. Table 5 WTA Unit Circuit Parameters T-count Wp/Wn Idc (1-4) Idc (58) Min (I1I4) Min (I5I8) Compare output Min (Idc1-4, Idc5-8 ) 92 1µ/300n 1µA 2µA 3µA 4µA 5µA 6µA 7µA 8µA 1µA 5µA 1µA The clock generator, RCC and biasing circuit are designed and simulated for its functionality. The transistor geometries are designed for optimum performance. The nonlinearities of analog circuit designed are compensated based on matching circuits, the width of NMOS and PMOS are identified for symmetric layout design. Table 6 compares the performances of the proposed DAVD with analog Viterbi decoder design. Table 6 Comparison of DAVD with AVD Demosthenous & Tomatsopoulos and Parameters Proposed DAVD Taylor (2002) Demosthenous (2008) 0.13 µm CMOS 0.8 µm CMOS 0.6 µm CMOS Technology 1.2 V 2.8 V 3V Supply voltage 2 2 0.2 mm 1mm 0.5 mm2 Core area 115 Mbps >1 Mbps Maximum decoding 200 Mbps speed < 1.2 mW 14.9 mW -Power dissipation 340 µW 1.9 mW -BMC 800 µW 9.4 mW -ACSU 110 µW 1.5 mW -PMR 207 µW 2.5 mW 2.45 mW Bias circuit Form the results tabulated it is found that the operating speed of proposed DAVD is improved by 42%, power dissipation is reduced by 92%. 4. Conclusion Analog Viterbi decoders are realized using differential decoder logic. In this work, two different ACSU is used to decode the encoded message signals that operate on differential signals. The output of S/H circuit is voltage which is converted to current samples and is used in decoding. Thus the proposed design reduces the number of transistors and also improves the operating frequency. The proposed design is verified for its functionality based on Matlab models for various test vectors. The transistor geometries are designed for optimum performance and is captured using Cadence Virtuoso. The differential analog Viterbi decoder implemented using 0.13µm CMOS technology. Sample and hold unit is implemented and is utilizing 140µW of power and occupying area 5x9 µm2. Voltage to current Converter draws 2.4µA and occupying area 6x6µm2. Branch Metric Computation Unit is implemented as a sub block of Viterbi decoder is utilizing 151pA, 15x14 µm2 of area. Add-Compare-Select Unit is implemented and is utilizing 4.8nA of current, 15x12µm2 of area. The designed DAVD is suitable for low power and high MSJETR, MSEC, Bangalore Page 17 speed applications. The circuit performance can be further enhanced with impedance matching circuits at the output of BMU and ACSU. Acknowledgement: The authors would like to acknowledge MSEC, Bangalore for the support extended in providing the resources and financial assistance in carrying out his work. The authors would also acknowledge MATLAB and Cadence for the online support in modeling the design and verification of the design. REFERENCES 1. Acampora, A & Gilmore, R 1978, ‘Analog Viterbi Decoding for High Speed Digital Satellite Channels’, IEEE Transactions on Communications, vol. 26, no. 10, pp. 1463-1470. 2. Demosthenous, A & Taylor, J 1996, Proceedings of the Third IEEE International Conference on Electronics, Circuits and Systems (ICECS), vol. 1, pp. 33-36 3. Demosthenous Andreas, Ctline Verdier & John Taylor 1997, ‘A new architecture for low power analogue convolutional decoders’, IEEE International Symposium on Circuits and Systems, vol. 1, pp. 37-40. 4. Demosthenous, A & Taylor, J 2002, ‘A 100-Mb/s 2.8-V CMOS current mode analog Viterbi decoder’, IEEE Journal Solid-State Circuits, vol. 37, no. 7, pp. 904–910. 5. Demosthenous, A & Taylor, J 2001, ‘Effects of signal-dependant errors on the performance of switched-current Viterbi decoders’, IEEE Trans. Circuits Syst. I, vol. 48, pp. 1225–1228. 6. Hemati, S, Banihashemi, AH, & Plett C 2006, ‘A 0.18-umCMOS analog min-sum iterative decoder for a (32,8) low-density parity-check (LDPC) code’, IEEE J. Solid-State Circuits, vol. 41, no. 11, pp. 2531–2540. 7. Janne Maunu, Mika Laiho, and Ari Paasio 2008, ‘A Differential Architecture for an Online Analog Viterbi Decoder’, IEEE transactions on circuits and systems—I, regular papers, vol. 55, no. 4, pp. 1133-1140 8. Jia, L, Gao, Y, Isoaho, J, & Tenhunen, H 1999, ‘Design of a super-pipelined Viterbi decoder’, Proc. International Symposium on Circuits and Systems (ISCAS), vol. 1, pp. 133– 136. 9. Matthews, TW & Spencer, RR 1993, ‘An integrated analog CMOS Viterbi detector for digital magnetic recording’, IEEE J. Solid- State Circuits, vol. 28, pp. 1294-1302. 10. Mark Anders, Sanu Mathew, Steven Hsu, Ram Krishnamurthy & Shekhar Borkar 2008, ‘A 1.9 Gb/s 358 mW 16-256 State Reconfigurable Viterbi Accelerator in 90 nm CMOS’, IEEE Journal of Solid-State Circuits, vol. 43, no. 1, pp. 214-222. 11. Sridharan, S. & Carley, LR 2000, ‘A 100-MHz 350-mW 0.6-�m CMOS 16-state generalized-target Viterbi detector for disk drive read channels’, IEEE J. Solid-State Circuits, vol. 35, no. 3, pp. 362–370. 12. Viterbi, J 1967, ‘Error Bounds for Convolutional Codes and an Asymptotically Optimum Decoding Algorithm’, IEEE Transactions on Information Theory, vol. IT-13, pp. 260–269. 13. Wang, B, Poa, P, Wei, L, Li, L, Yang, Y & Chen, Y 2007, ‘(n,m) Selectivity of single-walled carbon nanotubes by different carbon precursors on Co–Mo catalysts’, J. Amer. Chem. Soc., vol. 129, no. 9, pp. 9014–9019. 14. Wang Xiumin，Zhang Yang & Chen Haowei 2012, ‘Design of Viterbi Decoder Based on FPGA, International Conference on Applied Physics and Industrial Engineering, Physics Procedia, pp. 1243-1247. 15. Wen-Ta Lee, Ming-Jlun, Yuh-Shyan Hwang & Jiann-Jong Chen 2005, IC Design of a New Decision Device for Analog Viterbi Decoder’. Available from MSJETR, MSEC, Bangalore Page 18 http://140.117.166.1/eehome/ISCOM2005/SubmitPaper/UploadPapers/ISCON05_00133.pdf. [6 January 2010] 16. Winstead, C, Nguyen, N, Gaudet, VC, & Schlegel, C 2006, ‘Low-voltage CMOS circuits for analog iterative decoders’, IEEE Trans. Circuits Syst. I: Regular Papers, vol. 53, no. 4, pp. 829–841. 17. Winstead, C, Jie, D, Shuhuan, Y, Myers, C, Harrison, RR, & Schlegel, C 2004, ‘CMOS analog MAP decoder for (8,4) Hamming code’, IEEE J. Solid-State Circuits, vol. 39, no. 1, pp. 122–131. 18. Wonsun Yoo, Yunho Jung, Moo Young Kim & Seongjoo Lee 2012, A Pipelined 8-bit Soft Decision Viterbi Decoder for IEEE802.11ac WLAN Systems, IEEE Transactions on Consumer Electronics, Vol. 58, No. 4, pp. 1162-1168. 19. Yan Sun & Zhizhong Ding 2012, FPGA Design and Implementation of a Convolutional Encoder and a Viterbi Decoder Based on 802.11a for OFDM, Wireless Engineering and Technology, vol. 3, pp. 125-131. 20. Yao Gang & Tughrul Arslan 2004, ‘A Novel VLSI Architecture For ACS Operation In Adaptive Viterbi Decoding’, International Conference on Communications, Circuits and Systems (ICCCAS 2004), vol. 2, pp: 1434 – 1437. 21. Yao Gang, Tughrul Arslan & Ahmet Erdogan 2004, ‘An efficient reformulation based VLSI architecture for adaptive Viterbi decoding in wireless applications’, SIPS. 22. Zand, B & Johns, DA 2002, ‘High-speed CMOS analog Viterbi detector for 4-PAM partialresponse signaling’, IEEE J. Solid-State Circuits, vol. 37, no. 7, pp. 895–903. 23. Zhiwei Xu, Qun Jane Gu, Yi-Cheng Wu & Mau-Chung Frank Chang 2015, ‘Integrated Dband transmitter and receiver for wireless data communication in 65 nm CMOS’, Analog Integrated Circuits Signal Process, vol.82, pp. 171–179. MSJETR, MSEC, Bangalore Page 19

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Click here to Paper Template