Download A High-Speed Analog Turbo Decoder Fotios Gioulekas Michael Birbas Alex Birbas

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Electrical engineering wikipedia , lookup

Mains electricity wikipedia , lookup

Resistive opto-isolator wikipedia , lookup

Alternating current wikipedia , lookup

Rotary encoder wikipedia , lookup

Analog-to-digital converter wikipedia , lookup

Multimeter wikipedia , lookup

Integrated circuit wikipedia , lookup

Electronic engineering wikipedia , lookup

Opto-isolator wikipedia , lookup

Transcript
A High-Speed Analog Turbo Decoder
77
A High-Speed Analog Turbo Decoder
Fotios Gioulekas , Michael Birbas , Alex Birbas, and George Bilionis, Non-members
ABSTRACT
A new type of iterative decoders based on analog computing networks, which are used to decode
powerful error-correcting schemes, such as Turbo and
Low-density parity-check (LDPC) codes, outperform
their digital counterparts in terms of power consumption and speed. Only few analog Turbo decoders,
all of them based on CMOS subthreshold technology have been implemented till now. This paper
aims to present the design and enhanced functionality of an all-analog Turbo decoder taking advantage of high-speed features of SiGe HBTs achieving
throughput up to Gbits/s. Simulation results based
on AMS 0.35µm SiGe BiCMOS technology demonstrate promising performance compared to existing
designs.
Keywords: Analog Turbo Decoder, High speed,
SiGe benefits.
1. INTRODUCTION
The superior performance of Turbo [1] and LDPC
[2] codes, which almost “touch” the theoretical Shannon capacity limit offer great channel coding gains
in telecommunication systems providing low bit and
frame error rates at significant low signal to noise
ratios. Researchers [3, 4, 5] have observed that
this important class of algorithms can be efficiently
represented using graphical models (factor graphs).
These error-correcting schemes are based on iteratively passing messages (soft information), which correspond to probability mass functions, and their execution can be interpreted as an instance of a general “sum-product algorithm (SPA) [5]”. It has been
shown [3] that this algorithm is identical to the a posteriori probability (APP)-BCJR algorithm [6] used in
Turbo decoding. The main advantages of this graphical representation are the easy visualization of the
algorithm and the simplification of the corresponding
equations leading to low-power execution graphs.
Digital implementations of Turbo codes and
their representative SPA algorithm require complex
floating- point computations, large look up tables and
memory storage while the iterative nature of the decoding process causes long latencies, thus limiting
04PSI04: Manuscript received on December 20, 2004 ; revised
on August 8, 2005.
The authors are with the Department of Electrical and
Computer Engineering University of Patras Campus of
Rion, 26500, Greece. E-mail:{fgiou, mbirbas, birbas, bilionis}@ee.upatras.gr
significantly the decoding speed and increasing the
power consumption. However, the direct mapping
of the SPA algorithm into analog translinear circuits
that has been proposed lately outperforms comparable digital decoders improving the ratio of speed to
power consumption by two orders of magnitude [7].
Furthermore, a digital error-correcting decoder has
to estimate the original message by processing the received analog signal, a task that requires A/D converters of significant resolution. This denotes that
the availability of an analog decoder would give the
ability to directly process the incoming signal in its
natural form avoiding the A/D converter overhead.
The aim of this work is to utilize directly the analog received waveform through the design of a 16bit analog Turbo decoder in 0.35µm SiGe BiCMOS
technology and to investigate/compare its capabilities against other existing approaches (both digital
and analog). This is one of the first implementations
in literature since almost all other known ones, which
are few anyway, are in CMOS. The rest of the paper
is organized as follows: Section 2 reports the related
and previous work in the field of analog decoding.
In Section 3 an overview of the Turbo code scheme is
given and the employed design technique is described.
The decoder architecture and its performance based
on simulation results are given in section 4. Finally,
section 5 compares our decoder to other analog and
digital implementation.
2. RELATED WORK AND MOTIVATION
The concept of an all-analog decoding scheme was
introduced in 1998 [8] while prior work in analog
implementations of Viterbi decoders [9, 10] demonstrated better performance in power consumption and
speed than their corresponding digital realizations.
Implementations of small analog decoders in CMOS
and conventional BiCMOS technologies of Hamming
[11], tailbiting convolutional [12, 13], Turbo [14, 15,
16] and LDPC [17, 18] codes have been reported as
a proof-of-concept. Those analog decoders are constructed by interconnecting highly parallel translinear [7] circuits. These translinear circuits work with
soft (continuous) values in continuous-time and are
based on the Gillbert multiplier cell [19].
Some research groups [7] follow the current mode
approach, where the soft-inputs and soft-outputs
(SISO) [20] of the APP algorithm have the form of
currents corresponding to discrete probability distributions while other research group [8, 12] follow the
voltage mode approach, where log-likelihood ratios
78
ECTI TRANSACTIONS ON ELECTRICAL ENG., ELECTRONICS, AND COMMUNICATIONS VOL.3, NO.2 AUGUST 2005
(LLRs) are represented as differential voltages. Both
approaches exploit the exponential voltage-current
characteristic of bipolar transistors as well as of
CMOS ones when these operate in the subthreshold
region. Thus, the non-linearity of the transistor is
exploited rather than fought.
Our work adopts the current mode approach based
on the fact that the voltage mode approach is more
sensitive in temperature [21]. So far, really few analog implementations of Turbo decoders can be found
in the literature [14, 15, 16] and almost all of them
in CMOS technology. This is mainly due to the fact
that CMOS transistors consume less power, but on
the other hand are slower and sensitive to noise and
process variations when working in the subthreshold
region [22], an operation mode that is anyway difficult
to maintain/control but it is necessary to keep so as to
avoid losing accuracy. These are some of the reasons
that led us to choose SiGe BiCMOS technology and
in the same time we were able to directly benefit from
the superior speed of SiGe heterojunction transistors.
Regarding existing implementation of Turbo decoders
in SiGe, to our knowledge only one has been reported
[23] indicating speed gains but they followed a different architectural approach from our case. Specifically,
they used as many component APP decoders as the
estimated number of the iterations to build the analog decoder (no sample-and-hold-circuits where used
to feed the decoder with the voltage waveform). However, their design did not work properly and so it was
not possible to compare our design to theirs. In this
work the objective is to prove the concept of realizing
efficient analog implementations of Turbo decoders in
SiGe technology, and to investigate/evaluate the pros
and cons of such realizations while keeping the design
procedure and the Turbo decoder in quite realistic
standards.
accepts a bit information sequence u and provides a
set of outputs c = [u, p, q], where p and q are the
parity bits produced by RSC1 and RSC2, respectively. After the serialization of the encoder outputs,
the modulator forms and transmits them through the
channel. In our case the BPSK modulation scheme
and the AWGN memoryless channel model have been
employed. Both trellises are considered truncated at
the end.
The Turbo decoder incorporates a parallel-to-serial
(P2S) interface, which accepts the corrupted (by
noise) signals and feeds them to the two APP decoders. The two decoders exchange iteratively extrinsic information. The extrinsic information, which is
a reliability measure of each component decoder’s estimate concerning the transmitted information symbols, is based on the corresponding parity sequence
only. This negative feedback scheme is responsible
for the Turbo decoder’s stability and superior performance. After a specified number of iterations the
decoder makes final hard-decisions about the transmitted information sequence.
Figure 1 illustrates the Turbo encoder with 1/3
rate, the trellis diagram of the constituent codes and
the Turbo decoder architecture.
3. DESIGN METHODOLOGY
3. 2 Factor Graph Representation
3. 1 Turbo Code Structure
The Turbo encoding and decoding procedures can
be efficiently represented through a bipartite graphical model, the factor graph, which expresses how a
global function of several variables is factored into a
product of local functions [3]. A factor graph has a
variable node x for each variable, a factor node f for
each local function and an edge connecting a variable node x to a factor node f if and only if x is
an argument of f. The final cost function-inference
(a-posteriori probabilities) is computed by performing the sum-product algorithm and the node-update
rules [5]. Figure 2 delineates the factor graph for our
Turbo coding scheme. The variable nodes Yu , Yp , Yq
are represented by circles and describe the channel
observations provided e.g. by the matched filter of a
typical BPSK demodulator. The variable nodes U, Q,
P represent the transmitted information (u) and parity bits (p and q), respectively. The auxiliary nodes
S are used to define the Trellis states. The function
The Turbo encoder is constructed by the parallel concatenation of two or more recursive systematic convolutional codes (RSC), termed constituent
codes. The input of the first encoder is not permuted.
The input to the other encoders is permuted so that
the other constituent encoders are fed with the interleaved and uncorrelated form of the information sequence. Our design refers to the turbo encoder that
comprises two identical RSC codes and one cyclicshift interleaver [24]. The RSC codes used are described by the following generator polynomial:
g (D) 1 + D2 1
G(D) = 1,
= 1,
g0 (D)
1 + D + D2
(1)
The encoding process of an RSC code can be represented by a trellis diagram, which is defined by the
relevant generator polynomial. The Turbo encoder
Fig.1: (I) Turbo encoder structure, (II) One trellis
section, (III) the turbo decoder scheme.
A High-Speed Analog Turbo Decoder
79
Fig.3: MATLAB simulation results of BER and
FER vs. SNR.
Fig.2: The factor graph representation of the 16-bit
Turbo code.
nodes T, Ch are represented by squares and correspond to the trellis sections (this is a binary function,
which its value is true when a codeword belongs to
the trellis) and to the transition probabilities of the
Gaussian channel, respectively.
The methodological steps used for this implementation are described as follows. Firstly, the factor
graph model is employed as the input specification.
A high-level language like MATLAB simulates the
functionality of the Turbo code. The compound encoders are chosen appropriately in order to construct
a Turbo code of large “minimum Hamming distance”
avoiding “error-floor” problems, if it is possible. The
aforementioned encoders (eq.1) of constraint length
equal to three have the properties to constitute a
Turbo code with low complexity that fulfills the above
criteria. In Figure 3 the BER and FER curves are depicted and show the error correcting capability of the
employed decoder scheme.
3. 3 The Architecture of the Basic Analog Circuit
The next step during the design of the analog
Turbo decoder includes the mapping of the Trellis
function nodes into the transistor-level analog circuits. The Trellis function node (sum product module) is described and evaluated according to the following equation:
pX (x)pY (y)f (x, y, z)
(2)
pZ (z) = S
x∈X y∈Y
pX , pY and p − Z are probability distributions
(real-valued non-negative functions) defined on some
set X, Y, and Z of discrete random variables x, y and
z respectively. f is a binary function ranged in the
set {0, 1}, which denotes that the two nodes x, y
are connected by a trellis branch and S is a normalization factor that does not depend on z. Eq.2 is
the elementary computation used in the APP algorithm. A current vector Iv with non-negative components represents the relative probability distributions ([Iv,1 , . . . , Iv,n ] = [pv (v1 ), . . . , pv (vn )]). Figure
4 shows the circuit implementation of the equation
(2). The circuit consists of three cores. Two cores
are responsible for the interconnection to other computation modules and are based on current mirrors [6,
14]. The scaling up of the currents can be employed
either in the input or the output interfacing core. For
BiCMOS implementations the scaling up at the input
core provides more speed to the circuit [14]. The central core that performs the appropriate computation
as defined by (eq.2) is based on the Gilbert multiplier
cell. In our case the computation core and the input
core employ HBTs transistors while the output core
uses PMOS ones.
The next steps of the design methodology involve schematic design and Spice-level circuit simulations. Transistor low-level effects are taken into
account in order to determine their impact to the
error-correcting performance of the decoder. The design is verified comparing the circuit simulations with
the factor-graph specification (MATLAB simulation
results of the factor graph model). Finally, the layout and the chip is constructed and it is ready for
fabrication.
4. THE ANALOG TURBO DECODER ARCHITECTURE
4. 1 Analog Components of the Turbo Decoder
The analog Turbo decoder architecture includes
a serial-to-parallel (S2P) interface, two APP modules, an interleaver and the output stages, which in-
80
ECTI TRANSACTIONS ON ELECTRICAL ENG., ELECTRONICS, AND COMMUNICATIONS VOL.3, NO.2 AUGUST 2005
Fig.4: Generic circuit implementation of the equation (2).
Fig.5: The 16-bit analog Turbo decoder architecture.
volve current comparators and shift registers in order
to serialize the output (parallel to serial interfaceP2S). Due to the fact that this analog network works
continuously in time, the inputs should be fed simultaneously. The demodulator provides soft-values
(LLRs), which are an estimation of the corresponding codeword. This output has the form of a voltage waveform representing the LLR values (LLR=
ln[P(y/u=1)/P(y/u=0)]). In order to avoid the use
of a large number of the decoder pin-outs, a S2P interface has been employed [25], which comprises two
pipeline stages embedding two sample-and-hold circuits. This allows the decoder to process the analog signal frame-by-frame. The role of the two stage
buffer is firstly, to keep track of the corresponding
soft-value and secondly, to feed in parallel the decoder inputs through a voltage to current converter
(V2I). Figure 5 depicts the analog Turbo decoder’s
block diagram architecture.
The probability propagation (SPA) is applied at
the factor graph’s level. Figure 6 presents the basic components of a single APP decoder used in the
analog Turbo decoder. Each circuit line represents
Fig.6: The block diagram of the APP decoder with
16-bit length.
a probability distribution function and consequently
the relevant current vector. The boxes represent
the computations based on the elementary module
given in eq.2. Specifically, the A-type modules perform the branch metric computation γ. (It associates
the channel transition probabilities of the information
and parity bits with the relevant edge of the trellis.)
The B-type modules multiply the branch metric by
the extrinsic information δ(u) provided by the other
APP decoder. The C-type and D-type modules are
responsible for the computation of the forward α and
backward metrics b, respectively. The E-type modules multiply the forward metric by the backward one
and the F-type modules compute the extrinsic information E. Finally, the G-type modules provide the
a-posteriori probabilities U of the information bits.
The above computations are based on the SumProduct algorithm and are given analytically in Table
1.
S is a scaling-up coefficient (it differs from module to module), P (y/u) and P (y1 /p) are the channel
transition probabilities for the information and the
parity bits, respectively. The execution time t corresponds to the relevant trellis section.
4. 2 Transistor Level Design
The analog decoder has been designed using a 0.35
µm SiGe BiCMOS process from AMS. The analog
circuit operates with a 5V supply. The S2P interface feeds the voltage-to-current converters, which
are actually differential amplifiers. To efficiently
map the LLR values to voltages, we need to evaluate the dynamic range of the transistors. Simulations/optimizations have shown that for our design
the common mode voltage equals to VCM =2.2 V. The
A High-Speed Analog Turbo Decoder
81
Table 1: APP Computations.
Fig.7: V2I component.
range of the single-ended voltage Vinput is [2.0, 2.4]
volts. Therefore, the LLR values are converted to differential voltages (∆Vdif f erential = 0.2Vp−p ) by scaling and shifting them with constant factors. In our
case, the following equation holds.
Vinput = A ∗ Vth ∗ LLR + VCM = 0.0282 ∗ LLR + 2.2
(3)
The differential currents (I0 and I1 ) are provided
to the analog decoder APPs through the V2Is components (log-likelihood ratios to probabilities conversion). The differential currents represent channel
transition probabilities (see Figure 7). Thus the exponential characteristic of the HBTs regarding the
base-emitter voltage versus the collector current is
expressed as follows:
I0
I
= I
1
∆V
−V
th
1+e
= I0 + I1
, I1 = I
∆V
−V
e
th
∆V
−V
1+e
th
This circuit is a translinear circuit based on a Gilbert
cell that calculates the branch metric and it is depicted in Figure 8 (I). The relevant circuit of the module C that performs the forward recursion of the APP
algorithm is also shown in Figure 8 (II).
The bottom HBT transistor (in both Figures 8 (I)
and 8 (II)) is responsible for the bias current and
the scaling-up of the input currents. This scaling-up
in the input speeds-up the decoder. If chosen properly this bias current corresponds to the probability of
one. The 4 diode connected HBTs transistors (Figure
8 (I)) are used to mirror the input currents into the
basic multiplication circuitry. The rest of the HBTs
transistors perform the assigned calculation. The
pMOS transistors mirror the output current and allow the direct cascading of the modules avoiding extra
interface circuitry. It is obvious that by increasing the
bias current and the emitter length of the HBTs, the
power consumption is increased too. In order to keep
the power consumption and the die area to an acceptable magnitude, the bias current has been chosen to
be equal to Ibias = 688µA and the emitter length of
the HBTs equals to Lemitter = 8×0.35µm. The channel width of each PMOS transistor is W = 10µm.
The values of the reference voltages are VRef a = 2.3
Volts, VRef b = 1.3 Volts and Vb = 1.2 Volts. The
value of the resistor equals to R = 505.65Ω. Same
comments hold for the similar circuits of the C, D, E,
F and G modules.
(4)
4. 3 Simulation Results
As mentioned in section 4.1 the module A calculates the branch metric of the relevant trellis section.
SPICE simulation of the analog Turbo decoder is a
very time-consuming procedure. In order to simulate
82
ECTI TRANSACTIONS ON ELECTRICAL ENG., ELECTRONICS, AND COMMUNICATIONS VOL.3, NO.2 AUGUST 2005
settled within 8ns while existing BiCMOS and CMOS
designs [14] and [10] require 50ns and 1µs, respectively. Similar behavior has also been reported [17]
but only for a single APP decoder. (Similar graph
with Figure 9 have been obtained for the 16-bit case
but only the 10-bit case is shown here for reasons of
clarity.)
The following figure shows this subset of the output differential currents, which correspond to the aposteriori probabilities, provided the transmitted bit
is zero P(ui = 0/yi ). The error-received bits 3 and 4
were finally corrected at the end of 8ns.
Fig.9: Analog Turbo decoder’s error correction capability vs. settling time for 30 bits codeword (10 information bits) at SNR=0.8dB.
Fig.8: (I) Branch metric calculation, (II) schematic
of the forward recursion.
its performance, we have chosen to simulate a Turbo
decoder of 10 and 16 information bits. The analog Turbo decoder is a continuous-time asynchronous
network including a continuous-time feedback loop.
After feeding it with the voltage waveform, which corresponds to a received frame, the decoder settles to a
steady stage and its output provides an estimation of
the transmitted information bits. Hard decisions are
made on soft outputs using a bank of current comparators. Therefore, it avoids the discrete iterations
that its digital counterpart performs before making
the hard decision.
Figure 9 illustrates the transient response of the
proposed analog decoder and characterizes its performance. The use of SiGe HBTs has as a result a reduction to the decoder’s settling time. The used HBTs
have a transient frequency of approximately fT =60
GHz. This gives a significant advantage comparing
to the CMOS and conventional BiCMOS processes
used in other analog decoding designs. The simulated
transient response shows that the current values are
The decoding process of the analog network is divided into two phases. The first phase is related to
the decoder reset by feeding it with an analog signal
that produces equal currents. The time delay for this
procedure has been estimated by simulations to be
equal to Treset = 5ns. We reset the decoder for every
received frame. The other phase involves the supply
of the received frame to the decoder and its decoding,
and it is characterized by the settling time Tsettling .
Thus the total decoding delay equals approximately
to Td = Treset + Tsettling = 13ns. This gives a total
throughput of 16×75M b/s = 1.23Gb/s approximately
for a 16-bit decoder. Figure 10 shows an example of
the decoding progress of the 16-bit analog decoder.
A decoder’s performance and its capability are at
a great extent also determined and evaluated by its
bit error (BER) rate at various SNR values. The performance of the specific 16-bit analog decoder is given
in Figure 11, where the simulated BER is drawn as a
function of the time Tsettling for SNR=0.4 and 0.8 dB.
From the graph, we observed that as the decoder size
increases the BER values are decreased, and its errorcorrecting capability improves as the settling time increases, thus exhibiting the desired performance.
A High-Speed Analog Turbo Decoder
83
Table 2:
coders.
Fig.10: Transient response of the decoding process
for 5 consecutive received frames (SNR=0.8dB).
Throughput of Digital and Analog De-
Decoder
Information bits
Throughput
Analog SiGe Turbo
decoder (This work)
16
1.2Gb/s
Analog CMOS Turbo
decoders [8, 10]
16-40
2-13Mb/s
FPGA based Turbo
decoder [20, 21 ,22]
4k-64k
(6.1-6.5) Kb/s-
Digital ASIC Turbo
decoder [23]
16k
356Kb/s
311Mb/s
reliable metric for comparing various decoders implemented in different technologies (in general BiCMOS
designs are faster but consume more power than the
relevant CMOS ones). Table 3 depicts the performance of analog and digital decoders based on this
metric.
Table 3: Energy consumption per decoded bit for
digital and analog Turbo Decoders.
Energy consumption
per decoded bit
(nJ/decoded bit)
Fig.11: BER performance of the analog Turbo decoder as a function of the settling time.
5. COMPARISON OF THE PROPOSED
DECODER WITH EXISTING ANALOG
AND DIGITAL IMPLEMENTATIONS
The simulation results show that our 16-bit analog decoder under implementation, which uses SiGe
BiCMOS technology, can achieve a high-throughput
of 1.2Gb/s approximately. On the other hand, the
throughput of existing analog CMOS implementations [9, 10] varies from 2-13 Mb/s. This indicates the applicability of this analog decoder to
high-performance systems, like storage applications,
VDSL systems as well as optical systems. The
throughput of digital implementations based on Field
Programmable Gate Arrays (FPGAs) ranges from
356Kbit/s (64k info bits) [20] to 6.1-6.5Mb/s (4k-info
bits) [21, 22] while the fastest ASIC implementation
[23] of Turbo codes achieves throughput of 311Mbit/s
(16k-info bits) at the expense of high power consumption and silicon area. Table 2 shows the throughput
of Digital and Analog Turbo Decoders.
The energy consumption per decoded bit at a specific operating supply voltage can be considered as a
Decoders
7.088@5V
(Estimated) Analog Turbo decoder in
0.35µm SiGe BiCMOS (This work)
[email protected]
(Measured) 0.35µmCMOS Analog
Turbo decoder [8]
[email protected]
(Measured) 0.35µm CMOS Analog
Turbo decoder [10]
40@5V
(Measured) 0.8µm CMOS Digital
Turbo decoder [24]
960@5V
(Estimated) TMS320C55XTM DSP
[25]
Table 3 shows that the analog decoder presented
here outperform digital implementations. Indeed,
digital decoders demand ADC converters of high resolution to increase the BER performance. Additionally, digital implementations are dominated by excessive memory requirements, which are due to the excessive block length, the need for floating-point storage at the interleavers and for exponential computations. Thus, they require much greater chip area than
an analog decoder. For instance, the type-D module
used for the backward recursion requires 8 multiplications and 4 additions for the digital case, while the
analog implementation requires no more than 49 transistors (including the current mirrors and bias) for
the same task. Regarding our decoder based on SiGe
HBTs, it greatly outperforms existing analog CMOS
implementations in terms of speed but as expected
consumes more power. This is due to the inherent
nature of CMOS transistors, at the expense however
of the great difficulty to have them operating at the
subthreshold region. The estimated power consump-
84
ECTI TRANSACTIONS ON ELECTRICAL ENG., ELECTRONICS, AND COMMUNICATIONS VOL.3, NO.2 AUGUST 2005
tion of our 16-bit analog decoder is approximately
P=886mW at 5V supply. Furthermore, it should be
noted that, as shown in Table 3, the energy consumption per decoded bit of our decoder is at the same
order of magnitude with the corresponding CMOS
ones, which demonstrates that the capability of high
throughput is not translated to excessive power consumption.
6. CONCLUSION
In this paper the design of a high-speed analog
Turbo Decoder with low energy consumption per decoded bit was presented. Its implementation is based
on the 0.35 µm SiGe BiCMOS AMS process and, to
our knowledge, it is one of the first analog Turbo Decoder’s realizations in SiGe. The derived results so
far demonstrate the efficiency and the applicability of
this design for high throughput systems, being able to
reach speeds of the order of Gb/s. This exhibits superior performance than the few existing analog CMOS
ones and outperforms similar digital designs in terms
of speed and power consumption, while also its size
is certainly smaller. Currently, a 16-bit version of
this decoder is being implemented with the layout of
the design and the prototype board being under construction and about to be sent for fabrication.
ACKNOWLEDGEMENT
This work has been partially supported by the “K.
Karatheodoris” basic research initiative of the University of Patras.
References
[1] C. Berrou, A. Glavieux, and P. Thitimajshima,
“Near Shannon limit error-correcting coding and
decoding: Turbo-codes (1),” IEEE International
Conference on Communications (ICC93), pp.
1064-1071, vol. 2/3, May 1993.
[2] R. Gallager, “Low density parity check codes,”
IRE Transactions on Information Theory, vol. IT8, pp. 21-28, Jan. 1962.
[3] N. Wiberg, Codes and decoding on general graphs,
Ph.D. dissertation 440, Linkoping Studies in Science and Technology, University of Linkoping,
Linkoping, Sweden, 1996.
[4] F. R. Kschischang, B. J. Frey, and H. A. Loeliger,
“Factor graphs and the sum-product algorithm,”
IEEE Transactions on Information Theory, vol.
47, pp. 498-519, Feb. 2001.
[5] L. R. Bahl, J. Cocke, F. Jelinek, and J. Raviv,
“Optimal decoding of linear codes for minimizing symbol error rate,” IEEE Transactions on Information Theory, vol. IT-20, pp. 284-287, Mar.
1974.
[6] H. A. Loeliger, F. Lustenberger, M. Helfenstein,
and F.Tarkoy, “Probability propagation and decoding in analog VLSI,” IEEE Transactions on
Information Theory, vol. 47, no. 2, pp. 837-843,
Feb. 2001.
[7] J. Hagenauer and M. Winklhofer, “The analog decoder,” Proceedings of IEEE International Symposium on Information Theory (ISIT98), 16-21
Aug.1998, Boston, USA, p.145.
[8] V. C. Gaudet, and P. G. Gulak, “A 13.3Mb/s
0.35 µm CMOS Analog Turbo Decoder IC with a
Configurable Interleaver,” IEEE Journal of SolidState Circuits, vol. 38, no. 11, pp.2010-2015, Nov.
2003.
[9] A. Xotta, D. Vogrig, A. Gerosa, A. Neviani, A.
Graell i Amat, G. Montorsi, M. Bruccoleri, and G.
Betti “An All-Analog CMOS Implementation of a
Turbo Decoder for Hard-Disk Drive Read Channels”, IEEE International Symposium on Circuits
and Systems (ISCAS02), 26-29 May 2002, Scottsdale, USA, pp.69-72,.
[10] D.Vogrig, A. Gerosa, A. Neviani, A. Graell i
Amat, G. Montorsi, and S. Benedetto, “A full
CMOS Analog Turbo Decoder for UMTS Coding Schemes,” 2nd Analog Decoding Workshop
(ADW03), Sep. 2003, Zürich, Switzerland.
[11] C. Winstead, J. Dai, S. Yu, C. Myers, R.R. Harrison, and C. Schlegel, “CMOS Analog MAP Decoder for (8, 4) Hamming Code,” IEEE Journal
of Solid State Circuits, vol. 39, no. 1, pp. 122-131,
Jan. 2004.
[12] M. Moerz, T. Gabara, R. Yan, and J. Hagenauer,
“An analog 0.25 µm BiCMOS tailbiting MAP decoder,” IEEE International Solid-State Circuits
Conference Digest Technical Papers (ISSCC00),
Feb. 2000, San Francisco, USA, pp. 356-357,.
[13] F. Lustenberger, M. Helfenstein, H. A. Loeliger,
F. Tarkoy, and G. S. Moschytz, “All-analog decoder for a binary (18, 9, 5) tail-biting trellis
code,” Proceedings of 25th European Solid-State
Circuits Conference (ESSCIRC99), Sept. 1999,
Duisburg, Germany, pp. 362-365.
[14] F. Lustenberger, Ph.D. Dissertation, On the Design of Analog Iterative VLSI Decoders, ETH
Zürich, No 13879, Hartung-Gorre, Konstanz, Series in Signal and Information Processing, Vol. 2,
Nov. 2000.
[15] B. Gilbert, “A precise four-quadrant multiplier
with subnanosecond response,” IEEE Journal of
Solid-State Circuits, vol. 3, pp. 365-373, 1968.
[16] Y. Tsividis, Operation and Modeling of the MOS
Transistor, 2nd edition, McGraw-Hill, Apr. 1999.
[17] W.Huang, V. Igure, G. Rose, Y. Zhang, and
M. Stan, “Analog Turbo Decoder Implemented
in SiGe BiCMOS Technology,” 40th Design Automation Conference Student Design Contest
(DAC03), Jun. 2003, Anaheim, California,.
[18] B. Vucetic, and J. Yuan, Turbo Codes: Principles and Applications, Kluwer Academics Publishers, 2000.
[19] S. Yu, C. Winstead, C. Myers, R. Harrison, and
A High-Speed Analog Turbo Decoder
C. Schlegel, “An analog decoder for (8, 4) Hamming code with Serial Input Interface,” Proceedings of IEEE International Symposium on Circuits and Systems (ISCAS02), May 2002, Arizona, USA.
[20] S. S. Pietrobon, “Implementation and performance of a turbo/MAP decoder,” International
Journal of Satellite Communications, vol. 16, pp.
23-46, Jan.-Feb. 1998.
[21] Small
World
Communications,
http://www.sworld.com.au.
[22] J. Steensma, C. Dick, “FGPA Implementations
of a 3GPP Turbo codec,” Proceedings of 35th
IEEE Asilomar conference on Signal, Systems
and Computers (Asilomar01), Nov. 2001, Pacific
Grove, CA, USA, pp. 61-65.
[23] Comtech
AHA
Corporation,
http://www.aha.com.
[24] C. Berrou, P. Combelles, P. Pénard, and B. Talibart, “An IC for turbocodes encoding and decoding,” IEEE International Solid-State Circuits
Conference Digest Technical Papers (ISSCC95),
Feb. 1995, San Francisco, USA, pp. 90-91.
[25] Texas Instruments, http://www.ti.com.
Fotios Gioulekas received his Diploma
in Electrical and Computer Engineering
from the University of Patras, Greece, in
2000. He is currently working towards
the Ph.D. degree in Electrical and Computer Engineering at the University of
Patras. His research interests include
analog VLSI implementation of iterative
error-decoding alogrithms, and information theory.
Michael Birbas received the Diploma
and Ph.D. degrees in Electrical Engineering from the University of Patras,
Greece in 1985 and 1991. From 19861991 and 1992-1995 he worked with the
teams of the VLSI Design Lab. and the
Applied Electronics Lab. of the University of Patras respectively, in VLSI implementation of DSP algorithms and in
embedded systems design of telecommunication applications. From 1993-1999
he worked at Synergy Systems S.A., a high-tech company in
the area of microelectronics, as R&D projects manager. From
1999 -2002 he was with GIGA Hellas S.A-an Intel company,
(part of the Optical Products Group of Intel), working as R&D
manager in the development of 10-40 Gbit/s system solution
demonstrators for SDH/SONET and OTN applications and of
40 Gbits/sec Ser/Des components. Since 2002 he is with the
team of Applied Electronics Lab. of the University of Patras
working in the fields of embedded systems design and efficient
analog/digital VLSI implementations of soft iterative decoding
algorithms.
85
Alexios Birbas is an Associate Professor at the Dept. of Electrical and Computer Engineering of the University of
Patras, Greece. He holds a Ph.D. degree from the University of Minnesota,
Minneapolis USA. His research interests
focus in device electronics, modeling of
electron devices, analog circuit design,
soc design and co-design. He has more
than 50 publications in related matters
and has taken various industry consultant positions.
George Bilionis received his Diploma
in Electrical and Computer Engineering
from the University of Patras, Greece,
in 2000. He is currently working towards the Ph.D. degree in Electrical and
Computer Engineering at the University
of Patras. In 2001, he was with Giga
Aps (an Intel company) where he was involved with the design and development
of circuits for 40 Gb/s optical communications. His research work is in RF
analog and mixed-signal circuits for wireless and optical communications.