Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
An Analog Turbo Decoder for an (8,4) Product Code Neiyer Correal and Joe Heck Florida Communications Research Labs Motorola Plantation, FL Matthew C. Valenti Assistant Professor Lane Dept. of Comp. Sci. & Elect. Eng. West Virginia University Morgantown, WV [email protected] copyright 2002 Motivation & Goals Motivation Iteratively decodable code are capable of nearShannon limit performance. • e.g. turbo codes, LDPC codes, turbo product codes. Digital hardware has its limitations • Clock functions demand power and real estate. • Turbo decoding uses nonlinear functions that are awkward in digital but natural in analog. Goal of this study © 2002 To design and implement a simple iterative decoder in analog. • Simple: (8,4) product code created using a 2 by 2 array of (3,2) single parity check codes. Target throughput is 1 Gbps Overview of Talk Related work Single parity check (SPC) codes Turbo product codes © 2002 Encoder Soft-input/Soft-output (SISO) decoder Analog design of the SISO decoder Encoder Turbo Decoder Comments & Conclusions Related Work Inspiration: Carver Mead (Cal Tech) 1980’s Observation: nervous cells and CMOS transistors operating in subthreshold mode have same physics. Adaptive analog systems with the robustness of digital yet 2-orders of magnitude reduction in power. Primarily image processing applications. Recent work: Turbo Decoding J. Hagenauer (T.U. Munich) © 2002 • Log-likelihood domain circuits. Loeliger + Lustenberger (E.T.H. Zurich) • Circuits operate on raw probabilities rather than LLRs. Single Parity Check Codes Generic (n,k) SPC code: The (3,2) SPC code: © 2002 The input to the SPC encoder is a stream of k data bits: {u1, u2, …, uk} A single parity bit is created by xor-ing the data: up = u1u2uk The n=k+1 bit output is the k input bits and the parity bit: u = {u1, u2, …, uk ,up} Let the two input bits be denoted uj and uk The parity bit is uj,k = ujuk The output is u = {uj, uk, uj,k } Modulation and Channel Model BPSK modulation x = +1 if u = 0 and x = -1 if u = 1 Modulated code word: x = {xj, xk, xj,k } Received code word: y = {yj, yk, yj,k } Channel characterized by conditional pdf: f(y|x) • AWGN: N f ( y | x) ~ x Ε s , 0 2 © 2002 • Rayleigh flat-fading N0 f ( y | x) ~ ax Es , 2 Scaling the Input Input to the SISO decoder must be scaled: In log-likelihood form: f ( y | x 1) E 4ay s L( y | x) ln N0 f ( y | x 1) SNR © 2002 Decoder input is: r={rj,rk,rj,k}={L(yj|xj),L(yk|xk),L(yj,k|xj,k)} fade coefficient (=1 for AWGN) APP SISO Decoder a priori information from other decoders L(xj) scaled rj rj,k channel rk observations L(xk) SISO dec. #1 L(uj|y) Le(xj) Le(xk) L(uk|y) a posteriori estimates: used to make final bit decision extrinsic information to other decoders P[ x 1] L( x) ln P [ x 1 ] Input-output relationships (BCRJ algorithm): rj ,k Le ( x j ) 2 tanh tanh 2 © 2002 1 rk L( xk ) tanh 2 extrinsic rj ,k r L( x j ) tanh j Le ( xk ) 2 tanh 1 tanh 2 2 L( x j | y ) rj L( x j ) Le ( x j ) a posteriori L( xk | y ) rk L( xk ) Le ( xk ) information: exploits structure of code LLR estimates Box-Plus Operator The heart of the decoder is a soft-xor operation, also called “box-plus” L(u j u j ,k ) u j © 2002 u j ,k L(u j ) L(u j ,k ) tanh 2 tanh tanh 2 2 1 This can be implemented in analog by a modified Gilbert multiplier Modified Gilbert Multiplier Current Mirror Current Mirror I I Vcc I1 I2 Vcc V2 Vo V1 © 2002 I EE L(u j u j ,k ) u j u j ,k L(u j ) L(u j ,k ) tanh 2 tanh tanh 2 2 1 Computing Extrinsic Information Vdda Vgp + rk/VT 50 A Vdda + Le(xj)/VT + L(xk)/VT Vdda + R12 Q1 Q0 rj,k/VT L(xk) Q13 rk rj,k Le(xj) Vbn Q10 50 A Computing a Posteriori LLR Vdda R8 R6 R7 + L(xj|y)/VT + rj/VT + L(xj)/VT Vbn rj L(xj) Le(xj) L(xj|y) + Le(xj)/VT SISO Decoder for (3,2) SPC Code L(xj) rj rj,k rk L(xk) L(xj|y) Le(xj) Le(xk) L(xk|y) Components: 2 units for computing extrinsic information. • 6 PMOS + 10 BJT transistors + 7 resistors each. 2 units for computing a priori LLR. © 2002 • 9 BJT transistors + 9 resistors each. Product Codes u1 u3 u1,2 u4 = u1,3 u2,4 u3,4 A more powerful code can be created by using multiple SPC codes. © 2002 u2 = 2 by 2 array of SPC codes can correct a single error when using hard decision decoding. However, we want a soft-decision decoder. © 2002 Turbo Decoding of Product Codes r1 r2 r1,2 Le(x1) Le(x2) r3 r4 r3,4 Le(x3) Le(x4) r1,3 r2,4 Le(x1) Le(x3) Le(x2) Le(x4) Turbo Decoder r1 r1,2 r2 r3 © 2002 r4 r3,4 r1,3 r2,4 vertical dec. #1 vertical dec. #2 horiz. dec. #1 horiz. dec. #2 L(u1|y) L(u2|y) L(u3|y) L(u4|y) Vertical Decoder The a priori LLR is computed using the output of the horizontal decoding. Thus, the vertical decoder does not need the three-input adders L(xj) rj rj,k rk © 2002 L(xk) Le(xj) Le(xk) Circuit Complexity Two horizontal decoders, each with: Two vertical decoders, each with: 2 units for computing extrinsic information. Total complexity: © 2002 2 units for computing extrinsic information. 2 units for computing a priori LLR. 8 units for computing extrinsic information. 4 units for computing a priori LLR. 48 PMOS transistors 116 BJT transistors 92 Resistors Comments on Analog Decoding Complicated codes can be decoded using the same approach. Cascade of analog cells. Problems Transistor mismatch. • Equivalent to having additional noise at input. • Current transistor tolerances have same impact as using 5-10 bit quantization in a digital implementation. Thermal gradients • Proper scaling will mitigate intercell gradients. • However, intracell gradients will hurt performance. © 2002 Decoding serial data in parallel • Need to store analog values (e.g. sample/hold). • Could work well with multicarrier modulation. Design and test • Spice-level simulations needed, but take too long. Conclusion Advantages of analog decoders: Implementation issues may hinder performance However, suboptimality in local cells is okay as long as overall system performance is acceptable. Extensions of this work © 2002 No costly clock functions smaller footprint/expense. No concept of iteration faster convergence. easily implemented nonlinear functions. (n,k) SPC code for k>2. Product code with 3 dimensions or more. Other types of codes: Hamming, BCH. Turbo and LDPC codes. Other iterative processing schemes, e.g. turbo-EQ