Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
CALIFORNIA STATE UNIVERSITY, NORTHRIDGE REED-SOLOMON CODES ENCODER/DECODER MICROPROCESSOR BASED SYSTEM A thesis submitted in partial satisfaction of the requirements tor the degree of Master of Science 1n Engineering by Halima Makady El Naga May, 1987 Copyright 1987 by Halima Makady El Naga The Thesis of Halima Makady El Naga is approved: DR. Robert Henderson DR. Jagdish Prabhakar Commit tee Chair California State University, Northridge iii To My Parents iv Acknowledgements I wish Dr. ~o express my sincere thanks to my thesis advisor, Jagdish Parbahakar, for his advisement and the much appreciated time he spent towards improving the final torm of this thesis. Special thanks to my husband Nagi for his support, patience, encouragement and advisement. v continuous TABLE OF CONTENTS page .................................... List of Tables List of Figures •••••••••••••••••••••••••••••••••••• Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii ........................... ............................... 1 ......................•........... 2 CHAPTER I: INTRODUCTION 1.1 Introduction 1.2 Objective 1.3 Project Outline •••••••••••••••••••••••••••• .............. .. CHAPTER II: REED-SOLOMON CODES 2.1 Hamming Codes •••••••••••••••••••••••••••••• 2.2 2.3 2.4 2.5 General BCH Codes ·········~················ Reed-Solomon Codes Encoding of Reed-Solomon Codes Decoding of Reed-Solomon Codes ......................... ..... ........ ............. CHAPTER III: SYSTEM SPECIFICATION AND GENERAL DESCRIPTION • • • • • • • • • • • • • • o e • • • 3.1 System Specification 3.2 System Description 3.2.1 The Encoder 3.2.2 The Decoder o • e • • • • o • • • o ·~ • • • • o • ..... ....... . .. . . . . . .. . ... ... .. . .. . . ... .... .......... .......... . . .... CHAPTER IV: ENCODER HARDWARE DESIGN . .. . .. . .. .. . ... 4.1 Introduction ... . . . . ... .. ............ . ..... . o • • o • o 4.2 Field Element Multiplier Hardware Implementation ••••••••••••••••••••••••••••• 4.2.1 Two-Level ROM Implementation ••••••••••• 4.2.2 One-Level ROM Implementation ••••••••••• 4.2.3 Combinational Circuit Implementation 4.3 Encoder Hardware Implementation •••••••••.•• • tJ ix x vi 1 3 4 4 4 6 7 8 12 12 15 15 16 20 20 23 23 26 26 28 page CHAPTER V: THE SYNDROME PROCESSOR • • • • • • • • • • • • • • • • • • 5.1 The Syndrome Processor Description ••••••••• 5.2 The Syndrome Processor Hardware Design ••••••• 5.2.1 Using Two-Level ROM Multiplier Implementation •••••••••••••••••••••••• 35 35 36 36 5.2.2 Using One-Level ROM Mutiplier Implementation •••••••••••••••••••••••• 40 5.2.3 Using Combinational Circuit Mutiplier Implementation •••••••••••••••••••••••• 41 CHAPTER VI: THE SIGMA PROCESSOR ••••••••••••••••••• 6.1 Introduction ••••••••••••••••••••••••••••••• 46 46 6.2 The Iterative Algorithm For Finding The . ....... . .. . .. ..... Design . . .. . . . . . . .. . . . .. Error-Location Plynomial 6.3 The Sigma Processor 6.3.1 Microprocessor Based System Structure 46 49 49 6.3.2 The Sigma Processor Hardware System Design ................ ................ 6.3.2.1 The INTEL 8085 Microprocessor 6.3.2.2 Memory Unit 52 •••• 52 ••••••••••••••••••••.• 57 6.3.2.3 Input/Output Ports ••••••••••••••• 61 6.3.3 The Sigma Processor Software System Design •. . . . . •. . •. •. . •. . •. . . •. •. . . •. . •• 6.3.3.1 Data Structure Description 63 ••••••• 63 6.3.3.2 P( ) and V( ) Transforms ••••••••• 68 6.3.3.3 Main Program Description •••••.••• 71 6.3.3.4 The Sigma Routine ••••••••••••••••• 74 6.3.3.5 The Discrepancy Routine •••••••••• 75 ••••••••• 79 •••••••.•••••••••••••••.••••••• 79 CHAPTER VII: THE ERROR-LOCATION PROCESSOR 7.1 Introduction 7.2 Error-Location Processor Design •••••.•••••• 81 7.2.1 The Root Locator Design................ 81 vii page 7.2.2 The Counters Design •••••••••••••••••••• 7.2.3 The Stack Register Design 7.2.4 System Control Design ••••••••••••••••• 7.3 System Operation ••••••••••••••••••••••••••• ............. ... .... ............................... CHAPTER VIII: THE ERROR-MAGNITUDE PROCESSOR 8.1 Introduction 8.2 Error-Magnitude Processor Hardware Design ••• 8.2.1 Input/Output Ports ••••••••••••••••••••• 8.3 Error-Magnitude Processor Software Design ••• 8.3.1 Data Structure Description •••••••••••• 8.3.2 Main Program Description •••••••••••••• 8.3.3 The Error-Evaluation Polynomial Routine. 8.3.4 The Z(Xc) Routine ••••••••••••••••••••• 8.3.5 The Product Routine ••••••••••••••••••• CHAPTER IX: THE ERROR-CORRECTION PROCESSOR AND THE OVERALL CONTROLLER 9.1 The Error-Correction Processor Design •••••• 9.2 The Queue Buffer ••••••••••••••••••••••••••• 9.3 The System Overall Controller ••••••••••••••• •. ............... 88 88 88 90 93 93 93 96 98 98 103 103 105 108 ............................. 112 ....................................... . 115 CHAPTER X: CONCLUSION REFERENCES 84 84 84 85 viii LIST OF TABLES page 3.1 4.1 GFC2 8 > Elements Generated By p<x> = 1 + x2 + x3 + x4 + x8 .............. .. 14 Time Delay and IC Chip Count for the Encoder Circuit 34 6.1 P-Transform of 70 6.2 V-Transform of 6.3 The Sigma Processor IC parts List •••••••••••• 8.1 The Error-Magnitude Processor IC Parts List •• 100 ............................. 8 GFC2 > Elements ........ ...... . GFC2 8 ) Elements ..... ....... . . . ix 72 77 LIST OF FIGURES page ..................... Block Diagram . .. ... ... Block Diagram ... .. .... 3.1 Encoder Block Diagram 3.2 Reed-Solomon Decoder 4.1 Reed-Solomon Encoder 4.2 Reed-Solomon Encoder Using Two-Level ROM Multiplier Implementation 4.3 4.4 ••••••••••••••••• 17 18 22 25 Combinational Circuit Multiplier ............................ Impelementation .. .... . ... .. . .... . Implementation 27 XOR tree 29 5.1 . Reed-Solomon Codes Syndrome Computation Circuit •••••••••••••••••••••••••••••••••••• 5.2 Syndrome Processor Block Diagram •••••••••••• 5.3 Syndrome Unit Multiplier Implementation Using Two-Level ROM •••• -•... • • .. • • • • • • • • • • • • • • • • • • • 5.4 Syndrome Unit Implementation Using 37 38 39 ...... 45 6.1 Sigma Processor Inputs and Outputs ••••••••• 47 6.2 Microprocessor Based System Block Diagram •• 51 6.3 Sigma Processor Circuit Diagram •••••••••••. 53 Combinational Logic Circuit Mutiplier X page 6.4 Intel 8085A CPU Functional Block Diagram ••• 55 6.5 Intel 8085A Microprocessor and Address Latch 56 6.6 Buffer Pin Connections ..................... 58 6.7 Intel 8085A Pin Out Diagram •••••••••••••••• 59 6 .. 8 Program Memory Unit 60 6. 9 The RAM Unit • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • 62 .................. 64 • • • 0 • 6.10 Input Ports Configuration • • • • • • • • • • • • • • • • • • • 6.11 Input Port Connections ••••••••••••••••••••• 65 6.12 Output Ports Configuration ••••••••••••••••• 66 ............. 69 6.14 Main Program Flow Chart •••••••••••••••••••• 73 6.15 The Sigma Routine Flow Chart e • 76 6.16 The Discrepancy Routine Flow Chart ••••••••• 78 7.1 Error-Location Processor Inputs and Outputs. 80 7.2 Error-Location Processor Block Diagram 7.3 Root-Locator Circuit Diagram ................ 83 8.1 Error-Magnitude Processor Inputs and Outputs 89 6.13 Sigma Processor Data Structure xi • • • • • • • • • e • • • ..... 82 page 8.2 Input Ports Configuration •••••••••••••••••• 91 8.3 Output Ports Configuration ••••••••••••••••• 92 8.4 Error-Magnitude Processor Data Structure ••• 94 8.5 Main Program Flow Chart •••••••••••••••••••• 95 8.6 Error Evaluation Polynomial Routine Flow Chart • • • • . . • • . . • • . • • • • . • • • • • • • . • • • • . • . 97 8.7 The Z(Xc) Routine Flow Chart ••••••••••••••• 99 8.8 The Product Routine Flow Chart ............. 101 9.1 Error-Correction Processor Circuit Diagram • 104 9.2 The Queue Buffer Circuit Diagram ••••••••••• 107 9.3 Overall Controller Control Signals ......... 109 9.4 Control Signals Timing Diagram ••••••••••••• 110 xii ABSTRACT REED-SOLOMON CODES ENCODER/DECODER MICROPROCESSOR BASED SYSTEM By Halima Makady El Naga Master of Science in Engineering In this project, Reed-Solomon codes encoding and decoding algorithms are first discussed. A complete logic c~rcuit design ot the (255, 245) Reed-Solomon code encoder/decoder microprocessor based system is then presented. This code is detined over Galois Field GFC2 8 > and has the capability or correcting burst up to five burst errors ot 8 bits each or any combination provided they only symbols (bytes). For of up to affect better efficiency, this a a total maximum design length of ot five features a 40 bits individual five-stage pipelined structured decoder which utilizes the parallelism in the decoding algorithm. Berlekamp's iterative algorithm is used to determine the coefficients of the error location polynomial, and Chien's searching algorithm is used to find its roots$ In this design, only otf the shelf integrated circuits are used. The Intel 8085 microprocessor has been utilized as a data processor in two ot the xiii decoder pipelined stages. Alternative design methods of various system parts have been investigated and speed and time delay measurements of these parts are included. xiv CHAPTER ~ INTRODUCTION ~ Introduction: The error detection and correction system, which is responsible for the reliable recovery of digital data, has become one ot the important parts in the design of modern digital data communication and storage systems. The reason is partly due to the 1ntolerence of either system to error and, in some cases, partly because of the critical nature of the aata. Although several powerful error detecting and correcting codes have been known for some time, they have not been extensively used in these systems. On one hand, because ot the complexity of their encoder and decoder algorithms, the amount ot nardware required to implement their encoders and decoders was too large and too expensive to build. On the other hand, since relatively primitive single-short-burst error correcting codes (e.g. Fire Codes) were sufficient to achieve adequate system-level performance at that time the use or more powerful codes was not needed. However, over the past two decades, the cost of solid state electronic devices, particularly digital devices, has decreased dramatically. This has stimulated the development of automatic data processors, digital computers, long range communications such as with satellites and peripheral devices. This, in turn, has caused a dramatic increase in the volume of data communicated between such machines. As an example, the development of optical disks, with data densities ot 25,000 bit per inch and 10,000 tracks per inch 1 2 compared to 4,000 bits per inch and 200 tracks per inch for magnetic disks, means that data densities have increased by more than 250 times. As a result, and in spite ot the improvement in the storage media characteristics, the raw error rates have very much increased [5]. Under these conditions of much higher raw error rates and cheaper hardware, it nas become necessary to consider more powerful error detecting and correcting codes to maintain and possibly ~mprove reliability and performance. These codes should be capable multiple burst errors. of Reed-Solomon have shown other competitors, advantage codes over capability of all correcting a correcting multiple random large as and long, cost/performance and they well as have long the burst errors. The design ot a Reed-Solomon code encoder/decoder system requires a very good knowledge of both digital hardware design principles and the theory ot error control coding in general and .decoding algorithms for algebraic codes in particular. Although Reed-Solomon decoders have already been built, only a small amount of literature is available. The reason is that most digital design engineers may not have the knowledge ot the theory ot error control coding, and the few companies that have the capabilities ot designing Reed-Solomon decoders, obviously reveal nothing of the design in order to retain their hold on the growing market. L2. Objectives: In this project, the general procedure ot encoding and decoding ot Reed-Solomon codes are discussed first and the hardware required for implementing the encoder and decoder 3 is presented. As a typical design example, a complete detailed design (using otf the shelf integrated circuits>, of an encoder/decoder system ot a code is presented. The code selected Reed-Solomon selected is the (255,245) 8 Reed-Solomon code defined over Golois Field GF (2 >. This code has a aata block length of 255 symbols. Each symbol is represented by an 8-bit byte making the total length 2040 (255 x 8) binary bits. It has the capability or correcting up to five burst errors ot 8 bits each or any burst error combination of up to a total length or 40 bits provided they only affect a maximum ot five individual symbols. For oetter efficiency, a pipe lined structured decoder is considered. In the decoder design, the Intel 8085 microprocessor has been utilized as a data processor and a system controller. Speed and time delay measurements of various system parts are also included. ~ Project Outline: Chapter 2 introduces Reed-Solomon codes and their encoding and decoding algorithms. Chapter 3 provides a Reed-Solomon encoder/decoder system specification and general hardware description. In Chapter 4, a complete hardware encoder design is presented and various implementation methods are reviewea. The design ot various stages ot the p1.pelined decoder is discussed in chapters 5 through 9. Finally, Chapter project. 10 presents the results and conclusions or the CHAPTER l l REED-SOLOMON CODES ~ Hamming Codes: Hamming codes are the first class ot cyclic codes devised for error correction [5]. These codes and their variations have been widely used for error control in digital communication and data storage systems. For any positive integer m ~ 3, there exists a Hamming code with the tollowing parameters: Code length: n Number of 1nformation symbols: k Number of parity check symbols: n-k Min1mum distance: dm Error correcting capability: t = 2m = 2m =m =3 =1 - 1 - m - 1 Hamming codes are single-bit error correcting codes, and can be extended to correct single-bit and detect double-bit errors. A cyclic Hamming code ot length 2m - 1 is generated by a primitive polynomial p(X) of degree m. ~ General ~ Codes: The Bose, Chandhuri and Hocquenghen (BCH) codes form a class ot powerful random error correcting cyclic codes. These codes are a generalization ot Hamming codes for correcting multiple errors. In general, BCH codes are detined as follows: 4 5 If •p" is a prime number and •q• is any power ot p, there are codes with symbols from the elements ot Galois Field GF(q). These codes are called q-ary codes. An (n,k) linear code with symbols from G~(q) is a k-dimensional s ubspa'ce of the vector space of all n-tuples over GF(q). A q-ary (n,k) code is generated by a polynomial of degree <n-k) with coefficients from GF (q), which is a factor of xn-1. For any positive integers s and t, there exist a q-ary BCH code with the following parameters: Block length: n = qS 1 Number of parity check digits: (n - k) ~ 2st Minimum distance d >,. 2t + 1 min This code is capable of correcting any combination of t or fewer errors in a code block of n = qs - 1 digits. Let @ be a primitive element in GF(qs). The generator polynomial g(x) of the t-error correcting BCH code is the lowest-degree polynomial with coefficients from GF(q) which has 2 3 @I @ I @ I • • • • • • • • • • • • I as its roots. Let 0. (x) be the minimal polynomial of @i, ~ then, g(x) = LCM { 0 (x), 0 (x), ••••••• ,0 1 2 2t <x> } Since, the degree of each minimal polynomial is s or less, the degree of g(x) is at the most 2st. A special subclass of the BCH codes is given by q=2. These codes are called binary BCH codes. For a pr imi ti ve BCH ' 0 6 m code, n is restricted to be 2 -1; for a nonprimitive BOI code, n may be any other odd number. In addition to the binary BCH codes, there are also nonbinary codes. Among the nonbinary , BOI codes, the most important subclass is the class of Reed-Solomon codes which are defined in the following section. ~ Reed-Solomon Codes: Reed-Solomon codes are a special subclass of q-ary BCH codes for which s=l. A t-error-correcting Reed-Solomon code with symbols from GF(q) has the following parameters: Block length: Number of parity check digits: Minimum distance: n = q - 1 n-k = 2t d = 2t + 1 min In this project, only Reed-Solomon codes with symbols from m the Galois Field GF (2 ) will be considered .Since the main goal of this project is to design complete hardware and software systems to encode and decode Reed-Solomon codes, Reed-Solomon codes encoding and decoding algorithms are only very briefly discussed without any formal proof. The reader interested in details of these algorithms is referred to references [3] and [131. m Let @ be a primitive element in GF{2 ) • The generator polynomial of a primitive t-error-correcting Reed-Solomon m code of length 2 -1 is: g{X) g(X) = = (X + @) (X + 2 @ ) • ~ ••••• (X + •••• + g + 2t @ 2t-l ) X 2t-l +X 2t This is an (n,n-2t) code that consists of n s~mbols and has d . -1 parity check symbols. Since q = 2 , each q-ary m1n 7 symbol can be expressed as an m-tuple over GF(2). m m Consequently a t-error correcting ((2 -1),(2 -l-2t)) m Reed-Solomon code over GFC2 ) can be regarded as an m m ClmC2 -l)J,[m(2 -l-2t)J) code over GF(2) which is capable of correcting any error pattern whose nonzero digits are confined to t m-symbol blocks. Thus Reed-Solomon codes are very etfective in correcting multiple burst errors. ~ Encoding Qf Reed-Solomon Codes: Given the generator polynomial g(X) Reed-Solomon code, the code can be systematic form as follows. Let U{X) = u of an encoded Cn,n-2t) into a k-1 + u X+ ••••• + u X 1 0 k-1 be the m~~sage to be encoded, where k=n-2t. Mul t~plying U(X) by X we obtain a polynomial of degree n-1 or less: 2t+l n-1 2t 2t + • • • + uk-1 X X U (X) = u X + u X 0 Dividing have: 2t X U{X) 2t X U(X) by = 1 the generator polynomial a(X) g(X) + b(X) g (X}, we (2.1) where a(X) and b(X) are the quotient and remainder respectively. Since the degree of g(x) is 2t, the degree of b(X) must be 2t-l or less, that is 2t-l b(X) = bO + bl X + ••••• + b2t-l X Rearranging Equation (2.1) we obtain a polynomial which is a multiple of g(X), therefore it is a polynomial V(X) code 8 2t V(X)= b(X) + X U(X) = a(X) 9~£~l n-1 2t + u X + •• +u X = b 0 + b 1 x + •• + b2t-l X 0 k-1 This polynomial corresponds to the code vector: The first 2t elements are the parity check symbols and the rest k symbols are the information symbols. ~ Decoding Qf Reed-Solomon Codes: Let = v0 V(X) + v 1 X + ••••••• + v n-1 n-1 X be the transmitted code vector and r(X) = r 0 + r 1 x n-1 + ••••••• + rn-l X be the received vector. Then, the error pattern is, say, e(X) = r(X} - V{X) = e where e. 1 the error 0 + e 1 X + •• •.... + e n-1 n- 1 X m r. - v. is a symbol from GF(2 ). Suppo~ed tbat l. 1 ]1 ]2 patte~n e(X) has f errors at locations: X , X , = wh ere 0 ~ J. < j < •••• < jf 2 1 has error magnitudes: e. , e. , •••• ,e. , then ••••• •• , X ]f ]1 e(X) = e jl jl X + e ]2 j2 ~ n-1 and ]f j2 X + ••••• (2.2) 2 2 since @, @ , ••••• , @ t are roots of each code polynomial, then V(@i) = 0 for 1 ,< i ~ 2t. The ith component of the syndrome is given by: 9 S S i i i i == r(@ ) == V(@ ) i + e(@ ) i == e(@ ) (2.3) From (2.2) and (2.3) we ontain the following equations: j. where eji and @ ~ are unknown. Any method ot solving these equations is the basis for an error correction procedure. These equations are nonlinear. There are many possible, but finite, solutions and the correct solution is the one that yields an error pattern with the smallest number of errors. This error pattern is the most probable error pattern caused by the channel noise. In the following, a method ot solving ~hese equations is discupsed. Since the location of an error is given in terms ot @Ji,this is called the error location number. 'tl t; M Let ~ sl = s2 = r(@ ) == r = Bj (@) then, == e. B e. B ]1 2 ]1 1 2 1 + e. B + ]2 2 2 + e. B + ]2 2 ...... + e.]f ...... + e.]f B B f 2 f ( 2. 5) 10 • s 2t 2t =r(@ 2t >=e. B )1 1 2t +e. B + •••••• + eJ.f )2 2 These equations are called power-sum symmetric functions. Now, we aefine a polynomial o-(X) as The the the the o-(X) = o-(X) = (1 + B X) (1 + B X) 1 2 2 ........ (1 + Bf X) ........... + o-j Xf o- + o- X + o- X + 0 1 2 _l _l _l roots of o-(X) are B , B , •••••• B which are called 1 2 f inverse of the error location numbers. o-(X) is called error location polynomial. o-'s are related to B.'s by i J following equations: o0 o1 o2 • = = = o- = f 1 ...... + Bf B + .. . . . . . . + B 2 3 f-1 B + B + 1 2 B B + B 1 2 B f (2.6> B B B c••• B 1 2 3 f o-'s are known as elementary symmetric functions of B 's. i j From equations (2.5) and (2.6), o- 's are related to the i syndrome components S 's by the following Newton's i identities: s s 1 2 + o- = 0 1 + o- 1 s 1 + 2o- = 0 2 11 s 3 + o- s 1 2 + o- 2 s + 3o- = 1 3 o (2.7) • • s s f s + o1 f+l + f-1 o- s 1 f + ••••• + of-1 s + ••••• + oj-1 s 1 + fof 2 + of = 0 s =0 1 The error correcting procedure for Reed-Solomon codes consists of the following four major steps: 1- Compute the syndrome S= CS , S , 1 received polynomial r(X), 2 ••••• , S (Equation 2.5), 2t ) from the 2- Determine the error-location polynomial o-(X) syndrome components s 1' calculate o-'s from the S. 1 i IS 1 s 2' ... . . from the , s ) 2t i.e. (Equation 2. 6) , 3- Determine the error-location numbers B , B , • • • • , B 1 2 f by finding the roots of o- (X) (inverses of the roots o! ~(X)), and 4- Substitute the error location numbers into the error polynomials and solve for the corresponding error values e . • Knowledge ot the values of B and e. is sufficient Ji . i Ji for error correction. CHAPTER AND SYSTEM SPECIFICATION ~ In m GENERAL DESCRIPTION System Specification: this code chapter, system the will overall be procedures can be dimension, the detailed design discussed. applied to a design of the Reed-Solomon Although the Reed-Solomon code will be given design for of a any code that has the following parameters: = 88 m n = 2 - 1 = 255, n - k = 2t = 10 This code the (255, 245) Reed-Solomon code. It is 8 de tined over GF (2 ) • Thus, the code block length is 255 symbols is where digits each (byte). symbol is presented by eight Each data block contains 245 symbols (245x8=1960 symbols (10x8=80 binary binary bits) and bits). 10 This binary information parity code has check the capability ot correcting up to five burst errors ot 8 bits each or any burst error combination of up to a total length of 40 bits providing they only affect a maximum ot ti ve individual symbols. The 8 GF (2 ) genera tor elements are generated by the polynomial p (X) which is a following pr irn_!_ti ve polynomial -----------··•c---·-- -----,- -.,.,_~-~'->·> .. -~ --~'""'----=-·'-- ..----~_,__ of degree 8: ___ ,- ...••.-,_ e-o·--o··~-~"·o·--· 8 4 3 2 P(X) = X + X + X + X + 1 Let the primitive element @ be a root ot P(X). ' <l 12 field Then ~-· 13 8 =@ P(@) Since @ is a 2 3 4 =0 + @ + @ + @ + 1 pri~tivr e~ement, non-zero elements @ , @ , @8' ••• element of the field GF(2 ) can , it 2 swnerates tfll @ of GF ( 2 ) • the Any be represented as a polynomial ot @, for example, the field B can, in genera~,· be represented as: B = i @ =a 2 3 4 5 6 7 +a @+a @ +a @ +a @ +a @ +a @ +a t 0 1 2 3 4 5 6 7 Where the coefficient a is either 0 or 1. i The !ield element B can also be represented by an ordered sequence of the 8 coefficients of the polynomial / representation as follow: (a , a , a , a , a , a , a , a ) 0 1 2 3 4 5 6 7 This representation is qplled the vector representation. The zero element of GF(2 ) is represented by the all zero 8-tuple. A computer pro~am has been written to generate the field element o! GF(2 ) and the output of this program which shows the power representation and the corresponding vector representation Table 3.1. of each field element is in To add any two tield elements, simvly add the corresponding components of their addition). As an example, 10 90 @ + @ vector representation = O,O,l,O,l,l,l,O = ( modulo-2 + 1,1,1,1,1,0,1,1 l,l,O,l,O,l,O,l To multiply any two field e~~~ents,simply add their exponents and use the fact that @ = 1. As an example, 0 I 2 3 4 5 6 7 IJ 9 10 l I l2 I J •• 15 16 17 18 19 20 21 22 23 24 25 26 27 28 2'1 30 31 .12 33 34 35 36 37 )6 39 40 41 42 43 44 45 46 47 48 49 50 51 52 5.l 54 55 56 .51'· 58 59 60 61 62 00000000 I 0000000 0 I 000000 00100000 OOOIOOQO 00001000 00000 11)0 00000010 00000001 10111000 01011100 00101110 00010111 10110011 11&00001 11001000 01100 I 00 00110010 00011001 10110100 01011010 OOlOllOI 10101110 010101ll 10010011 lll10001 11000000 OltOOOOO 00110000 OOOllOuO 00001100 00000110 OOOOOOll 101llOOt 11100100 01110010 00111001 10100100 01010010 00101001 10101100 01010110 00101011 10101101 11101110 01110111 l 00000 I 1 I II II 00 l 11000100 01100010 00110001 I 01 00000 01010000 00101000 00010100 00001010 00000101 IOIIlOIO 01011101 I 0010 I I 0 01001011 10011101 11110110 01111011 63 64 65 66 67 68 6Q 10 71 12 73 74 75 76 77 78 7Q 80 I' I ez e" e5 8.1 M 87 es f'9 90 .. I Q2 QJ ... Q5 96 <;7 qs c;q 100 I0l 102 I O..J l 04 I 05 106 107 lOf:t I OQ I 10 l I I I 12 113 114 l 15 1 16 t 11 1 18 I 19 1 20 121 I 22 IZJ 124 125 126 10001)101 11111\JIO 0 II I l I 01 10000110 01000011 10011001 llll'liOO 011 Ill) I 0 00111101 101 00 I I 0 01010::111 10010001 11110000 011 II 000 00 I II I 00 0 00 I I I I 0 00001111 10111111 I 1 1 001 11 11001011 I I0 t I I0I II') I 0 I 10 01101011 10001101 11111110 1• 01111111 10000111 11111011 11000101 11011010 01101101 IOOOillO 01000111 100 II 0 I I Ill I 0 I 0 I 11000010 01100001 I 0001000 0 I 000 I 00 00100010 00010001 10110000 01011000 00101100 00010110 00001011 10111101 11100110 011 I 00 II I 0000001 lllltOOO 01111100 00 I 11 I I 0 00011111 10110111 11100011 11001001 I I 0 I I I 00 01101110 00110111 10100011 11101001 11001100 0 l I 00 1 I 0 00110011 10100001 11101000 01110100 0011101() 0001110 I 10110110 01011011 10010101 lliiOOIO OlliiOOI 10000100 01000010 00100001 1tJIOIOOO 01010100 00101010 00010101 10110010 01011001 10010100 01001010 00100101 10101010 01010101 10010010 01001001 10011100 01001110 00100111 10101011 11101101 11001110 01100111 10001011 llllll01 11000110 01100011 10001001 llllllOO Oll1ll10 00111111 10 I 001 I l 11101011 11001101 11011110 01101111 10001111 lllllllt 11000111 11011011 11010101 l1010010 01101001 10001100 01000110 0010001 1 10101001 11101100 OlllOllO OOiliOll 10100101 11101010 01110101 12 7 126 12<1 L'IO I3 l 132 133 lJ4 135 136 137 136 139 140 14 I 142 143 144 145 146 147 146 149 150 151 152 15) 154 155 156 157 156 15Q 160 161 162 163 164 165 166 167 II'> I\ 169 1 70 I 71 I 72 l'Tl I 74 175 176 177 178 179 180 181 182 18.31 184 185 • 86 187 188 189 190 Table 3. l GF(2 8) Elements Generated By p(X) = , I + x2 + x3 + x4 + xB 191 192 193 1 'l4 1<>5 1 <;6 197 1<>8 199 200 201 202 203 204 205 206 207 208 209 210 2 II 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 226 229 230 231 2J2 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 2'50 251 252 2 '53 254 10000010 01000001 IOOliOOQ 0 I 00 I I 00 00100110 00010011 10110001 Ill 00000 01110000 00111000 00011100 00001110 00000111 10lll011 11100101 11001010 01100101 10001010 01000101 10011010 01001101 100 I I 1 I 0 0 IOOilll 1 00 I I 1 II 11110111 11000011 11011001 11010100 01101010 0011 01 01 10100010 01010001 10010000 01001000 00100100 00010010 00001001 eI 0 l I l I 00 01011110 00101111 10101111 11101.111 11001111 11011111 11010111 11010011 11010001 11010000 01101000 00110100 00011010 00001101 10111110 010111&1 10010111 lll100ll 11000001 11011000 Oil Oil 00 001 I 0110 000110 l1 10110101 Ill 000 tO 01110001 ..... .+::o 15 10 @ 190 and @ 90 100 • @ = @ 90 • @ 280 =@ = 25 @ The generator polynomial for the {255,245) code is given by: 2 3 4 5 g (X)= {X+@). (X+@ ) • (X+@ ) • {X+@ ) • (X+@ ) 6 7 8 9 10 • (X+@ ) • (X+@ ) • (X+@ ) • (X+@ ) • (X+@ ) Using Table 3.1, g(X) can be expanded to: 10 252 9 69 8 49 7 65 6 123 5 g(X)= X +@ X +@ X +@ X +@ X +@ X +@ Although a 76 4 71 3 102 2 41 55 X +@ X +@ X +@ X+@ Reed-Solomon code, with smaller dimensions, could have been selected, this code in particular has been chosen because of 1ts suitable symbol s1ze that matches the very commonly used 8-bit data byte. Although the natural length of this code is 255 symbols (2040 binary bits}, it can easily be shortend to any length to match any system specifications without any major change in the encoder/decoder hardware circuitry. ~ System Description: In the fol.lowing two sections the encoder and decoder ot the selected Reed-Solomon code are described. 3.2.1 The Encoder: The encoder accepts a data message blocks ot 245 symbols 0960 bits) symbol.s as an input and generates a code word ot 255 (2040 bits) as an output. While input message 16 symbols are transmitted to the encoder output they are also shifted into a linear feedback shift register that deals 8 with elements from GFC2 >. As soon as all the 245 message symbols are shifted out, the contents of the shift register These ten will represent the ten parity check symbols. parity check symbols will then be shifted out, following the 245 information symbols to form the 255 code word symbols. A block diagram of this encoder is shown in Figure 3.1. 3.2.2 ~Decoder: As discussd in section 2.4, the decoding procedure of the Reed-Solomon codes consist of the following five major steps: Step Step Step Step Step 1. 2. 3. 4. 5. Compute the syndrome, Determine the error-location polynomial, Determine the error-location numbers, Compute the magnitudes of the errors and Using the error-location numbers and the magnitudes of the errors, correct the received vector Since each step has a specific function, and the output of each step is the input of the following step, a pipelined structured decoder would be the most efficient decoder in this case. A block diagram of this decoder is shown in Figure 3.2. This decoder consists of five main stages. At any time, each stage will be processing information that belongs to a different data block, i.e. five data blocks will be decoded concurrently, and each one will be at a different decoding step. The jth received data block r (X) is first shifted j 17 U(X) ENCODER Figure 3.1 Encoder Block Diagram V(X) Queue Buffer cr. 1 r(X) Syndrome Processor s.1 Sigma Processor s.l - ()'": 1 ErrorLocation Processor X.1 s.1 - Error- --..... Magnitude Processor X.1 s.l e(X).I:~ ErrorCorrection rr rocessor Figure 3.2 Reed-Solomon Decoder Block Diagram co 19 into the syndrome processor as well as the Queue buffer. The syndrome processor then computes the 10 syndrome components s 1 s 1 s , ••• ,s • These components are then 1 2 3 10 loaded into the Sigma processor. While the Sigma processor is determining the error-location polynomial o-(X) of this jth received data block, the Syndrome processor will be computing the ten syndrome components of the (j+l)th data block. The Error Location processor will then find the error location numbers (Xi) which are the reciprocals of the roots of o-(X). (e ) The error magnitudes the computed by are i Error-Magnitudes processor. The error location numbers and error magnitudes of the jth data block are stored in the Error Correction processor while the (j-l)th data block is corrected and shifted out from the decoder as well as the Queue buffer. The transfer of the information between different processor is supervised by an overall controller. The Queue buffer must be able store five data blocks. The size of this buffer is ((255x8)x5) = 10200 binary bits. The rate of the data processing in this pipelined decoder is determined by the time delay of the slowest processor, T. Although the total decoding time of each data block is ST, but since this will be done concurrently with four other data blocks, the average decoding time ot every data block will be T. ' !l CHAPTER 1Y. ENCQDER HARPWARE DESIGN ~ Introduction: Let • • • • • • + uk-1 X k-1 be the message polynomial. It has been shown in section 2.3, that the code polynomial of Reed-Solomon code in systematic form is given by: b(X) + x 2 t U(X) where = a(X) g (X) If'!-..\ '/\,J'- ( / g(x) is the generator polynomial, 2t a(X) is the quotient resulting from dividing X U C)\) by g(X) and b(X) is the remainder. The code vector is then given by: >l:,:?\0 ~ bl' • • • • ' b2t-l' uo' ul' • • • • • ,ukl,.l ,.....--- parity check --+I• information ...:.;..t symbols symbols For the selected Reed-Solomon code (255,245), the generator polynomial g(X) is given by: 20 21 and the code polynomial V(X) of U(X) is: V(X) = b(X) + = b0 + b 1 x10 x+ U(X) = a(X) ••••• + b 254 + ····~ + u244 X 9 g(X) x9 + u 10 + 0 X U 1 11 X Encoding this Reed-Solomon code can be done by using a linear ten~~~age shift register as shown in Figure 4.1. The .... feedback connections of this shift register are based on the coefficients of the generator polynomial g(X). The following notations are used in Figure 4.1: --.=--~- "~--.~ -~ --,-~-~~·~·~---,-~~ ~-- ---~""··~·--=-,.-~.'-"'""'=·--~---~- denotes a multiplier that multiplies any 8 field element of GF <2 > by a fixed element (--.--•.. -B ~~,~):rom the same field, denotes a storage device that can store a 8 field element from GFC2 > (an 8-bit register) and denotes an adder that 8 elements from GF(2 >. adds two field The encoding procedure is as follows: 1- Clear the storage devices b. •s, l. 2- Enable feedback connection by enabling the control AND gate and feed U(X) to the output of the encoder by enabling the second input of the multiplexer. 3- Shift the 245 information symbols of U{X) into the shift register and to the encoder output. At the completion of shifting, the register contains the ten Control 8 Clock X2tU(X) 8 Input Message Control Figure 4.1 Reed-Solomon Encoder Block Diagram N N 23 parity check symbols. 4- Disable the feedback connection and enable the first input ot the multiplexer. 5- Shift the ten parity check symbols to the encoder output. During this time the encoder input would be disabled and no information can be accepted from the source. This circuit needs 245 clock information message into the shift encoder output and 10 clock pulses check digits to the output, a total ~ pulses to shift the register as well as the to shift the ten parity of 255 clock pulses. Field Element Multiplier Hardware Implementation: As shown in Figure 4 .1, one of the main elements in the encoder circuit is the field element multipliers. This multiplier can be implemented using one of three different implementations which are discussed in the following sections. 4.2.1 Two-Level RQM Implementation: Since each field element can be represented as a power of @, where @ is the primitive element, then the product of any two elements, represented in the power representation form, is given by the sum of their powers of @. In hardware implementation, this can be done by converting the vector form of the multiplicand element into its power representation, then adding the power of the multiplier, and finally converting the result from the power representation into the vector representation. 24 In this circuit, two ROM levels are used. The first ROM is used to convert the vector representation form of the symbol entering the feedback loop,B (the multiplicand> into its power representation. The se~gnd level is a set of ROM's which works as a look up tables. The output of each of these ROM's represents the results of multiplying the feedback symbol (B ), in the power representation form, by fb 1· one ot the coefficients of g(X) <g.=@ J). The output will be in the vector representation iorm. In this case, the contents of any of the second level ROM's which multiplies by the coefficient g, is simply the vector representation of field elements gtven in Table 3.1 rotated i . times (.,here i. is the power of the field element g. g. = 1 55 @ j>. Fo~ example, to multiply by @ , the con~ents ot ihe 55 second level ROM is the vector representation of 0, @ , 56 254 2 54 . . @ , • • • • , @ , @, @ , • • • • , @ • The f 1 r s t e 1 e men t 1n the look up table is always 0. t.e. The block diagram of this implementation is shown in Figure 4.2. The first ROM is enabled by the feedback control input. When this feedback is disabled, an all-zero vector should be present on the lines of the feedback loop. Since most ROM's have tri-state output, the feedback lines should be grounded when the feedback loop is disabled. This is achieved by using tri-state logic buffers which have zero volt inputs and are connected to the feedback loop lines and enabled when the feedback is disabled. This circuit can be implemented using eleven (256x8) ROM's. These are available as a single Schottky TTL IC with 50 ns access time (SN74S471). Programming these ROM's for this circuit is s1mple, but the circuit needs two ROM time delay of 100 ns to perform the multiplication. Therefore, this circuit implementation is considered relatively slow. Control r--------.-------- ------- ----r----------r___ 8 ADo-AD;l :: ., I R(J~ lil41 8 8 I AD:-AD7 Data ROM Addre5s _.____,~--i 8 8 A0 -AD 0 7 R0~1 @69 ROM @252 -----.1..-------l--------------------'-------_. Clock 8 X2 tu(X) 8 Input Me55age Figure 4.2 Reed-Solomon Encoder Using Two-Level ROM Multiplier Implementation N (.11 26 4.2.2 One-Level ROM Implementation: The multiplication of any field element by a fixed field element can be done by using table look up ROM. In the encoder hardware implementation the multipliers in Figure 4.1 are replaced by ROM • s. The contents of each of those ROM's is the vector representation of the result of multiplying the feedback and one of the . l. . element coefficients of g (X) (gJ=@ J > • This circuit can be implemented using the same type of ROM (SN74S471). The multiplication delay time is one ROM time delay of 50 ns only. Although this circuit is faster than the previous one, the ROM contents are not as easy to generate. 4.2.3 Combinational Circuit Implementation: Multiplication of a field element by a fixed element from the the same field is best explained by an example. Consider multiplying an arbitrary field element B of GFC2 given by: 2 6 7 8 > B = B + Bl @ + B @ + ••••• + B @ + B @ 0 6 7 2 . d element @55 • Usl.ng . . Table 3 • 1 , t h e pro d uct l.S b y t h e fJ.el given by the following logical function: ••• = • cl B {@5 + @7) 0 2 + B (1 + @ + @3 + @4 + @6) 1 + B (@ + @3 + @4 + @5 + @7) 2 + B (1 + @2 + @3 + @4 + @6 +@ 7) 3 87 a.,~ e, '\ s, ® s'7 I e, B_s s, 1\ &7 I S B6 86 B18J 8 65 s, 3 ® BS 84 85 s, f\1\f\ s., I C±! 8' 4 82 83 Ba 81 e, E\f\1\6,8, e, e, a, ~ 8 I 8'3 Bz '\ B11 e, a, I @ s. i\ a, e, I <t; , 1 "'0 Figure 4.3 Combinational Circuit Multiplier Implementation 1"\) ....... 28 + + + + B (@ + @4 + 4 B (1 + @3 + 5 B (1 + @ + 6 B (@ + @2 + 7 @55 B = (B + 1 + (B + 2 +,j'lB · + 1 . + (B + ol + + I + (B + 0 + (B + - 1 + '(B._ + 0 c B3 B 4 B6 B 2 B2 B2 B3 B2 @6 @4 @2 @3 + + + + @7) @5 +@ 7) @3 + @5 + @6) @4 + @6 + @7) + B5 + B6) +B +B) @ 6 . 27 ±-:a7) @ 3 + B3 + B5 + + ~7)@ ~6 + B4 + B5 + B7) @ 5 +'~3 + B5 + B6) @ 6 +B4 + B6 + B7) @ + ·a + B5 + B7) @7 4 (4.1) ·-! The hardware implementation of this mutiplier is shown in Figure 4.3. This multiplier consists of a set of adders. Each of these adders is simply an XOR tree. An example of the XOR tree implementation is shown in Figure 4.4. Since each field symbol is 8 bit long, then the maximum number of inputs to an XOR tree is 8, and the maximum number of gate levels in the XOR tree is 4. However, for the selected code, the maximum gate levels are 3 levels. These XOR trees can be implemented using Schottky TTL (SN74S86}. This gate has 7 ns time delay. Therefore, the multiplication time delay is 21 ns. This circuit implementation is much faster than the previous two circuit, but the chip count is 55 larger. For example, @ multiplier requires 19 XOR' s to implement, i.e. 5 Quade-2-inputs XOR IC chips. However, the chip count can be reduced if the circuit is implemented using programmable logic arrays (PLA's). ~ Encoder Hardware Implementation: The encoder circuit consists of two main parts, the linear feedback shift register and the field element multipliers. The linear feedback shift register consists of ten 8-bit 29 Figure 4.4 XOR Tree Implementation 30 latch registers. These latch registers are implemented using Schottky TTL IC , SN74S374, which has 10 ns time delay. The outputs of each latch register are connected to the inputs of the following latch register through a field element adder, which adds the outputs of the latch register and the field element multiplier as shown in Figur 4.1. These field element adders simply consist of 8 XOR each, which can be implemented using Schottky TTL Quad 2-input XOR SN74S86, which has 7 ns time delay. The output control Cl is connected to eight 2 to 1 multiplexers, to select one of the two inputs. These multiplexers are implemented using Quad 2 to 1 multiplexer SN74Sl58. The feedback control C2 is connected to the control AND gates, and is used to enable and disable the feedback loop. The control AND gates are implemented using Quad 2-input AND SN74S08, which has 4.75 n.sec. time delay. The circuit discussed in Section 4.2.3 is selected to be used in implementing the 10 multipliers needed for the encoder eire ui t. To design these multipliers, a product logical function (PLF), similar to the one given by Eq. 4.1, should be generated for each multiplier. These product logical functions have been generated. The following is a list of each of the coefficients of the generator polynomial g (X) and the final form of the corresponding multiplier product logical function: 1- g 0 = @55 The product logical function of the multiplier of this coefficient is given by equation 4.1. 2- g 1 = @41 ,the corresponding PLF is: -- ------ - - 31 @41 B = + + + + + + + 3- g 2 = 3 = 1 + {B + 2 (B + 0 (B + 3 CB + 0 (B + 1 (B + 0 (B + 0 = + + + + + + + (B2 B 3 (B 0 (B 1 (B 3 (B 4 (B 0 (B 1 = (B 1 + (B 2 + (B 0 + (B 0 + (B 0 + (B 0 + {B 1 + {B 0 4 = B1 + B2 + B3 +B5 + B7) ) @3 B5 @4 B1 + B2 + B5> 5 B2 + B3 + B6) @ @6 B2 + B3 + B4 + B7) @7 B1 + B3 + B4 + B5) @2 + B7) @ + + + + + + @2 B2 + B4 + B7) @3 B2 + B3 + 8 s + ~7> B4 + B6 + B7) @ @5 B5 + B7) @6 B5 + B6) @7 B6 + B7) 71 @ , the Corresponding PLF is: @71 B 5- g B2 + B4 + B5 + B6) B3 + B5 + B6 + B7) @ PLF is: @102' the corresponding @ B 102 4- g (B + + + + + + + + B3 + B4) B4 + B5> @ B1 + B4 + Bs + B6) @2 3 B2 + B3 + B4 + ~5 + B6 + B7) @ B5 + B6 + B ) @ 7 5 B1 + B6 + ~7) @ B2 + B ) @ 7 7 B2 + B3) @ 76 @ , the corresponding PLF is: @76 B = (B 4 + (B 0 + (B 0 + (B 0 + + + + Bs + B6 + B7) B5 + B6 + B7) @ @2 B1 + B4 + B5> 3 B1 + B2 + B4 + B7) @ 32 + + + + 6- 9 123 5 = @ + + + + + + + 7 49 @ + + + + + + + 0 + (B + + + + + + ~ CB 1 (B 0 (B 0 (B 0 (B '0 (B 0 (B 1 (B B3 B2 B4 B3 B3 Bl B2 B2 + + + + + + + + + + + + B5 + B4 + Bs + B3 + B4 B4 B5 B4 B7) Bs>@ @2 B6 + B7) 3 B + B } @ 5 46 B ) @ 6 5 B6 + ~7) @ B } @ 7 7 B6) @ , the corresponding PLF is: @49 B = ' (B 0 (B 1 (B 0 (B 4 (B 1 65 = @4 :1> = @ , the corresponding PLF is: @65 B = 8- g B1 + B2 + B3 + B4 + B6 + ~7) B2 + B3 + B4 + Bs + @ + B ) @ B3 + B4 + B5 6 7 B4 + B5 + B6 + B7) @ + B1 + B2 + B5 + B6) + B2 + B3 + B6 + B7) @ 2 @ + B7) + BS + B6 + B4 + B1 + ~3 + B7) @ 4 + B2 + B6) @ + (B + B3 + B7} @5 2 + (B + B3 + B4) @6 0 + (B + B1 + B4 + Bs> @7 0 + + + + 6 + + + + , the corresponding PLF is: @123 B = 7- 9 CB 0 (B 1 (B 2 (B 3 1 + B7) B @ 2 2 (B + Bl + B3 + B7} @ 0 (B + B2 + B + B ) @3 0 4 47 (B + ss @ + ~7) 3 (B + B } @ 4 6 6 (B + B7) @ 5 33 + cao + a6) @7 9- g 8 = @69, the corresponding PLF is: @69 a = .Ca 0 + a3 + a5 + a6) + + + + + + + Ca 0 (a 0 (a 0 (B 1 (B 0 (B + + + + + 1 + (B + 2 al al al B2 B2 a3 B4 + + + + + + + a4 + a6 + a,> @ 2 a2 + a3 + a 6 + a 7 > @ 3 a2 + ~4 + a 5 + a 6 + s 7 > @ a,> @ 5 B3) @ B4) @6 B5> @7 10- g9 = @252, the corresponding PLF is: + B3) + B4) @ + B2 + ~5) + B6) @ 2 @ Each of these multipliers is implemented using a circuit similar to the one shown in Figure 4.3. These multipliers need 168 XOR's, which are implemented using 42 Quad 2-inputs XOR SN74S86s. Table 4.1 shows all the used to implement each element, and the number circuit. As shown, the for the encoder is 76. encoder elements, IC type numbers element, the time delay of each of IC chips used to implement the total number of IC chips required 34 Element Type IC Type Number Time Delay AND 74508 4.75 8 2 Multiplexer 745158 5 8 2 XOR for MUX 74886 7x3=21 168 42 XOR for Adder 74586 7 80 20 Latch Register 745374 10 10 10 Number of Elements Number of IC Chips Table 4.1 Time Delay and IC chip Count for the Encoder Circuit The maximum frequency that can be used will depend on the total time delay of the longest path of the signal, T , total which is given by: T total = T(latch) + T(adder) + T(AND} + T(Mul.) + T(adder) = 10 + 7 + 4.75 + 21 + 7 = 49.75 n.sec. This is the time required to process one byte of the information message. The circuit maximum frequency is 20MHz. If the input message bits are fed in serial, then a serial to parallel shift register will be needed. In this case, the maximum input frequency is 20x8 = 160 MHz. CHAPTER y_ ~ SYNDROME PROCESSOR Syndrome Processor Description: ~ ~ The syndrome processor is the first stage in the decoder pipelined structure. It is responsible for computing the syndrome from the received vector r(X). In general, for a t-error Reed-Solomon code, the syndrome has 2t components. These components are obtained by substituting @i into the received vector polynomial r(X) for 1 ~ i ~ 2t. For the (255,245) Reed-Solomon code, the syndrome S has ten components, i.e. where, S i = i r(@ ) 1 ~ ~ i 10 Let, v;;~ •••••• + r254 X~ be the received vector, then s \\ i = ro + rl @t + ....... + r254 2540 @ which can be rearranged to take the following form: s i i i {{ •••• <r254 @ + r253) @ + 35 ..... + rl)@i + r 0 36 The computation of a syndrome component can be done using the circuit shown in Figure 5.1. In the procedure of calculating a syndrome component, the register b. is 1 in1 tially cleared. The received vector ( r , r , • • • • • , 0 1 r > is then shifted into the circuit one symbol at a 245 time. After the first shift, the register bi will contain the vector representation of r and the multiplie~ output 254 1 will represent the vector representation of r @ • After 254 the second shift, the r~gister b. will contain the vector . 1 1 representation of r @ + r and the multiplier output 254 1. 253 1. will represent ((r @ + r ) @ ) . After 255 shifts, the 254 i 253 register will contain r(@ ) in vector representation form which iss., the ith component of the syndrome. l ~ ~ Syndrome Processor Hardware Design: Since the syndrome consists of ten components, this processor consists of ten similar circuits, which are called Syndrome Units, as shown in Figure 5. 2. Each ot these Syndrome Units is responsible for computing one ot the syndrome components s .. As shown in Figure 5.1, each of . l these Syndrome units consists of b. register, field element l adder and multiplier. The b. register is simply implemented l using 8-bit latch register. The field element adder is implemented by a set of XOR gates. In this processor, the field element multipliers can also be implemented using one of the three methods of implementation which are discussed in detail in Chapter 4. Now, depending on how the multipliers are implemented, the Syndrome Units can be implemented using one of the following three methods of implementation. 5.2.1 Using Two-Leyel BQM Multipliers Implementation: The circuit diagram of a Syndrome Unit implemented using this method is given in Figure 5.3. The first ROM takes the 37 8-bit Register r(X) 8 8 (a) Over GF(2 8 ) Multiply by@; r. 10 ,, r. -------4----~+~----------~ ----~~+·~----------------~ (b) In Binary Form Figure 5.1 Reed-Solomon Codes Syndrome Computation Circuit 38 . .. Syndrome Unit 1 ~ . . ~} s, e., . .. Syndrome Unit 2 .. . ~' r(X) . • • II II II I I II . . . Syndrom Unit 3 .. Syndrome Unit 10 . . . B'f I I I I I . ~ .. . Figure 5.2 Syndrom Processor Block Diagram 39 r. lo ,, r. Figure 5.3 Syndrome Unit Multiplier Implementatio Using Two-Level ROM , r} 40 field element stored in register bi in its vector representation form and generates its power representation form. The second ROM is the multiplier ~hich multiplies a 1 field element in its power form by @ (the multiplier element). Each of these ROM's is 256x8. This unit can be implemented using the Schottky TTL IC's, SN74S471, a 256x8 PROM with access time of 50 ns, SN74S374, an octal D-type latch with a 10 ns time delay and SN74S86, a Quad 2-input XOR with time delay of 7 ns. The total time delay,T , of a Syndrome Unit implemented this way is: total T total = 100 + 7 + 10 = 117 ns Each Syndrome Unit is built of two ROMs, eight XORs and one register, which makes a total chip count of 5. 5.2.2 Using One-Level ROM Multiplier Implementation: The circuit diagram of a Syndrome unit implemented this using this method is shown in Figure 5 .lb, in which the multiplier is implemented using a ROM. The ROM is used as a look up table to perform the multiplication. This ROM is also 256x8. In this method, the rest of the unit is implemented using exactly the same parts used in the first method. The total time delay, T , is: total2 Ttotal 2 = 50 + 7 + 10 = 67 ns For this method of implementation, the total chip count is 4. This method is faster than the first, but programming the contents of the ROMs of the multipliers is not as easy to generate. 41 5.2.3 Using Combinational Circuit Multiplier Implementation: This method of implementation has the same block diagram as shown in Figure S.lb., but in this case, the mutiplier is implemented using a combinational logic circuit. Assuming that B is a field element that has the following general form: B = BO + B X+ ••••• + B X6 1 6 + B 7 X7 the product logical functions, PLFs, of the ten multipliers which are used to multiply by @i, where 1 ~ i -$ 10, have been generated and their final forms are listed below: 1- The 1 st = Syndrome Unit PLF = + + + + + + + 2- The 2 nd @ + B7) + B7) +sB7> @ 4 6 B @ 5 7 B @ 6 2 @ @3 4 @ 2 + + + + + + + rd B 7 B 0 (B 1 (B 2 (B 3 B Syndrome Unit PLF = @ B = 3- The 3 @B Syndrome Unit PLF B 6 B 7 CB 1 (B 1 (B 2 (B 3 B 4 B 5 @ + B6) 2 @ + B6 + B7) + 8 6 + ~7) +6B7) @ @ @7 @3 4 @ 42 = + + + + + + + 4- The 4 5- The 5 B 5 B 6 (B 5 (B 0 (B 1 {B 2 (B 3 B @ 3 + B5 + B6) @ 4 + B5 + B6 + ~7) @ + B6 + ~7) @ + B7) @ @7 4 th Syndrome Unit PLF = @4 B B = 4 + B @ 5 2 + (B + B6) @ 4 + (B + B5 + B7) @3 4 4 + (B + B4 + B5 + B6) @ 0 5 + (B + B5 + 8 6 + ~7) @ 1 + (B + B6 + B7) @ 2 7 + (B + B7 + so> @ 3 th Syndrome Unit PLF = @5 B (B + B7) = 3 + B @ 4 + (B + B5 + 3 + (B + B4 + 3 + (B + B4 + 3 + (B + B4 + 0 + (B + B5 + 1 + (B + B6 + 2 6- The 6 2 + B7) @ th Syndrome Unit PLF = @6 B (B + = 2 + (B + 3 + (B + 2 + (B + 2 + (B + 2 + (B + 3 2 B7) @ @3 B6 + ~7) B5> @ 5 B5 + B6) @ 6 B6 + ~7) @ B7) B6 + B7) B7) @ @ 2 B4 + B6 + B7) @ 3 B3 + 8 5 + ~6) @ B3 + B4) @ 5 B4 + B5) @ 43 6 + (B + B4 + B5 + B6) @ 0 7 + (B + B5 + B6 + B7) @ 1 7- The 7 th Syndrome Unit PLF = @7 B = (B1 + + + + + + + + 8- The 8 th Syndrome Unit PLF + + + + + + + th (B 0 (B 1 {B 0 (B 0 (B 0 (B 1 (B 2 (B 3 + + + + + + + + + + + B6 + B7) B7) @ 2 B5 + B6) @ 3 B4 + B5) @ 4 + B3 + ~7) @ + B4) ~ + B) @ 7 + B5 + B6) @ B + + + + + + + + B4 + B5 + B6) B5 + B6 + B7) @ B2 + B4 + B5 + ~7> B1 + B3 + B4) @ 4 B1 + B2 + B6) @ 5 B2 + B3 + ~7) @ B3 + B4} @ @7 B4 + Bs> Syndrome Unit PLF = @9 B = + + + + + + + 10- The 10 2 (B 1 (B 1 (B 1 (B 2 (B 3 (B 0 = @8 = 9- The 9 (B B5 B6 B3 B 2 B2 B3 B4 B4 th (B 3 {B 0 (B 1 (B 0 (B 0 (B 0 (B 1 (B 2 + + + + + + + + B4 B4 B3 B2 B1 B1 B2 B3 + Bs> + Bs + B6) @ 2 + B4 + B6) @ @3 + B3 + ~7) + B5> @ 5 + B2 + B6) @ 6 + B3 +7B7) @ + B4)@ Syndrome Unit PLF= @10 B = (B + B3 + B4) 2 + (B + B4 + B5> @ 3 2 @ 44 + (B + B 0 2 + (B + B + (Bl + B2 0 4 + (B + Bl 0 + (B + Bl 0 + (B + B2 1 + + + + + + ! ) B + B + B3 + B5) @ 6 6 45 B7) @ 5 B5> @ 6 B2 + B6) @ @7 B3 + B7) 2 @ The tirst Syndrome Unit multiplier is shown in Figure 5.4. Eacn ot the other units multipliers is implemented using similar circuit. These ten multipliers are implemented using 31 Quad 2-input XORs. From the above product logical functions, it 1s obvious that the mutipliers have a maximum of three levels of XORs. Therefore, the total time delay, TtotalJ' of this implementation method is: T total3 = 21 + 7 + 10 = 38 ns The total chip count for this method of implementation is 61. If programmable logic arrays are used to implement the XORs, the total chip count can be dramatically decreased. Comparing the above three method of implementation, we notice that the third method, using combinational ci rc ui t multipliers, is the fastest method. Therefore, it has been chosen to Processor. implement the Syndrome units ot the Syndrome 45 B' 7 Figure 5.4 Syndrome Unit Implementation Using Combinational Logic Circuit Multiplier CHAPTER :U ~ ~ SIGMA PROCESSOR Introduction: The Sigma Processor is the second processor in the decoder pipelined structure. Its function is to determine the error-location polynomial o-(X) from the syndrome components s , s , • • • • ,s • The inputs to this 1 10 2 processor, are the ten syndrome components calculated by the preceding pipeline stage, the Syndrome Processor. It generates as an input to the tollowing pipeline stage, the Error Location Processor, the number of errors, L , and the u coefficient location polynomial. The first five syndrome components are also transmitted to Error Location processor unchanged, as shown in Figure 6.1. There are several algorithms to determine the error location polynomial. A highly efficient algorithm is the Ber lekamp' e.;__ itet:,~!=:J.ve ' ••''""_....:...-.......--· - . -....-·-·'· ··"--·...-•• . • algori ...• th![LJ3J ..•... -·-·~ .~. ... ' ~ ,.~.-~,..~-~....- -~--~-~--~-·~-~--,·•:·;-:~-· .-.,·-~· ··-~~---·--··~·--• >·•---~··-··--· "··-·-·-··---·~····-······· -~--"·-~~"-'>--'"-~ ·---~··'- ."'-.'"'. ·; ·-:-::,.·.·:~;-,c·::·. :; ' Tha Iterative Algorithm ~ Finding the Error Location Polynomial: This algorithm is used to determine the error-location polynomial o-(X) from the syndrome components S , , s 2t 1 . Let o-(X) = o-0 + o-1 X+ ••••• + f o-f X then, the coefficients of o-(X) are related to the syndrome components s. 's by the Newton's identities given by ~ equations (2.7) in Chapter 2. The iterative algorithm ' <) 46 47 Input Ports Output Ports s, Lu s2 crt s3 D2 s4 D 8 OJ s5 Ofi s,o OS s, s2 s3 s4 s5 8 Figure 6. 1 Sigma Processor Inputs and Outputs 48 solves these sets of equations to determine the polynomial o-(X) of minimum degree. This o-(X) would produce an error pattern which has the minimum number of errors. The first step in the iteration is to find a minimum degree polynomial o- (l) (X) whose coefficient satisfies the first Newton's identity. The second step is to check if the coefficient ot o-{l)(X) satisfies the second Newton's identity. If it does then, o- {2 ) (X) = o- (l) {X) If it does not satisfy the second Newton's identity, then a (1) (2) correction term is added to o(X) such that o(X) bas the minimum degree and its coefficient satisfies the first two Newton's identities. This iteration continues until 2 o- ( t) (X) is obtained. Then, o- ( 2 t) (X) is the error-location polynomial o-(x), o- (X) = o- (2t) (X) • Let, o-(u) (X) = 1 + o-(u) X+ o-(u) X2 + •• + o-(u) XLu 1 Lu 2 be the m1n1rnum degree polynomial determined at the end ot th (u) the u step of iteration and L be the degree ot o(X). The steps to find o-{u+l) (X) ar~ as follows: 1 - Compu t e th e uth d'1screpancy as: d U = 5 u+l 2- If d u + ~1 <u> s = 0 then set u + ~2 o- <u> s (u+l) (X) <u>s u-1 + •• + o-Lu = o- (u) (X) ( u+l-Lu) 49 3- if d # 0 then find another iteration p prior to the th u u step that has d # 0 and p-L has the largest p p value, then : o-(u+l) (X) = o-<u> (X) + d d-1 x<u-p) o-<p> (X) p u and, L u+l = Max(L u , L p + u - p) 4- Repeat steps 1-3 until o-(x) = u = 2t, then o-(2t) (X) If L t > t then there are more than t errors and 2 generally it is not possible to locate them [101. ~ From Xhe Sigma Processor Design: the iteration section, it from the ~s algorithm discussed in the previous clear that the process of determining o-{X) syndrome components is a set ot arithmetic and logical operations. Therefore, the Sigma Processor is best implemented by either a microcomputer or a special purpose computer designed computations. to handle Galois Field element In this project, Intel 8085 microprocessor based system is used to implement this processor. 6.3.1 Microprocessor Based System Structure: The basic components of a microprocessor based system are: 50 1. The microprocessor, 2. Read-Only Memory for storage of system programs, 3. Random Access Memory for storage ot data and, 4. Input-Output Interface. A typical microprocessor-based shown in Figure 6.2. system block diagram is The microprocessor executes all the instructions and performs ar i thrnetic and logical operations on data. It also controls the communications between all system blocks. The Read Only Memory (ROM) is used to store the operating program. It does not have a write capability. This implies that the binary information stored in the ROM is made permanent during the hardware production of the unit and can not be altered. The Randon Access Memeory (RAM) is used to store programs and data which are temporary and might change during the execution of a program. It allows reading and writing of data. It has two control signals which specify a read or write operation. The peripheral interface devices transfer data between the microprocessor and the external devices. This transfer involves data, status and control signals. All these units communicate through a bus structured organization. There are three buses, an address bus, a data bus and a control bus. The address bus is usually a 16-bi t -~~~-~_:.~tt~D~}-"-9'-~~~ used to address a partie ular memory word stored in the ROM or RAM. This 16 bit address ROM !,';. Data Bus K ~~ MieraProcessor Address Bus !,'~ RAM 1\ J. ' ~ 11---- i- ~ - ~ Control Bus 1.- 1\ ~ > 1-- - ,. 1 - I- f-- I-f-- r-- \l './ I/0 /~ v Ports Figure 6.2 Microprocessor Based System Block Diagram <.n ..... 52 address up to 65k words. The data bus is usually an 8-bit, bidirectional bus used to transfer data between the microprocessor and all the other devices. The control bus provides control and timing signals to all the devices of the system. Xhe Sjgma Processor Hardware Design: 6.3.2 The Sigma processor has been designed around the Intel 8085 8-bit microprocessor. The basic block diagram ot this system is similar to Figure 6.2. A detailed circuit diagram ot the designed Sigma Processor is snown in Figure 6. 3. This system consists of the following three main parts: 1- The Intel 8085 microprocessor, 2- EPROM and RAM memory units and 3- Input/Output ports. 6.3.2.1 Xhe Intel ~microprocessor: The Intel 8085A is a complete 8-bit parallel CPU~ The 8085A microprocessor contains the functions of clock generation, system bus control and interrupt priority selection, in addition to execution of the instruction set. It transfers data on an 8-bit, bidirectional 3-state bus wnich is time-multiplexed so as to also transmit the 8 lower ordered address bits. Additional 8 lines expand the MCS-85 system memory addressing capability to 16-bit, thereby, allowing 64k bytes of memory to be accessed directly by the CPU. The 8085 has six 8-bit general purpose registers, B, C; D, E; and H, L which can be used either as single register ~ CLK .--1-----fv __l__j~l ll'DY ; Sv ~< I_otii~ J r----·- !~~ ~ ~ 74L52~~~ ----1 n. 8085A l~$1 ---lf?,5.!. ~-"1"0 .... d .., .. f'o,.ll <~ I i\.s ' uL~o ~ £:-- A,, J oi: Cs !o1ii=t~ A,, p;. I< ·-- ~ (sv ~ I A,, c.o""''".,..,n_'"'" sro -~ R5T 15 :: : RSr 6"·5" "ST 5.< A !WTR , 1 · : g ~,A, ~~~.~.:I I"- _f"i-v"---- ~ ... '-l "' ~~ 'P.J ~. I: · I I : I i ii1:: \'1 I ----~t. + A.) ·~ 0 IC _ o,p - - ~· .. "v 1 I~ I "'~ 2\ ~ JV r---~ f--· ~. RAM 2114 l~:D,o,_o,,_ ,~ ~ J ___- ~ ~ _,J~ _ ---~ RAM 2114 ~:0,A, -!• ~-,, D D,t>!>, ~ t\ ~ ~~ _ "" ... ., ~ ~ utput Ports t __l?): ' D ~ . ~leo 1---i•~ ~ t"-:A7 A, l:l., M2~47,_o, I "' .;A, :: EPROM ~, 2716 Hul.-c I ~· A p ;:"'1 1"R~P ' _j="lfp l I ~mwlL . _ _ _ _ _ Input Ports t till "'''· J ---------- Figure 6.3 Sigma Processor Circuit Diagram CJl w 54 (8-bit) or as register pairs (16-bits). There are also four register which can function only as two 16 bit registers, the program counter PC and stack pointer SP. The 8085 CPU generates control signals that can be used to select appropriate external devices and functions to perform Read and Write operations and to select memory or I/0 locations. The 8085 operates with a single 5 volts power supply and a maximum clock frequency-range ot 3 MHz at s1ngle-phase. It provides RD, WE, SO, Sl and IO/M signal for bus control. It also provides five interrupt inputs INTR, RST 5.5, RST 6.5, RST7.5 and TRAP. It also provides Serial Input Data (SID) and Serial Output Data (SOD) lines tor serial interface. A block diagram ot 8u85 CPU is shown in Figure 6.4. In general, the Intel 8085 microprocessor can have its own clock pulse which is generated internally and the clock pulse frequency is determined by an external crystal connected to the 1nput pins Xl and X2. In this system, the Sigma Processor is part of the decoder system; therefore, in order to be synchronized with the rest of the system, the microprocessor clock pulse 1s generated by the overall controller and fed externally to the microprocessor through input Xl. The 8085 microprocessor uses a multiplexed address/data bus that contains the lower 8-bit address information during the first part of machine cycle. The same bus contains data at a later time in the cycle. An address latch enable (ALE) signal is provided by the 8085 to be used by latch to latch the address so that it may be available through the whole machine cycle. The 74LS373 chip is used as a latch for the 8085 lower 8-bits as shown in Figure 6.5. Due to current driving limitations of the 8085 data outputs, the AM2947 non-inverting buffer chip is used to BLOCK DIAGRAM INTA HSl b > THAP SID SOD - - - - - - - --·· -------- =.::::! ··---------·-· -------------- .•. .•. .•. 8 HHi INSlHUCTION () (Jf.UJOfR RHi AND MACHINE H RlG CVLlt c R(G [ REG L ,., ---1 .•. ,,, f.- REGoSltR ARRAY fUl; ENCODING 1161 STAC~ POINHR 1161 PROGRAM COuNI i: H POwtR .,. •!J.V 5UH'lV • G"-'0 tNCRfM[NT t H '()E(Ht Mt.t..!Tf..R AOORtS.SLATCH lltH • l ;-- TIMING AND CONHIOL ., I •• HEADY HOLD 181 At6·Aa RESET.iN Figure 6.4 Intel 8085A CPU ADDRESS BufFER ADDRESS BUS Fu~ctional Block -- d DA JA,A00HtSSIIUfHR 111 f A01 A0 0 AODAESSJDATA BUS Diagra~ t.n (..'1 2.2K - +5v ~.i~~ ROY Vee Ars A,~ RS I RESET OUT A, u A,s 2.1 A,lf A,3 A, 2 26 25' A,,fl :: Z4 Au 12J 10 Ato u A~ q 21 .' . 30 ; ,, IS b 11 5 ~ 16 1 'i 3 l'i CJ 14 13 6 12 i Ag 2. ,, _lO Address Bus v,, ,, CLK ,, 74LS373 17 1'.5 9 7 A1 A6 As AIf A3 , Az. At 3 Ao 10 3't o1 +5v G. NO I OJ_ ITt 'J. D7 o, ;- Ds Dtt \. Data Bus D_, Dl o, Do Figure 6.5 Intel 8085A Microprocessor and Address Latch (.11 0"\ 57 buffer these outputs. It is a tri-state bidirectional buffer. The nigh order address bits AS, A9, AlO, All, AlS and control lines WE, RD, and IO/M are also buffered using 74LS244. The pin connections of the 74LS244 and AM2947 buffers are shown in Figure 6.6. The READY input is tied high since all I/0 and Memory are considered sufficiently fast. The RESET input is connected to the overall controller. All interrupt inputs are grounded since they are not needed. The 8085 is a 40 pin single chip. The pin out diagram is shown in Figure 6.7. 6.3.2.2 Memory Unit: The memory units consists of a program memory chip, and data memory chip. The Intel 2716 (2Kx8) EPROM chip is used as the program memory and two of the Intel 2114 (1Kx4) RAM chips are used as the data memory. 1. Program Memory (Intel 2716 EPROM): The Intel 2716 is a 2K byte ultra violet erasable and It electrically programmable read-only memory (EPROM) • operates from a single 5-volt power supply. Since it is a 2K memory, 11-bit address is needed to address the memory. eleven address lines AO-AlO from the 8085 The microprocessor are connected to the inputs of the EPROM. The 8-data outputs of the EPROM are connected directly to the data bus of the microprocessor. It has a chip enable/program (CE/PGM) and an output enable (OE) inputs. The chip enable/program input is used to enable the EPROM whenever it is selected by the microprocessor as shown in Figure 6 .8. The output enable (OE) input is connected to the read control signal RD of the microprocessor. 2. Data Memory (Intel 2114 RAM) : 58 Data Bus v,, .le_+5v g D, D, Ds D't D3 D, D, 7 o, ·~ ' lit .D~ 5 16 Dt.~ If It 03 D2 AM2947 3 Do D1 12. ,.., 2. 1r· o, I I 'I Do , CD r; GND -:b-" T/R u -RD The AM2947 Pin Connections Vee ~+5v A.~ 2 li WE 17 RD ~ ,, IO/M 15 5 3 74LS244 ' All A,o ll Aq At l- l'f 7 &' 12 II q GND The 74LS244 Pin Connections Figure 6.6 Buffers Pin Connections fus WE RD ro/M A,, A,o A'l A~ Buffered Data Bus 59 x, Xz RESET OUT SOD SID TRAP RST 7.5 RST 6.5 RST 5.5 INTR INTA ADo AD1 ADz AD3 AD4 ADs AD5 AD7 vss 40 39 38 37 Vee HOLD HLOA eLK (OUT) 36 35 34 33 RESET IN READY 10/M 32 s, RD 10 31 8085A-2 11 30 12 WR ALE 13 14 15 16 A15 A14 A13 17 18 19 20 An Figure 6.7 Intel 8085A Pin so A12 A1Q Ag 21 As Out Connections 60 ,, AI() I vPP 2l Aq ).3 A~ Address Bus Vcr; I A1 A, 2 3 A'§ /If (2K X 8) 5 AJ " 16 2716 't A~t 17 EPROM 13 ''1 A:~. AI Ao II /0 8 C'f ~v RD I o1M" A " AliS OE CE r~o ~~ 18 /' Figure 6.8 Program Memory Unit +5v 61 This RAM is a lk x 4 bits memory. Two of them are connected together to give a lk x 8 bits memory. Ten address lines are needed to address this memory. Address lines AO - A9 are connected to the address inputs of the RAM. It has a write enable (WE) input and a chip select input ( CS) • The RAM pin connections are shown in Figure 6.9. 3. Memory Address Space Mapping: Address Lines AO - AlO are used to address the memory locations in either memory. Address lines All and AlS are used to select between the RAM, EPROM, and I/0 ports as shown in the tollowing table: IO/M AlS All 0 0 0 0 0 0 1 1 X Selected Device EPROM RAM I/0 ports. The memory is mapped as follows: Address 0000-07FF 0800-0BFF 8000-8200 8000-8020 6.3.2.3 Device Memory Space EPROM RAM Input Ports Output Ports 2K bytes lK bytes Input/Output Ports: Ten input ports are used to interface the syndrome 62 18 IS vee !---o+5v 16 11. l - 2 1Kx4 II 07 RAM It 06 05 04 3 13 " 2114 llj 'l t. 5 WE toY G.ND ~ Cs cs ~ V ~+5v cc 15 16 11 -IZ . 1\ 1Kx4 I z ... RAM 3 "' ".. .)3 2114 7 WE lo't &ND ~- ~ llj cs) '- Figure 6.9 The RAM Unit ' ~ 03 02 o, oo 63 processor to the Sigma Processor and six outputs ports are used to interface the microprocessor to the Error-Location Processor. The input/output ports are implemented using 16 SN74S373s. Any of the 10 input ports is selected by RD, AlS and one of the address lines AO to A9 as snown in Figure 6.10. The output of each of these input ports is connected to the data bus through a set of tri-state buffers which are enabled when these ports are selected. Figure 6.11 shows an input port connection. Each of the ten syndrome components s. will be received from the syndrome l. processor through one of these input ports. Each ot the six output ports is selected by WE, AlS and one of the address lines AO-AS, as shown in Figure 6.12. When any ot these output ports is selected, the data byte which is present on the data bus will be latched into it. Each of the sigma polynomial coefficient 07 or L will be sent l. u to the Error-location processor through one of the output ports, where L represents the number of the detected u errors 6.3.3 The Sigma Processor Software System Design: The algorithm for calculating the sigma polynomial coefficients from the syndrome components is done by the hardware microprocessor based system under the control of a software control program. This control program consists ot three routines, main program and the two routines: Sigma routine and the Discrepancy (d ) routine. u The data structure and the control program ot these three routines are discussed in the follwoing sections. 6.3.3.1 nata Structure Discription; The data structure consists mainly of three arrays, To Data Bus 8 RO A,A,;1"8 RQ Att.\tst 8 R£? ~T~•.c j8 I?DA,A,,l-8 RD, ~5 ~,,'1-8 RPA.,A 15'i-8 745373 745373 8 510 8 sg B 8 58 57 8 s6 8 55 8 s4 8 s3 8 s2 s, Figure 6.10 Input Ports Configuration ~ ~ 65 o.J.-1--- Cl s:: - S0 Vl Vl QJ u Cl 0 S- 0.. u QJ s:: s:: 0 u +-' S0 0.. 0 S- +-' >, s:: ...... -o s:: Cl +-' QJ ·.- E Vl 0 ...... Vl E 0 S- u... :::::; 0. s:: c:( r- 0..0 QJ S- :::::; en Cl u... Vl 0:: ..:::L. u 0 r- u From Data Bus A2 8 8 8 8 8 8 74S373 74S373 as (}4 8 8 8 8 0"'3 cr2 8 8 cr, Lu Figure 6.12 Output Ports Configuration "'"' 67 During the program execution the first 5-location each. array contains the coefficients of the sigma polynomial . d 1n . t h e p th step. Th'1s array occup1es . d eterm1ne t he memory locations 0800-0806H. Associated with this array, there are 3 other memory locations which represent the following parameters: which re~resents the iteration step number for the o-(p (X) polynomial, which occupies the location (0807H). p d . p th which represents the discrepancy in the p iteration step, which occupies the location (0808H). h which represents the difference between the iteration step number and the power of o-(p)(X), which occupies the location (0809H). p The second array contains the coefficients ot the sigma polynomial determined in the uth step. This array occupies the memory locations 080A-080FH. Associated with this array, there are four memory locations which represents the following parameters: L u • <) ( u) u : The degree of the o(X), which occupies the memory location (0810H). The iteration step number of the current polynomial o-<u> (X), which occupies the memory location (0811H). 68 d - h u . . h occup1es . The uth d'1screpancy, wh1c t h e memory location (0812H). The difference between the iteration step number ( u) and the power of o(X), which occupies the memory location (0813H). u The third array is a temporary storage for the coefficient of o-(u+l)(X) polynomial during the process of calculating them. The block diagram of the data structure is given in Figure 6.13. 6.3.3,2 and~(.) £(,) Transforms: To simplify the software description the following transforms P(.) and V(,) will be introduced. two 1. P(.) Transform Definition: For a field element B given by its vector representation v., the transform P (v.) gives the power representation p. 1 1 1 of B, i.e. P(.) transforms a vector representation of any field element into its power representation. Therefore, if v. and p. are the vector and power representation in binary 1 1 form ot field element, then: P<v.) = p. 1 1 The P(.) transforms of all the field elements ot GFC2 8 > are given in Table 6.1. 2. V(.) Transform Defintion: For a tield element B given by its power representation P., 1 the transform V(p,) gives the vector representation v. of 1 1 69 B B 6 G B _6 ,...... >< ~0~8(6 ~ ~ B ~ ~ B ~ ~ +-' u ~ S- +' (/) 10 +-' 10 Cl S- o til til (l) u 0 S- o... 10 E Ol .,..... ~~o ~~o (/) (Y) r- <.0 (l) S- ~ ~{] ~~o ~o ~ o ~~o Ol .,..... l.J... 70 P(V.) v.1 P(V.) 00000001 00000010 00000011 00000000 00000001 00011001 10000000 10000001 10000010 00000111 01110000 11000000 01111101 01111110 01111111 11110011 10100111 01010111 11111101 11111110 01010000 01011000 10011111 1 11111111 1 Table 6.1 P-Transform of GF(2 8 ) Elements 71 B, i.e. V(p.) 1 = v.1 8 The V(.) transforms of all the field elements of GFC2 > are given in Table 6.2. ~.3.3.3 Main Program Description: The flow chart of this program is shown in Figure 6.14. This program is stored in location OOOOH. When the system is reset, the program counter is forced to OOOOH and the Intel 8085 starts executing the main program. At the beginning ot the program execution, the program checks if there is an error, by checking all the syndrome components& If these components are all zeros then there is no error detected and Lu is set to zero. If any of the syndrome components is not all zeros, then the program will start system initialization. The system initialization consist of the tollowing: 1. Reset the locations o-(u)_ o-(u) and o-(p)-o-(p) 1 to zeros, 2. Set the locations o- (p) 0 3. Reset the locations u, h 4. Set the locations h 5. Set the location d p u 5 = 1, u 1 o- ( u) 0 = 1, 5 d p = 1, and L to zeros, u and p to -1, and to sl. The system will call the Sigma routine to calculate ( u+l> o(X) and store the resulting coefficients of 72 P. 1 V(P.) 1 p. V( p.) 1 1 00000000 00000001 00000011 00000001 00000010 00001000 10000000 10000001 10000010 10000101 00010111 00101110 01111101 01111110 01111111 10010011 01100110 11001100 11111101 11111110 11111111 10101101 10001110 00000001 Table 6.2 V-Transform of GF(2 8 ) Elements 73 -1-P sl--- du 1-dp o - hu -1-h 1-riiPl o-- 0 ' (atp 1 - a!"' -o-,!PI ) • o_l'-'' 0 - (<JQ L-0 u Move Arguments of u to Arguments of p Yes Figure 6.14 Main Program Flow Chart ' ' 74 <u+l) in the o{X) array. The value of h is u then calculated and checked if it is greater than h • If <u> P so, the content of the o(X) array is transferred ~o the o-(p) (X) array and the parameters u, d and h are u u transferred to p, d and h respectively. If h~ is less than b then the o-JPp) (X) Noes not change • o-<u l) (X) is P (u) then transferred too(X). o- <u+l) (X) The degree ot o- ( u) (X) is checked. If it is larger than the number of correctable errors (5) an uncorrectable error flag is set. If not, the next step of iteration is done by incrementing the iterative step number and calculating the . uth d.1screpancy (d ) by calling the Discrepancy rout1ne. u (u+l) . Cu) If d is zero then o(X) 1s the same as o(X). If u (u+l) . (X) Wlll be d does not equal to zero, then the ou calculated by the Sigma routine. This will continue until the number of iterations u is equal to 10. At the end of executing this program, the microprocessor will get into continuous loop waiting to be reset by the overall controller. 6.3.3.4 Xha Sigma Routine: . . <u+l} This routine calculates the coeff1c1ents ot o(X) using the current value of the sigma polynomial o- ( u) (X), (p) the discrepancy d , o(X) and d where p < u and (p) u . p o(X}, and d are the polynom1al and its discrepancy ot the pth step. fhis p step is selected such that d ~ 0 and p th h has the largest value in the steps prior to the u s~ep. o-(u+l) (X) is given by: o-(u+l) (X) = o-(u) (X) d-1 x<u-p) o-<p> <x> + d u p 75 The tlow chart tor this subroutine is shown in Figure 6.15. Three temporary locations are used to store the intermediate results, (TEMPl - TEMP3). The multiplication of d by d -l is done by using the P-transform for both eleme':tts. Tlfe result is stored in the power representation -1 {p) form in TEMP2. To calculate the term d d o(X), each of the coefficients of o-(p) CX), o-~p) ,uispmultiplied by d d-l which is done by the aJdi tion of the powe~ r~presentations of the two terms. Then multiplying this term by x<u-p> is equivalent to shifting these new coefficients (u-p) times to the right. The addition of the result of this shifting operation to the coefficients ot o-<u> (X) yields the coefficients of o-(u+l) (X). This result . d ~n . ~s store o- <u+l > <X) array. 6.3.3.5 ~ Discrepancy Routine: th discrepancy d This routine calculates the value of the u u using the coefficients of o-(u){X) and the syndrome th components s. 1 s, The u discrepancy is given by the ~ following equation: d u =S u+l +o-.:-1 (u) S u +~2 (u) S u-1 + •••• + < u) or;u s u+l-Lu Two pointers, N and N , are used for this routine. N is used to point at the ~ ( u) (X) coefficients and N is the . 1 pointer tor the syndrome components. This value of d can u be easily calculated by L multlplications and L u u additions. Each of these multiplications is a two-field element multiplication which is done by adding the P-transform ot both elements. Then, the result is checked if ~t is greater than 254. If it is greater, then the power of the mutiplication result should be adjusted by . 255 , ( s~nce . "'255 -- 1 t h en 1::'Cli+255 = -eeli .1::'"'255 = sub tract~ng 1::' 76 TEMP2+TEMP3 --+ TEMP3 TEMP3-255 -+TEMP3 1 Shift cr ~;',' Right TEMPl Times Figure 6.15 The Sigma Routine Flow Chart 77 @i > • The addi tiona are done by adding the vector representation form of the two e!ements modulo-2 addition. The result ot this routine is stored in d • The flow chart u of this routine is shown in Figure 6.16. Table 6.3 The Sigma Processor IC Parts List Element Type IC Type Number Number of Element Number of IC Chips Microprocessor Intel 8085A 1 1 EPROM Intel 2716 1 1 RAM Intel 2114 2 2 Latch register SN74S373 17 17 Tranceiver AM 2947 1 1 Octal buffer SN74LS244 1 1 Tri-state buffer SN74367 80 10 3-input NAND SN74Sl0 18 6 All ot the elements required to construct this processor are listed in Table 6.3. From this table, the total number of IC chips required is 39. The exact time delay of this processor is a tunction of the number of executed software instructions and the clock pulse trequency provided by the overall controller. 78 ST A RT 0 u+l - - - SN N-1-----+Nl+l·--~ Yes Yes Yes ----il-' TEMP No TEMP-255 - - - TEMP V(TEMP}+d Figure 6.16 The Discrepancy Routine Flow Chart CHAPTER ~ ~ m ERROR-LOCATION PROCESSOR Introduction: This is the third processor in the decoder pipelined structure. Its function is to determine the roots ot the error-location polynomial o- (X) • The reciprocals ot these roots are the error-location numbers. The inputs to this processor are the coefficients of the error-location polynomial and the number of errors Lu. It generates the roots of the error-location polynomial as an input to the following pipeline stage, the Error-Magnitude processor. Since the first five syndrome components are also needed as an 1nput to the Error-Magnitude processor, they are stored in the Error-Location processor and transmitted to it, as shown in Figure 7.1. The error location polynomial o-(X) has the following form: o-<x> = 1 + oy x + ~ x 2 + ••••• +Of xt The roots ot o-(X) can be found by substituting all the non-zero field elements of GF(2m) into o-(X). If o-(@i) = 0 then @i is a root of o-(X). Since there are (2m-l> non-zero field elements and the degree ot o-(X} is t, then a maximum ot (2m-l)t additions and (2m-l)t multiplications are required to find the roots of o- (X} • This procedure will be very slow to implement. However, Chien's search algorithm to find these roots requires only 2m-1 clock pulses. This is much faster than the direct method. Therefore, Chien's search algorithm is used in this processor. 79 80 Input Ports L I u \ I Output Ports Figure 7.1 Error-Location Processor Inputs and Outputs 81 ~ Error-Location Processor Design: The block diagram of this processor is illustrated in Figure 7.2. It consists ot four parts: the Root Locator, the Stack register, two counters and control circuit. 7.2.1 ~ RQQt Locator Design: The Root Locator circuit diagram is snown in Figure 7.3. This Root Locator uses Chien's search algorithm to find the roots of the error location polynomial o- {X) • In general, it requires t multipliers to multiply by @, @2, ••••• , @t. For the (255,245) code, only five multipliers are required to multiply by @, @2, .. • , @5. Initially, the sigma registers are loaded with the coefficients ot the error-location polynomial generated by the Sigma processor. Then, the registers are clocked 255 times. At the end of the ith clock pulse, the registers contain oy @i, ~ @2i, •••• , o; @5 i, and the output of the XOR tree is the sum of these values, which is: If this sum is zero, then @i is a root of o-(X), and @255-i is an error location number. If o-(@i) # 0, then @i is not a root of o- (X) • To check if the sum is zero, an 8-input NOR gate is used to NOR all the a-outputs of the XOR tree. If the output is 1 then o-(@i) = 0. This circuit can be implemented using five multipliers similar to the tirst five multiplexers used in the Syndrome processor designed in Chapter V. The Sigma-registers are a set ot 8-bit shift registers. ' 0 8 a; o-2 o-3 ; 8 ~»j . CJ (X) Root Locator 8 ~4 ~--, I 0'"5 I I t RS ~8 n I Ml ~ Stack Register ' ~ M2 F t-----F CL'K Figure 7.2 Error-Locator Processor Block Diagram co N 8 XOR Tree 8 14'5373 14531.3 RS R RS ® @ 8 RS ® @2 8 RS RS ® 8 @3 ® @4 8 ® @5 Figure 7.3 Root-Locator Circuit Diagram co w 84 1.2.2 ~ Counters Design: In this processor there two counters. The first counter, Cl, is an up binary counter which is used to count the circuit clock pulses. At any time, the content o:t this counter ls also the power representation form o:t the field element that has been tested to check if it is a root o:t o- (X) or not. The second counter, C2, is a down binary counter. Initially, this counter is loaded with the number of errors Lu• When a root of o-(X) is detected, this counter is decremented by enabling its clock input using the NOR gate output. When this counter reaches zero, it indicates that all the roots of o- (X) have been detected and the eire ui t clock is disabled. The content ot this counter ls tested by a 3-input NOR gate. 7.2.3 Xhe Stack Register Design: There are tive registers arranged in a form of a stack (the outputs of one register are the inputs of the followl.ng one). These registers are used to store the roots of o-(X) ~olynomial in their power representation forms. Each time a root is detected the content of Cl is loaded into the top location ot the stack and the rest of the stack content is pushed one place down. This is done by enabling its clock pulse lnput using the output of the NOR gate wnich is controlling the out~ut of the XOR tree. At the end of the process, this stack will have all the roots of o-(X). 7.2.4 System Control Design: There are a tew signals used to control this processor. The reset signal RS is set by the overall controller. This signal is used to initialize the processor. Signal M2 when it l.S zero indicates that all the roots of o-(X) have been located. When it is not zero, it indicates that all the 85 roots ot o-(X) have not been located yet. Signal Ml, when set, indicates that 255 clocks have been received, i.e. all the the tield elements of GFC2 8 > have already been tested to see if they are roots of o- (X) • If this occurs before finding all the roots of o-(X), an uncorrectable error is assumed. This signal is necessary because occasionally o-(X) can not be factorized into the form (l+B1 X)(l+B 2 X) • • • • Cl+B 5 X). This occurs when some ot its roots may not be in the field GF <2 8 >. If this happens, an error ot a weight more than 5 is assumed. The M2 signal is then used to set an alarm flag to indicate such a condition. ~ System Operation: At the beg inning of each processing cycle, the processor receives a reset pulse trom the overall controller, on the RS line. This pulse J.s used to initialize the processor. During the J.nitialization: - the coefficients of the error-location polynomial are loaded J.nto the Sigma registers. This is done by selecting the first input of each of the 2 to 1 multiplexers connected to the OJ's comming out of the Sigma processor, - Cl is reset to zero, - C2 is loaded by the number of errors Lu, - Flip-Flop Q is reset to zero to enable the system clock and - Flip-Flop F is set to one to indicate that the process of determining the roots of o-(X) is in progress. The processor clock is controlled by the contents of Cl and C2. If C2 reaches 0 first, this indicates that the roots ot 86 o- (X) have been found, then the processor clock pulse is disabled and F is set to 1 indicating a correctable error. If Cl reaches 255 first, this indicates that an uncorrectable error is detected; in this case F is reset to 0 and the processor clock pulse 1.s disabled. Whenever a root is detected its value is loaded into the stack register and C2 is decremented. This processor can be implemented using Schottky TTL IC circuits. The tive multipliers are implemented using 41 XOR gates which need 6 Quad 2-input XOR SN74S86s. This gate has 7 ns time aelay. The Sigma registers are implemented using 5 octal D-type Edge-triggered Flip Flops, SN74S373s with a 10 ns time delay. The XOR tree circuit nas 5x8 inputs and 8 outputs and can be constructed using 32 XOR gates. The least significant bit of the XOR tree output is inverted, to account for the addition of I. The XOR tree output 1.s then connected to 8-input NOR gate. This NOR gate can be constructed using four 2-input OR gates followed by 4-input NOR gate which are 1.mplemented using SN74S32 and SN74S40, and have a total time delay of 8 ns. The two counters Cl and C2 are implemented using three 4-bit counter SN74Sl61. The stack register consists of five octal D-type Edge-Triggered Flip Flops, SN7 4S374s. The control circuit requires: four 2-input AND gates, a 2-input OR gate, an 8-input AND gate, and a 3-input NOR gate. These gates can be implemented using: SN74S08, SN7427 and SN7421. Eleven 8-bit registers are required for the storage ot the Sigma coefficients, the syndrome components and the number ot detected errors. This makes the total number of IC chips required for the 1.mplementation of this processor equal 37. This processor needs a maximum of 256 clock pulses to determine the roots of the error location polynomial. The clock pulse duration should be 57 ns minimum, the total 87 time delay of the root locator circuit. ' \) CHAPTER ~ ~ nn ERROR-MAGNITUDE PROCESSOR Introduction: This is the fourth processor in the decoder pipelined structure. It computes the magnitudes of the errors at all given error locations. The inputs to this processor are: the coefficients of the error-location polynomial, the roots of this polynomial, the number of errors and the first five syndrome components. The outputs of this processor are: the error location numbers and the magn1tudes ot the errors as shown in Figure 8 .1. These outputs are transmitted to the Error-Correction processor. The procedure to calculate the magnitudes of the errors was devised by Forney and Berlekamp £31. In this procedure, the error evaluation polynomial Z(X) is defined as: then, the magnitude of the error in location BL is given by: Z( B-1 ) L = --r-------------------r-1 ( 1 + Bi BLl ) i=l ipL ~ Error-Magnitude Procesor Hardware Design: To compute the magn1tude of the error, a microprocessor based system similar to the one used in Chapter VI has been 88 89 Input Ports l Output Ports I - ...... .,. "7" ...... ... ~ ~ .,.. Figure 8.1 Error-Magnitude Processor Inputs and Outputs 90 designed. The detailed design of this system is described in Chapter VI, and the detailed circuit diagram is snown in Figure 6. 3. The only difference in the hardware design between the two processors is the Input/Output ports, which are discussed in the tollowing section. 8.2.1 Input/Output Ports: There are 16 input ports and 10 output ports. The input ports are used to interface the Error-Location processor to the Error-Magnitude processor. The output ports are used to interface the Error-Magnitude processor to the Error-Correction processor. All the input and output ports are implemented using 26 SN74S373. To address one of the 16 input ports, four address lines A0 , A1 , A2 , A3 are fed to 4/16 binary decoder (74Sl38) to output 16 different lines. Each of these lines is used with A15 and RD to completely select an input port. The input port selector implementation is shown in Figure 8.2. The outputs of each input port are connected to the data bus through a set ot tri-state buffers which are enabled when the port is selected. The tirst five syndrome components <s 1 -s 5 > are received through the first five input ports. The coefficients of the error-location polynomial are received through the next five input ports. The tollowing five ports are used for the roots of the error-location polynomial, while the last input port is used for the number ot detected errors, Lu• Each ot the ten output ports are selected by WE, A15 and one ot the address lines A0-A 9 as shown in Figure 8.3. When any of these output ports is selected,· the data byte present on the data bus is latched into it. These output ports are used to send the error location numbers and the magn1tudes of the errors to the Error-Correction processor. To r------r---r------ -- Data Bus ~--- --------ill ,,, pl5 ill --------- pl4 745373 8 L u RD A15 p3 I 11 I I ls fs x5 11 . . . . . . r,5 · · · · r I x4 4/16 I Po I I1 fs s4 I fs 53 I I 745373 fs fs sz s, Decoder Tl A A A A 3 2 1 0 ' Figure 8.2 Input Ports Configuration \0 ...... 92 745373 8 B 8 B 2 8 B 4 1 From Data Bus Figure 8.3 Output Ports Configuration 93 ~ Error-Magnitude Processor Software Description: The algori thrn to compute the magnitudes ot the errors 1s executed by the microprocessor based system under the control of the software control program. This program consists ot tour main routines; a main program and three routines. These routines are: a routine to calculate Z(X), a routine to evaluate Z(Xc> and a routine to calculate the product term~(l + Bi B~l). The data structure and each ot these routines are described in the following sections. 8.3 .1 D.at..a Structure Des.cription: The data structure consists of 6 arrays, the size ot each array is five locations. The tirst array is the syndrome array which contains the first five syndrome components needed to calculate Z(X). The second is the Sigma array which contains the coefficients of the error-location polynomial o-i •s. The third is the x-array which contains the roots ot the error-location polynomial in the power representation form. The tourth array is the z-array which contains the values of the coefficients of the error evaluation polynomial Z(X}. The tifth and the sixth are the B-array and E-array which contain the error location numbers (Bi's) and the corresponding magnitudes of the errors (ei•s), respectively. The block diagram of the data structure is shown in Figure 8.4. 8.3.2 Main Program Description: The tlow chart of this program is shown in Figure 8.5~ This program is stored in the ROM starting at location OOOOH. When the processor is reset, the program counter is forced to OOOOH and the processor starts executing this program. The tirst step in this program is to determine the 94 s, z, s, s2 z2 B2 s3 z3 z4 83 z5 85 Z Array B Array s4 ss Syndrome Array 84 o; el e2 ~ (}3 e3 (}4 OS e4 e5 Sigma Array E Array x, X Array Figure 8.4 Error-Magnitude Processor Data Structure ' t} 95 Determine c+ l Z( X) c OOH FFH Xc- 255 Z(Xc) L U ( TT j =I 1 + B;Bc-l Be TEMPl ) --TH1P2 P(TEMPl)-P(TEMP2)~TEMPl V(TEMPl) - - - - c No Figure 8.5 Main Program Flow Chart 96 coefficients of the error evaluation polynomial Z (X) and store them in the Z-array. Next, the program checks the number ot errors. If it is less than five, the processor sets the locations that correspond to no-error in the error location array to FFH and the corresponding locations in the error magnitude array to OOH. This is to facilitate the function of the Error-Correction processor and to make it independent of the number of detected errors. The program then determines the error location numbers Bi 's using the roots of the error location polynomial Xi's and stores them in the cor responding locations of the B-ar ray. For each error location number, the program generates an error magn~tude ei and stores it in the corresponding location of the E-array. This is done by executing the Z (X c ) routine followed by the Product routine. Dividing the results ot both routines yields the error magnitude. This process is repeated for each of the error locations. At the end of executing this program the microprocessor will get into a continuous loop waiting to be reset by the overall controller. 8.3.3 The Error Evaluation Polynomial Routine: The tlow chart of this routine is shown in Figure 8.6. This routine has the first five syndrome components and the coefficients of the error location polynomial as inputs and generates the coefficients of the Z(X) as an output. These Z(X) coefficients are stored in the Z-array. Since any coefficient of the Z(X) polynomial, zi, is given by: then, each coefficient of Z(X) is calculated using two 97 ; c 0 Cl \ Z; C-1 C Cl+l Cl z.1 Yes No Yes Z;+V(TEMP) Figure 8.6 Error Evaluation Polynomial Routine Fl 0\'1 Chart 98 pointers C and Cl which are set to point to the syndrome component number and to the Sigma coefficient number, respectively. The ith syndrome component is moved to the Zi location, then the pointer C is decremented and pointer Cl is incremented. The value of o-s 1. is then computed and 1 -1 added to Zi. This process continues till the pointer C reaches a zero value. Then the "' is added to Zi. This procedure is repeated for each value of i where 1~ i ~ Lu· 8.3.4 Ihe Z(Xc) Routine: This routine calculates the value of the error evaluation polynomial Z(X) for each of the error location polynomial root Xc, The value of Z(Xc> is stored in a temporary location TEMPI. Since Z(X) is given by: where 1 ~ i ~ Lu, it can be rearranged as: To calculate Z(Xc>, the value of zi is tirst transferred to TEMPI. The content of TEPMl is multiplied by Xc. The result is added to Zi-l and stored in TEMPI. This process of multiplication followed by addition is repeated until i=l. The value of TEMPI is then multiplied by Xc and the result is added to 1, modulo-2 . addition. The flow chart ot this routine is shown in Figure 8.7. 8.3.5 !ha Product Routine: This routine computes given by: the value of the product term PT 99 L .U Z. 1 -~ TEt·1Pl Yes No P(TEMPl)+P(X) ~ TEMPl c Adjust TEMPl P( TEMPl )+P( Xc) _ __.,.,_ TEMPl Adjust TEMPl V(TEMPl)+l TEMPl Figure 8.7 The Z(Xc) Routine Flow Chart 100 Lu PT = r-\ ( 1 + Bi B~l ) i=l iFC and stores the result in a temporary location TEMP2. For a given error location number Be, the term (Bi B~ 1 > is calculated, added to 1 and the result is stored in location TEMP!. This process is repeated for each value ot i wnere 1~ i ~ Lu and i # c. The tlow chart ot this routine is shown in Figure 8.8. Table 8.1 The Error-Magnitude Processor IC Part List Element Type IC Type Number Number of Elements Number of IC Chips Microprocessor Intel 8085A 1 1 EPROM Intel 2716 1 1 RAM Intel 2114 2 2 Tranceiver AM 2947 1 1 Octal buffer SN74LS244 1 1 Latch register SN74373 27 27 Tri-state buffer SN74367 128 16 4116 decoder SN74154 1 l 3-input NAND SN74Sl0 28 10 All the IC parts required to implement the Error-Magnitude processor are listed in Table 8 .1. From this table, the total number of IC chips required is 60. ' d 101 Lu 1 - P(X )+P(M) c V(M) + 1 i TEMP2 M M P(TEMP2)+P(M) - - - TH1P2 V( TEMP2) - - - TEMP2 No Figure 8.8 The Product Routine Flow Chart ' 0 102 The exact time delay of this processor is a function of the number ot the executed software instructions as well as the clock pulse trequency provided by the overall controller. CBAPTER l.X. ~ ERROR-CQRRECTION PROCESSOR ANn xa& OVERALL QQNTRQLLER ~ Xhe ~rror-Correction Processor Design: is the final stage o:t the decoder pipe lined structure. The ~nputs to this processor are: the five error location numbers, the five magnitudes of errors and the received vector r(X). The output ot this processor is the corrected code word V(X). The error numbers and the magnitudes of errors are location trasferred from the Error-Magnltude processor while r(X) is read out ot the Queue buffer. The Error-Correction processor This processor consists of: two 5-location stack registers B and E 1 a counter c, a comparator COMP 1 an adder and control gates as shown in Figure 9.1. When a reset pulse, RS, is received from the overall controller, the output of the Error-Mgnitude processor which represents the five error location numbers and the corresponding five magn~tudes ot errors are latched into the B and E stack registers. At the same time the counter C is reset to zero. It may be recalled that the Error-Location processor began its root search from @0 to @254 , therefore, the error location numbers are ~n the proper sequence for correction. The corresponding magnitudes of errors are stored in the B stack in the same sequence. The error correction procedure is described as follows: with each clock pulse received, a symbol is read from the Queue buffer unit (in order) and the counter is decremented. The output of this counter is compared with 103 8 rd 8 Buffer Unit CLK r(X) 8 Counter C L ,____ COMP 8 Stack Register 1-- 8 V(1J e(X) I B Stack Register 8 <t; LCLK ~f3 E Figure 9.1 Error-Correction Processor Circuit Diagram ... ...J 0 105 the output of the top location ot the B stack. If they match (they are equal), then the current symbol, ri, read out ot the buffer is corrupted. To correct it, the output of the top location of the E stack is enabled and added to it to form the correct symbol vi. The contents of the Band E stacks are then popped one location up such that the error location number and the corresponding magnitude that are JUSt used are shifted out. The bottom location of the B stack will be loaded by binary 255 (FFH) and that ot E stack will be loaded by 0. If the contents of the counter and the output ot the top location of the B stack do not match, then the symbol ri is correct and shifted out unchanged. This process will continue for 255 pulses. Therefore the maximum value the counter C reaches is 254. This is the reason why the no-error locations of the B stack are loaded with binary 255 CFFH) that causes no match between the counter output and this value at any time. This processor comparator is is implemented using Schottky TTL IC. The constructed using two 4-bi t magnitude comparators, SN74S85s which has 18 ns time delay. The counter is contructed using two 4-bit binary counters SN74Sl6ls. The registers are implemented using 10 octal D-type Flip Flops, SN74S374s. The adder is implemented using two Quad 2-input XOR, SN74S86s. Three SN74S08s are used for the control AND gates. The total chip count required to implement this processor is 19. This processor requires 255 clock pulse to correct and read one data block out. ~The Queue Buffer: This is the buffer where the five received vectors which 106 are currently being decoded are stored. It is organized in a form ot a queue, i.e. the first vector to get in is the first vector to get out (FIFO). As discussed in Chapter 3, the size of this buffer is (5x255)x8, and is divided into five units 255x8 each as shown in Figure 9.2. There are two pointers RDP and WRP. The pointer RDP always points at the top unit of the queue buffer, Brd' from which the vector being corrected and transmitted out is read. While, the pointer WRP always points at the bottom unit of the queue, Bwr' where the newly received vector is written into. The read and write operations from and into these two units are synchronized (have the same clock pulse signal); therefore only one address counter is required. To select the Brd buffer unit, the output of the RDP pointer is oecoded using 3/8 binary decoder. The outputs ot this decoder control a set of tri-state buffers such that only the output of Brd unit is enabled to the buffer output data bus. To select the Bwr buffer unit the output of the WRP pointer is also decoded using another 3/8 binary decoder. The outputs of this decoder control a set of tri-state buffers such that the input data bus is enabled to Bwr data inputs only. For every clock pulse received, one data byte is read out of the Brd unit and one data byte is written into the Bwr unit. At the same time the address counter is incremented. After completing the read and write operations ot wnole data blocks, the buffer is reset by the RS control signal provided oy the overall controllere This s~gnal will clear the address counter, and increment the two pointers RDP and WRP to be ready for a new read or write operation respectively. Each of these two pointers is a 3-bi t up counter designed to count up from zero to four, and then back to zero. The WRP pointer is always one unit ahead of 107 pl 8 utput Data Bus ,.;>-----.,""---f Buffer Unit 8 8 Address Lines 8 Address 8 RS CLK 8 Buffer Unit In put Data Bus wl w5 RS Figure 9.2 The Queue Buffer Circuit Diagram ' 0 RS 108 the RDP, i.e. if WRP contains 4 then RDP will contain 3. The buffer requires: five (256x8) RAMs, 8xl0 tri-state buffers, two (3/8) binary decoders, one 8-bi t up binary counter and two 3-bit counters. ~ ~ Syatem Overall £ontroller: The system overall controller controls all the decoder's five processors. This decoder system is a part of either a digital communication or data storage system, the Host system. Therefore, the decoder system could be controlled directly by the Host system overall controller, or it could have ~ts own overall controller which would be controlled by the Host system overall controller. Eacn ot the decoder pipelined processors is designed such that a minimum external control is required. The only control signals required for each processor are clock pulse and reset signals, as shown in Figure 9.3. The timing diagram of these control signals is shown in Figure 9.4. Since the received symbols are fed to the Syndrome processor and the Queue buffer at the same time, they are clocked using the same clock pulse signal, CPl. The Error-Correcting processor is correcting and reading data at the same rate; therefore it is clocked using the same clock pulse signal, CPl. The Error-Location processor is clocked by the main clock pulse s1gnal, CP. The Sigma and Error-Magnitude processors are running at a much faster rate than any of the other processors, therefore, they are clocked by a clock pulse signal that has a trequency which is a multiple ot the frequency of the ' 6 Queue Buffer A t I CPl r( X) Syndrome Processor RS ErrorLocation Processor Sigma Processor 1- /'- A I I t CP1 CP2 RS .A. I t Fl CP RS ErrorMagn i tude Processor 1 l F2 , _,...,. /'.. I 1 CP2 ErrorCorrection ~ Processor RS I t Cpl RS Figure 9.3 Overall Controller Control Signals _, 0 '-0 110 CPl Figure 9.4 Control Signals Timing Diagram 111 CP signal. The RS signal that is provided by the overall controller has the tollowing functions: 1- latches the data coming out of each intermediate processor into the input ports of the following processor, 2- resets all counters and control Flip Flops of all the decoder processors and 3- it resets the two 8085 microprocessors of the Sigma and Error-Magnitude processors. When the Sigma or the Error-Location processor detects an uncorrectable error, it sets the Fl or F2 flags, respectively. When either Fl or F2 flags is set, this indicates to the overall controller that the data block stored in either the buffer unit (WRP-1) or (WRP-2) respectively is an uncorrectable corrupted data block. The action to be taken next is upto the Host system. CHAPTER X CONCLUSION One ot the important parts in the design ot modern digital communication and data storage systems is the error detecting and correcting system which is responsible for the reliable recovery of data. In this project, Reed-Solomon codes encoding and decoding algorithms are first discussed. A complete logic circuit design of the ( 255, 245) Reed-Solomon code encoder/ decoder microprocessor based system is then presented. This code is detined over Galois Field GF(2 8 > and has the capability of correcting up to five burst errors of 8 bits each or any burst combination of upto a total length ot 40 bits provided they only affect a maximum of five individual symbols (bytes). Although a Reed-Solomon code, with smaller dimensions, could have oeen selected, this code in particular has been chosen because of its suitable symbol size that matches the very commonly used 8-bit aata byte. Alternative approaches to design Reed-Solomon encoders based on linear feed back shift register implementation have been investigated. Two-level ROM, one-level ROM, combinational logic circuit field implementation have been discussed. For better pipelined pipelined efficiency and higher elements speed, a multiplier five-stage structur~d decoder has been designed. This architecture utilizes the parallelism in the decoding algorithms which consist of several distinct steps. The aecoder consists of five independently designed 112 113 processors. These processors are: the Syndrom, Sigma, Error-Location, Error-Magnitude and Error-Correction processors. The detailed design of each ot these processors is presented. Berlekamp' s iterative algorithm is used to determine the coefficients of the error location polynomial, and Chien's search algorithm is used to find out its roots. The Sigma and Error-Magnitude processors are designed using Intel 8085A microprocessors, while the other processors are designed using off the shelf integrated circuits. The cost ot 1mplementing each processor has been expressed in terms of the number of IC chips required to build it. On the other hand, the speed of each processor has been expressed in terms of the total time delay. Although this system has been designed for a code that has a natural length of 255 symbols (2040 binary bits), this code can easily be shortened to any length to match any system specifications without any major change in the encoder/decoder hardware circuits. Although Reed-Solomon codes have been known for a long time, very little literature is available. Perhaps this paper could be the only available document that presents a complete detailed design of Reed-Solomon code a encoder/decoder system in one packet •• Besides the enormous experience the author had while reviewing literature and completing the design of the system presented in this project, this design has shown that with today's technology, it is feasible to design and build Reed-Solomon codes encoder/decoder systems using otf the shelf integrated circuits. Alternative approaches to design Reed-Solomon codes 114 encoder/decoder systems (for example, using special purpose processors designed to handle Galois Field element computations) [2], comparison of cost and performance of these designs with the one presented, actual design implementation, putting this design in a form that suits VLSI implementation remain as work for the future • . " 115 REFERENCES 1- Berlekamp, E.R., •on Decoding Binary Bose-ChandhuriHocquenghen Codes", IEEE Trans. Inf. Theory, lT-11, pp. 577-580, October 1965. j;;:~-~Berlekamp, l ~-,/ 3- 4- E. R., } n IEEE Trans. Inf. November 1982. Bit Serial Reed-Solomon Encoders ", Theory, Vol lT-28, pp. Berlekamp, E. R., Algebraic Coding McGraw-Hill Book Company, New York, 1968. Blahut, R. E., 869-874, Theory Theory and Practice of Error Codes, Addison-Wesley Mass., 1983. Publishing Company, , Control Reading, 5- Chien, R. T., " Cyclic Decoding Procedure for BoseChandhuri-Hocquenghen Codes", IEEE Trans. Inf. Theory, lT-10, pp. 357-363, October 1964. 6- El Naga, N. M., " An Error Detecting and Correcting System For Optical Memory ", Proceedings of The International Conference of SPIE, Los Angeles, calif., January 1982. 7- Furney, G. D., " On Decoding BCH Codes ", IEEE Trans. Inf. Theory, lT-11, pp. 577-580, October 1965. 8- Hamming, R. Codes" 1950. 9- Intel. , w., " Error Detecting and Error Correcting Bell System Tech. J., "MCS-8085 Family 29 pp. 147-160, April User's Manual", Intel 116 Corporation, Santa Clara, CA, October 1979. 10- Lin, S., Costello, D.J., Jr., Error Control Coding Fundamentals and Application~, Prentice-Hall, Englewood Cliffs, N.J., 1983. 11- Massey, J. L., " Step-by-Step Decoding of the BoseChaudhuri-Hocquenghen Codes", IEEE Trans. Inf. Theory, lT-11, pp. 580-585, October 1965. 12- Peterson, W. w., " Encoding and Error-Correcting Procedures for The Bose-Chaundhuri Codes •, IRE Trans. Inf. Theory, lT-6, pp. 459-470, September 1960. 13- Peterson, W. w., and Weldon, E.J., Jr., ErrorCorrecting Codes, 2nd Ed., MIT press, Cambridge, Mass., 1970. 14- Rooney, V. M. , Microcomputers", York, 1984. Ismail, A. R., Microprocessors Macmillan Publishing Company, and New 15- Texas Instruments Inc., The TTL Data Book For Design Engineers , 2nd Ed., Dallas: Texas Instruments, Inc., Texas, 1981.