Download ElNagaHalima1987

Document related concepts
no text concepts found
Transcript
CALIFORNIA STATE UNIVERSITY, NORTHRIDGE
REED-SOLOMON CODES ENCODER/DECODER
MICROPROCESSOR BASED SYSTEM
A thesis submitted in partial satisfaction of
the requirements tor the degree of Master of Science 1n
Engineering
by
Halima Makady El Naga
May, 1987
Copyright 1987
by
Halima Makady El Naga
The Thesis of Halima Makady El Naga is approved:
DR. Robert Henderson
DR. Jagdish Prabhakar
Commit tee Chair
California State University, Northridge
iii
To My Parents
iv
Acknowledgements
I wish
Dr.
~o
express my sincere thanks to my thesis advisor,
Jagdish
Parbahakar,
for
his
advisement
and
the
much
appreciated time he spent towards improving the final torm
of this thesis.
Special
thanks
to
my
husband
Nagi
for
his
support, patience, encouragement and advisement.
v
continuous
TABLE OF CONTENTS
page
....................................
List of Tables
List of Figures ••••••••••••••••••••••••••••••••••••
Abstract
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii
...........................
...............................
1
......................•...........
2
CHAPTER I: INTRODUCTION
1.1 Introduction
1.2 Objective
1.3 Project Outline
••••••••••••••••••••••••••••
.............. ..
CHAPTER II: REED-SOLOMON CODES
2.1 Hamming Codes ••••••••••••••••••••••••••••••
2.2
2.3
2.4
2.5
General BCH Codes ·········~················
Reed-Solomon Codes
Encoding of Reed-Solomon Codes
Decoding of Reed-Solomon Codes
.........................
..... ........
.............
CHAPTER III: SYSTEM SPECIFICATION AND GENERAL
DESCRIPTION • • • • • • • • • • • • • • o e • • •
3.1 System Specification
3.2 System Description
3.2.1 The Encoder
3.2.2 The Decoder
o • e • • • • o • • • o
·~
•
•
•
•
o •
..... ....... . .. .
. . . . .. . ... ... .. . .. . . ...
.... .......... ..........
. . ....
CHAPTER IV: ENCODER HARDWARE DESIGN . .. . .. . .. .. . ...
4.1 Introduction ... . . . . ... .. ............ . ..... .
o •
•
o •
o
4.2 Field Element Multiplier Hardware
Implementation •••••••••••••••••••••••••••••
4.2.1 Two-Level ROM Implementation •••••••••••
4.2.2 One-Level ROM Implementation •••••••••••
4.2.3 Combinational Circuit Implementation
4.3 Encoder Hardware Implementation •••••••••.••
• tJ
ix
x
vi
1
3
4
4
4
6
7
8
12
12
15
15
16
20
20
23
23
26
26
28
page
CHAPTER V: THE SYNDROME PROCESSOR • • • • • • • • • • • • • • • • • •
5.1 The Syndrome Processor Description
•••••••••
5.2 The Syndrome Processor Hardware Design •••••••
5.2.1 Using Two-Level ROM Multiplier
Implementation ••••••••••••••••••••••••
35
35
36
36
5.2.2 Using One-Level ROM Mutiplier
Implementation
••••••••••••••••••••••••
40
5.2.3 Using Combinational Circuit Mutiplier
Implementation
••••••••••••••••••••••••
41
CHAPTER VI: THE SIGMA PROCESSOR •••••••••••••••••••
6.1 Introduction •••••••••••••••••••••••••••••••
46
46
6.2 The Iterative Algorithm For Finding The
. ....... . .. . .. .....
Design . . .. . . . . . . .. . . . ..
Error-Location Plynomial
6.3 The Sigma Processor
6.3.1 Microprocessor Based System Structure
46
49
49
6.3.2 The Sigma Processor Hardware System
Design
................ ................
6.3.2.1 The INTEL 8085 Microprocessor
6.3.2.2 Memory Unit
52
••••
52
••••••••••••••••••••.•
57
6.3.2.3 Input/Output Ports
•••••••••••••••
61
6.3.3 The Sigma Processor Software System
Design
•. . . . . •. . •. •. . •. . •. . . •. •. . . •. . ••
6.3.3.1 Data Structure Description
63
•••••••
63
6.3.3.2 P( ) and V( ) Transforms
•••••••••
68
6.3.3.3 Main Program Description
•••••.•••
71
6.3.3.4 The Sigma Routine •••••••••••••••••
74
6.3.3.5 The Discrepancy Routine
••••••••••
75
•••••••••
79
•••••••.•••••••••••••••.•••••••
79
CHAPTER VII: THE ERROR-LOCATION PROCESSOR
7.1 Introduction
7.2 Error-Location Processor Design
•••••.••••••
81
7.2.1 The Root Locator Design................
81
vii
page
7.2.2 The Counters Design ••••••••••••••••••••
7.2.3 The Stack Register Design
7.2.4 System Control Design •••••••••••••••••
7.3 System Operation •••••••••••••••••••••••••••
.............
... ....
...............................
CHAPTER VIII: THE ERROR-MAGNITUDE PROCESSOR
8.1 Introduction
8.2 Error-Magnitude Processor Hardware Design •••
8.2.1 Input/Output Ports •••••••••••••••••••••
8.3 Error-Magnitude Processor Software Design •••
8.3.1 Data Structure Description ••••••••••••
8.3.2 Main Program Description ••••••••••••••
8.3.3 The Error-Evaluation Polynomial Routine.
8.3.4 The Z(Xc) Routine
•••••••••••••••••••••
8.3.5 The Product Routine •••••••••••••••••••
CHAPTER IX: THE ERROR-CORRECTION PROCESSOR AND
THE OVERALL CONTROLLER
9.1 The Error-Correction Processor Design ••••••
9.2 The Queue Buffer •••••••••••••••••••••••••••
9.3 The System Overall Controller •••••••••••••••
•. ...............
88
88
88
90
93
93
93
96
98
98
103
103
105
108
............................. 112
....................................... . 115
CHAPTER X: CONCLUSION
REFERENCES
84
84
84
85
viii
LIST OF TABLES
page
3.1
4.1
GFC2 8 > Elements Generated By
p<x> = 1 + x2 + x3 + x4 + x8
.............. ..
14
Time Delay and IC Chip Count for the
Encoder Circuit
34
6.1
P-Transform of
70
6.2
V-Transform of
6.3
The Sigma Processor IC parts List ••••••••••••
8.1
The Error-Magnitude Processor IC Parts List •• 100
.............................
8
GFC2 > Elements ........ ...... .
GFC2 8 ) Elements ..... ....... . . .
ix
72
77
LIST OF FIGURES
page
.....................
Block Diagram . .. ... ...
Block Diagram ... .. ....
3.1
Encoder Block Diagram
3.2
Reed-Solomon Decoder
4.1
Reed-Solomon Encoder
4.2
Reed-Solomon Encoder Using Two-Level ROM
Multiplier Implementation
4.3
4.4
•••••••••••••••••
17
18
22
25
Combinational Circuit Multiplier
............................
Impelementation .. .... . ... .. . .... .
Implementation
27
XOR tree
29
5.1 . Reed-Solomon Codes Syndrome Computation
Circuit ••••••••••••••••••••••••••••••••••••
5.2
Syndrome Processor Block Diagram ••••••••••••
5.3
Syndrome Unit Multiplier Implementation Using
Two-Level ROM •••• -•... • • .. • • • • • • • • • • • • • • • • • • •
5.4
Syndrome Unit Implementation Using
37
38
39
......
45
6.1
Sigma Processor Inputs and Outputs •••••••••
47
6.2
Microprocessor Based System Block Diagram ••
51
6.3
Sigma Processor Circuit Diagram •••••••••••.
53
Combinational Logic Circuit Mutiplier
X
page
6.4
Intel 8085A CPU Functional Block Diagram •••
55
6.5
Intel 8085A Microprocessor and Address Latch
56
6.6
Buffer Pin Connections
.....................
58
6.7
Intel 8085A Pin Out Diagram ••••••••••••••••
59
6 .. 8
Program Memory Unit
60
6. 9
The RAM Unit • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • •
62
..................
64
•
•
•
0
•
6.10 Input Ports Configuration
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
6.11 Input Port Connections •••••••••••••••••••••
65
6.12 Output Ports Configuration •••••••••••••••••
66
.............
69
6.14 Main Program Flow Chart ••••••••••••••••••••
73
6.15 The Sigma Routine Flow Chart
e •
76
6.16 The Discrepancy Routine Flow Chart •••••••••
78
7.1
Error-Location Processor Inputs and Outputs.
80
7.2
Error-Location Processor Block Diagram
7.3
Root-Locator Circuit Diagram ................
83
8.1
Error-Magnitude Processor Inputs and Outputs
89
6.13 Sigma Processor Data Structure
xi
•
• •
•
• •
•
•
• e •
•
•
.....
82
page
8.2
Input Ports Configuration ••••••••••••••••••
91
8.3
Output Ports Configuration •••••••••••••••••
92
8.4
Error-Magnitude Processor Data Structure •••
94
8.5
Main Program Flow Chart ••••••••••••••••••••
95
8.6
Error Evaluation Polynomial Routine
Flow Chart • • • • . . • • . . • • . • • • • . • • • • • • • . • • • • . • .
97
8.7
The Z(Xc) Routine Flow Chart •••••••••••••••
99
8.8
The Product Routine Flow Chart
.............
101
9.1
Error-Correction Processor Circuit Diagram •
104
9.2
The Queue Buffer Circuit Diagram •••••••••••
107
9.3
Overall Controller Control Signals
.........
109
9.4
Control Signals Timing Diagram •••••••••••••
110
xii
ABSTRACT
REED-SOLOMON CODES ENCODER/DECODER
MICROPROCESSOR BASED SYSTEM
By
Halima Makady El Naga
Master of Science in Engineering
In this project, Reed-Solomon codes encoding and decoding
algorithms
are first discussed.
A complete
logic
c~rcuit
design ot the (255, 245) Reed-Solomon code encoder/decoder
microprocessor based system is then presented. This code is
detined over Galois Field GFC2 8 > and has the capability or
correcting
burst
up to five burst errors ot 8 bits each or any
combination
provided they only
symbols (bytes).
For
of
up
to
affect
better efficiency,
this
a
a
total
maximum
design
length
of
ot
five
features
a
40
bits
individual
five-stage
pipelined structured decoder which utilizes the parallelism
in the decoding algorithm. Berlekamp's iterative algorithm
is used to determine the coefficients of the error location
polynomial, and Chien's searching algorithm is used to find
its roots$
In this design, only otf the shelf integrated circuits are
used. The Intel 8085 microprocessor has been utilized as a
data
processor
in
two
ot
the
xiii
decoder
pipelined
stages.
Alternative design methods of various system parts have
been investigated and speed and time delay measurements of
these parts are included.
xiv
CHAPTER
~
INTRODUCTION
~
Introduction:
The error detection and correction system, which is
responsible for the reliable recovery of digital data, has
become one ot the important parts in the design of modern
digital data communication
and storage systems. The reason
is partly due to the 1ntolerence of either system to error
and, in some cases, partly because of the critical nature
of the aata.
Although several powerful error detecting and correcting
codes have been known for some time, they have not been
extensively used in these systems. On one hand, because ot
the complexity of their encoder and decoder algorithms, the
amount ot nardware required to implement their encoders and
decoders was too large and too expensive to build. On the
other hand, since relatively primitive single-short-burst
error correcting codes (e.g. Fire Codes) were sufficient to
achieve adequate system-level performance at that time the
use or more powerful codes was not needed.
However, over the past two decades, the cost of solid state
electronic devices,
particularly digital devices,
has
decreased dramatically. This has stimulated the development
of automatic data processors, digital computers, long range
communications such as with satellites and peripheral
devices. This, in turn, has caused a dramatic increase in
the volume of data communicated between such machines. As
an example, the development of optical disks, with data
densities ot 25,000 bit per inch and 10,000 tracks per inch
1
2
compared to 4,000 bits per inch and 200 tracks per inch for
magnetic disks, means that data densities have increased by
more
than
250
times.
As
a
result,
and
in
spite
ot
the
improvement in the storage media characteristics, the raw
error rates have very much increased [5].
Under these conditions of much higher raw error rates and
cheaper hardware, it nas become necessary to consider more
powerful error detecting and correcting codes to maintain
and possibly ~mprove reliability and performance. These
codes should be capable
multiple burst errors.
of
Reed-Solomon
have
shown
other
competitors,
advantage
codes
over
capability of
all
correcting
a
correcting multiple
random
large
as
and
long,
cost/performance
and
they
well as
have
long
the
burst
errors.
The
design ot a
Reed-Solomon code encoder/decoder system
requires a very good knowledge of both digital hardware
design principles and the theory ot error control coding in
general
and .decoding
algorithms
for
algebraic
codes
in
particular. Although Reed-Solomon decoders have already
been built, only a small amount of literature is available.
The reason is that most digital design engineers may not
have the knowledge ot the theory ot error control coding,
and the few companies that have the capabilities ot
designing Reed-Solomon decoders, obviously reveal nothing
of the design in order to retain their hold on the growing
market.
L2. Objectives:
In
this
project,
the
general
procedure
ot
encoding
and
decoding ot Reed-Solomon codes are discussed first and the
hardware required for implementing the encoder and decoder
3
is
presented.
As
a
typical
design
example,
a
complete
detailed design (using otf the shelf integrated circuits>,
of an encoder/decoder system ot a
code
is
presented.
The
code
selected Reed-Solomon
selected
is
the
(255,245)
8
Reed-Solomon code defined over Golois Field GF (2 >. This
code has a aata block length of 255 symbols. Each symbol is
represented by an 8-bit byte making the total length 2040
(255 x 8) binary bits. It has the capability or correcting
up to five burst errors ot 8 bits each or any burst error
combination of
up to a
total
length or
40 bits provided
they only affect a maximum ot five individual symbols.
For oetter efficiency, a pipe lined structured decoder is
considered.
In
the
decoder
design,
the
Intel
8085
microprocessor has been utilized as a data processor and a
system controller.
Speed and time delay measurements of various system parts
are also included.
~
Project Outline:
Chapter 2 introduces Reed-Solomon codes and their encoding
and decoding algorithms. Chapter 3 provides a Reed-Solomon
encoder/decoder system specification and general hardware
description. In Chapter 4, a complete hardware encoder
design is presented and various implementation methods are
reviewea. The design ot various stages ot the p1.pelined
decoder is discussed in chapters 5 through 9. Finally,
Chapter
project.
10
presents
the
results
and
conclusions
or
the
CHAPTER l l
REED-SOLOMON CODES
~
Hamming Codes:
Hamming codes are the first class ot cyclic codes devised
for error correction [5]. These codes and their variations
have been widely used for error control in digital
communication and data storage systems.
For any positive integer m ~ 3, there exists a Hamming code
with the tollowing parameters:
Code length:
n
Number of 1nformation symbols:
k
Number of parity check symbols: n-k
Min1mum distance:
dm
Error correcting capability:
t
= 2m
= 2m
=m
=3
=1
- 1
- m
-
1
Hamming codes are single-bit error correcting codes, and
can be extended to correct single-bit and detect double-bit
errors. A cyclic Hamming code ot length 2m - 1 is generated
by a primitive polynomial p(X) of degree m.
~
General
~
Codes:
The Bose, Chandhuri and Hocquenghen (BCH) codes form a
class ot powerful random error correcting cyclic codes.
These codes are a generalization ot Hamming codes for
correcting multiple errors. In general, BCH codes are
detined as follows:
4
5
If •p" is a prime number and •q• is any power ot p,
there are codes with symbols from the elements ot
Galois Field GF(q). These codes are called q-ary
codes. An (n,k) linear code with symbols from G~(q)
is a k-dimensional s ubspa'ce of the vector space of
all n-tuples over GF(q). A q-ary (n,k) code is
generated by a polynomial of degree <n-k) with
coefficients from GF (q), which is a factor of xn-1.
For any positive integers s and t, there exist a
q-ary BCH code with the following parameters:
Block length:
n = qS
1
Number of parity check digits: (n - k) ~ 2st
Minimum distance
d
>,. 2t + 1
min
This code is capable of correcting any combination of t or
fewer errors in a code block of n = qs - 1 digits.
Let @ be a primitive element in GF(qs). The generator
polynomial g(x) of the t-error correcting BCH code is the
lowest-degree polynomial with coefficients from GF(q) which
has
2
3
@I @ I @ I • • • • • • • • • • • • I
as its roots. Let 0. (x) be the minimal polynomial of @i,
~
then,
g(x)
= LCM
{ 0 (x), 0 (x), ••••••• ,0
1
2
2t
<x> }
Since, the degree of each minimal polynomial is s or less,
the degree of g(x) is at the most 2st.
A special subclass of the BCH codes is given by q=2. These
codes are called binary BCH codes. For a pr imi ti ve BCH
'
0
6
m
code, n is restricted to be 2 -1; for a nonprimitive BOI
code, n may be any other odd number.
In addition to the binary BCH codes, there are also
nonbinary codes. Among the nonbinary , BOI codes, the most
important subclass is the class of Reed-Solomon codes which
are defined in the following section.
~
Reed-Solomon Codes:
Reed-Solomon codes are a special subclass of q-ary BCH
codes for which s=l. A t-error-correcting Reed-Solomon code
with symbols from GF(q) has the following parameters:
Block length:
Number of parity check digits:
Minimum distance:
n = q - 1
n-k = 2t
d
= 2t + 1
min
In this project, only Reed-Solomon codes with symbols from
m
the Galois Field GF (2 ) will be considered .Since the main
goal of this project is to design complete hardware and
software systems to encode and decode Reed-Solomon codes,
Reed-Solomon codes encoding and decoding algorithms are
only very briefly discussed without any formal proof. The
reader interested in details of these algorithms is
referred to references [3] and [131.
m
Let @ be a primitive element in GF{2 ) • The generator
polynomial of a primitive t-error-correcting Reed-Solomon
m
code of length 2 -1 is:
g{X)
g(X)
=
=
(X + @) (X +
2
@ ) • ~ •••••
(X
+ •••• + g
+
2t
@
2t-l
)
X
2t-l
+X
2t
This is an (n,n-2t) code that consists of n s~mbols and has
d . -1 parity check symbols. Since q = 2 , each q-ary
m1n
7
symbol can be expressed as an m-tuple over GF(2).
m
m
Consequently
a
t-error
correcting
((2 -1),(2 -l-2t))
m
Reed-Solomon code over GFC2 ) can be regarded as an
m
m
ClmC2 -l)J,[m(2 -l-2t)J) code over GF(2) which is capable
of correcting any error pattern whose nonzero digits are
confined to t m-symbol blocks. Thus Reed-Solomon codes are
very etfective in correcting multiple burst errors.
~
Encoding Qf Reed-Solomon Codes:
Given the generator polynomial g(X)
Reed-Solomon code, the code can be
systematic form as follows. Let
U{X) = u
of an
encoded
Cn,n-2t)
into a
k-1
+ u X+ ••••• + u
X
1
0
k-1
be the m~~sage to be encoded, where k=n-2t. Mul t~plying
U(X) by X
we obtain a polynomial of degree n-1 or less:
2t+l
n-1
2t
2t
+
•
•
•
+
uk-1
X
X
U (X) = u
X
+ u X
0
Dividing
have:
2t
X
U{X)
2t
X
U(X)
by
=
1
the generator
polynomial
a(X) g(X) + b(X)
g (X},
we
(2.1)
where a(X) and b(X) are the quotient and remainder
respectively. Since the degree of g(x) is 2t, the degree of
b(X) must be 2t-l or less, that is
2t-l
b(X) = bO + bl X + ••••• + b2t-l X
Rearranging Equation (2.1) we obtain a polynomial
which is a multiple of g(X), therefore it is a
polynomial
V(X)
code
8
2t
V(X)= b(X) + X
U(X) = a(X) 9~£~l
n-1
2t
+ u X
+ •• +u
X
= b 0 + b 1 x + •• + b2t-l X
0
k-1
This polynomial corresponds to the code vector:
The first 2t elements are the parity check symbols and the
rest k symbols are the information symbols.
~
Decoding Qf Reed-Solomon Codes:
Let
= v0
V(X)
+ v 1 X + ••••••• + v
n-1
n-1
X
be the transmitted code vector and
r(X) = r
0
+ r
1
x
n-1
+ ••••••• + rn-l X
be the received vector. Then, the error pattern is, say,
e(X) = r(X} - V{X)
= e
where e.
1
the error
0
+ e
1
X + •• •.... + e
n-1
n-
1 X
m
r. - v. is a symbol from GF(2 ). Suppo~ed tbat
l.
1
]1
]2
patte~n e(X) has f errors at locations: X
, X ,
=
wh ere 0 ~ J. < j < •••• < jf
2
1
has error magnitudes: e. , e. , •••• ,e. , then
•••••
••
,
X
]f
]1
e(X)
= e jl
jl
X
+ e
]2
j2
~
n-1 and
]f
j2
X
+ •••••
(2.2)
2
2
since @, @ , ••••• , @ t are roots of each code polynomial,
then V(@i) = 0 for 1 ,< i ~ 2t. The ith component of the
syndrome is given by:
9
S
S
i
i
i
i
== r(@ ) == V(@ )
i
+ e(@ )
i
== e(@ )
(2.3)
From (2.2) and (2.3) we ontain the following equations:
j.
where eji and @ ~ are unknown. Any method ot solving these
equations is the basis for an error correction procedure.
These equations are nonlinear. There are many possible, but
finite, solutions and the correct solution is the one that
yields an error pattern with the smallest number of errors.
This error pattern is the most probable error pattern
caused by the channel noise. In the following, a method ot
solving ~hese equations is discupsed. Since the location of
an error is given in terms ot @Ji,this is called the error
location number.
'tl
t;
M
Let
~
sl
=
s2
= r(@ ) ==
r
= Bj
(@)
then,
== e.
B
e.
B
]1
2
]1
1
2
1
+ e. B +
]2 2
2
+ e. B +
]2 2
...... + e.]f
...... + e.]f
B
B
f
2
f
( 2. 5)
10
•
s
2t
2t
=r(@
2t
>=e. B
)1 1
2t
+e. B + •••••• + eJ.f
)2 2
These equations are called power-sum symmetric functions.
Now, we aefine a polynomial o-(X) as
The
the
the
the
o-(X)
=
o-(X)
=
(1
+ B X) (1 + B X)
1
2
2
........ (1 + Bf X)
........... + o-j Xf
o- + o- X + o- X +
0
1
2
_l
_l
_l
roots of o-(X) are B , B , •••••• B
which are called
1
2
f
inverse of the error location numbers. o-(X) is called
error location polynomial. o-'s are related to B.'s by
i
J
following equations:
o0
o1
o2
•
=
=
=
o- =
f
1
...... + Bf
B + .. . . . . . . + B
2 3
f-1
B + B +
1
2
B B + B
1
2
B
f
(2.6>
B B B c••• B
1 2 3
f
o-'s are known as elementary symmetric functions of B 's.
i
j
From equations (2.5) and (2.6), o- 's are related to the
i
syndrome components S 's by the following Newton's
i
identities:
s
s
1
2
+ o- = 0
1
+ o-
1
s
1
+ 2o- = 0
2
11
s
3
+
o- s
1
2
+ o-
2
s
+ 3o- =
1
3
o
(2.7)
•
•
s
s
f
s
+ o1
f+l
+
f-1
o- s
1
f
+ ••••• + of-1
s
+ ••••• + oj-1
s
1
+ fof
2
+ of
=
0
s =0
1
The error correcting procedure for Reed-Solomon codes
consists of the following four major steps:
1- Compute the syndrome S=
CS , S ,
1
received polynomial r(X),
2
••••• , S
(Equation 2.5),
2t
) from the
2- Determine the error-location polynomial o-(X)
syndrome
components
s 1'
calculate o-'s from the S.
1
i
IS
1
s 2'
... . .
from the
, s )
2t
i.e.
(Equation 2. 6) ,
3- Determine the error-location numbers B , B , • • • • , B
1
2
f
by finding the roots of o- (X) (inverses of the roots o!
~(X)), and
4-
Substitute
the
error
location numbers
into
the
error
polynomials and solve for the corresponding error values
e . • Knowledge ot the values of B and e. is sufficient
Ji
.
i
Ji
for error correction.
CHAPTER
AND
SYSTEM SPECIFICATION
~
In
m
GENERAL DESCRIPTION
System Specification:
this
code
chapter,
system
the
will
overall
be
procedures
can
be
dimension,
the
detailed
design
discussed.
applied
to a
design
of
the
Reed-Solomon
Although
the
Reed-Solomon
code
will
be
given
design
for
of
a
any
code
that has the following parameters:
= 88
m
n
= 2
- 1 = 255,
n - k = 2t = 10
This
code
the (255, 245) Reed-Solomon code.
It is
8
de tined over GF (2 ) •
Thus, the code block length is 255
symbols
is
where
digits
each
(byte).
symbol
is
presented
by
eight
Each data block contains 245
symbols
(245x8=1960
symbols
(10x8=80
binary
binary
bits)
and
bits).
10
This
binary
information
parity
code
has
check
the
capability ot correcting up to five burst errors ot 8 bits
each or any burst error combination of up to a total length
of
40
bits providing
they
only affect
a
maximum ot
ti ve
individual symbols.
The
8
GF (2 )
genera tor
elements
are generated by the
polynomial p (X)
which is a
following
pr irn_!_ti ve polynomial
-----------··•c---·-- -----,-
-.,.,_~-~'->·>
..
-~ --~'""'----=-·'--
..----~_,__
of degree 8:
___ ,- ...••.-,_
e-o·--o··~-~"·o·--·
8
4
3
2
P(X) = X + X + X + X + 1
Let the primitive element @ be a root ot P(X).
'
<l
12
field
Then
~-·
13
8
=@
P(@)
Since
@ is
a
2
3
4
=0
+ @ + @ + @ + 1
pri~tivr
e~ement,
non-zero elements @ , @ , @8' •••
element of the field GF(2 ) can
,
it 2 swnerates tfll
@
of GF ( 2 ) •
the
Any
be represented as a
polynomial ot @, for example, the field B can, in genera~,·
be represented as:
B
=
i
@
=a
2
3
4
5
6
7
+a @+a @ +a @ +a @ +a @ +a @ +a t 0 1
2
3
4
5
6
7
Where the coefficient a
is either 0 or 1.
i
The !ield element B can also be represented by an ordered
sequence
of
the
8
coefficients
of
the
polynomial /
representation as follow:
(a , a , a , a , a , a , a , a )
0
1
2
3
4
5
6
7
This representation is qplled the vector representation.
The zero element of GF(2 ) is represented by the all zero
8-tuple.
A computer pro~am has been written to generate
the field element o! GF(2 ) and the output of this program
which shows the power representation and the corresponding
vector representation
Table 3.1.
of
each
field
element
is
in
To add any two tield elements, simvly add the corresponding
components
of
their
addition).
As an example,
10
90
@
+ @
vector
representation
= O,O,l,O,l,l,l,O
=
(
modulo-2
+ 1,1,1,1,1,0,1,1
l,l,O,l,O,l,O,l
To multiply any two field e~~~ents,simply add their
exponents and use the fact that @
= 1. As an example,
0
I
2
3
4
5
6
7
IJ
9
10
l I
l2
I J
••
15
16
17
18
19
20
21
22
23
24
25
26
27
28
2'1
30
31
.12
33
34
35
36
37
)6
39
40
41
42
43
44
45
46
47
48
49
50
51
52
5.l
54
55
56
.51'·
58
59
60
61
62
00000000
I 0000000
0 I 000000
00100000
OOOIOOQO
00001000
00000 11)0
00000010
00000001
10111000
01011100
00101110
00010111
10110011
11&00001
11001000
01100 I 00
00110010
00011001
10110100
01011010
OOlOllOI
10101110
010101ll
10010011
lll10001
11000000
OltOOOOO
00110000
OOOllOuO
00001100
00000110
OOOOOOll
101llOOt
11100100
01110010
00111001
10100100
01010010
00101001
10101100
01010110
00101011
10101101
11101110
01110111
l 00000 I 1
I II II 00 l
11000100
01100010
00110001
I 01 00000
01010000
00101000
00010100
00001010
00000101
IOIIlOIO
01011101
I 0010 I I 0
01001011
10011101
11110110
01111011
63
64
65
66
67
68
6Q
10
71
12
73
74
75
76
77
78
7Q
80
I' I
ez
e"
e5
8.1
M
87
es
f'9
90
.. I
Q2
QJ
...
Q5
96
<;7
qs
c;q
100
I0l
102
I O..J
l 04
I 05
106
107
lOf:t
I OQ
I 10
l I I
I 12
113
114
l 15
1 16
t 11
1 18
I 19
1 20
121
I 22
IZJ
124
125
126
10001)101
11111\JIO
0 II I l I 01
10000110
01000011
10011001
llll'liOO
011 Ill) I 0
00111101
101 00 I I 0
01010::111
10010001
11110000
011 II 000
00 I II I 00
0 00 I I I I 0
00001111
10111111
I 1 1 001 11
11001011
I I0 t I I0I
II') I 0 I 10
01101011
10001101
11111110 1•
01111111
10000111
11111011
11000101
11011010
01101101
IOOOillO
01000111
100 II 0 I I
Ill I 0 I 0 I
11000010
01100001
I 0001000
0 I 000 I 00
00100010
00010001
10110000
01011000
00101100
00010110
00001011
10111101
11100110
011 I 00 II
I 0000001
lllltOOO
01111100
00 I 11 I I 0
00011111
10110111
11100011
11001001
I I 0 I I I 00
01101110
00110111
10100011
11101001
11001100
0 l I 00 1 I 0
00110011
10100001
11101000
01110100
0011101()
0001110 I
10110110
01011011
10010101
lliiOOIO
OlliiOOI
10000100
01000010
00100001
1tJIOIOOO
01010100
00101010
00010101
10110010
01011001
10010100
01001010
00100101
10101010
01010101
10010010
01001001
10011100
01001110
00100111
10101011
11101101
11001110
01100111
10001011
llllll01
11000110
01100011
10001001
llllllOO
Oll1ll10
00111111
10 I 001 I l
11101011
11001101
11011110
01101111
10001111
lllllllt
11000111
11011011
11010101
l1010010
01101001
10001100
01000110
0010001 1
10101001
11101100
OlllOllO
OOiliOll
10100101
11101010
01110101
12 7
126
12<1
L'IO
I3 l
132
133
lJ4
135
136
137
136
139
140
14 I
142
143
144
145
146
147
146
149
150
151
152
15)
154
155
156
157
156
15Q
160
161
162
163
164
165
166
167
II'> I\
169
1 70
I 71
I 72
l'Tl
I 74
175
176
177
178
179
180
181
182
18.31
184
185
• 86
187
188
189
190
Table 3. l GF(2 8) Elements Generated By p(X)
=
,
I
+ x2 + x3 + x4 + xB
191
192
193
1 'l4
1<>5
1 <;6
197
1<>8
199
200
201
202
203
204
205
206
207
208
209
210
2 II
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
226
229
230
231
2J2
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
2'50
251
252
2 '53
254
10000010
01000001
IOOliOOQ
0 I 00 I I 00
00100110
00010011
10110001
Ill 00000
01110000
00111000
00011100
00001110
00000111
10lll011
11100101
11001010
01100101
10001010
01000101
10011010
01001101
100 I I 1 I 0
0 IOOilll
1 00 I I 1 II
11110111
11000011
11011001
11010100
01101010
0011 01 01
10100010
01010001
10010000
01001000
00100100
00010010
00001001
eI 0 l I l I 00
01011110
00101111
10101111
11101.111
11001111
11011111
11010111
11010011
11010001
11010000
01101000
00110100
00011010
00001101
10111110
010111&1
10010111
lll100ll
11000001
11011000
Oil Oil 00
001 I 0110
000110 l1
10110101
Ill 000 tO
01110001
.....
.+::o
15
10
@
190
and
@
90
100
• @ = @
90
• @
280
=@
=
25
@
The generator polynomial for the {255,245) code is given
by:
2
3
4
5
g (X)= {X+@). (X+@ ) • (X+@ ) • {X+@ ) • (X+@ )
6
7
8
9
10
• (X+@ ) • (X+@ ) • (X+@ ) • (X+@ ) • (X+@ )
Using Table 3.1, g(X) can be expanded to:
10 252 9 69 8 49 7 65 6 123 5
g(X)= X +@
X +@
X +@
X +@
X +@
X
+@
Although
a
76
4 71 3 102 2 41
55
X +@
X +@
X +@
X+@
Reed-Solomon
code,
with
smaller
dimensions,
could have been selected, this code in particular has been
chosen because of 1ts suitable symbol s1ze that matches the
very commonly used 8-bit data byte.
Although the natural length of this code is 255 symbols
(2040 binary bits}, it can easily be shortend to any length
to match any system specifications without any major change
in the encoder/decoder hardware circuitry.
~
System Description:
In the fol.lowing
two sections the encoder and decoder ot
the selected Reed-Solomon code are described.
3.2.1 The Encoder:
The encoder accepts a data message blocks ot 245 symbols
0960 bits)
symbol.s
as an input and generates a code word ot 255
(2040
bits)
as
an
output.
While
input message
16
symbols are transmitted to the encoder output they are also
shifted into a linear feedback shift register that deals
8
with elements from GFC2 >. As soon as all the 245 message
symbols are shifted out, the contents of the shift register
These ten
will represent the ten parity check symbols.
parity check symbols will then be shifted out, following
the 245 information symbols to form the 255 code word
symbols.
A block diagram of this encoder is shown in
Figure 3.1.
3.2.2
~Decoder:
As discussd in section 2.4, the decoding procedure of the
Reed-Solomon codes consist of the following five major
steps:
Step
Step
Step
Step
Step
1.
2.
3.
4.
5.
Compute the syndrome,
Determine the error-location polynomial,
Determine the error-location numbers,
Compute the magnitudes of the errors and
Using the error-location numbers and the
magnitudes of the errors, correct the received
vector
Since each step has a specific function, and the output of
each step is the input of the following step, a pipelined
structured decoder would be the most efficient decoder in
this case.
A block diagram of this decoder is shown in
Figure 3.2.
This decoder consists of five main stages.
At any time,
each stage will be processing information that belongs to a
different data block, i.e. five data blocks will be decoded
concurrently, and each one will be at a different decoding
step. The jth received data block r (X) is first shifted
j
17
U(X)
ENCODER
Figure 3.1 Encoder Block Diagram
V(X)
Queue Buffer
cr.
1
r(X)
Syndrome
Processor
s.1
Sigma
Processor
s.l
-
()'":
1
ErrorLocation
Processor
X.1
s.1
- Error-
--..... Magnitude
Processor
X.1
s.l
e(X).I:~
ErrorCorrection
rr rocessor
Figure 3.2 Reed-Solomon Decoder Block Diagram
co
19
into the syndrome processor as well as the Queue buffer.
The syndrome processor then computes the 10 syndrome
components s 1 s 1 s , ••• ,s • These components are then
1
2
3
10
loaded into the Sigma processor. While the Sigma processor
is determining the error-location polynomial o-(X) of this
jth received data block, the Syndrome processor will be
computing the ten syndrome components of the (j+l)th data
block.
The Error Location processor will then find the
error location numbers (Xi) which are the reciprocals of
the roots of o-(X).
(e )
The
error
magnitudes
the
computed
by
are
i
Error-Magnitudes processor. The error location numbers and
error magnitudes of the jth data block are stored in the
Error Correction processor while the (j-l)th data block is
corrected and shifted out from the decoder as well as the
Queue buffer.
The transfer of the information between
different processor is supervised by an overall controller.
The Queue buffer must be able store five data blocks. The
size of this buffer is ((255x8)x5) = 10200 binary bits.
The rate of the data processing in this pipelined decoder
is determined by the time delay of the slowest processor,
T. Although the total decoding time of each data block is
ST, but since this will be done concurrently with four
other data blocks, the average decoding time ot every data
block will be T.
'
!l
CHAPTER 1Y.
ENCQDER HARPWARE DESIGN
~
Introduction:
Let
• • • • • • + uk-1
X
k-1
be the message polynomial. It has been shown in section
2.3, that the code polynomial of Reed-Solomon code in
systematic form is given by:
b(X) + x 2 t U(X)
where
= a(X)
g (X)
If'!-..\
'/\,J'- (
/
g(x) is the generator polynomial,
2t
a(X) is the quotient resulting from dividing X U C)\)
by g(X) and
b(X) is the remainder.
The code vector is then given by:
>l:,:?\0
~ bl' • • • • ' b2t-l' uo' ul' • • • • • ,ukl,.l
,.....--- parity check --+I•
information ...:.;..t
symbols
symbols
For the selected Reed-Solomon code (255,245), the generator
polynomial g(X) is given by:
20
21
and the code polynomial V(X) of U(X) is:
V(X)
=
b(X) +
= b0
+ b
1
x10
x+
U(X)
= a(X)
••••• + b
254
+ ····~ + u244 X
9
g(X)
x9
+ u
10 +
0
X
U
1
11
X
Encoding this Reed-Solomon code can be done by using a
linear ten~~~age
shift register as shown in Figure 4.1. The
.... feedback connections of this shift register are based on
the coefficients of the generator polynomial g(X). The
following notations are used in Figure 4.1:
--.=--~- "~--.~
-~
--,-~-~~·~·~---,-~~ ~-- ---~""··~·--=-,.-~.'-"'""'=·--~---~-
denotes a multiplier that multiplies any
8
field element of GF <2 > by a fixed element
(--.--•..
-B
~~,~):rom the same field,
denotes a storage device that can store a
8
field
element
from
GFC2 >
(an
8-bit
register) and
denotes an adder that
8
elements from GF(2 >.
adds
two
field
The encoding procedure is as follows:
1- Clear the storage devices b. •s,
l.
2- Enable feedback connection by enabling the control
AND gate and feed U(X) to the output of the encoder
by enabling the second input of the multiplexer.
3- Shift the 245 information symbols of U{X) into the
shift register and to the encoder output. At the
completion of shifting, the register contains the ten
Control
8
Clock
X2tU(X)
8
Input Message
Control
Figure 4.1 Reed-Solomon Encoder Block Diagram
N
N
23
parity check symbols.
4- Disable the feedback connection and enable the first
input ot the multiplexer.
5- Shift the ten parity check symbols to the encoder
output. During this time the encoder input would be
disabled and no information can be accepted from the
source.
This circuit needs 245 clock
information message into the shift
encoder output and 10 clock pulses
check digits to the output, a total
~
pulses to shift the
register as well as the
to shift the ten parity
of 255 clock pulses.
Field Element Multiplier Hardware Implementation:
As shown in Figure 4 .1, one of the main elements in the
encoder circuit is the field element multipliers. This
multiplier can be implemented using one of three different
implementations which are discussed in the following
sections.
4.2.1 Two-Level
RQM
Implementation:
Since each field element can be represented as a power of
@, where @ is the primitive element, then the product of
any two elements, represented in the power representation
form, is given by the sum of their powers of @.
In hardware implementation, this can be done by converting
the vector form of the multiplicand element into its power
representation, then adding the power of the multiplier,
and
finally
converting
the
result
from
the
power
representation into the vector representation.
24
In this circuit, two ROM levels are used. The first ROM is
used to convert the vector representation form of the
symbol entering the feedback loop,B
(the multiplicand>
into its power representation. The se~gnd level is a set of
ROM's which works as a look up tables. The output of each
of these ROM's represents the results of multiplying the
feedback symbol (B ), in the power representation form, by
fb
1·
one ot the coefficients of g(X) <g.=@ J). The output will
be in the vector representation iorm. In this case, the
contents of any of the second level ROM's which multiplies
by the coefficient g, is simply the vector representation
of field elements gtven in Table 3.1 rotated i .
times
(.,here i. is the power of the field element g.
g. =
1
55
@ j>. Fo~ example, to multiply by @ , the con~ents ot ihe
55
second level ROM is the vector representation of 0, @ ,
56
254
2
54
.
.
@ , • • • • , @ , @, @ , • • • • , @ • The f 1 r s t e 1 e men t 1n
the look up table is always 0.
t.e.
The block diagram of this implementation is shown in Figure
4.2. The first ROM is enabled by the feedback control
input. When this feedback is disabled, an all-zero vector
should be present on the lines of the feedback loop. Since
most ROM's have tri-state output, the feedback lines should
be grounded when the feedback loop is disabled. This is
achieved by using tri-state logic buffers which have zero
volt inputs and are connected to the feedback loop lines
and enabled when the feedback is disabled.
This circuit can be implemented using eleven (256x8) ROM's.
These are available as a single Schottky TTL IC with 50 ns
access time (SN74S471). Programming these ROM's for this
circuit is s1mple, but the circuit needs two ROM time delay
of 100 ns to perform the multiplication. Therefore, this
circuit implementation is considered relatively slow.
Control
r--------.-------- ------- ----r----------r___
8
ADo-AD;l
:: .,
I
R(J~
lil41
8
8
I AD:-AD7
Data
ROM
Addre5s
_.____,~--i
8
8
A0 -AD
0
7
R0~1
@69
ROM
@252
-----.1..-------l--------------------'-------_.
Clock
8
X2 tu(X)
8
Input Me55age
Figure 4.2 Reed-Solomon Encoder Using Two-Level ROM
Multiplier Implementation
N
(.11
26
4.2.2 One-Level ROM Implementation:
The multiplication of any field element by a fixed field
element can be done by using table look up ROM. In the
encoder hardware implementation the multipliers in Figure
4.1 are replaced by ROM • s. The contents of each of those
ROM's is the vector representation of the result of
multiplying
the
feedback
and
one
of
the
. l. . element
coefficients of g (X) (gJ=@ J > •
This circuit can be implemented using the same type of ROM
(SN74S471). The multiplication delay time is one ROM time
delay of 50 ns only. Although this circuit is faster than
the previous one, the ROM contents are not as easy to
generate.
4.2.3 Combinational Circuit Implementation:
Multiplication of a field element by a fixed element from
the the same field is best explained by an example.
Consider multiplying an arbitrary field element B of GFC2
given by:
2
6
7
8
>
B = B + Bl @ + B @ + ••••• + B @ + B @
0
6
7
2
. d element @55 • Usl.ng
.
.
Table 3 • 1 , t h e pro d uct l.S
b y t h e fJ.el
given by the following logical function:
•••
=
• cl
B {@5 + @7)
0
2
+ B (1 + @ + @3 + @4 + @6)
1
+ B (@ + @3 + @4 + @5 + @7)
2
+ B (1 + @2 + @3 + @4 + @6 +@ 7)
3
87
a.,~
e, '\ s,
®
s'7
I
e, B_s s, 1\ &7
I
S
B6
86
B18J 8 65 s,
3
®
BS
84
85
s, f\1\f\ s.,
I
C±!
8'
4
82
83
Ba
81
e, E\f\1\6,8,
e, e, a,
~
8
I
8'3
Bz
'\ B11
e,
a,
I
@
s. i\ a, e,
I <t;
,
1
"'0
Figure 4.3 Combinational Circuit Multiplier Implementation
1"\)
.......
28
+
+
+
+
B (@ + @4 +
4
B (1 + @3 +
5
B (1 + @ +
6
B (@ + @2 +
7
@55 B =
(B +
1
+ (B +
2
+,j'lB · +
1 .
+ (B +
ol
+
+
I
+ (B +
0
+ (B +
- 1
+ '(B._ +
0
c
B3
B
4
B6
B
2
B2
B2
B3
B2
@6
@4
@2
@3
+
+
+
+
@7)
@5 +@ 7)
@3 + @5 + @6)
@4 + @6 + @7)
+ B5 + B6)
+B +B) @
6 . 27
±-:a7) @
3
+ B3 + B5 +
+ ~7)@
~6
+ B4 + B5 + B7) @
5
+'~3 + B5 + B6) @
6
+B4 + B6 + B7) @
+ ·a + B5 + B7) @7
4
(4.1)
·-!
The hardware implementation of this mutiplier is shown in
Figure 4.3. This multiplier consists of a set of adders.
Each of these adders is simply an XOR tree. An example of
the XOR tree implementation is shown in Figure 4.4. Since
each field symbol is 8 bit long, then the maximum number of
inputs to an XOR tree is 8, and the maximum number of gate
levels in the XOR tree is 4. However, for the selected
code, the maximum gate levels are 3 levels. These XOR trees
can be implemented using Schottky TTL (SN74S86}. This gate
has 7 ns time delay. Therefore, the
multiplication time
delay is 21 ns. This circuit implementation is much faster
than the previous two circuit, but the chip count is
55
larger. For example, @
multiplier requires 19 XOR' s to
implement, i.e. 5 Quade-2-inputs XOR IC chips. However, the
chip count can be reduced if the circuit is implemented
using programmable logic arrays (PLA's).
~
Encoder Hardware Implementation:
The encoder circuit consists of two main parts, the linear
feedback shift register and the field element multipliers.
The linear feedback shift register consists of ten 8-bit
29
Figure 4.4 XOR Tree Implementation
30
latch registers. These latch registers are implemented
using Schottky TTL IC , SN74S374, which has 10 ns time
delay. The outputs of each latch register are connected to
the inputs of the following latch register through a field
element adder, which adds the outputs of the latch register
and the field element multiplier as shown in Figur 4.1.
These field element adders simply consist of 8 XOR each,
which can be implemented using Schottky TTL Quad 2-input
XOR SN74S86, which has 7 ns time delay.
The output control Cl is connected to eight 2 to 1
multiplexers, to select one of the two inputs. These
multiplexers are implemented using Quad 2 to 1 multiplexer
SN74Sl58. The feedback control C2 is connected to the
control AND gates, and is used to enable and disable the
feedback loop. The control AND gates are implemented using
Quad 2-input AND SN74S08, which has 4.75 n.sec. time delay.
The circuit discussed in Section 4.2.3 is selected to be
used in implementing the 10 multipliers needed for the
encoder eire ui t. To design these multipliers, a product
logical function (PLF), similar to the one given by Eq.
4.1, should be generated for each multiplier. These product
logical functions have been generated. The following is a
list of each of the coefficients of the generator
polynomial g (X) and the final form of the corresponding
multiplier product logical function:
1- g
0
= @55
The product logical function of the multiplier of this
coefficient is given by equation 4.1.
2- g
1
= @41 ,the corresponding PLF is:
--
------ - -
31
@41 B
=
+
+
+
+
+
+
+
3- g
2
=
3
=
1 +
{B +
2
(B +
0
(B +
3
CB +
0
(B +
1
(B +
0
(B +
0
=
+
+
+
+
+
+
+
(B2
B
3
(B
0
(B
1
(B
3
(B
4
(B
0
(B
1
=
(B
1
+ (B
2
+ (B
0
+ (B
0
+ (B
0
+ (B
0
+ {B
1
+ {B 0
4
=
B1 + B2 + B3 +B5 + B7)
) @3
B5
@4
B1 + B2 + B5>
5
B2 + B3 + B6) @
@6
B2 + B3 + B4 + B7)
@7
B1 + B3 + B4 + B5)
@2
+ B7)
@
+
+
+
+
+
+
@2
B2 + B4 + B7)
@3
B2 + B3 + 8 s + ~7>
B4 + B6 + B7) @
@5
B5 + B7)
@6
B5 + B6)
@7
B6 + B7)
71
@ , the Corresponding PLF is:
@71 B
5- g
B2 + B4 + B5 + B6)
B3 + B5 + B6 + B7) @
PLF is:
@102' the corresponding
@
B
102
4- g
(B
+
+
+
+
+
+
+
+
B3 + B4)
B4 + B5> @
B1 + B4 + Bs + B6)
@2
3
B2 + B3 + B4 + ~5 + B6 + B7) @
B5 + B6 + B ) @
7
5
B1 + B6 + ~7) @
B2 + B ) @
7
7
B2 + B3) @
76
@ , the corresponding PLF is:
@76 B
=
(B
4
+ (B
0
+ (B
0
+ (B
0
+
+
+
+
Bs + B6 + B7)
B5 + B6 + B7) @
@2
B1 + B4 + B5>
3
B1 + B2 + B4 + B7) @
32
+
+
+
+
6- 9
123
5
= @
+
+
+
+
+
+
+
7
49
@
+
+
+
+
+
+
+
0 +
(B
+
+
+
+
+
+
~
CB
1
(B
0
(B
0
(B
0
(B
'0
(B
0
(B
1
(B
B3
B2
B4
B3
B3
Bl
B2
B2
+
+
+
+
+
+
+
+
+
+
+
+
B5 +
B4 +
Bs +
B3 +
B4
B4
B5
B4
B7)
Bs>@
@2
B6 + B7)
3
B + B } @
5
46
B ) @
6
5
B6 + ~7) @
B } @
7
7
B6) @
, the corresponding PLF is:
@49 B =
'
(B
0
(B
1
(B
0
(B
4
(B
1
65
=
@4
:1>
= @ , the corresponding PLF is:
@65 B =
8- g
B1 + B2 + B3 + B4 + B6 + ~7)
B2 + B3 + B4 + Bs +
@
+
B
)
@
B3 + B4 + B5
6
7
B4 + B5 + B6 + B7) @
+ B1 + B2 + B5 + B6)
+ B2 + B3 + B6 + B7) @
2
@
+
B7)
+
BS
+
B6
+
B4
+ B1 + ~3
+ B7) @
4
+ B2 + B6) @
+ (B + B3 + B7} @5
2
+ (B + B3 + B4) @6
0
+ (B + B1 + B4 + Bs> @7
0
+
+
+
+
6
+
+
+
+
, the corresponding PLF is:
@123 B
=
7- 9
CB
0
(B
1
(B
2
(B
3
1 + B7)
B @
2
2
(B + Bl + B3 + B7} @
0
(B + B2 + B + B ) @3
0
4
47
(B + ss
@
+ ~7)
3
(B + B } @
4
6
6
(B + B7) @
5
33
+ cao + a6) @7
9- g
8
= @69, the corresponding PLF is:
@69 a =
.Ca 0 + a3 + a5 + a6)
+
+
+
+
+
+
+
Ca
0
(a
0
(a
0
(B
1
(B
0
(B
+
+
+
+
+
1 +
(B +
2
al
al
al
B2
B2
a3
B4
+
+
+
+
+
+
+
a4 + a6 + a,> @
2
a2 + a3 + a 6 + a 7 > @
3
a2 + ~4 + a 5 + a 6 + s 7 > @
a,> @
5
B3) @
B4) @6
B5> @7
10- g9 = @252, the corresponding PLF is:
+ B3)
+ B4) @
+ B2 + ~5)
+ B6) @
2
@
Each of these multipliers is implemented using a circuit
similar to the one shown in Figure 4.3. These multipliers
need 168 XOR's, which are implemented using 42 Quad
2-inputs XOR SN74S86s.
Table 4.1 shows all the
used to implement each
element, and the number
circuit. As shown, the
for the encoder is 76.
encoder elements, IC type numbers
element, the time delay of each
of IC chips used to implement the
total number of IC chips required
34
Element Type
IC Type
Number
Time
Delay
AND
74508
4.75
8
2
Multiplexer
745158
5
8
2
XOR for MUX
74886
7x3=21
168
42
XOR for Adder
74586
7
80
20
Latch Register
745374
10
10
10
Number of
Elements
Number of
IC Chips
Table 4.1 Time Delay and IC chip Count for the
Encoder Circuit
The maximum frequency that can be used will depend on the
total time delay of the longest path of the signal, T
,
total
which is given by:
T
total
= T(latch) + T(adder) + T(AND} + T(Mul.) + T(adder)
= 10 + 7 + 4.75 + 21 + 7
= 49.75
n.sec.
This is the time required to process one byte of the
information message. The circuit maximum frequency is
20MHz. If the input message bits are fed in serial, then a
serial to parallel shift register will be needed. In this
case, the maximum input frequency is 20x8 = 160 MHz.
CHAPTER y_
~
SYNDROME PROCESSOR
Syndrome Processor Description:
~ ~
The syndrome processor is the first stage in the decoder
pipelined structure. It is responsible for computing the
syndrome from the received vector r(X). In general, for a
t-error Reed-Solomon code, the syndrome has 2t components.
These components are obtained by substituting @i into the
received vector polynomial r(X) for 1 ~ i ~ 2t.
For the (255,245) Reed-Solomon code, the syndrome S has ten
components, i.e.
where,
S
i
=
i
r(@ )
1
~
~
i
10
Let,
v;;~
•••••• + r254 X~
be the received vector, then
s
\\
i
= ro + rl @t +
....... + r254
2540
@
which can be rearranged to take the following form:
s
i
i
i
{{ •••• <r254 @ + r253) @ +
35
..... + rl)@i
+ r
0
36
The computation of a syndrome component can be done using
the circuit shown in Figure 5.1. In the procedure of
calculating a syndrome component, the register b. is
1
in1 tially cleared. The received vector ( r , r , • • • • • ,
0
1
r
> is then shifted into the circuit one symbol at a
245
time. After the first shift, the register bi will contain
the vector representation of r
and the multiplie~ output
254
1
will represent the vector representation of r
@ • After
254
the second shift, the r~gister b. will contain the vector
.
1
1
representation of r
@
+ r
and the multiplier output
254 1.
253 1.
will represent ((r
@ + r
) @ ) . After 255 shifts, the
254
i 253
register will contain r(@ ) in vector representation form
which iss., the ith component of the syndrome.
l
~ ~
Syndrome Processor Hardware Design:
Since the syndrome consists of ten components, this
processor consists of ten similar circuits, which are
called Syndrome Units, as shown in Figure 5. 2. Each ot
these Syndrome Units is responsible for computing one ot
the syndrome components s .. As shown in Figure 5.1, each of
.
l
these Syndrome units consists of b. register, field element
l
adder and multiplier. The b. register is simply implemented
l
using 8-bit latch register. The field element adder is
implemented by a set of XOR gates. In this processor, the
field element multipliers can also be implemented using one
of the three methods of implementation which are discussed
in
detail in Chapter 4. Now, depending on how the
multipliers are implemented, the Syndrome Units can be
implemented using one of the following
three methods of
implementation.
5.2.1 Using Two-Leyel BQM Multipliers Implementation:
The circuit diagram of
a Syndrome Unit implemented using
this method is given in Figure 5.3. The first ROM takes the
37
8-bit
Register
r(X)
8
8
(a) Over GF(2 8 )
Multiply by@;
r.
10
,,
r.
-------4----~+~----------~
----~~+·~----------------~
(b) In Binary Form
Figure 5.1 Reed-Solomon Codes Syndrome Computation Circuit
38
.
..
Syndrome Unit 1
~
.
.
~} s,
e.,
.
.. Syndrome Unit 2 ..
.
~'
r(X)
.
•
•
II
II
II
I I
II
.
.
.
Syndrom Unit 3
..
Syndrome Unit 10
.
.
.
B'f
I
I
I
I
I
.
~
..
.
Figure 5.2 Syndrom Processor Block Diagram
39
r.
lo
,,
r.
Figure 5.3 Syndrome Unit Multiplier Implementatio
Using Two-Level ROM
,
r}
40
field element stored in register bi in its vector
representation form and generates its power representation
form. The second ROM is the multiplier ~hich multiplies a
1
field element in its power form by @ (the multiplier
element). Each of these ROM's is 256x8.
This unit can be implemented using the Schottky TTL IC's,
SN74S471, a 256x8 PROM with access time of 50 ns, SN74S374,
an octal D-type latch with a 10 ns time delay and SN74S86,
a Quad 2-input XOR with time delay of 7 ns. The total time
delay,T
, of a Syndrome Unit implemented this way is:
total
T
total
= 100
+ 7 + 10
= 117
ns
Each Syndrome Unit is built of two ROMs, eight XORs and one
register, which makes a total chip count of 5.
5.2.2 Using One-Level ROM Multiplier Implementation:
The circuit diagram of a Syndrome unit implemented this
using this method is shown in Figure 5 .lb, in which the
multiplier is implemented using a ROM. The ROM is used as a
look up table to perform the multiplication. This ROM is
also 256x8. In this method, the rest of the unit is
implemented using exactly the same parts used in the first
method. The total time delay, T
, is:
total2
Ttotal 2
= 50
+ 7 + 10
= 67 ns
For this method of implementation, the total chip count is
4. This method is faster than the first, but programming
the contents of the ROMs of the multipliers is not as easy
to generate.
41
5.2.3 Using Combinational Circuit Multiplier
Implementation:
This method of implementation has the same block diagram as
shown in Figure S.lb., but in this case, the mutiplier is
implemented using a combinational logic circuit. Assuming
that B is a field element that has the following general
form:
B
= BO
+ B X+ ••••• + B X6
1
6
+
B
7
X7
the product logical functions, PLFs, of the ten multipliers
which are used to multiply by @i, where 1 ~ i -$ 10, have
been generated and their final forms are listed below:
1- The 1
st
=
Syndrome Unit PLF
=
+
+
+
+
+
+
+
2- The 2
nd
@
+ B7)
+ B7)
+sB7>
@
4
6
B @
5 7
B @
6
2
@
@3
4
@
2
+
+
+
+
+
+
+
rd
B
7
B
0
(B
1
(B
2
(B
3
B
Syndrome Unit PLF = @ B
=
3- The 3
@B
Syndrome Unit PLF
B
6
B
7
CB
1
(B
1
(B
2
(B
3
B
4
B
5
@
+ B6)
2
@
+ B6 + B7)
+ 8 6 + ~7)
+6B7) @
@
@7
@3
4
@
42
=
+
+
+
+
+
+
+
4- The 4
5- The 5
B
5
B
6
(B
5
(B
0
(B
1
{B
2
(B
3
B
@
3
+ B5 + B6) @
4
+ B5 + B6 + ~7) @
+ B6 + ~7) @
+ B7) @
@7
4
th
Syndrome Unit PLF = @4 B
B
=
4
+ B @
5
2
+ (B + B6) @
4
+ (B + B5 + B7) @3
4
4
+ (B + B4 + B5 + B6) @
0
5
+ (B + B5 + 8 6 + ~7) @
1
+ (B + B6 + B7) @
2
7
+ (B + B7 + so> @
3
th
Syndrome Unit PLF = @5 B
(B + B7)
=
3
+ B @
4
+ (B + B5 +
3
+ (B + B4 +
3
+ (B + B4 +
3
+ (B + B4 +
0
+ (B + B5 +
1
+ (B + B6 +
2
6- The 6
2
+ B7) @
th
Syndrome Unit PLF = @6 B
(B +
=
2
+ (B +
3
+ (B +
2
+ (B +
2
+ (B +
2
+ (B +
3
2
B7) @
@3
B6 + ~7)
B5> @
5
B5 + B6) @
6
B6 + ~7) @
B7)
B6 + B7)
B7) @
@
2
B4 + B6 + B7) @
3
B3 + 8 5 + ~6) @
B3 + B4) @
5
B4 + B5) @
43
6
+ (B + B4 + B5 + B6) @
0
7
+ (B + B5 + B6 + B7) @
1
7- The 7
th
Syndrome Unit PLF
= @7 B
= (B1 +
+
+
+
+
+
+
+
8- The 8
th
Syndrome Unit PLF
+
+
+
+
+
+
+
th
(B
0
(B
1
{B
0
(B
0
(B
0
(B
1
(B
2
(B
3
+
+
+
+
+
+
+
+
+
+
+
B6 + B7)
B7) @
2
B5 + B6) @
3
B4 + B5) @
4
+ B3 + ~7) @
+ B4) ~
+ B) @
7
+ B5 + B6) @
B
+
+
+
+
+
+
+
+
B4 + B5 + B6)
B5 + B6 + B7) @
B2 + B4 + B5 + ~7>
B1 + B3 + B4) @
4
B1 + B2 + B6) @
5
B2 + B3 + ~7) @
B3 + B4} @
@7
B4 + Bs>
Syndrome Unit PLF = @9 B
=
+
+
+
+
+
+
+
10- The 10
2
(B
1
(B
1
(B
1
(B
2
(B
3
(B
0
= @8
=
9- The 9
(B
B5
B6
B3
B
2
B2
B3
B4
B4
th
(B
3
{B
0
(B
1
(B
0
(B
0
(B
0
(B
1
(B
2
+
+
+
+
+
+
+
+
B4
B4
B3
B2
B1
B1
B2
B3
+ Bs>
+ Bs + B6) @
2
+ B4 + B6) @
@3
+ B3 + ~7)
+ B5> @
5
+ B2 + B6) @
6
+ B3 +7B7) @
+ B4)@
Syndrome Unit PLF= @10 B
= (B + B3 + B4)
2
+ (B + B4 + B5> @
3
2
@
44
+ (B
+ B
0
2
+ (B + B
+ (Bl + B2
0
4
+ (B + Bl
0
+ (B + Bl
0
+ (B + B2
1
+
+
+
+
+
+
! )
B + B +
B3 + B5) @ 6
6
45
B7) @
5
B5> @
6
B2 + B6) @
@7
B3 + B7)
2
@
The tirst Syndrome Unit multiplier is shown in Figure 5.4.
Eacn ot the other units multipliers is implemented using
similar circuit. These ten multipliers are implemented
using 31 Quad 2-input XORs. From the above product logical
functions, it 1s obvious that the mutipliers have a maximum
of three levels of XORs. Therefore, the total time delay,
TtotalJ' of this implementation method is:
T
total3
= 21
+ 7 + 10
= 38
ns
The total chip count for this method of implementation is
61. If programmable logic arrays are used to implement the
XORs, the total chip count can be dramatically decreased.
Comparing
the
above
three
method
of
implementation,
we
notice that the third method, using combinational ci rc ui t
multipliers, is the fastest method. Therefore, it has been
chosen to
Processor.
implement
the
Syndrome
units
ot
the
Syndrome
45
B'
7
Figure 5.4 Syndrome Unit Implementation Using
Combinational Logic Circuit Multiplier
CHAPTER :U
~
~
SIGMA PROCESSOR
Introduction:
The Sigma Processor is the second processor in the decoder
pipelined structure. Its function is to determine the
error-location
polynomial
o-(X)
from
the
syndrome
components s , s ,
• • • • ,s •
The inputs to this
1
10
2
processor, are the ten syndrome components calculated by
the preceding pipeline stage, the Syndrome Processor.
It
generates as an input to the tollowing pipeline stage, the
Error Location Processor, the number of errors, L , and the
u
coefficient location polynomial.
The first five syndrome
components are also transmitted to Error Location processor
unchanged, as shown in Figure 6.1. There are several
algorithms to determine the error location polynomial.
A
highly
efficient algorithm
is
the Ber lekamp' e.;__ itet:,~!=:J.ve
'
••''""_....:...-.......--· - . -....-·-·'· ··"--·...-••
.
•
algori
...• th![LJ3J ..•...
-·-·~
.~.
... '
~
,.~.-~,..~-~....-
-~--~-~--~-·~-~--,·•:·;-:~-· .-.,·-~· ··-~~---·--··~·--• >·•---~··-··--·
"··-·-·-··---·~····-······· -~--"·-~~"-'>--'"-~ ·---~··'-
."'-.'"'. ·; ·-:-::,.·.·:~;-,c·::·. :;
'
Tha Iterative Algorithm
~
Finding the Error Location
Polynomial:
This algorithm is used to determine the error-location
polynomial o-(X) from the syndrome components S ,
, s
2t
1
.
Let
o-(X)
= o-0
+ o-1 X+
••••• +
f
o-f X
then, the coefficients of o-(X) are related to the syndrome
components s. 's by the Newton's identities given by
~
equations (2.7) in Chapter 2. The iterative algorithm
'
<)
46
47
Input
Ports
Output
Ports
s,
Lu
s2
crt
s3
D2
s4
D
8
OJ
s5
Ofi
s,o
OS
s,
s2
s3
s4
s5
8
Figure 6. 1 Sigma Processor Inputs and Outputs
48
solves these sets of equations to determine the polynomial
o-(X) of minimum degree. This o-(X) would produce an error
pattern which has the minimum number of errors.
The first step in the iteration is to find a minimum degree
polynomial o- (l) (X) whose coefficient satisfies the first
Newton's identity.
The second step is to check if the coefficient ot o-{l)(X)
satisfies the second Newton's identity. If it does then,
o- {2 )
(X)
= o- (l)
{X)
If it does not satisfy the second Newton's identity, then a
(1)
(2)
correction term is added to o(X) such that o(X) bas
the minimum degree and its coefficient satisfies the first
two Newton's identities.
This iteration continues until
2
o- ( t) (X)
is
obtained.
Then,
o- ( 2 t) (X)
is
the
error-location polynomial o-(x),
o- (X)
=
o-
(2t)
(X) •
Let,
o-(u) (X)
= 1 + o-(u) X+ o-(u) X2 + •• + o-(u) XLu
1
Lu
2
be the m1n1rnum degree polynomial determined at the end ot
th
(u)
the u
step of iteration and L be the degree ot o(X).
The steps to find o-{u+l) (X) ar~ as follows:
1 - Compu t e th e uth d'1screpancy as:
d
U
= 5 u+l
2- If d
u
+ ~1
<u> s
= 0 then set
u + ~2
o-
<u> s
(u+l)
(X)
<u>s
u-1 + •• + o-Lu
= o- (u) (X)
( u+l-Lu)
49
3- if d # 0 then find another iteration p prior to the
th u
u
step that has d
# 0 and p-L
has the largest
p
p
value, then :
o-(u+l) (X)
= o-<u> (X)
+ d
d-1 x<u-p) o-<p> (X)
p
u
and,
L
u+l
= Max(L
u
, L
p
+ u - p)
4- Repeat steps 1-3 until
o-(x)
=
u
=
2t, then
o-(2t) (X)
If L t > t then there are more than t errors and
2
generally it is not possible to locate them [101.
~
From
Xhe Sigma Processor Design:
the
iteration
section, it
from
the
~s
algorithm
discussed
in
the
previous
clear that the process of determining o-{X)
syndrome
components
is
a
set
ot
arithmetic and
logical operations. Therefore, the Sigma Processor is best
implemented by either a microcomputer or a special purpose
computer
designed
computations.
to
handle
Galois
Field
element
In this project, Intel 8085 microprocessor based system is
used to implement this processor.
6.3.1 Microprocessor Based System Structure:
The basic components of a microprocessor based system are:
50
1.
The microprocessor,
2.
Read-Only Memory for storage of system programs,
3.
Random Access Memory for storage ot data and,
4.
Input-Output Interface.
A typical microprocessor-based
shown in Figure 6.2.
system
block
diagram
is
The microprocessor executes all the instructions and
performs ar i thrnetic and logical operations on data.
It
also controls the communications between all system blocks.
The Read Only Memory (ROM) is used to store the operating
program.
It does not have a write capability.
This
implies that the binary information stored in the ROM is
made permanent during the hardware production of the unit
and can not be altered.
The Randon Access Memeory (RAM) is used to store programs
and data which are temporary and might change during the
execution of a program.
It allows reading and writing of
data.
It has two control signals which specify a read or
write operation.
The peripheral interface devices transfer data between the
microprocessor and the external devices.
This transfer
involves data, status and control signals.
All these units communicate through a bus structured
organization.
There are three buses, an address bus, a
data bus and a control bus. The address bus is usually a
16-bi t -~~~-~_:.~tt~D~}-"-9'-~~~ used to address a partie ular
memory word stored in the ROM or RAM. This 16 bit address
ROM
!,';.
Data Bus
K
~~
MieraProcessor
Address Bus
!,'~
RAM
1\
J.
'
~
11----
i-
~
-
~
Control Bus
1.-
1\
~
>
1--
-
,.
1 - I-
f--
I-f--
r--
\l './
I/0
/~
v
Ports
Figure 6.2 Microprocessor Based System Block Diagram
<.n
.....
52
address up to 65k words. The data bus is usually an 8-bit,
bidirectional bus used to transfer data between the
microprocessor and all the other devices. The control bus
provides control and timing signals to all the devices of
the system.
Xhe Sjgma Processor Hardware Design:
6.3.2
The Sigma processor has been designed around the Intel 8085
8-bit microprocessor.
The basic block diagram ot this
system is similar to Figure 6.2.
A detailed circuit
diagram ot the designed Sigma Processor is snown in Figure
6. 3.
This system consists of the following three main
parts:
1-
The Intel 8085 microprocessor,
2-
EPROM and RAM memory units and
3-
Input/Output ports.
6.3.2.1 Xhe Intel
~microprocessor:
The Intel 8085A is a complete 8-bit parallel CPU~ The 8085A
microprocessor contains the functions of clock generation,
system bus control and interrupt priority selection, in
addition to execution of the instruction set. It transfers
data on an 8-bit, bidirectional 3-state bus wnich is
time-multiplexed so as to also transmit the 8 lower ordered
address bits. Additional 8 lines expand the MCS-85 system
memory addressing capability to 16-bit, thereby, allowing
64k bytes of memory to be accessed directly by the CPU.
The 8085 has six 8-bit general purpose registers, B, C; D,
E; and H, L which can be used either as single register
~
CLK
.--1-----fv
__l__j~l ll'DY
; Sv
~<
I_otii~
J
r----·-
!~~
~
~
74L52~~~
----1 n.
8085A
l~$1 ---lf?,5.!.
~-"1"0 .... d .., .. f'o,.ll
<~
I
i\.s
'
uL~o
~
£:--
A,,
J
oi:
Cs
!o1ii=t~
A,,
p;.
I<
·--
~
(sv
~
I
A,,
c.o""''".,..,n_'"'"
sro
-~
R5T 15
::
:
RSr 6"·5"
"ST 5.<
A
!WTR
,
1 · :
g
~,A,
~~~.~.:I
I"-
_f"i-v"----
~
...
'-l
"'
~~
'P.J ~.
I: ·
I
I : I i ii1:: \'1
I
----~t.
+
A.)
·~ 0
IC
_
o,p
-
-
~·
..
"v
1
I~
I
"'~
2\
~
JV
r---~
f--· ~.
RAM
2114
l~:D,o,_o,,_
,~ ~
J
___-
~
~
_,J~
_
---~
RAM
2114
~:0,A,
-!•
~-,, D D,t>!>,
~ t\ ~ ~~
_
""
...
.,
~
~ utput
Ports
t
__l?):
'
D
~
.
~leo
1---i•~
~
t"-:A7
A,
l:l.,
M2~47,_o,
I
"'
.;A,
:: EPROM
~, 2716
Hul.-c
I
~·
A
p ;:"'1
1"R~P
'
_j="lfp
l
I
~mwlL . _ _ _ _ _
Input Ports
t
till
"'''· J
----------
Figure 6.3 Sigma Processor Circuit Diagram
CJl
w
54
(8-bit) or as register pairs (16-bits).
There are also
four register which can function only as two 16 bit
registers, the program counter PC and stack pointer SP.
The 8085 CPU generates control signals that can be used to
select appropriate external devices and functions to
perform Read and Write operations and to select memory or
I/0 locations. The 8085 operates with a single 5 volts
power supply and a maximum clock frequency-range ot 3 MHz
at s1ngle-phase.
It provides RD, WE, SO, Sl and IO/M
signal for bus control.
It also provides five interrupt
inputs INTR, RST 5.5, RST 6.5, RST7.5 and TRAP. It also
provides Serial Input Data (SID) and Serial Output Data
(SOD) lines tor serial interface. A block diagram ot 8u85
CPU is shown in Figure 6.4.
In general, the Intel 8085 microprocessor can have its own
clock pulse which is generated internally and the clock
pulse frequency is determined by an external crystal
connected to the 1nput pins Xl and X2. In this system, the
Sigma Processor is part of the decoder system; therefore,
in order to be synchronized with the rest of the system,
the microprocessor clock pulse 1s generated by the overall
controller and fed externally to the microprocessor through
input Xl.
The 8085 microprocessor uses a multiplexed address/data bus
that contains the lower 8-bit address information during
the first part of machine cycle.
The same bus contains
data at a later time in the cycle. An address latch enable
(ALE) signal is provided by the 8085 to be used by latch to
latch the address so that it may be available through the
whole machine cycle. The 74LS373 chip is used as a latch
for the 8085 lower 8-bits as shown in Figure 6.5.
Due to current driving limitations of the 8085 data
outputs, the AM2947 non-inverting buffer chip is used to
BLOCK DIAGRAM
INTA
HSl b
>
THAP
SID
SOD
- - - - - - - --·· -------- =.::::!
··---------·-· --------------
.•.
.•.
.•.
8
HHi
INSlHUCTION
()
(Jf.UJOfR
RHi
AND
MACHINE
H
RlG
CVLlt
c
R(G
[
REG
L
,., ---1
.•.
,,,
f.- REGoSltR
ARRAY
fUl;
ENCODING
1161
STAC~
POINHR
1161
PROGRAM COuNI i: H
POwtR
.,. •!J.V
5UH'lV
• G"-'0
tNCRfM[NT t H '()E(Ht Mt.t..!Tf..R
AOORtS.SLATCH
lltH
•
l
;--
TIMING AND CONHIOL
.,
I
••
HEADY
HOLD
181
At6·Aa
RESET.iN
Figure 6.4 Intel 8085A CPU
ADDRESS BufFER
ADDRESS BUS
Fu~ctional
Block
--
d
DA JA,A00HtSSIIUfHR
111
f
A01 A0 0
AODAESSJDATA BUS
Diagra~
t.n
(..'1
2.2K - +5v
~.i~~
ROY
Vee
Ars
A,~
RS
I
RESET OUT
A,
u
A,s
2.1
A,lf
A,3
A, 2
26
25'
A,,fl ::
Z4
Au
12J
10
Ato
u
A~
q 21
.'
. 30
;
,,
IS
b
11
5
~
16
1
'i
3
l'i
CJ
14
13
6
12
i
Ag
2.
,,
_lO
Address Bus
v,,
,, CLK
,, 74LS373
17
1'.5
9
7
A1
A6
As
AIf
A3
,
Az.
At
3
Ao
10
3't
o1 +5v
G. NO
I OJ_
ITt
'J.
D7
o,
;-
Ds
Dtt
\. Data Bus
D_,
Dl
o,
Do
Figure 6.5 Intel 8085A Microprocessor and Address Latch
(.11
0"\
57
buffer these outputs.
It is a tri-state bidirectional
buffer. The nigh order address bits AS, A9, AlO, All, AlS
and control lines WE, RD, and IO/M are also buffered using
74LS244.
The pin connections of the 74LS244 and AM2947
buffers are shown in Figure 6.6. The READY input is tied
high since all I/0 and Memory are considered sufficiently
fast.
The RESET input is connected to the overall
controller.
All interrupt inputs are grounded since they
are not needed. The 8085 is a 40 pin single chip. The pin
out diagram is shown in Figure 6.7.
6.3.2.2
Memory Unit:
The memory units consists of a program memory chip, and
data memory chip. The Intel 2716 (2Kx8) EPROM chip is used
as the program memory and two of the Intel 2114 (1Kx4) RAM
chips are used as the data memory.
1.
Program Memory (Intel 2716 EPROM):
The Intel 2716 is a 2K byte ultra violet erasable and
It
electrically programmable read-only memory (EPROM) •
operates from a single 5-volt power supply. Since it is a
2K memory, 11-bit address is needed to address the memory.
eleven
address
lines
AO-AlO
from
the
8085
The
microprocessor are connected to the inputs of the EPROM.
The 8-data outputs of the EPROM are connected directly to
the data bus of the microprocessor.
It has a chip
enable/program (CE/PGM) and an output enable (OE) inputs.
The chip enable/program input is used to enable the EPROM
whenever it is selected by the microprocessor as shown in
Figure 6 .8.
The output enable (OE) input is connected to
the read control signal RD of the microprocessor.
2.
Data Memory (Intel 2114 RAM) :
58
Data
Bus
v,, .le_+5v
g
D,
D,
Ds
D't
D3
D,
D,
7
o,
·~
'
lit
.D~
5
16
Dt.~
If
It
03
D2
AM2947
3
Do
D1
12.
,..,
2.
1r·
o,
I
I 'I
Do
,
CD
r; GND
-:b-"
T/R
u
-RD
The AM2947 Pin Connections
Vee ~+5v
A.~
2
li
WE
17
RD
~
,,
IO/M
15
5
3
74LS244
'
All
A,o
ll
Aq
At
l-
l'f
7
&'
12
II
q
GND
The 74LS244 Pin Connections
Figure 6.6 Buffers Pin Connections
fus
WE
RD
ro/M
A,,
A,o
A'l
A~
Buffered
Data Bus
59
x,
Xz
RESET OUT
SOD
SID
TRAP
RST 7.5
RST 6.5
RST 5.5
INTR
INTA
ADo
AD1
ADz
AD3
AD4
ADs
AD5
AD7
vss
40
39
38
37
Vee
HOLD
HLOA
eLK (OUT)
36
35
34
33
RESET IN
READY
10/M
32
s,
RD
10
31
8085A-2
11
30
12
WR
ALE
13
14
15
16
A15
A14
A13
17
18
19
20
An
Figure 6.7 Intel 8085A Pin
so
A12
A1Q
Ag
21
As
Out Connections
60
,,
AI()
I
vPP
2l
Aq
).3
A~
Address
Bus
Vcr;
I
A1
A,
2
3
A'§
/If
(2K X 8)
5
AJ
"
16
2716
't
A~t
17
EPROM
13
''1
A:~.
AI
Ao
II
/0
8
C'f
~v
RD
I o1M"
A
"
AliS
OE
CE
r~o
~~ 18
/'
Figure 6.8 Program Memory Unit
+5v
61
This RAM is a lk x 4 bits memory.
Two of them are
connected together to give a lk x 8 bits memory.
Ten
address lines are needed to address this memory. Address
lines AO - A9 are connected to the address inputs of the
RAM.
It has a write enable (WE) input and a chip select
input ( CS) •
The RAM pin connections are shown in Figure
6.9.
3. Memory Address Space Mapping:
Address Lines AO - AlO are used to address the memory
locations in either memory. Address lines All and AlS are
used to select between the RAM, EPROM, and I/0 ports as
shown in the tollowing table:
IO/M
AlS
All
0
0
0
0
0
0
1
1
X
Selected Device
EPROM
RAM
I/0 ports.
The memory is mapped as follows:
Address
0000-07FF
0800-0BFF
8000-8200
8000-8020
6.3.2.3
Device
Memory Space
EPROM
RAM
Input Ports
Output Ports
2K bytes
lK bytes
Input/Output Ports:
Ten input ports are used to interface the syndrome
62
18
IS
vee !---o+5v
16
11.
l
-
2
1Kx4
II
07
RAM
It
06
05
04
3
13
"
2114
llj
'l
t.
5
WE
toY
G.ND
~
Cs
cs
~
V ~+5v
cc
15
16
11
-IZ
. 1\
1Kx4
I
z
...
RAM
3
"'
"..
.)3
2114
7
WE
lo't
&ND
~-
~
llj
cs)
'-
Figure 6.9 The RAM Unit
'
~
03
02
o,
oo
63
processor to the Sigma Processor and six outputs ports are
used to interface the microprocessor to the Error-Location
Processor. The input/output ports are implemented using 16
SN74S373s.
Any of the 10 input ports is selected by RD,
AlS and one of the address lines AO to A9 as snown in
Figure 6.10. The output of each of these input ports is
connected to the data bus through a set of tri-state
buffers which are enabled when these ports are selected.
Figure 6.11 shows an input port connection. Each of the ten
syndrome components s. will be received from the syndrome
l.
processor through one of these input ports.
Each ot the six output ports is selected by WE, AlS and one
of the address lines AO-AS, as shown in Figure 6.12. When
any ot these output ports is selected, the data byte which
is present on the data bus will be latched into it. Each
of the sigma polynomial coefficient 07 or L will be sent
l.
u
to the Error-location processor through one of the output
ports, where L
represents the number of the detected
u
errors
6.3.3 The Sigma Processor Software System Design:
The algorithm for
calculating the
sigma polynomial
coefficients from the syndrome components is done by the
hardware microprocessor based system under the control of a
software control program. This control program consists ot
three routines, main program and the two routines: Sigma
routine and the Discrepancy (d ) routine.
u
The data structure and the control program ot these three
routines are discussed in the follwoing sections.
6.3.3.1 nata Structure Discription;
The data structure consists mainly of three arrays,
To Data Bus
8
RO A,A,;1"8
RQ Att.\tst
8 R£? ~T~•.c j8 I?DA,A,,l-8
RD,
~5 ~,,'1-8
RPA.,A 15'i-8
745373
745373
8
510
8
sg
B
8
58
57
8
s6
8
55
8
s4
8
s3
8
s2
s,
Figure 6.10 Input Ports Configuration
~
~
65
o.J.-1---
Cl
s::
-
S0
Vl
Vl
QJ
u
Cl
0
S-
0..
u
QJ
s::
s::
0
u
+-'
S0
0..
0
S-
+-'
>,
s::
......
-o
s::
Cl
+-'
QJ
·.- E
Vl
0
......
Vl
E
0
S-
u...
:::::;
0.
s::
c:(
r-
0..0
QJ
S-
:::::;
en
Cl
u...
Vl
0::
..:::L.
u
0
r-
u
From Data Bus
A2
8
8
8
8
8
8
74S373
74S373
as
(}4
8
8
8
8
0"'3
cr2
8
8
cr,
Lu
Figure 6.12 Output Ports Configuration
"'"'
67
During the program execution the first
5-location each.
array contains the coefficients of the sigma polynomial
. d 1n
. t h e p th step. Th'1s array occup1es
.
d eterm1ne
t he memory
locations 0800-0806H.
Associated with this array, there
are 3 other memory locations which represent the following
parameters:
which re~resents the iteration step number for
the o-(p (X) polynomial, which occupies the
location (0807H).
p
d
.
p
th
which represents the discrepancy in the p
iteration step, which occupies the location
(0808H).
h
which represents the difference between the
iteration step number and the power of o-(p)(X),
which occupies the location (0809H).
p
The second array contains the coefficients ot the sigma
polynomial determined in the uth step. This array occupies
the memory locations 080A-080FH.
Associated with this array, there are four memory locations
which represents the following parameters:
L
u
•
<)
( u)
u
:
The degree of the o(X), which occupies the
memory location (0810H).
The iteration step number of the current
polynomial o-<u> (X), which occupies the memory
location (0811H).
68
d
- h
u
.
. h occup1es
.
The uth d'1screpancy, wh1c
t h e memory
location (0812H).
The difference between the iteration step number
( u)
and the power of o(X), which occupies the
memory location (0813H).
u
The third array is a temporary storage for the coefficient
of o-(u+l)(X) polynomial during the process of calculating
them. The block diagram of the data structure is given in
Figure 6.13.
6.3.3,2
and~(.)
£(,)
Transforms:
To simplify the software description the following
transforms P(.) and V(,) will be introduced.
two
1. P(.) Transform Definition:
For a field element B given by its vector representation
v., the transform P (v.) gives the power representation p.
1
1
1
of B, i.e. P(.) transforms a vector representation of any
field element into its power representation. Therefore, if
v. and p. are the vector and power representation in binary
1
1
form ot field element, then:
P<v.) = p.
1
1
The P(.) transforms of all the field elements ot GFC2 8 > are
given in Table 6.1.
2.
V(.) Transform Defintion:
For a tield element B given by its power representation P.,
1
the transform V(p,) gives the vector representation v. of
1
1
69
B B 6
G B _6
,......
><
~0~8(6
~
~
B
~
~
B
~
~
+-'
u
~
S-
+'
(/)
10
+-'
10
Cl
S-
o
til
til
(l)
u
0
S-
o...
10
E
Ol
.,.....
~~o
~~o
(/)
(Y)
r-
<.0
(l)
S-
~
~{]
~~o
~o
~
o
~~o
Ol
.,.....
l.J...
70
P(V.)
v.1
P(V.)
00000001
00000010
00000011
00000000
00000001
00011001
10000000
10000001
10000010
00000111
01110000
11000000
01111101
01111110
01111111
11110011
10100111
01010111
11111101
11111110
01010000
01011000
10011111
1
11111111
1
Table 6.1 P-Transform of GF(2 8 ) Elements
71
B, i.e.
V(p.)
1
= v.1
8
The V(.) transforms of all the field elements of GFC2 > are
given in Table 6.2.
~.3.3.3
Main Program Description:
The flow chart of this program is shown in Figure 6.14.
This program is stored in location OOOOH. When the system
is reset, the program counter is forced to OOOOH and the
Intel 8085 starts executing the main program.
At the
beginning ot the program execution, the program checks if
there is an error, by checking all the syndrome components&
If these components are all zeros then there is no error
detected and Lu is set to zero. If any of the syndrome
components is not all zeros, then the program will start
system initialization.
The system initialization consist
of the tollowing:
1.
Reset the locations o-(u)_ o-(u) and o-(p)-o-(p)
1
to zeros,
2.
Set the locations o- (p)
0
3.
Reset the locations u, h
4.
Set the locations h
5.
Set the location d
p
u
5
= 1,
u
1
o- ( u)
0
= 1,
5
d
p
= 1,
and L to zeros,
u
and p to -1, and
to sl.
The system will call the Sigma routine to calculate
( u+l>
o(X) and store the resulting coefficients of
72
P.
1
V(P.)
1
p.
V( p.)
1
1
00000000
00000001
00000011
00000001
00000010
00001000
10000000
10000001
10000010
10000101
00010111
00101110
01111101
01111110
01111111
10010011
01100110
11001100
11111101
11111110
11111111
10101101
10001110
00000001
Table 6.2 V-Transform of GF(2 8 ) Elements
73
-1-P
sl--- du
1-dp
o - hu
-1-h
1-riiPl
o--
0 '
(atp
1 - a!"'
-o-,!PI ) •
o_l'-''
0 - (<JQ
L-0
u
Move Arguments of u to
Arguments of p
Yes
Figure 6.14 Main Program Flow Chart
'
'
74
<u+l)
in the o{X) array.
The value of h is
u
then calculated and checked if it is greater than h • If
<u>
P
so, the content of the o(X) array is transferred ~o the
o-(p) (X) array and the parameters u, d
and h
are
u
u
transferred to p, d and h respectively.
If h~ is less
than b then the o-JPp) (X) Noes not change • o-<u l) (X) is
P
(u)
then transferred too(X).
o- <u+l) (X)
The degree ot o- ( u) (X) is checked.
If it is larger than
the number of correctable errors (5) an uncorrectable error
flag is set. If not, the next step of iteration is done by
incrementing the iterative step number and calculating the
.
uth d.1screpancy (d ) by calling the Discrepancy rout1ne.
u (u+l)
.
Cu)
If d is zero then o(X) 1s the same as o(X).
If
u
(u+l)
.
(X)
Wlll be
d does not equal to zero, then the ou
calculated by the Sigma routine. This will continue until
the number of iterations u is equal to 10.
At the end of executing this program, the microprocessor
will get into continuous loop waiting to be reset by the
overall controller.
6.3.3.4
Xha Sigma Routine:
. .
<u+l}
This routine calculates the coeff1c1ents ot o(X)
using the current value of the sigma polynomial o- ( u) (X),
(p)
the discrepancy d , o(X) and d
where p < u and
(p)
u
.
p
o(X}, and d are the polynom1al and its discrepancy ot
the pth step. fhis p step is selected such that d ~ 0 and
p
th
h has the largest value in the steps prior to the u
s~ep.
o-(u+l) (X) is given by:
o-(u+l) (X)
= o-(u) (X)
d-1 x<u-p) o-<p> <x>
+ d
u
p
75
The tlow chart tor this subroutine is shown in Figure 6.15.
Three
temporary
locations
are
used
to
store
the
intermediate results, (TEMPl - TEMP3). The multiplication
of d by d -l is done by using the P-transform for both
eleme':tts. Tlfe result is stored in the power representation
-1
{p)
form in TEMP2. To calculate the term d d
o(X), each
of the coefficients of o-(p) CX), o-~p) ,uispmultiplied by d
d-l which is done by the aJdi tion of the powe~
r~presentations of the two terms. Then multiplying this
term by x<u-p>
is equivalent to shifting these new
coefficients (u-p) times to the right. The addition of the
result of this shifting operation to the coefficients ot
o-<u> (X) yields the coefficients of o-(u+l) (X). This result
.
d ~n
.
~s store
o- <u+l > <X) array.
6.3.3.5
~
Discrepancy Routine:
th
discrepancy d
This routine calculates the value of the u
u
using the coefficients of o-(u){X) and the syndrome
th
components s. 1 s, The u
discrepancy is given by the
~
following equation:
d
u
=S
u+l
+o-.:-1
(u)
S
u
+~2
(u)
S
u-1
+ •••• +
< u)
or;u
s
u+l-Lu
Two pointers, N and N , are used for this routine. N is
used to point at the ~ ( u) (X) coefficients and N is the
.
1
pointer tor the syndrome components. This value of d can
u
be easily calculated by L
multlplications and L
u
u
additions. Each of these multiplications is a two-field
element multiplication which is
done by adding the
P-transform ot both elements. Then, the result is checked
if ~t is greater than 254. If it is greater, then the power
of the mutiplication result should be adjusted by
.
255 , ( s~nce
.
"'255
-- 1 t h en 1::'Cli+255 = -eeli .1::'"'255 =
sub tract~ng
1::'
76
TEMP2+TEMP3
--+
TEMP3
TEMP3-255 -+TEMP3
1
Shift cr ~;',' Right TEMPl Times
Figure 6.15 The Sigma Routine Flow Chart
77
@i > • The addi tiona are done by adding the vector
representation form of the two e!ements modulo-2 addition.
The result ot this routine is stored in d • The flow chart
u
of this routine is shown in Figure 6.16.
Table 6.3 The Sigma Processor IC Parts List
Element Type
IC Type
Number
Number of
Element
Number of
IC Chips
Microprocessor
Intel 8085A
1
1
EPROM
Intel 2716
1
1
RAM
Intel 2114
2
2
Latch register
SN74S373
17
17
Tranceiver
AM 2947
1
1
Octal buffer
SN74LS244
1
1
Tri-state
buffer
SN74367
80
10
3-input NAND
SN74Sl0
18
6
All ot the elements required to construct this processor
are listed in Table 6.3. From this table, the total number
of IC chips required is 39.
The exact time delay of this processor is a tunction of the
number of executed software instructions and the clock
pulse trequency provided by the overall controller.
78
ST A RT
0
u+l - - - SN
N-1-----+Nl+l·--~
Yes
Yes
Yes
----il-'
TEMP
No
TEMP-255 - - - TEMP
V(TEMP}+d
Figure 6.16 The Discrepancy Routine Flow Chart
CHAPTER
~
~
m
ERROR-LOCATION PROCESSOR
Introduction:
This is the third processor in the decoder pipelined
structure. Its function is to determine the roots ot the
error-location polynomial o- (X) • The reciprocals ot these
roots are the error-location numbers. The inputs to this
processor are the coefficients of the error-location
polynomial and the number of errors Lu. It generates the
roots of the error-location polynomial as an input to the
following pipeline stage, the Error-Magnitude processor.
Since the first five syndrome components are also needed as
an 1nput to the Error-Magnitude processor, they are stored
in the Error-Location processor and transmitted to it, as
shown in Figure 7.1.
The error location polynomial o-(X) has the following form:
o-<x> = 1 + oy x + ~ x 2 + ••••• +Of xt
The roots ot o-(X) can be found by substituting all the
non-zero field elements of GF(2m) into o-(X). If o-(@i) = 0
then @i is a root of o-(X).
Since there are (2m-l> non-zero field elements and the
degree ot o-(X} is t, then a maximum ot (2m-l)t additions
and (2m-l)t multiplications are required to find the roots
of o- (X} • This procedure will be very slow to implement.
However, Chien's search algorithm to find these roots
requires only 2m-1 clock pulses. This is much faster than
the direct method. Therefore, Chien's search algorithm is
used in this processor.
79
80
Input Ports
L
I
u
\
I
Output Ports
Figure 7.1 Error-Location Processor
Inputs and Outputs
81
~
Error-Location Processor Design:
The block diagram of this processor is illustrated in
Figure 7.2. It consists ot four parts: the Root Locator,
the Stack register, two counters and control circuit.
7.2.1
~
RQQt Locator Design:
The Root Locator circuit diagram is snown in Figure 7.3.
This Root Locator uses Chien's search algorithm to find the
roots of the error location polynomial o- {X) • In general,
it requires t multipliers to multiply by @, @2, ••••• , @t.
For the (255,245) code, only five multipliers are required
to multiply by @, @2, .. • , @5.
Initially,
the sigma registers are loaded with the
coefficients ot the error-location polynomial generated by
the Sigma processor. Then, the registers are clocked 255
times. At the end of the ith clock pulse, the registers
contain oy @i, ~ @2i, •••• , o; @5 i, and the output of the
XOR tree is the sum of these values, which is:
If this sum is zero, then @i is a root of o-(X), and @255-i
is an error location number. If o-(@i) # 0, then @i is not
a root of o- (X) • To check if the sum is zero, an 8-input
NOR gate is used to NOR all the a-outputs of the XOR tree.
If the output is 1 then o-(@i) = 0.
This circuit can be implemented using five multipliers
similar to the tirst five multiplexers used in the Syndrome
processor designed in Chapter V. The Sigma-registers are a
set ot 8-bit shift registers.
'
0
8
a;
o-2
o-3
;
8
~»j
. CJ (X) Root Locator
8
~4 ~--,
I
0'"5
I
I
t
RS
~8
n
I
Ml
~
Stack Register
'
~
M2
F
t-----F
CL'K
Figure 7.2 Error-Locator Processor Block Diagram
co
N
8
XOR
Tree
8
14'5373
14531.3
RS
R
RS
®
@
8
RS
®
@2
8
RS
RS
®
8
@3
®
@4
8
®
@5
Figure 7.3 Root-Locator Circuit Diagram
co
w
84
1.2.2
~
Counters Design:
In this processor there two counters. The first counter,
Cl, is an up binary counter which is used to count the
circuit clock pulses. At any time, the content o:t this
counter ls also the power representation form o:t the field
element that has been tested to check if it is a root o:t
o- (X) or not. The second counter, C2, is a down binary
counter. Initially, this counter is loaded with the number
of errors Lu• When a root of o-(X) is detected, this
counter is decremented by enabling its clock input using
the NOR gate output. When this counter reaches zero, it
indicates that all the roots of o- (X) have been detected
and the eire ui t clock is disabled. The content ot this
counter ls tested by a 3-input NOR gate.
7.2.3 Xhe Stack Register Design:
There are tive registers arranged in a form of a stack (the
outputs of one register are the inputs of the followl.ng
one). These registers are used to store the roots of o-(X)
~olynomial in their power representation forms. Each time a
root is detected the content of Cl is loaded into the top
location ot the stack and the rest of the stack content is
pushed one place down. This is done by enabling its clock
pulse lnput using the output of the NOR gate wnich is
controlling the out~ut of the XOR tree. At the end of the
process, this stack will have all the roots of o-(X).
7.2.4 System Control Design:
There are a tew signals used to control this processor. The
reset signal RS is set by the overall controller. This
signal is used to initialize the processor. Signal M2 when
it l.S zero indicates that all the roots of o-(X) have been
located. When it is not zero, it indicates that all the
85
roots ot o-(X) have not been located yet. Signal Ml, when
set, indicates that 255 clocks have been received, i.e. all
the the tield elements of GFC2 8 > have already been tested
to see if they are roots of o- (X) • If this occurs before
finding all the roots of o-(X), an uncorrectable error is
assumed. This signal is necessary because occasionally
o-(X) can not be factorized into the form (l+B1 X)(l+B 2 X)
• • • • Cl+B 5 X). This occurs when some ot its roots may not
be in the field GF <2 8 >. If this happens, an error ot a
weight more than 5 is assumed. The M2 signal is then used
to set an alarm flag to indicate such a condition.
~
System Operation:
At the beg inning of each processing cycle, the processor
receives a reset pulse trom the overall controller, on the
RS line. This pulse J.s used to initialize the processor.
During the J.nitialization:
- the coefficients of the error-location polynomial are
loaded J.nto the Sigma registers. This is done by selecting
the first input of each of the 2 to 1 multiplexers
connected to the OJ's comming out of the Sigma processor,
- Cl is reset to zero,
- C2 is loaded by the number of errors Lu,
- Flip-Flop Q is reset to zero to enable the system clock
and
- Flip-Flop F is set to one to indicate that the process of
determining the roots of o-(X) is in progress.
The processor clock is controlled by the contents of Cl and
C2. If C2 reaches 0 first, this indicates that the roots ot
86
o- (X)
have been found, then the processor clock pulse is
disabled and F is set to 1 indicating a correctable error.
If
Cl
reaches
255
first,
this
indicates
that an
uncorrectable error is detected; in this case F is reset to
0 and the processor clock pulse 1.s disabled. Whenever a
root is detected its value is loaded into the stack
register and C2 is decremented.
This processor can be implemented using Schottky TTL IC
circuits. The tive multipliers are implemented using 41 XOR
gates which need 6 Quad 2-input XOR SN74S86s. This gate has
7 ns time aelay. The Sigma registers are implemented using
5 octal D-type Edge-triggered Flip Flops, SN74S373s with a
10 ns time delay. The XOR tree circuit nas 5x8 inputs and 8
outputs and can be constructed using 32 XOR gates. The
least significant bit of the XOR tree output is inverted,
to account for the addition of I. The XOR tree output 1.s
then connected to 8-input NOR gate. This NOR gate can be
constructed using four 2-input OR gates followed by 4-input
NOR gate which are 1.mplemented using SN74S32 and SN74S40,
and have a total time delay of 8 ns. The two counters Cl
and C2 are implemented using three 4-bit counter SN74Sl61.
The
stack
register
consists
of
five
octal
D-type
Edge-Triggered Flip Flops, SN7 4S374s. The control circuit
requires: four 2-input AND gates, a 2-input OR gate, an
8-input AND gate, and a 3-input NOR gate. These gates can
be implemented using: SN74S08, SN7427 and SN7421. Eleven
8-bit registers are required for the storage ot the Sigma
coefficients, the syndrome components and the number ot
detected errors. This makes the total number of IC chips
required for the 1.mplementation of this processor equal 37.
This processor needs a maximum of 256 clock pulses to
determine the roots of the error location polynomial. The
clock pulse duration should be 57 ns minimum, the total
87
time delay of the root locator circuit.
'
\)
CHAPTER
~
~
nn
ERROR-MAGNITUDE PROCESSOR
Introduction:
This is the fourth processor in the decoder pipelined
structure. It computes the magnitudes of the errors at all
given error locations. The inputs to this processor are:
the coefficients of the error-location polynomial, the
roots of this polynomial, the number of errors and the
first five syndrome components. The outputs of this
processor are:
the error
location numbers and the
magn1tudes ot the errors as shown in Figure 8 .1. These
outputs are transmitted to the Error-Correction processor.
The procedure to calculate the magnitudes of the errors was
devised by Forney and Berlekamp £31. In this procedure, the
error evaluation polynomial Z(X) is defined as:
then, the magnitude of the error in location BL is given
by:
Z( B-1 )
L
=
--r-------------------r-1 ( 1 + Bi BLl )
i=l
ipL
~
Error-Magnitude Procesor Hardware Design:
To compute the magn1tude of the error, a microprocessor
based system similar to the one used in Chapter VI has been
88
89
Input Ports
l
Output Ports
I
-
......
.,.
"7"
......
...
~
~
.,..
Figure 8.1 Error-Magnitude Processor
Inputs and Outputs
90
designed. The detailed design of this system is described
in Chapter VI, and the detailed circuit diagram is snown in
Figure 6. 3. The only difference in the hardware design
between the two processors is the Input/Output ports, which
are discussed in the tollowing section.
8.2.1 Input/Output Ports:
There are 16 input ports and 10 output ports. The input
ports are used to interface the Error-Location processor to
the Error-Magnitude processor. The output ports are used to
interface
the
Error-Magnitude
processor
to
the
Error-Correction processor. All the input and output ports
are implemented using 26 SN74S373. To address one of the 16
input ports, four address lines A0 , A1 , A2 , A3 are fed to
4/16 binary decoder (74Sl38) to output 16 different lines.
Each of these lines is used with A15 and RD to completely
select
an
input
port.
The
input
port
selector
implementation is shown in Figure 8.2. The outputs of each
input port are connected to the data bus through a set ot
tri-state buffers which are enabled when the port is
selected. The tirst five syndrome components <s 1 -s 5 > are
received
through
the
first
five
input
ports.
The
coefficients of the error-location polynomial are received
through the next five input ports. The tollowing five ports
are used for the roots of the error-location polynomial,
while the last input port is used for the number ot
detected errors, Lu•
Each ot the ten output ports are selected by WE, A15 and
one ot the address lines A0-A 9 as shown in Figure 8.3. When
any of these output ports is selected,· the data byte
present on the data bus is latched into it. These output
ports are used to send the error location numbers and the
magn1tudes of the errors to the Error-Correction processor.
To
r------r---r------ --
Data Bus
~---
--------ill
,,,
pl5
ill
---------
pl4
745373
8
L
u
RD
A15
p3
I
11
I
I
ls fs
x5
11
. . . . . .
r,5 · · · ·
r
I
x4
4/16
I
Po
I I1
fs
s4
I
fs
53
I
I
745373
fs
fs
sz
s,
Decoder
Tl
A A A A
3 2 1 0
'
Figure 8.2 Input Ports Configuration
\0
......
92
745373
8
B
8
B
2
8
B
4
1
From Data Bus
Figure 8.3 Output Ports Configuration
93
~
Error-Magnitude Processor Software Description:
The algori thrn to compute the magnitudes ot the errors 1s
executed by the microprocessor based system under the
control of the software control program. This program
consists ot tour main routines; a main program and three
routines. These routines are: a routine to calculate Z(X),
a routine to evaluate Z(Xc> and a routine to calculate the
product term~(l + Bi B~l). The data structure and each ot
these routines are described in the following sections.
8.3 .1 D.at..a Structure Des.cription:
The data structure consists of 6 arrays, the size ot each
array is five locations. The tirst array is the syndrome
array which contains the first five syndrome components
needed to calculate Z(X). The second is the Sigma array
which contains the coefficients of the error-location
polynomial o-i •s. The third is the x-array which contains
the roots ot the error-location polynomial in the power
representation form. The tourth array is the z-array which
contains the values of the coefficients of the error
evaluation polynomial Z(X}. The tifth and the sixth are the
B-array and E-array which contain the error location
numbers (Bi's) and the corresponding magnitudes of the
errors (ei•s), respectively. The block diagram of the data
structure is shown in Figure 8.4.
8.3.2 Main Program Description:
The tlow chart of this program is shown in Figure 8.5~ This
program is stored in the ROM starting at location OOOOH.
When the processor is reset, the program counter is forced
to OOOOH and the processor starts executing this program.
The tirst step in this program is to determine the
94
s,
z,
s,
s2
z2
B2
s3
z3
z4
83
z5
85
Z Array
B Array
s4
ss
Syndrome Array
84
o;
el
e2
~
(}3
e3
(}4
OS
e4
e5
Sigma Array
E Array
x,
X Array
Figure 8.4 Error-Magnitude Processor Data Structure
'
t}
95
Determine
c+
l
Z( X)
c
OOH
FFH
Xc- 255
Z(Xc)
L
U (
TT
j =I
1 + B;Bc-l
Be
TEMPl
)
--TH1P2
P(TEMPl)-P(TEMP2)~TEMPl
V(TEMPl) - - - -
c
No
Figure 8.5 Main Program Flow Chart
96
coefficients of the error evaluation polynomial Z (X) and
store them in the Z-array. Next, the program checks the
number ot errors. If it is less than five, the processor
sets the locations that correspond to no-error in the error
location array to FFH and the corresponding locations in
the error magnitude array to OOH. This is to facilitate the
function of the Error-Correction processor and to make it
independent of the number of detected errors. The program
then determines the error location numbers Bi 's using the
roots of the error location polynomial Xi's and stores them
in the cor responding locations of the B-ar ray. For each
error location number, the program generates an error
magn~tude ei and stores it in the corresponding location of
the E-array. This is done by executing the Z (X c ) routine
followed by the Product routine. Dividing the results ot
both routines yields the error magnitude. This process is
repeated for each of the error locations.
At the end of executing this program the microprocessor
will get into a continuous loop waiting to be reset by the
overall controller.
8.3.3 The Error Evaluation Polynomial Routine:
The tlow chart of this routine is shown in Figure 8.6. This
routine has the first five syndrome components and the
coefficients of the error location polynomial as inputs and
generates the coefficients of the Z(X) as an output. These
Z(X) coefficients are stored in the Z-array.
Since any coefficient of the Z(X) polynomial, zi, is given
by:
then, each coefficient of Z(X) is calculated using two
97
;
c
0
Cl
\
Z;
C-1
C
Cl+l
Cl
z.1
Yes
No
Yes
Z;+V(TEMP)
Figure 8.6 Error Evaluation Polynomial
Routine Fl 0\'1 Chart
98
pointers C and Cl which are set to point to the syndrome
component number and to the Sigma coefficient number,
respectively. The ith syndrome component is moved to the Zi
location, then the pointer C is decremented and pointer Cl
is incremented. The value of o-s 1.
is then computed and
1 -1
added to Zi. This process continues till the pointer C
reaches a zero value. Then the "' is added to Zi. This
procedure is repeated for each value of i where 1~ i ~ Lu·
8.3.4 Ihe Z(Xc) Routine:
This routine calculates the value of the error evaluation
polynomial Z(X) for each of the error location polynomial
root Xc, The value of Z(Xc> is stored in a temporary
location TEMPI.
Since Z(X) is given by:
where 1
~
i
~
Lu, it can be rearranged as:
To calculate Z(Xc>, the value of zi is tirst transferred to
TEMPI. The content of TEPMl is multiplied by Xc. The result
is added to Zi-l and stored in TEMPI. This process of
multiplication followed by addition is repeated until i=l.
The value of TEMPI is then multiplied by Xc and the result
is added to 1, modulo-2 . addition. The flow chart ot this
routine is shown in Figure 8.7.
8.3.5 !ha Product Routine:
This routine computes
given by:
the value of
the product term PT
99
L
.U
Z.
1
-~
TEt·1Pl
Yes
No
P(TEMPl)+P(X) ~ TEMPl
c
Adjust
TEMPl
P( TEMPl )+P( Xc) _ __.,.,_ TEMPl
Adjust TEMPl
V(TEMPl)+l
TEMPl
Figure 8.7 The Z(Xc) Routine Flow Chart
100
Lu
PT
= r-\ (
1
+ Bi B~l )
i=l
iFC
and stores the result in a temporary location TEMP2. For a
given error location number Be, the term (Bi B~ 1 > is
calculated, added to 1 and the result is stored in location
TEMP!. This process is repeated for each value ot i wnere
1~ i ~ Lu and i # c. The tlow chart ot this routine is
shown in Figure 8.8.
Table 8.1 The Error-Magnitude Processor IC Part List
Element Type
IC Type
Number
Number of
Elements
Number of
IC Chips
Microprocessor
Intel 8085A
1
1
EPROM
Intel 2716
1
1
RAM
Intel 2114
2
2
Tranceiver
AM 2947
1
1
Octal buffer
SN74LS244
1
1
Latch register
SN74373
27
27
Tri-state
buffer
SN74367
128
16
4116 decoder
SN74154
1
l
3-input NAND
SN74Sl0
28
10
All the IC parts required to implement the Error-Magnitude
processor are listed in Table 8 .1. From this table, the
total number of IC chips required is 60.
'
d
101
Lu
1 -
P(X )+P(M)
c
V(M) + 1
i
TEMP2
M
M
P(TEMP2)+P(M) - - - TH1P2
V( TEMP2) - - - TEMP2
No
Figure 8.8 The Product Routine Flow Chart
'
0
102
The exact time delay of this processor is a function of the
number ot the executed software instructions as well as the
clock pulse trequency provided by the overall controller.
CBAPTER l.X.
~
ERROR-CQRRECTION PROCESSOR ANn
xa& OVERALL QQNTRQLLER
~
Xhe
~rror-Correction
Processor Design:
is the final stage o:t the
decoder pipe lined structure. The ~nputs to this processor
are: the five error location numbers, the five magnitudes
of errors and the received vector r(X). The output ot this
processor is the corrected code word V(X). The error
numbers
and
the
magnitudes
of
errors
are
location
trasferred from the Error-Magnltude processor while r(X) is
read out ot the Queue buffer.
The
Error-Correction processor
This processor consists of: two 5-location stack registers
B and E 1 a counter c, a comparator COMP 1 an adder and
control gates as shown in Figure 9.1. When a reset pulse,
RS, is received from the overall controller, the output of
the Error-Mgnitude processor which represents the five
error
location
numbers
and
the
corresponding
five
magn~tudes ot errors are
latched into the B and E stack
registers. At the same time the counter C is reset to zero.
It may be recalled that the Error-Location processor began
its root search from @0 to @254 , therefore, the error
location numbers are ~n the proper sequence for correction.
The corresponding magnitudes of errors are stored in the B
stack in the same sequence.
The error correction procedure is described as follows:
with each clock pulse received, a symbol is read from the
Queue
buffer
unit
(in
order)
and
the
counter
is
decremented. The output of this counter is compared with
103
8
rd
8
Buffer Unit
CLK
r(X)
8
Counter C
L
,____
COMP
8
Stack
Register
1--
8
V(1J
e(X)
I
B
Stack
Register
8
<t;
LCLK
~f3
E
Figure 9.1 Error-Correction Processor Circuit Diagram
...
...J
0
105
the output of the top location ot the B stack. If they
match (they are equal), then the current symbol, ri, read
out ot the buffer is corrupted. To correct it, the output
of the top location of the E stack is enabled and added to
it to form the correct symbol vi. The contents of the Band
E stacks are then popped one location up such that the
error location number and the corresponding magnitude that
are JUSt used are shifted out. The bottom location of the B
stack will be loaded by binary 255 (FFH) and that ot E
stack will be loaded by 0. If the contents of the counter
and the output ot the top location of the B stack do not
match, then the symbol ri is correct and shifted out
unchanged.
This process will continue for 255 pulses. Therefore the
maximum value the counter C reaches is 254. This is the
reason why the no-error locations of the B stack are loaded
with binary 255 CFFH) that causes no match between the
counter output and this value at any time.
This processor
comparator
is
is implemented using Schottky TTL IC. The
constructed
using
two
4-bi t
magnitude
comparators, SN74S85s which has 18 ns time delay. The
counter is contructed using two 4-bit binary counters
SN74Sl6ls. The registers are implemented using 10 octal
D-type Flip Flops, SN74S374s. The adder is implemented
using two Quad 2-input XOR, SN74S86s. Three SN74S08s are
used for the control AND gates.
The total chip count required to implement this processor
is 19. This processor requires 255 clock pulse to correct
and read one data block out.
~The Queue
Buffer:
This is the buffer where the five
received vectors which
106
are currently being decoded are stored. It is organized in
a form ot a queue, i.e. the first vector to get in is the
first vector to get out (FIFO). As discussed in Chapter 3,
the size of this buffer is (5x255)x8, and is divided into
five units 255x8 each as shown in Figure 9.2.
There are two pointers RDP and WRP. The pointer RDP always
points at the top unit of the queue buffer, Brd' from which
the vector being corrected and transmitted out is read.
While, the pointer WRP always points at the bottom unit of
the queue, Bwr' where the newly received vector is written
into. The read and write operations from and into these two
units are synchronized (have the same clock pulse signal);
therefore only one address counter is required.
To select the Brd buffer unit, the output of the RDP
pointer is oecoded using 3/8 binary decoder. The outputs ot
this decoder control a set of tri-state buffers such that
only the output of Brd unit is enabled to the buffer output
data bus. To select the Bwr buffer unit the output of the
WRP pointer is also decoded using another 3/8 binary
decoder. The outputs of this decoder control a set of
tri-state buffers such that the input data bus is enabled
to Bwr data inputs only.
For every clock pulse received, one data byte is read out
of the Brd unit and one data byte is written into the Bwr
unit. At the same time the address counter is incremented.
After completing the read and write operations ot wnole
data blocks, the buffer is reset by the RS control signal
provided oy the overall controllere This s~gnal will clear
the address counter, and increment the two pointers RDP and
WRP to be ready for a new read or write operation
respectively. Each of these two pointers is a 3-bi t up
counter designed to count up from zero to four, and then
back to zero. The WRP pointer is always one unit ahead of
107
pl
8
utput
Data Bus
,.;>-----.,""---f
Buffer Unit
8
8
Address Lines
8
Address
8
RS
CLK
8
Buffer Unit
In put Data Bus
wl
w5
RS
Figure 9.2 The Queue Buffer Circuit Diagram
'
0
RS
108
the RDP, i.e. if WRP contains 4 then RDP will contain 3.
The buffer requires: five (256x8) RAMs, 8xl0 tri-state
buffers, two (3/8) binary decoders, one 8-bi t up binary
counter and two 3-bit counters.
~ ~
Syatem Overall £ontroller:
The system overall controller controls all the decoder's
five processors. This decoder system is a part of either a
digital communication or data storage system, the Host
system. Therefore, the decoder system could be controlled
directly by the Host system overall controller, or it could
have ~ts own overall controller which would be controlled
by the Host system overall controller.
Eacn ot the decoder pipelined processors is designed such
that a minimum external control is required. The only
control signals required for each processor are
clock
pulse and reset signals, as shown in Figure 9.3. The timing
diagram of these control signals is shown in Figure 9.4.
Since the received symbols are fed to the Syndrome
processor and the Queue buffer at the same time, they are
clocked using the same clock pulse signal, CPl. The
Error-Correcting processor is correcting and reading data
at the same rate; therefore it is clocked using the same
clock pulse signal, CPl.
The Error-Location processor is clocked by the main clock
pulse s1gnal, CP.
The Sigma and Error-Magnitude processors are running at a
much faster rate than any of the other processors,
therefore, they are clocked by a clock pulse signal that
has a trequency which is a multiple ot the frequency of the
'
6
Queue Buffer
A
t
I
CPl
r( X)
Syndrome
Processor
RS
ErrorLocation
Processor
Sigma
Processor
1-
/'-
A
I
I
t
CP1
CP2
RS
.A.
I t
Fl
CP
RS
ErrorMagn i tude
Processor
1
l
F2 ,
_,...,.
/'..
I 1
CP2
ErrorCorrection ~
Processor
RS
I
t
Cpl
RS
Figure 9.3 Overall Controller Control Signals
_,
0
'-0
110
CPl
Figure 9.4 Control Signals Timing Diagram
111
CP signal.
The RS signal that is provided by the overall controller
has the tollowing functions:
1- latches the data coming out of each intermediate
processor into the input ports of the following
processor,
2- resets all counters and control Flip Flops of all
the decoder processors and
3- it resets the two 8085 microprocessors of the Sigma
and Error-Magnitude processors.
When the Sigma or the Error-Location processor detects an
uncorrectable error, it sets the Fl or F2 flags,
respectively. When either Fl or F2 flags is set, this
indicates to the overall controller that the data block
stored in either the buffer unit (WRP-1) or (WRP-2)
respectively is an uncorrectable corrupted data block. The
action to be taken next is upto the Host system.
CHAPTER X
CONCLUSION
One ot the important parts in the design ot modern digital
communication and data storage systems is the error
detecting and correcting system which is responsible for
the reliable recovery of data.
In this project, Reed-Solomon codes encoding and decoding
algorithms are first discussed. A complete logic circuit
design of the ( 255, 245) Reed-Solomon code encoder/ decoder
microprocessor based system is then presented. This code is
detined over Galois Field GF(2 8 > and has the capability of
correcting up to five burst errors of 8 bits each or any
burst combination of upto a total length ot 40 bits
provided they only affect a maximum of five individual
symbols (bytes).
Although a Reed-Solomon code, with smaller dimensions,
could have oeen selected, this code in particular has been
chosen because of its suitable symbol size that matches the
very commonly used 8-bit aata byte.
Alternative
approaches
to
design
Reed-Solomon
encoders
based on linear feed back shift register implementation
have been investigated. Two-level ROM, one-level ROM,
combinational logic circuit field
implementation have been discussed.
For
better
pipelined
pipelined
efficiency
and
higher
elements
speed,
a
multiplier
five-stage
structur~d
decoder has been designed.
This
architecture utilizes the parallelism in the
decoding algorithms which consist of several distinct
steps. The aecoder consists of five independently designed
112
113
processors. These processors are: the Syndrom, Sigma,
Error-Location,
Error-Magnitude
and
Error-Correction
processors. The detailed design of each ot these processors
is presented. Berlekamp' s iterative algorithm is used to
determine
the
coefficients
of
the
error
location
polynomial, and Chien's search algorithm is used to find
out its roots. The Sigma and Error-Magnitude processors are
designed using Intel 8085A microprocessors, while the other
processors are designed using off the shelf integrated
circuits.
The cost ot 1mplementing each processor has been expressed
in terms of the number of IC chips required to build it. On
the other hand, the speed of each processor has been
expressed in terms of the total time delay.
Although this system has been designed for a code that has
a natural length of 255 symbols (2040 binary bits), this
code can easily be shortened to any length to match any
system specifications without any major change in the
encoder/decoder hardware circuits.
Although Reed-Solomon codes have been known for a long
time, very little literature is available. Perhaps this
paper could be the only available document that presents a
complete
detailed
design
of
Reed-Solomon
code
a
encoder/decoder system in one packet ••
Besides the enormous experience the author had while
reviewing literature and completing the design of the
system presented in this project, this design has shown
that with today's technology, it is feasible to design and
build Reed-Solomon codes encoder/decoder systems using otf
the shelf integrated circuits.
Alternative
approaches
to
design
Reed-Solomon
codes
114
encoder/decoder systems (for example, using special purpose
processors
designed
to
handle
Galois
Field
element
computations) [2], comparison of cost and performance of
these designs with the one presented, actual design
implementation, putting this design in a form that suits
VLSI implementation remain as work for the future •
.
"
115
REFERENCES
1- Berlekamp, E.R., •on Decoding Binary Bose-ChandhuriHocquenghen Codes", IEEE Trans. Inf. Theory, lT-11, pp.
577-580, October 1965.
j;;:~-~Berlekamp,
l
~-,/
3-
4-
E. R.,
}
n
IEEE Trans. Inf.
November 1982.
Bit Serial Reed-Solomon Encoders ",
Theory,
Vol
lT-28,
pp.
Berlekamp,
E.
R.,
Algebraic
Coding
McGraw-Hill Book Company, New York, 1968.
Blahut,
R.
E.,
869-874,
Theory
Theory and Practice of Error
Codes, Addison-Wesley
Mass., 1983.
Publishing
Company,
,
Control
Reading,
5- Chien, R. T., " Cyclic Decoding Procedure for BoseChandhuri-Hocquenghen Codes", IEEE Trans. Inf. Theory,
lT-10, pp. 357-363, October 1964.
6- El Naga, N. M., " An Error Detecting and Correcting
System For Optical Memory ",
Proceedings of The
International Conference of SPIE, Los Angeles, calif.,
January 1982.
7- Furney, G. D., " On Decoding BCH Codes ", IEEE Trans.
Inf. Theory, lT-11, pp. 577-580, October 1965.
8- Hamming, R.
Codes"
1950.
9-
Intel.
,
w., "
Error Detecting and Error Correcting
Bell System Tech. J.,
"MCS-8085
Family
29 pp. 147-160, April
User's
Manual",
Intel
116
Corporation, Santa Clara, CA, October 1979.
10- Lin, S., Costello, D.J., Jr., Error Control Coding
Fundamentals and Application~, Prentice-Hall, Englewood
Cliffs, N.J., 1983.
11- Massey, J. L., " Step-by-Step Decoding of the BoseChaudhuri-Hocquenghen Codes", IEEE Trans. Inf. Theory,
lT-11, pp. 580-585, October 1965.
12-
Peterson, W. w., " Encoding and Error-Correcting
Procedures for The Bose-Chaundhuri Codes •, IRE Trans.
Inf. Theory, lT-6, pp. 459-470, September 1960.
13-
Peterson, W. w., and Weldon, E.J., Jr., ErrorCorrecting Codes, 2nd Ed., MIT press, Cambridge, Mass.,
1970.
14-
Rooney, V. M. ,
Microcomputers",
York, 1984.
Ismail, A. R., Microprocessors
Macmillan Publishing Company,
and
New
15- Texas Instruments Inc.,
The TTL Data Book For Design
Engineers , 2nd Ed., Dallas: Texas Instruments, Inc.,
Texas, 1981.