Download Self-Checking Circuits

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Soft error wikipedia , lookup

Index of electronics articles wikipedia , lookup

Immunity-aware programming wikipedia , lookup

Transcript
Self-Checking Circuits
Delay-Insensitive Codes
and
Self-Checking Checkers
Self-Checking Circuits
• Most important factors in designing a digital system:
Speed, Cost and Correctness.
• Some systems used in
1. medical equipment used in ICUs,
2. aircraft control systems,
3. nuclear reactor control systems,
4. military systems and
5. computing systems used in space missions.
• High reliability is of the utmost importance.
Self-Checking Circuits
• Def: Self-Checking Circuit
Circuits detecting faults in normal operation.
• Faults: stuck at zero and stuck at one
stuck at one in one input of an 2-input OR gate?
stuck at zero in one input of an 2-input OR gate?
stuck at one in one input of an 2-input AND gate?
stuck at zero in one input of an 2-input AND gate?
Self-Checking Circuits:
• Def: Error
An incorrect output caused by a stuck-at fault.
• Def: Single Error
An error that affects only a single component value
• Def: Multiple Error
An error that affects multiple component values.
• The component value affected by an error may
change form 0 to 1, or vice versa.
•Def: unidirectional errors
When all components affected by a multiple error
change their values monotonically.
Self-Checking Circuits:
• Def: Error Detecting Code
1. Y  U (U is the universe of vectors)
2. Y is the set of code words and
U-Y is the set of noncode words.
3. y  y' due to an error and y  Y and y'  U - Y
• Def: Hamming distance of two vectors x and y
the number of components in which they differ.
• Def: Hamming distance of a code X
the minimum of the Hamming distances between all
possible pairs of code words in X.
Self-Checking Circuits:
• Lemma: A code with Hamming distance d+1
can detect all errors with weight d or less.
• Lemma: A code with Hamming distance 2c+1
can correct all errors with weight c or less.
• One-Hot Code:
1. Delay-Insensitive Code
2. Detect one error (H.D.=2).
Fault-tolerant Systems
• masking scheme:
1. All of the redundant modules are active at all times.
2. When a fault occurs, the faulty module is masked.
3. The most common masking scheme is
triple modular redundancy in which the outputs
of three copies of function units are fed to a
majority gate.
4. If one of the three modules becomes faulty,
the two remaining fault-free modules mask the
results of the faulty one when the majority vote
is performed.
Fault-tolerant Systems
• Standby scheme:
1. only one copy of the system is active.
2. When the active module detects the occurrence
of faults, the standby module is activated and
takes over the control.
3. Thus, to use self-checking circuits in a
fault-tolerant system, double module redundancy
is sufficient.
4. This scheme may be superior than the former
in terms of power consumption and hardware cost.
Self-checking scheme
• Self-Checking scheme:
1. a self-checking functional unit.
2. a self-checking checker.
Inputs...
X
Self-Checking
...
functional unit
...
...
X: input code space
Y: output code space
Self-checking
checker
Error signal
Outputs
Y
Self-Checking Circuits
• During the fault-free operation:
a normal input will produce a normal output.
• If an incorrect output is produced due to a fault,
the error should be detected by the self-checking
checker.
Self-checking scheme
1. the set of phsical faults : Φ
2. a fault in Φ :   Φ :
3. a function containing fault  and x  X :
F(x,).
4. a fault - free function : F ( x , ).
5. Output code space (all the output code words)
Y  {F(x,)|x  X}
Self-checking scheme
• Fault Secure(FS):
code word input to a faulty circuit must not produce
an incorrect code word output.
• Self-testing:
a fault in a circuit must be detected by some input.
Self-checking scheme
• Fault Secure(FS):
A circuit is called fault- sec ure with respect to Φ
if and only if
  Φ x  X , F(x, )  Y or F(x, )  F(x,  ).
• Self-testing:
A circuit is called self-testing with respect to Φ
if and only if
  Φ x  X , F(x, )  Y .
Self-checking scheme
• Totally Self-Checking:
A circuit is called totally self-checking (TSC) with respect to Φ
if and only if it is self-testing and fault- sec ure with respect to Φ.
• Partially Self-Checking:
A circuit is called partially self-checking with respect to Φ
if and only if
it is self-testing for X and fault- sec ure for a subset I  X .
Self-checking scheme
• Fault-secure-only circuits:
1. No erroneous results go undetected.
2. However, it is possible that some fault can
never be detected.
• Self-testing-only circuit:
1. Any fault can produce undetected errors for a
short time.
2. However, there is a code word input that can
detect the fault.
Self-checking scheme
• Totally self-checking circuit:
1. no erroneous results go undetected and
2. any fault will be eventually detected.
• Partially self-checking circuits:
1. This approach is to restrict the set of faults for
which the circuit has to be checked.
2. They are introduced to provide low-cost error
detection.
3. They may be used in non-critical applications.
Delay-Insensitive Tree Adder
Self-Checking Checkers
• Code-disjoint:
A circuit is called code  disjo int if and only if
x  X , F(x, )  Y and x  X , F(x, )  Y .
• TSC Checker:
A circuit is called a TSC checker with respect to Φ
if and only if it is self - testing, fault - secure and
code - disjoint with respect to Φ .
With the code-disjoint feature, one may be able to
test if the TSC checker is malfunction.
DI Adder Checker
• Code word input: one-hot code of output signals (adder)
• Correct code word output Z0 Z1 = 10
• Incorrect code word outputs Z0 Z1 = {00 01 11}
Delay-Insensitive Codes
• M/N code (M<N): M-out-of-N code
all valid code words have exactly M 1’s and N-M O’s.
Length of M/N code: C(N,M)
1. One-hot code(1/N code):
a. dual-rail encoding: (01 10)
b. 1/3 code: (001 010 100)
c. length of one-hot code: C(N,1)
2. Optimal M/N code: M=N/2
a. 3/6 code: (000111, 001011, 001101, 001110, …)
b. length of M/N Code = C(N, N/2)
• Berger Code, Modified Berger code:
Delay-Insensitive Transmission
Sender
Receiver
I
I
DI codes
encoder
decoder
Delay-Insensitive Transmission
Cost factors:
1. Number of Wires (cost)
2. Encoder (logic complexity/computation time)
3. Decoder (logic complexity/computation time)
Delay-Insensitive Codes: Berger
• Berger Code:
1. Systematic code: Information bits + Check bits
(note that M/N code is a nonsystematic code).
2. Check bits  K and Informatio n bits  I
K  log 2 I  1
3. Check bits = counting the number of 0’s in I bits.
4. See table (Next page)
Delay-Insensitive Codes
N
Info
bit
2-rail
0 0000
1111
100
000111
1 0001
1110
011
001011
2 0010
1101
011
001101
3 0011
1100
010
001110
4 0100
1011
011
010011
…
…
…
5.
...
Systematic
Non-sys
Berger 3/6 code
Delay-Insensitive Transmission:
Berger codes
DI codes
Sender
I’
I
Check
bits
C
C’
Receiver
Check
bits
C’’
Compare
Valid?
Self-Checking Checkers:
• Self-checking Checkers of M/N code
1. One-hot code(1/N code):
a. dual-rail encoding: (01 10): shown in DI Adders
b. 1/N code:
C(N,2)
c. Z0: completion signal
Z1: error detection
& & ... &
...
+
+
Z0
Z1
Self-Checking Checkers:
• Self-checking Checkers of M/N code
2. Optimal M/N code: M=N/2
a. 3/6 code: (000111, 001011, 001101, 001110, …)
b. length of M/N Code = C(N, N/2)
N/2 C(N,N/2)
..
&
& ... &
N/2+1C(N,n/2+1)
..
&
& ... &
+
+
Z0
Z1
Self-Checking Checkers:
• Use Sorting Networks for Self-checking Checkers:
• General sorting network:
n
unsorted
numbers
A1
...
Sorting ...
Network
An
A1
A2
Comparator
Max(A1, … , An) n
sorted
numbers
Min(A1, … , An)
Max(A1,A2)
Min(A1, A2)
Self-Checking Checkers:
• Binary sorting network:
n
binary
input
x1
...
...
xn
x1
x2
Sorting
Network ...
Comparator
1
1
0
0
k 1’s
n-k o’s
Max(x1,x2)= x1+x2
Min(x1, x2)= x1x2
Self-Checking Checkers:
• Binary sorting network for 2/4 code
CMP
CMP
CMP
CMP
CMP
Error Collection Codes:
• Code with HD>=3 may correct error
Ex. HD=4: 0011 1100