Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
A General Method For Maximizing the Error-Detecting Ability of Distributed Algorithms Martina Schollmeyer and Bruce McMillin IEEE Transactions on parallel and distributed System, VOL. 8, NO. 2, February 1997 元智大學 資訊工程所 陳桂慧 1999.04.21 Outline • Terminology for MPS Topology • Coloring algorithm • Maximal fault index (MFI) for common topologies • Conclusion Terminology for MPS Topology • The communication environment (CE) of a processor Pi is the set of processors from which Pi will receive information during the execution a program. CE(Pi) is a subset of {P1,P2,…Pn} • A fault group of a processor, FG(Pi), of fault tolerance tl, is the collection of faulty processors in CE(Pi), to guarantee error detection for all errors caused by these faults, required that • A collection of processors that must be nonfaulty to guarantee detection of all errors induced by a set of faulty processor P is called the nonfault group of P, NFG(P) Coloring Algorithm • Coloring algorithm - to color the graph, indicating faultiness or nonfaultiness of components, when determine the NFG of an individually faulty processor. – describe how the coloring is done for one fault in each CE( tl =1) – extend the algorithm for tl >1 to multicoloring, where each vertex has a chromaticity of tl , to obtain the NFGs. – this algorithm is used to obtain a possible distribution of component failures for the whole MPS. for i:=1 to n // n is the total # of processors color Pi faulty; color (Pj)(i j Pj CE( Pi)) non-faulty; //NFG({pi}) save NFG({Pi}); reset colors; end for An algorithm to determine the NFG of a set P of faulty processor To determine a permissible fault distribution for the entire network. 1. Select 1 to be faulty => 2,3 nonfaulty 2a. Arbitrarily node 5 is chosen to be faulty =>4,6,7 nonfaulty 2b.Arbitrarily node 3 is chosen to be faulty =>all nonfaulty • This provides a total of three faulty processors, 1, 3, and 5 with at most two faulty component in each CE. A fault matrix (FM) of an MPS gives, for all sets of faulty processor P, all processors that must be nonfaulty if the elements in P are faulty. A FM corresponds to a collection of NFGs for a specific tl . MFI for Meshes - Square Pattern • For an arbitrary n*m torus-connected mesh, with tl =1 MFI = div(m,2) * div(m,2) MFI for Meshes - Star Pattern • The number of resource nodes in kary n-cube for perfect 1-adjacency as X=k^n/(2n+1), k>2 • A torus-connected mesh is a k-ary 2cube, if we can guarantee that we have only k*k meshes X = k^2/5 MFI for Binary Hypercubes Conclusion • The Maximal fault index was introduced to demonstrate how a maximal number of simultaneous component failures can be tolerated by an error detecting algorithm, based on specific distributions of the faults within the interconnection network. • The assessment of an error-detecting algorithm based on the concept of its minimal and maximal fault index can be used for safety critical system, especially with respect to the process-to-processor mapping that can be obtained from it.