Download Handbook of Statistics, Vol. 7. Quality Control and Reliability by Krishnaiah P.R., Rao C.R. (

yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
This is the seventh volume in the series 'Handbook of Statistics' started by the
late Professor P. R. Krishnaiah to provide comprehensive reference books in
different areas of statistical theory and applications. Each volume is devoted to
a particular topic in statistics; the present one is on 'Quality Control and Reliability', a modern branch of statistics dealing with the complex problems in the
production of goods and services, maintenance and repair, and management and
operations. The accent is on quality and reliability in all these aspects.
The leading chapter in the volume is written by W. Edwards Deming, a pioneer
in statistical quality control, who spearheaded the quality control movement in
Japan and helped the country in its rapid industrial development during the post
war period. He gives a 14-point program for the management to keep a country
in the ascending path of industrial development.
Two main areas of concern in practice are the reliability of the hardware and
of the process control software. The estimation of hardware reliability and its uses
is discussed under a variety of models for reliability by R.A. Johnson in
Chapter 3, M. Mazumdar in Chapter 4, L. F. Pan in Chapter 15, H. L. Harter
in Chapter 22, A. P. Basu in Chapter 23, and S. Iyengar and G. Patwardhan in
Chapter 24. The estimation of software reliability is considered by F. B. Bastani
and C. V. Ramamoorthy in Chapter 2 and T. A. Mazzuchi and N. D. Singpurwalla in Chapter 5.
The main concepts and theory of reliability are discussed in Chapters 10, 12,
13, 14 and 21 by F. Proschan in collaboration with P. J. Boland, F. Guess, R. E.
Barlow, G. Mimmack, E. E1-Neweihi and J. Sethuraman.
Chapter 6 by N. R. Chaganty and K. Joag-dev, Chapter 7 by B. W. Woodruff
and A. H. Moore, Chapter 9 by S. S. Gupta and S. Panchapakesan, Chapter 11
by M . C . Bhattacharjee and Chapter 16 by W . J . Padgett deal with some
statistical inference problems arising in reliability theory.
Several aspects of quality control of manufactured goods are discussed in
Chapter 17 by F. B. Alt and N. D. Smith, in Chapter 18 by B. Hoadley, in
Chapter 20 by M. CsOrg6 and L. Horv6th and in Chapter 19 by P. R. Krishnaiah
and B. Q. Miao.
All the chapters are written by outstanding scholars in their fields of expertise
and I wish to thank all of them for their excellent contributions. Special thanks
are due to Elsevier Science Publishers B.V. (North-Holland) for their patience and
cooperation in bringing out this volume.
C. R. Rao
F. B. Alt, Dept. of Management Science & Stat., University of Maryland, College
Park, MD 20742, USA (Ch. 17)
F. B. Bastani, Dept. of Computer Science, University of Houston, University Park,
Houston, TX 77004, USA (Ch. 2)
A. P. Basu, Dept. of Statistics, University of Missouri-Columbia, 328 Math. Science
Building, Columbia, MO 65201, USA (Ch. 23)
M. C. Bhattacharjee, Dept. of Mathematics, New Jersey Inst. of Technology, Newark,
NJ 07102, USA (Ch. 11)
H. W. Block, Dept. of Mathematics & Statistics, University of Pittsburgh, Pittsburgh,
PA 15260, USA (Ch. 8)
P. J. Boland, Dept. of Mathematics, University College, Belfield, Dublin 4, Ireland
(Ch. 10)
R. E. Barlow, Operations Research Center, University of California, Berkeley, CA
94720, USA (Ch. 13)
N. R. Chaganty, Math, Dept., Old Dominion University, Hampton Blvd., Norfolk, VA
23508, USA (Ch. 6)
M. CsOrg6, Dept. of Mathematics & Statistics, Carleton University, Ottawa, Ontario,
Canada K1S 5B6 (Ch. 20)
W. Edwards Deming, Consultant in Statistical Studies, 4924 Butterworth Place,
Washington, DC 20016, USA (Ch. 1)
F. M. Guess, Department of Statistics, University of South Carolina, Columbia,
South Carolina 29208, USA (Ch. 12)
S. Gupta, Dept. of Statistics, Math./Science Building, Purdue University, Lafayette,
IN 47907, USA (Ch. 9)
H. L. Harter, 32 S. Wright Ave., Dayton, OH 45403, USA (Ch. 22)
B. Hoadley, Bell Laboratories, HP 1A-250, HolmdeL NJ 07733, USA (Ch. 18)
L. Horvhth, Bolyai Institute, Szeged University, Aradi Vertanuk tere 1, H-6720
Szeged, Hungary (Ch. 20)
S. Iyengar, Dept. of Statistics, Carnegie Mellon University, Pittsburgh, PA 15213,
USA (Ch. 24)
K. Joag-dev, Dept. of Mathematics, University of Illinois at Urbana-Champaign,
Urbana, IL 61801, USA (Ch. 6)
R. A. Johnson, Dept. of Statistics, 1210 West Dayton Street, Madison, WI 53706,
USA (Oh. 3)
M. Mazumdar, Dept. of Industrial Engineering, University of Pittsburgh, Benedum
Hall 1048, Pittsburgh, PA 15260, USA (Ch. 4)
T. A. Mazzuchi, c/o N. D. Singpurwalla, Operations Research & Statistics, Geo
Washington University, Washington, DC 20052, USA (Ch. 5)
B. Miao, Dept. of Math. & Stat., University of Pittsburgh, Pittsburgh, PA 15260,
USA (Ch. 19)
G. M. Mimmack, c/o F. Proschan, Statistics Department, Florida State University,
Tallahassee, FL 32306, USA (Ch. 14)
A. H. Moore, AFIT/ENC, Wright-Patterson AFB, OH 45433, USA (Ch. 7)
E. E1-Neweihi, Dept. of Math., Stat. & Comp. Sci., University of Illinois, Chicago,
IL 60680, USA (Ch. 21)
W. J. Padgett, Math. & Stat. Department, University of South Carolina, Columbia,
SC 29208, USA (Ch. 16)
G. Patwardhan, Dept. of Mathematics, Pennsylvania State University at Altoona,
Altoona, PA 16603, USA (Ch. 24)
S. Panchapakesan, Mathematics Department, Southern Illinois University, Carbondale, IL 62901, USA (Ch. 9)
L. F. Pau, 7 Route de Drize, CH 1227 Carouge, Switzerland (Ch. 15)
F. Proschan, Statistics Department, Florida State University, Tallahassee, FL 32306,
USA (Ch. 10, 12, 13, 14, 21)
C. V. Ramamoorthy, Dept. of Electrical Engineering & Comp. Sci., University of
California at Berkeley, Berkeley, CA 94720, USA (Ch. 2)
T. H. Savits, Dept. of Mathematics & Statistics, University of Pittsburgh, Pittsburgh,
PA 15260, USA (Ch. 8)
J. Sethuraman, Dept. of Statistics, Florida State University, Tallahassee, FL 32306,
USA (Ch. 22)
N. D. Singpurwalla, Operations Research & Statistics, George Washington University, Washington, DC 20052, USA (Ch. 5)
N. D. Smith, Dept. of Management Sci. & Stat., University of Maryland, College
Park, MD 20742, USA (Ch. 17)
B. Woodruff, Directorate of Mathematical & Inf. Service, AFOSR/NM, Bolling Air
Force Base, DC 20332, USA (Ch. 17)
P. R. Krishnaiah and C. R. Rao, eds., Handbook of Statistics, Vol. 7
© Elsevier Science Publishers B.V. (1988) 1-6
Transformation of Westem Style of Management*
W. Edwards Deming
1. The crisis of Western industry
The decline of Western industry, which began in 1968 and 1969, a victim of
competition, has reached little by little a stage that can only be characterized as
a crisis. The decline is caused by Western style of management, and it will
continue until the cause is corrected. In fact, the decline may be ready for a nose
dive. Some companies will die a natural death, victims of Charles Darwin's
inexhorable law of survival of the fittest. In others, there will be awakening and
conversion of management.
What happened? American industry knew nothing but expansion from 1950 till
around 1968. American goods had the market. Then, one by one, many American
companies awakened to the reality of competition from Japan.
Little by little, one by one, the manufacture of parts and materials moves out
of the Western world into Japan, Korea, Taiwan, and now Brazil, for reasons of
quality and price. More business is carded on now between the U. S. and the
Pacific Basin than across the Atlantic Ocean.
A sudden crisis like Pearl Harbor brings everybody out in full force, ready for
action, even if they have no idea what to do. But a crisis that creeps in catches
its victims asleep.
2. A declining market exposes weaknesses
Management in an expanding market is fairly easy. It is difficult to lose when
business simply drops into the basket. But when competition presses into the
market, knowledge and skill are required for survival. Excuses ran out. By 1969,
the comptroller and the legal department began to take charge for survival, fighting a defensive war, backs to the wall. The comptroller does his best, using only
visible figures, trying to hold the company in the black, unaware of the importance
* Parts of this Chapter are extracts from the author's book Out of the Crisis (Center for Advanced
Engineering Study, Massachusetts Institute of Technology, 1985).
w. Edwards Deming
for management of figures that are unknown and unknowable. The legal department fights off creditors and predators that are on the lookout for an attractive
takeover. Unfortunately, management by the comptroller and the legal department
only brings further decline.
3. Forces that feed the decline
The decline is accelerated by the aim of management to boost the quarterly
dividend, and to maximize the price of the company's stock. Quick returns,
whether by acquisition, or by divestiture, or by paper profits or by creative
accounting, are self-defeating. The effect in the long run erodes investment and
ends up as just the opposite to what is intended.
A far better plan is to protect investment by plans and methods by which to
improve product and service, accepting the inevitable decrease in costs that accompany improvement of quality and service, thus reversing the decline, capturing the
market with better quality and lower price. As a result, the company stays in
business and provides jobs and more jobs.
For years, price tag and not total cost of use governed the purchase of materials
and equipment.
Numerical goals and M.B.O. have made their contribution to the decline. A
numerical goal outside the capability of a system can be achieved only by impairment or destruction of some other part of the company. Work standards more
than double costs of production. Worse than that, they rob people of their pride
of workmanship. Quotas of production are guarantee of poor quality. Exhortations are directed at the wrong people. They should be directed at the management, not at the workers.
Other forces are still more destructive.
(1) Lack of constancy of purpose to plan product and service that will have
a market and keep the company in business, and provide jobs.
(2) Emphasis on short-term profits: short-term thinking (just the opposite from
constancy of purpose to stay in business), fed by fear of unfriendly takeover, and
by push from bankers and owners for dividends.
(3) Personal review system, or evaluation of performance, merit rating, annual
review, or annual appraisal, by whatever name, for people in management, the
effects of which are devastating.
(4) Mobility of management; job hopping from one company to another.
(5) Use of visible figures only for management, with little or no consideration
of figures that are unknown or unknowable.
Peculiar to industry in the Unites States:
(6) Excessive medical costs.
(7) Excessive costs of liability.*
* Eugene L. Grant, interviewin the journal Quality, Chicago, March 1984.
Transformation of Western style of management
Anyone could add more inhibitors. One, for example, is the choking of business
by laws and regulations; also by legislation brought on by groups of people with
special interests, the effect of which is too often to nullify the work of standardizing committees of industry, government, and consumers.
Still another force is the system of detailed budgets which leave a division
manager no leeway. In contrast, the manager in Japan is not bothered by detail.
He has complete freedom except for one item; he can not transfer to other uses
his expenditure for education and training.
4. Remarks on evaluation of performance, or the so-called merit rating
Many companies in America have systems by which everyone in management
or in research receives from his superiors a rating every year. Some government
agencies have a similar system. The merit system leads to management by fear.
The effect is devastating.
- It nourishes short-term performance, annihilates long-term planning, builds
fear, demolishes teamwork; nourishes rivalry and politics,
- It leaves people bitter, others despondent and dejected, some even depressed,
unfit for work for weeks after receipt of rating, unable to comprehend why they
are inferior. It is unfair, as it ascribes to the people in a group differences that
may be caused largely if not totally by the system that they work in.
The idea of a merit rating is alluring. The sound of the words captivates the
imagination: pay for what you get; get what you pay for; motivate people to do
their best, for their own good.
The effect of the merit rating is exactly the opposite of what the words promise.
Everyone propels himself forward, or tries to, for his own good, on his own life
preserver. The organization is the loser.
Moreover, a merit rating is meaningless as a predictor of performance, whether
in the same job or in one that he might be promoted into. One may predict
performance only for someone that falls outside the limits of differences attributable to the system that the people work in.
5. Modern principles of leadership
Modern principles of leadership will in time replace the annual performance
review. The first step in a company will be to provide education in leadership.
This education will include the theory of variation, also known as statistical
theory. The annual performance review may then be abolished. Leadership will
take its place. Suggestions follow.
(1) Institute education in leadership; obligations, principles, and methods.
(2) More careful selection of the people in the first place.
(3) Better training and education after selection.
w. Edwards Deming
(4) A leader, instead of being a judge, will be a colleague, counseling and
leading his people on a day-to-day basis, learning from them and with them.
(5) A leader will discover who if any of his people is (a) outside the system on
the good side, (b)outside on the poor side, (c) belonging to the system. The
calculations required are fairly simple if numbers are used for measures of performance. Ranking of people (outstanding down to unsatisfactory) that belong to
the system violates scientific logic and is ruinous as a policy.
In the absence of numerical data, a leader must make subjective judgment. A
leader will spend hours with every one of his people. They will know what kind
of help they need. There will sometimes be incontrovertible evidence of excellent
performance, such as patents, publication of papers, invitations to give lectures.
People that are on the poor side of the system will require individual help.
Monetary reward for outstanding performance outside the system, without
other, more satisfactory recognition, may be counterproductive.
(6) The people of a group that form a system will all be subject to the company's formula for privileges and for raisesin pay. This formula may involve (e.g.)
seniority. It is important to note that privilege will not depend on rank within the
system. (In bad times, there may be no raise for anybody.)
(7) Figures on performance should be used not to rank the people in a group
that fall within the system, but to assist the leader to accomplish improvement of
the system. These figures may also point out to him some of his own weaknesses.
(8) Have a frank talk with every employee, up to three or four hours, at least
once a year, not for criticism, but to learn from each of them about the job and
how to work together.
The day is here when anyone deprived of a raise or of any privilege through
misuse of figures for performance (as by ranking the people in a group) may with
justice file a grievance.
Improvement of the system will help everybody, and will decrease the spread
between the figures for the performances of people.
6. Other obstacles
(1) Hope for quick results (instant pudding).
(2) The excuse that 'our problems are different'.
(3) Inept teaching in schools of business.
(4) Failure of schools of engineering to teach statistical theory.
(5) Statistical teaching centres fail to prepare students for the needs of industry.
Students learn statistical theory for enumerative studies, then see them applied in
class and in textbooks, without justification nor explanation, to analytic problems.
They learn to calculate estimates of standard errors of the result of an experiment
and in other analytic problems where there is no such thing as a standard error.
They learn tests of hypothesis, null hypothesis, and probability levels of significance. Such calculations and the underlying theory are excellent mathematical
exercises, but they provide no basis for action, no basis for evaluation of the risk
Transformation of Western style of management
of prediction of the results of the next experiment, nor of tomorrow's product,
which is the only question of interest in a study aimed at improvement of performance of a process or of a product.
(6) The supposition by management that the work-force could turn out quality
if they would apply full force their skill and effort. The fact is that nearly everyone
in Western industry, management and work-force, is impeded by barriers to pride
of workmanship.
(7) Reliance on QC-Circles, employee involvement, employee participation
groups, quality of work life, anything to get rid of the problems of people. These
shams, without management's participation, deteriorate and break up after a few
months. The big task ahead is to get the management involved in management
for quality and productivity. The work-force has always been involved. There will
then be quality of work life, pride of workmanship, and quality. Applications of
techniques within the system as it exists often accomplish great improvements in
quality, productivity and reduction of waste.
7. Remarks on use of visible figures
The comptroller runs the company on visible figures. This is a sure road to
decline. Why? Because the most important figures for management are not visible:
they are unknown and unknowable. Do courses in finance teach students the
importance of the unknown and unknowable loss
- from a dissatisfied customer?
- from a dissatisfied employee, one that, because of correctible faults of the
system, can not take pride in his work?
- from the annual rating on performance, the so-called merit rating?
- loss from absenteeism (purely a function of supervision)?
Do courses in finance teach their students about the increase in productivity
that comes from people that can take pride in their work?
Unfortunately, the answer is no.
8. Condensation of the 14 points for management
There is now a theory of management. No one can say now that there is
nothing about management to teach. If experience by itself would teach management how to improve, then why are we in this predicament? Everyone doing his
best is not the answer that will halt the decline. It is necessary that everyone know
what to do; then for everyone to do his best.
The 14 points apply anywhere, to small organizations as well as to large ones,
to the service industry as well as to manufacturing.
(1) Create constancy of purpose toward improvement of product and service,
with the aim to excel in quality of product and service, to stay in business, and
to provide jobs.
IV. Edwards Deming
(2) Adopt the new philosophy. We are in a new economic age, created by
Japan. Transformation of Western style of management is necessary to halt the
continued decline of industry.
(3) Cease dependence on inspection to achieve quality. Eliminate the need for
inspection on a mass basis by building quality into the product in the first place.
(4) End the practice of awarding business on the basis of price tag. Purchasing
must be combined with design of product, manufacturing, and sales, to work with
the chosen supplier, the aim being to minimizing total cost, not initial cost.
(5) Improve constantly and forever every activity in the company, to improve
quality and productivity, and thus constantly decrease costs. Improve design of
(6) Institute training on the job, including management.
(7) Institute supervision. The aim of supervision should be to help people and
machines and gadgets to do a better job.
(8) Drive out fear, so that everyone may work effectively for the company.
(9) Break down barriers between departments. People in research, design,
sales, and production must work as a team, to foresee problems of production
and in use that may be encountered the product or service.
(10) Eliminate slogans, exhortations, and targets for the work force asking for
fewer defects and new levels of productivity. Such exhortations only create adversarial relationships, as the bulk of the causes of low quality and low productivity
belong to the system and thus lie beyond the power of the work force.
(11) Eliminate work standards that prescribe numerical quotas for the day.
Substitute aids and helpful supervision.
(12a) Remove the barriers that rob the hourly worker of his right to pride of
workmanship. The responsibility of supervisors must be changed from sheer
numbers to quality.
(b) Remove the barriers that rob people in management and in engineering of
their right to pride of workmanship. This means, inter alia, abolishment of the
annual or merit rating and of management by objective.
(13) Institute a vigorous program of self-improvement and education.
(14) Put everybody in the company to work in teams to accomplish the transformation. Teamwork is possible only where the merit rating is abolished, and
leadership put in its place.
9. What is required for change?
The first step is for Western management to awaken to the need for change.
It will be noted that the 14 points as a package, plus removal of the deadly
diseases and obstacles to quality, are the responsibility of management.
Management in authority will explain by seminars and other means to a critical
mass of people in the company why change is necessary, and that the change will
involve everybody. Everyone must understand the 14 points, the deadly diseases,
and the obstacles. Top management and everyone else must have the courage to
change. Top management must break out of line, even to the point of exile
amongst their peers.
P. R. Krishnaiah and C. R. Rao, eds., Handbook of Statistics, Vol. 7
© Elsevier Science Publishers B.V. (1988) 7-25
Software Reliability
F. B. Bastani a n d C. V. R a m a m o o r t h y
1. Introduction
Process control systems, such as nuclear power plant safety control systems,
air-traffic control systems and ballistic missile defense systems, are embedded
computer systems. They are characterized by severe reliability, performance and
maintainability requirements. The reliability criterion is particularly crucial since
any failures can be catastrophic. Hence, the reliability of these systems must be
accurately measured prior to actual use.
The theoretical basis for methods of estimating the reliability of the hardware
is well developed (Barlow and Proschan, 1975). In this paper we discuss methods
of estimating the reliability of process control software.
Program proving techniques can, in principle, establish whether the program is
correct with respect to its specification or whether it contains some errors. This
is the ideal approach since there is no physical deterioration or random malfunctions in software. However, the functions expected of process control systems
are usually so complex that the specifications themselves can be incorrect and/or
incomplete, thus limiting the applicability of program proofs.
One approach is to use statistical methods in order to assess the reliability of
the program based on the set of test cases used. Since the early 1970's, several
models have been proposed for estimating software reliability and some related
parameters, such as the mean time to failure (MTTF), residual error content, and
other measures of confidence in the software. These models are based on three
basic approaches to estimating software reliability. Firstly, one can observe the
error history of a program and use this in order to predict its future behavior.
Models in this category are applicable during the testing and debugging phase. It
is often assumed that the correction of errors does not introduce any new errors.
Hence, the reliability of the program increases and, therefore, these models are
often called reliability growth models. A problem with these models is the difficulty in modelling realistic testing processes. Also, they cannot incorporate program proofs, cannot be applied prior to the debugging phase and have to be
modified significantly in order to be applicable to programs developed using
iterative enhancement.
F.B. Bastani and C. V. Ramamoorthy
The second approach attempts to predict the reliability of a program on the
basis of its behavior for a sample of points taken from its input domain. These
software reliability models are applicable during the validation phase (Ramamoorthy and Bastani, 1982; TRW, 1976). Errors found during this phase are not
corrected. In fact, if errors are discovered the software may be rejected. The size
of the sample required for a given confidence in the reliability estimate can be
reduced by using some knowledge about the relationship between different points
in the input domain. However, general modelling of the nature of the input
domain results in mathematically intractable derivations.
The third method which can be used to estimate software reliability is based
on error seeding (Mills, 1973; Schick and Wolverton, 1978). In this approach the
program is seeded with artificial errors without the knowledge of the team
responsible for testing and debugging the software. At the conclusion of the
testing and debugging phase, the correctness of the program is estimated by
comparing the number of artificial and actual errors found by the test team.
The rest of this paper is organized as follows: Section 2 defines software
reliability and classifies some of the models which have been proposed over the
past several years. Section 3 discusses the concept of error size and testing
process. It states the assumptions of software reliability growth models and
reviews error-counting and non-error-counting models. Section 4 discusses the
measurement of software reliability/correctness using Nelson's model (TRW,
1976) and an input domain based model (Ramamoorthy and Bastani, 1979).
Section 5 summarizes the paper and outlines some research issues in this area.
2. Definition and classification
In this section we first give a formal definition of software reliability and then
present a classification of the models proposed for estimating the reliability of a
2.1. Definition
Software reliability has been defined as the probability that a software fault
which causes deviation from the required output by more than the specified
tolerances, in a specified environment, does not occur during a specified exposure
period (TRW, 1976). Thus, the software needs to be correct only for inputs for
which it is designed (specified environment). Also, if the output is correct within
the specified tolerances in spite of an error, then the error is ignored. This may
happen in the evaluation of complicated floating point expressions where many
approximations are used (e.g., polynomial approximations for cosine, sine, etc.).
It is possible that a failure may be due to errors in the compiler, operating
system, microcode or even the hardware. These failures are ignored in estimating
the reliability of the application program. However, the estimation of the overall
system reliability will include the correctness of the supporting software and the
reliability of the hardware.
Software reliability
In some cases it may be desirable to classify software faults into several
categories, ranging from trivial errors (e.g., minor misspellings on a hardcopy
output) to catastrophic errors (e.g., resulting in total loss of control). Then, one
could specify different reliability requirements for the various types of faults. Most
software reliability models can be easily adapted for errors in a given class by
merely ignoring other types of errors when using the model. However, this
decreases the confidence in the reliability estimate since the sample size available
for estimating the parameters of the model is reduced.
The exposure period should be independent of extraneous factors like machine
execution time, programming environment, etc. For many applications the appropriate unit of exposure period is a run corresponding to the selection of a point
from the input domain (specified environment) of the program. However, for some
programs (e.g., an operating system), it is difficult to determine what constitutes
a 'run'. In such cases, the unit of exposure period is time. One has to be careful
in measuring time in these cases (Musa, 1975). For example, if a multiuser,
interactive data base system is being accessed by five users, should the exposure
period be five times the observed time? This may be reasonable if the system is
not saturated since then five users are likely to generate approximately five times
as much work in the observed time as would a single user. However, this is not
true if the system is saturated.
Thus, we have:
R(i) = reliability over i runs = P{no failure over i runs}
R(t) = reliability over t seconds = P{no failure in interval [0, t)}.
(P{E} denotes the probability of the event E.)
Definition (1) leads to an intuitive measure of software reliability. Assuming
that inputs are selected independently according to some probability distribution
function, we have:
R(i) = [R(1)]; = (R);,
where R = R(1). We can define the reliability, R, as follows:
R = 1 - lim nf
where n = number of runs and nf--- number of failures in n runs.
This is the operational definition of software reliability. We can estimate the
reliability of a program by observing the outcomes (success/failure) of a number
of runs under its operating environment. If we observe nf failures out of n runs,
the estimate of R, denoted by/~, is:
F. B. Bastani and C. V. Ramamoorthy
This method of estimating R is the basis of the Nelson model (TRW, 1976).
2.2. Classification
In this subsection we present a classification of some of the software reliability
models proposed over the past fifteen years. The classification scheme is based
on the three different methods of estimating software discussed in Section 1. The
main features of a model serves as a subclassification.
After a program has been coded, it enters a testing and debugging phase.
During this phase, the implemented software is tested till an error is detected.
Then the error is located and corrected. The error history of the program is defined
to be the realization of a sequence of random variables 1"1, T2, . . . , T,, where Tt
denotes the time spent in testing the program after the ( i - 1)-th error was
corrected till the i-th error is detected. One class of software reliability models
attempts to predict the reliability of a program on the basis of its error history.
It is frequently assumed that the correction of errors does not introduce any new
errors. Hence, the reliability of the program increases, and therefore such models
are called software reliability growth models.
Software reliability growth models can be further classified according to whether
they express the reliability in terms of the number of errors remaining in the
program or not. These constitute error-counting and nonerror-counting models,
Error-counting models estimate both the number of errors remaining in the
program as well as its reliability. Both deterministic and stochastic models have
been proposed. Deterministic models assume that if the model parameters are
known then the correction of an error results in a known increase in the reliability.
This category includes the Jelinski-Moranda (1972), Shooman (1972), Musa
(1975), and Schick-Wolverton (1978) models. The general Poisson model (Angus
et al., 1980) is a generalization of these four models. Stochastic models include
Littlewood's Bayesian model (Littlewood, 1980a) which models the (usual) case
where larger errors are detected earlier than smaller errors, and the G o e l Okumoto Nonhomogeneous Poisson Process Model (NHPP) (Goel and Okumoto,
1979a) which assumes that the number of faults to be detected is a random
variable whose observed value depends on the test and other environmental
factors. Extensions to the Goel-Okumoto N H P P model have been proposed by
Ohba (1984) and Yamada et al. (Yamada et al., 1983; Yamada and Osaki, 1985).
The number of errors remaining in the program is useful in estimating the
maintenance cost. However, with these models it is d~Aficult to incorporate the
case where new errors may be introduced in the program as a result of imperfect
debugging. Further, for some of these models the reliability estimate is unstable
if the estimate of the number of remaining errors is low (Forman and Singpurwalla, 1977; Littlewood and Verall, 1980b).
Software reliability
Nonerror-counting models only estimate the reliability of the software. The
Jelinski-Moranda geometric de-eutrophication model (Moranda, 1975) and a
simple model used in the Halden project (Dahl and Lahti, 1978) are deterministic
models in this category. Stochastic models consider the situation where different
errors have different effects on the failure rate of the program. The correction of
an error results in a stochastic increase in the reliability. Examples include a
stochastic input domain based model (1L~M 80), Littlewood and Verrall's
Bayesian model (Littlewood and Verrall, 1973), and the Musa-Okumoto logarithmic model (Musa and Okumoto, 1984).
All the models described above treat the program as a black box. That is, the
reliability is estimated without regard to the structure of the program. The validity
of their assumptions usually increases as the size of the program increases. Since
programs for critical control systems may be of medium size only, these models
are mainly used to obtain a preliminary estimate of the software reliability.
Several variants of software reliability growth models can be obtained by considering various orthogonal factors such as (1) the development of calendar time
expressions for predictions of MTTF, stopping time, etc. (Musa, 1975; Musa and
Okumoto, 1984); (2) the consideration of the time spent in locating and correcting
errors; this aspect is modelled as a Markov process by Trivedi and Shooman
(1975); and, (3) the possibility of imperfect debugging, including the introduction
of new errors (Goel and Okumoto, 1979b).
The second class of software reliability models, called sampling models, estimate
the reliability of a program on the basis of its behavior for a set of points selected
from its input domain. These models are especially attractive for estimating the
reliability of programs developed for critical applications, such as air-traffic control programs, which must be shown to have a high reliability prior to actual use.
At the end of the testing and debugging phase, the software is subjected to a large
amount of testing in order to assess its reliability. Errors found during this phase
are not corrected. In fact, if errors are discovered then the software may be
One sampling model is the Nelson model developed at TRW (1976). It assumes
that the software is tested with test cases having the same distribution as the
actual operating environment. The operational definition discussed earlier is used
to obtain the reliability estimate.
The only disadvantage of the Nelson model is that a large amount of test cases
are required in order to have a high confidence in the reliability estimate. The
approach developed in (Ramamoorthy and Bastani, 1979) reduces the number of
test cases by exploiting the nature of the input domain of the program. An
important feature of this model is that the testing need not be random--any type
of test-selection strategy can be used. However, the model is difficult mathematically and difficult to validate experimentally.
The third approach to assessing software reliability is to insert several known
errors into the program prior to the testing and debugging phase. At the end of
this phase the number of errors remaining in the program can be computed on
the basis of the number of known and unknown errors detected. Models based
F. B. Bastani and C. F. Ramamoorthy
on this approach have been proposed by Mills and Basin (Mills and Basin, 1973;
Schick and Wolverton, 1978) and, more recently, by Duran and Wiorkowski
(1981). The major problem is that it is difficult to select errors which have the
same distribution (such as ease of detectability) as the actual errors in the
program. An alternate approach is to let two different teams independently debug
a program and then estimate the number of errors remaining in the program on
the basis of the number of common and disjoint errors found by them. Besides
the extra cost, this method may underestimate the number of errors remaining in
the program since many errors are easy to detect and, hence, are more likely to
be detected by both the teams. DeMillo, Lipton and Sayward (1978) discuss a
related technique called 'program mutation' for systematically seeding errors into
a program.
In this section we have classified many software reliability models without
describing them in detail. References (Bologna and Ehrenberger, 1978; Dahl and
Lahti, 1978; Schick and Wolverton, 1978; Tal, 1976; Ramamoorthy and Bastani,
1982; Goel and Okumoto, 1985) contain a detailed survey of most of these
models. In the next two sections we discuss a few software reliability growth
models and sampling models, respectively.
30 Software reliability growth models
In this section we first discuss the concepts of error size and testing process.
We develop a general framework for software reliability growth models using these
concepts. Then we briefly discuss some error-counting and nonerror-counting
models. The section concludes with a discussion on the practical application of
such models.
3.1. Error sizes
A program P, maps its input domain,/, into its output space, O. Each element
in I is mapped to a unique element in O if we assume that the state variables (i.e.,
output variables whose values are used during the next run, as in process control
software) are considered a part of both I and O. Software reliability models used
during the development phase are intimately concerned with the size of an error.
This is defined as follows:
DEFINITION. The size of an error is the probability that an element selected
from I according to the test case selection criterion results in failure due to that
An error is easily detected if it has a large size since then it affects many input
elements. Similarly, if it has a small size, then it is relatively more difficult to
detect the error. The size of an error depends on the way the inputs are selected.
Good test case selection strategies, like boundary value testing, path testing and
Software reliability
range testing, magnify the size of an error since they exercise error-prone constructs. Likewise, the observed (effective) error size is lower if the test cases are
randomly chosen from the input domain.
We can generalize the notion of 'error size' by basing it on the different
methods of observing programs. For example, an error has a large size visually
if it can be easily detected by code reading. Similarly, an error is difficult to detect
by code review if it has a small size (e.g., when only one character is missing).
The development phase is assumed to consist of the following cycle:
(1) The program is tested till an error is found;
(2) The error is corrected and step (1) is repeated.
As we have noted above, the error history of a program depends on the testing
strategy employed, so that the reliability models must consider the testing process
used. This is discussed in the following subsection.
3.2. Testing process
As a simple example of a case where the error history is strongly dependent
on the testing process used, consider a program which has three paths, thus
partitioning the input domain into three disjoint subsets. If each input is considered as equally likely, then initially errors are frequently detected. As these are
corrected, the interval between error detection increases since fewer errors remain.
If a path is tested 'well' before testing another path, then whenever a switch is
made to a new path the error detection rate increases. Similarly, if we switch from
random testing to boundary value testing, the error detection rate can increase.
The major assumption of all software reliability growth models is:
ASSUMPTION. Inputs are selected randomly and independently from the input
domain according to the operational distribution.
This is a very strong assumption and will not hold in general, especially so in
the case of process control software where successive inputs are correlated in time
during system operation. For example, if an input corresponds to a temperature
reading then it cannot change very rapidly. To complicate the issue further, most
process control software systems maintain a history of the input variables. The
input to the program is not only the current sensor inputs, but also their history.
This further reduces the validity of the above assumption. The assumption is
necessary in order to keep the analysis and data requirements simple. However,
it is possible to relax it as follows:
ASSUMPTION. Inputs are selected randomly and independently from the input
domain according to some probability distribution (which can change with time).
This means that the effective error size varies with time even though the
program is not changed. This permits a straightforward modelling of the testing
process as discussed in the following subsection.
F. B. Bastani and C. V. Ramamoorthy
3.3. General growth model
number of failures experienced;
number of runs since the j-th failure;
testing process for the k-th run after j failures;
size of residual errors for the k-th run after j failures; this can be random.
e{success on the k-th run IJ failures} = 1 - Vj(k)
= 1 - f(Tj(k))2j
where )~j = error size under operational inputs; this can be a r a n d o m variable;
0 ~< 2./~< 1 ; and f(Tj(k)) = severity of the testing process relative to the operational
inputs; 0 ~< f(Tj(k)) <~ 1/2j.
Rj(kl2j) = P{no failure over k runs b2j}
= I-[ P{no failure on the i-th run 12j},
since successive test cases have independent failure probability.
Rj(kl2j) = [ I [I - f(Tj(i))2j]
Rj(k) = E~j
[1 - f(Tj(i))~j]
where E~j[.] is the expectation over ,~j.
For cases where it is difficult to identify 'runs', such as operating systems and
real-time process control systems, it is simpler to work in continuous time. The
above relation becomes"
Rj(t) = E~j[e- ~jS'of(rj(s))d,]
-- failure rate after the j-th failure; 0 <~ ).j ~< ~ ;
= testing process at time s after the j-th failure;
f ( T j ( s ) ) = severity of testing process relative to operational
0 <~f(Tj(s)) <~ ~ .
(1) As we have noted above, f(Tj(.)) is the severity of the testing
Software reliability
process relative to the operational distribution, where the testing severity is the
ratio of the probability that a run based on the test case selection strategy detects
an error to the probability that a failure occurs on a run selected according to the
operational distribution. Obviously, during the operational phase, f(Tj(.)) = 1. In
general it is difficult to determine the severity of the test cases, and most models
assume that f ( T j ( . ) ) = 1. However, for some testing strategies we can quantify
f(Tj(.)). For example, in functional testing, the severity increases as we switch to
new functions since these are more likely to contain errors than functions which
have already been tested.
(2) Even the weaker assumption is difficult to justify for programs developed
using incremental top-down or bottom-up integration (Myers, 1978), since the
input domain keeps on changing. Further, the assumption ignores other methods
of debugging programs, such as code reviews, static analysis, program proofs, etc.
(3) In the continuous case, the time is the CPU time (Musa, 1975).
(4) Software reliability growth models can be applied (in principle) to any type
of software. However, their validity increases as the size of the software and the
number of programmers involved increases.
(5) This process is a type of doubly stochastic process; these processes were
originally studied by Cox in 1955 (Cox, 1966).
3.4. Error-counting models
These models attempt to estimate the software reliability in terms of the estimated number of errors remaining in the program. The Jelinski-Moranda model
(1972) was the first error-counting model. The Shooman model (1972) underwent
some changes and is now similar to the Jelinski-Moranda model. The
Schick-Wolverton model (1978) extended the Jelinski-Moranda model by incorporating a factor representing the severity of the test cases. The Musa model
(1975) is equivalent to the Jelinski-Moranda model. However, it is better developed and is the first model to insist on execution time data rather than the
calendar time data used in the earlier models. These early models assumed that
all the errors had the same error rate. This is clearly unsatisfactory since one
would expect that errors which are detected later should have smaller (operational) error rates than those which are detected earlier. This is rectified by
Littlewood's model (1980a) which incorporates the case where the failure rate of
successive errors is stochastically decreasing. The Goel-Okumoto N H P P model
(1979a) makes another departure from the other models by treating the number
of faults to be detected as a random variable instead of a fixed unknown constant.
Two additional assumptions made by most error-counting models are:
(a) The failure rates of the errors remaining in the program are independently
identically distributed random variables.
(b) The program failure rate is the sum of the individual failure rates.
Taken together these assumptions are not true in general since the error distribution across modules is often skewed (Myers, 1978), so that a few complex,
error-prone modules contain a large proportion of the errors. Since there is likely
F. B. Bastani and C. V. Ramamoorthy
to be a considerable overlap in the elements (in the input domain) affected by
such closely related errors, the removal of each error, except the last error,
decreases the failure rate by l e s s than its own failure rate. This can result in an
incorrect estimate of the reliability of the program since each detected error would
be perceived as having a failure rate smaller than its actual failure rate. Further,
a common testing strategy is to direct subsequent test cases at the module in
which an error was most recently detected till sufficient confidence is restored in
its correctness. However, this would mean that the failure rates are no longer
independently identically distributed.
In order to illustrate models in this category, we now present the details of the
general Poisson model (GPM) discussed in (Angus et al., 1980). It generalizes the
Jelinski-Moranda linear de-eutrophication model, the Shooman model, and the
Schick-Wolverton model. The key parts of the Musa model are also generalized
by this model.
The inputs to the model are (1) tl, tz, . . . , t , where ts is the rime required to
detect the j-th failure after the error(s) causing the ( j - 1)-th failure has (have)
been corrected, and (2) m l , m z , . . . , m n where m j is the number of errors fixed
as a result of the j-th failure.
The G P M model assumes that
= as ~-1 ,
2s = ( N - M j ) ~ b ,
where N is the number of errors originally present, Mj = Z ji= 1 mi, and 0~, q~ are
R j ( t ) = e - dp(N-
~ "
The assumptions of the G P M model are as follows:
(1) consecutive inputs have independent failure probabilities,
(2) all errors have the same disjoint failure rate (p,
(3) the severity of the testing process is proportional to a power of the elapsed
CPU time,
(4) no new errors are introduced.
Assumption (1) has already been discussed above. Assumption (2) is a major
drawback of these models (Littlewood, 1980a): earlier errors are likely to have a
larger failure rate since they are detected more easily. Assumption (3) depends to
a large extent on the testing strategy used. Intuitively, as time increases, the
severity of the testing increases (Schick and Wolverton, 1978). Assumption (4) is
not true in general and can lead to invalid estimates (Angus et al., 1980). Musa
(1975) partly overcomes this by estimating the total number of errors to be
eventually detected.
The Maximum Likelihood Estimates (MLE) for the parameters of the model
can be derived as follows:
failure PDFj(t)
= (o(N - Mj)~t ~- 1 e-
Software reliability
The likelihood function is
L = fi PD~_,(~)
Hence, the log likelihood function is
logL = n iog~b + n log~ + ~ log(N - Mj_ 1)
log, j=l
The MLE's can be computed by numerically solving the equations obtained by
equating the partial derivatives of logL with respect to N, c¢, and ~p to O. The final
equations are as follows:
^ ~
j=l IV-Mj_ 1 j=l~tj
- - + ~ l o g t j - ~ (p(2V-Mj_,)tTlogtj= 0 ,
~ (N - Mj_I)tj ~ = 0.
These are discussed further in (Angus et al., 1980).
3.5. Nonerror-counting models
These models only estimate the reliability of the software. They consider the
effect of a debugging action on the error size or on the failure rate without concern
as to the number of errors detected at a time. For example, in the JelinskiMoranda Geometric De-eutrophication model, we have
~j = ~ j - 1
where 2j is the error rate and D is a constant to be estimated. An interesting
observation is that the estimate of the parameters of this model may exist even
in cases where those of the linear de-eutrophication model do not exist, i.e., fail
to converge (Dahl and Lahti, 1978; Tal, 1976). Similarly, for the LittlewoodVerrall Bayesian model (1973) we have
st •
F. B. Bastani and C. V. Ramamoorthy
This models the case where there is a possibility that a debugging action may
introduce new errors into the program. For the stochastic input domain based
model (Ramamoorthy and Bastani, 1980) we have:
,~j__l -- ~j-- ~ j _ l X ,
where 2j is the error size and X is a random variable having a piecewise continuous distribution. This models the case where errors detected later have
(stochastically) smaller sizes than those detected earlier.
In order to illustrate models in this category, we present details of the M u s a Okumoto Logarithmic model (i984). The inputs to the model are tl, t2, . . . , tn
where t/ is the time (not interval) at which the j-th error was detected. In this
f(Tj(s))-- 1,
2 ( 0 - - 20 Ot + 1
Thus, the model assumes that the failure rate decreases continuously over the
testing and debugging phase, rather than at discrete points corresponding to error
correction times. Further, the rate of decrease in 2(0 itself decreases with time,
thus modelling the decrease in the size of errors detected as debugging proceeds
Rj(t)= e_/~+,~(,)d,={ 2oOtj+ 1
2o O ( t j + t ) + 1
From this, the failure probability density function is
failure PDF/(t) = 2(t/+ t) e -Ig +' ~<')a"x(')d"
L = {j=~l )L(lj)} e- So"a(s)d~
Taking the logarithm of the likelihood function, we get
logL = n log)~o - ~ log(2o0t/ + 1) - 1 log(2o0t, + 1)
Setting the derivative of logL with respect to 2o and 0 to 0 yields two equations
which can be solved numerically for the maximum likelihood estimates of 2 o and
0, i.e., 2o and 0:
Software reliability
j=l 2o0tj+ 1
- ^
- to El=
0t, + 1
^ ^
log(2 oOt. + 1)
^ ^ ^
0(4o0t. + 1)
Experience has shown that this model is more accurate than the earlier model
proposed by Musa (1975). Further discussions concerning the application of the
new model appear in (Musa and Okumoto, 1984)
3.6. Summary
We can view 2 as a random walk process in the interval (0, e). Each time the
program is changed (due to error corrections or other modifications) 2 changes.
In the formulation of the general model, 2i denotes the state of 2 after the j-th
change to the program. Let Zj denote the time between failures after the j-th
change. Zj is a random variable whose distribution depends on 2j. In all the above
continuous (discrete) time models, we have assumed that this distribution is the
exponential (geometric) distribution with parameter 2j, provided that f(Tj(.)) = 1.
We do not know anything about the random walk process of 2 other than a
sample of time between failures. Hence, one approach is to construct a model for
2 and fit the parameters of the model to the sample data. Then we assume that
the future behavior of 2 can be predicted from the behavior of the model.
Some of the models for 2 which have been developed are as follows:
General Poisson Model (Angus et al., 1980): The set of possible states are (0, e/N,
2e/N . . . . , e); 2j = ( N - j ) e / N ; the parameters are e and N, there is a finite number of states.
Geometric De-Eutrophication Model (Moranda, 1975): The set of possible states are
(e, ed, ed 2, ed 3. . . . ), where d < 1; 2j = edJ; the parameters are e and d; there is
an infinite (although countable) number of states.
Stochastic (Input Domain) Model (Ramamoorthy and Bastani, 1980): The state is
continuous over the interval (0, e); 2j = 2j_ 1 + Zig.,where Aj ~ 2j_ 1X, X ~ fl(r, s);
the parameters are r and s.
An alternative approach is the Bayesian approach advocated by Littlewood
(1979). In this method, we postulate a prior distribution for each of 2 l, 22, ..., 2j.
Then based on the sample data, we compute the posterior distribution of 2j+ 1.
Some additional discussions appear in (Ramamoorthy, 1980).
Over 50 different software reliability growth models have been proposed so far.
These models yield widely varying predictions for the same set of failure data
(Abdel-Ghaly et al., 1986). Further, any given model gives reasonable predictions
for one set of data and incorrect predictions for other sets of data. This has led
some researchers to propose that for each project several models should be used
and then goodness-of-fit tests should be performed prior to selecting a model that
is valid for the given set of failure data (Goel, 1985; Abdel-Ghaly et al., 1986).
F. B. Bastani and C. V. Ramamoorthy
A basic problem with all software reliability growth models is that their assumption that errors are detected as a result of random testing is not true for modern
software development methods. Models which have been validated using data
gathered over a decade ago are not necesarily valid for current projects that use
more systematic methods and tools. As an analogy, consider the task of reviewing
a technical paper. There are (at least) three major types of errors which can creep
into a manuscript. These are (1) spelling, typographical, and other context independent errors, (2) grammatical, organization, style, and other context dependent
errors, and (3) correctness of equations, significance of the contribution, and other
technical errors. Context dependent errors can be detected by random testing (i.e.,
by selecting anyone familiar with the language to review the paper) while three
carefully selected referees are vastly superior to a thousand randomly selected
referees in their ability to detect technical errors. Also, the failure process
observed when all the errors are detected by human beings (testing) is different
from that observed when automated tools such as spelling and grammar checkers
are used. Similarly, in software development we now have tools that can detect
most context independent errors (syntax errors, incorrect procedure calls, etc.)
and context dependent errors (undefined variables, invalid pointers, inaccessible
code segments, etc.). These tools include strongly typed languages and their
compilers, data flow analyzers, etc. The remaining errors are generally the result
of misunderstanding of specifications. These are best detected by formal code
review and walk-through, simulation, verification where possible, and systematic
testing which can be either incremental bottom-up or top-down and which
emphasizes error prone regions of the input domain, such as boundary and
special value points. Again, the failure process when these methods are used is
completely different from that obtained when only random testing is used.
In summary, software reliability growth models treat the program as a black
box. That is, the reliability is estimated without regard to the structure of the
program, number of procedures which have been formally proved/derived, etc.
The validity of their assumption regarding random testing is generally not true for
modern program development methods. Experience shows that with systematic
validation techniques, errors are initially detected in quick succession with an
abrupt transition to an (almost) error free state. Thus, these models can only be
used for obtaining an approximate estimate of the reliability of programs.
4. Sampling models
Software developed for critical applications, like air-traffic control, must be
shown to have a high reliability prior to actual use. Since the possibility of
specification errors exists, program testing must be used in addition to program
proofs. At the end of the development phase, the software is subjected to a large
amount of testing in order to estimate its reliability. Errors found during this
phase are not corrected. In fact, if errors are discovered the software may be
rejected (Ramamoorthy, 1979).
Software reliability
In this section we discuss methods of measuring the reliability of a program
based on the sample selected. We first discuss Nelson's method (MacWilliarns,
1973; Nelson, 1978; TRW, 1976) and then a model for estimating the correctness
probability of a program based on its input domain.
4.1. The Nelson model
This model (TRW, 1976) is based on the operational definition of software
reliability given earlier. It is the only model whose theoretical foundations are
sound. However, it suffers from a number of practical drawbacks:
(1) In order to have a high confidence in the reliability estimate, a large number
of test cases must be used.
(2) It does not take into account 'continuity' in the input domain. For example,
if the program is correct for a given test case, then it is likely that it is correct
for all test cases executing the same sequence of statements.
(3) It assumes random sampling of the input domain. Thus, it cannot take
advantage of testing strategies which have a higher probability of detecting errors,
e.g., boundary value testing, etc. Further, for most real-time control systems, the
successive inputs are correlated if the inputs are sensor readings of physical
quantities, like temperature, which cannot change rapidly. In these cases we
cannot perform random testing.
(4) It does not consider any complexity measure of the program, e.g., number
of paths, statements, etc. Generally, a complex program should be tested more
than a simple program for the same confidence in the reliability estimate.
In order to overcome these drawbacks, the model has been extended (Nelson,
1978) as follows: The input domain is divided into several equivalence classes.
The division can be based on paths or some other criteria when the number of
paths is too large (e.g., program sub-functions). It is assumed that there is some
continuity among the elements in an equivalence class, i.e., if the program
executes correctly for an input from the j-th equivalence class, then it will execute
correctly for any randomly selected input from the same equivalence class with
probability 1 - bj., where bj ,~ 1. Then:
where m = number of equivalence classes; and Pj = probability of selecting an
input from the j-th equivalence class during actual operation.
DISCUSSION. This model is a big improvement over the original model. Some
comments are:
(1) The assignment of values to bj is ad hoc; no theoretical justification is given
for the assignment (Nelson, 1978).
(2) The model uses only one type of complexity measure, namely, number of
paths, functions, etc. However, it does not consider the relative complexity of
each path, function, etc.
F. B. Bastani and C. V. Ramamoorthy
Many other interesting aspects of the Nelson model are discussed in (TRW,
4.2. Input domain based model
This model is discussed in detail in (Ramamoorthy and Bastani, 1979). It
removes most of the objections to the Nelson model. The price is the increased
complexity of the model. The model was developed for assessing the quality of
critical real-time process control programs. In such systems no failures should be
detected during the reliability estimation phase, so that the reliability estimate is
one. Hence, the important metric of concern is the confidence in the reliability
estimate. This model provides an estimate of the conditional probability that the
program is correct for all possible inputs given that it is correct for a given set
of inputs. The basic assumption is that the outcome of each test case provides
at least some stochastic information about the behavior of the program for points
which are close to the test point. The model uses the concept of probabilistic
equivalence classes which is defined as follows: E is a probabilistic equivalence
class if E is a subset o f / , where I is the input domain of the program P, and
P is correct for all elements in E, with probability P(X~, . . . , Xa}, if P is correct
for each X,. in E, i = 1. . . . . d. Then, P { I IX) is the correctness probability of P
based on the set of test cases X. (Obviously, the program must be correct for each
element in X.) Probabilistic equivalence classes are derived from the requirements
specification and the program source code in order to minimize control flow
errors. A suggested selection criterion (Ramamoorthy and Bastani, 1979) is:
Let E be a probabilistic equivalence class. X is in E if an error in the program
which affects any element in E can affect X, and vice versa. The results of this
classification scheme are:
(1) It includes all paths without loops since distinct paths differ in at least one
(2) Multiple conditions are treated separately since an error in one condition
need not affect the other conditions.
(3) Loops are restricted to a finite number of repetitions.
In order to further minimize control flow errors, these classes should be intersected with classes derived from the requirements specification (Weyuker and
Ostrand, 1980). Finally, we can estimate the correctness probability of the program using the continuity assumption, namely, closely related points in the input
domain are 'correlated' with respect to the implementation of the function. This
is true in general for algebraic programs where errors usually affect an interval of
nearby points. These regions correspond to high probability equivalence classes,
such as those formed on the basis of program paths. A specific model is developed in (Ramamoorthy and Bastani, 1979). The main result of this model is
P{program is correct for all points in [a, a + V lit is correct for
test cases having successive distances xj, j = 1. . . . , n - 1}
= e-RV
1 +e
Software reliability
where 2 is a parameter which is deduced from some measure of the complexity
of the source code.
DISCUSSION. The advantages of this model are:
(1) Any test case selection strategy can be used. This will minimize the testing
effort since we can choose test cases which exercise error-prone constructs.
(2) It does not assume random sampling.
(3) It takes into account the complexity of the program: A simple program is
tested less than a complicated program for the same correctness probability. The
model also yields the optimal testing strategy to be used. Specifically, for algebraic
programs the test cases should be spread out over the input domain for higher
correctness probability.
The disadvantages of the model are:
(1) It is relatively expensive to determine the equivalence classes and their
(2) Incorporation of more general continuity assumptions (e.g., boundary value
relationships) results in mathematically intractable derivations.
4.3. Summary
The models discussed in this section are especially attractive for medium size
programs whose reliability cannot be accurately estimated by using reliability
growth models. These models also have the advantage of considering the structure
of the program. This enables the joint use of program proving and testing in order
to validate the program and assess its reliability (Long et al., 1977).
5. Conclusion
We first defined software reliability and discussed three methods of measuring
it. Then we developed a general framework for software reliability growth models
using the concept of error size and testing process. We distinguished between
error counting and nonerror counting models. If only the reliability estimate is
required, then the nonerror counting models are preferable since they can model
the debugging process more realistically. Error counting models should be used
when an estimate of the number of remaining errors is needed. This may be
required if resources have to be allocated for the maintenance phase (assuming
that the average resource per error correction is known). It is also possible to
estimate the number of errors remaining in a program by using error seeding
techniques. Finally, we briefly discussed two sampling models, namely, the Nelson
model and its extension and an input domain based model.
At the present time no specific software reliability has found wide acceptance.
This is partly due to the cost involved in gathering failure data and partly because
of the difficulty in modelling the testing process. In the following, we outline a
method combining well established proof procedures with software reliability estimation methods. It is particularly suitable for critical process control systems.
F. B. Bastani and C. V. Ramamoorthy
(1) During the testing and debugging phase at least two different software
reliability growth models should be used, primarily for helping the manager to
make decisions such as when to stop testing, etc. Goodness-of-fit tests should be
performed in order to select the model which is most appropriate for the failure
data obtained from the project.
(2) After the reliability growth models indicate that the reliability objective has
been achieved, a sampling model is used in order to get a more accurate estimate
of the reliability of the program.
(a) At first equivalence classes are determined based on the paths in the
program using the selection criterion discussed in Section 4.2. Boundary value
and range testing are performed in order to ensure that the classes are chosen
(b) If the path corresponding to each equivalence class can be verified (e.g., by
using symbolic execution) then the correctness probability of the class is 1.
(c) If the correctness of the path cannot be verified, then the degree of the
equivalence class is estimated. Next, as many test cases as necessary are used so
as to achieve a desired confidence in the correctness of the software.
During the first decade of software reliability research the major emphasis was
on developing models based on various assumptions. This resulted in the proliferation of models, most of which were neither used nor validated. Currently the
consensus appears to be that perhaps there is no single model which can be
applied to all types of projects. Hence, one area of active research is to investigate
whether a set of models can be combined so as to achieve more accurate
reliability estimates for various situations. Other research topics include (1) developing methods of analyzing the confidence in the predictions of a model, and
(2) using software reliability theory to assist with the management of a project
throughout its life cycle.
Abdel-Ghaly, A. A., Chan, P. Y. and Littlewood, B. (1966). Evaluation of competing software
reliability predictions. 1EEE Trans. Softw. Eng. 12(9).
Angus, J. E., Schafer, R. E. and Sukert, A. (1980). Software reliability model validation. In Proc.
Annu. Rel. and Maintainability Syrup., San Francisco, CA, Jan. 1980, 191-199.
Barlow, R. E. and Proschan, F. (1975). Statistical Theory of Reliability and Life Testing. Holt, Rinehart
and Winston, New York.
Bologna, S. and Ehrenberger, W. (1978). Applicabilityof statistical models for reactor safety software
verification. Unpublished report.
Cox, D. R. and Lewis, P. A. W. (1966). The Statistical Analysis of Series of Events. Methuen, London.
Dahl, G. and Lahti, J. (1978). Investigation of methods for production and verification of computer
programmes with high requirements for reliability. OECD Halden Reactor Project, Preliminary
DeMillo, R. A., Lipton, R. J. and Sayward, F. G. (1978). Hints on test data selection: Help for the
practicing programmer. Computer (IEEE), April, 34-41.
Duran, J. W., Wiorkowski, J. J. Capture-recapture sampling for estimating software error content.
IEEE Trans. Softw. Eng. 7(1).
Software reliability
Forman, E. H. and Singpurwalla, N. D. (1977). An empirical stopping rule for debugging and testing
computer software. J. Amer. Stat. Ass. 72, 750-757.
Goel, A. L. and Okumoto, K. (1979a). A time-dependent error-detection rate model for software
reliability and other performance measures. 1EEE Trans. ReL 28(3), 206-211.
Goel, A. L. and Okumoto, K. (1979b). A Markovian model for reliability and other performance
measures for software systems. In Proc. Nat. Comput. Conf., New York 48, 767-774.
Goel, A. L. (1985). Software reliability models: Assumptions, limitations, and applicability. IEEE
Trans. Softw. Eng. 11(12), 1411-1423.
Jelinski, Z. and Moranda, P. (1972). Software reliability research. In: W. Freiberger, ed., Statistical
Computer Performance Evaluation. Academic Press, New York, 465-484.
Littlewood, B. and Verrall, J. L. (1973). A Bayesian reliability growth model for computer software.
J. Roy. Stat. Soc. 22(3), 332-346.
Littlewood, B. (1979). How to measure software reliability and how not to... IEEE Trans. Rel. 28,
Littlewood, B. (1980a). A Bayesian differential debugging model for software reliability. Proc.
COMPSAC "80. Chicago, IL, 511-519.
Littlewood, B. and Verrall, J. L. (1980b). On the likelihood function of a debugging model for
computer software reliability. Dep. Math., City Univ., London.
Long, A. B. et al. (1977). A methodology for the development and validation of critical software for
nuclear power plants. Proc. 1st Int. Conf. Comp. Softw. & Appl. (COMPSAC "77). Chicago, IL.
MacWilliams, W. H. (1973). Reliability of large real-time control software systems. In: Rec. 1973
1EEE Syrup. Comput. Sofiw. Rel. New York, 1-6.
Mills, H. D. (1973). On the development of large reliable software. Rec. IEEE Syrup. Comp. Softw.
Rel. New York, 155-159.
Moranda, P. B. (1975). Prediction of software reliability during debugging. In: Proc. 1975 Annu. Rel.
and Maintainability Symp. Washington, DC, 327-332.
Musa, J. D. (1975). A theory of software reliability and its applications. IEEE Trans. Softw. Eng.
1(3), 312-327.
Musa, J. D. and Okumoto, K. (1984). A logarithmic Poisson execution time model for software
reliability measurement. In: Proc. 7th Int. Conf. Softw. Eng., Orlando, FL, 230-237.
Myers, G. J. (1978). The Art of Software Testing. Wiley, New York.
Nelson, E. (1978). Estimating software reliability from test data. Microelectronics and Reliability 17,
Ohba, M. (1984). Software reliability analysis models. IBM J. Res. Develop. 28, 428-443.
Ramamoorthy, C. V. and Bastani, F. B. (1979). An input domain based approach to the quantitative
estimation of software reliability. Proc. Taipei Sere. on Softw. Eng. Taipei.
Ramamoorthy, C. V. and Bastani, F. B. (1980). Modelling of the software reliability growth process.
In: Proc. COMPSAC "80, Chicago, IL, 161-169.
Ramamoorthy, C.. and Bastani, F. B. (1982). Software reliability--Status and perspectives. 1EEE
Trans. Soflw. Eng. 8(4), 354-371.
Schick, G. J. and Wolverton, R. W. (1978). An analysis of competing software reliability models.
IEEE Trans. Softw. Eng. 4(2), 104-120.
Shooman, M. L. (1972). Probability models for software reliability prediction. In: W. Freiberger, ed.,
Statistical Computer Performance Evaluation. Academic Press, New York, 485-502.
Tal, J. (1976). Development and evaluation of software reliability estimators. UTEC SR 77-013, Univ.
of Utah, Elect. Eng. Dep., Salt Lake City, UT.
Trivedi, A. K. and Shooman, M. L. (1975). A many-state Markov model for the estimation and
prediction of computer software performance parameters. In: Proc. 1975 Int. Conf. Rel. Sofiw., Los
Angeles, CA, 208-220.
TRW Defense and Space Systems Group (1976). Software Reliability Study. Rep. No. 76-2260.1-9-5,
RW, Redondo Beach, CA.
Weyuker, E. J. and Ostrand, T. J. (1980). Theories of program testing and the application of revealing
subdomains. IEEE Trans. Softw. Eng. 6(3), 236-246.
Yamada, S., Ohba, M. and Osaki, S. (1983). S-shaped reliability growth modeling for software error
detection. 1EEE Trans. Rel. 32, 475-478.
Yamada, S. and Osaki, S. (1985). Software reliability growth modeling: Models and applications.
1EEE Trans. Softw. Eng. 11(12), 1431-1437.
P. R. Krishnaiah and C. R. Rao, eds., Handbook of Statistics, Vol. 7
© Elsevier Science Publishers B.V. (1988) 27-54
Stress-Strength Models for Reliability
Richard A. Johnson
I. Introduction
It is a well accepted fact that the strength of a manufactured unit is a variable
quantity that should be modeled as a random variable. This fact forms the basis
for all of reliability modeling. A second source of variability may also have to be
taken into account. When ascertaining the reliability of equipment or the viability
of a material, it is also necessary to take into account the stress conditions of the
operating environment. That is, uncertainty about the actual environmental stress
to be encountered should be modeled as random. The terminology stress-strength
model makes explicit that both stress and strength are treated as random
Let X be the stress placed on a unit by its operating environment. In many
applications, X is taken to represent the maximum value attained by a critical kind
of stress. Lloyd and Lipow (1962) describe an application where X is the maximum chamber pressure generated by the ignition of a solid propellant in a rocket
engine. Kececioglu (1972) discusses a case where a torsion stress is the most
critical type of stress for a rotating steel shaft on a computer. Typically, the stress
variable is the most difficult to model accurately because of the lack of sufficient
In the simplest stress-strength model, X is the stress placed on the unit by the
operating environment and Y is the strength of the unit. A unit is able to perform
its intended function if its strength is greater than the stress imposed upon it. In
this context, we define reliability (R) as
R = probability that the unit performs its task satisfactorily.
That is, reliability is the probability that the unit is strong enough to overcome
the stress.
Let the stress X have continuous distribution F(x) and strength Y have continuous distribution G(y). When X and Y can be treated as independent,
R = _f F(y)dG(y)= .f
[1 -
G(x)]dF(x)= P[Y>X].
R. A. Johnson
This model, first considered by Birnbaum (1956), has found an increasing number
of applications in civil, mechanical and aerospace engineering.
The following examples help to delineate the versatility of the model.
EXAMPLE 1.1 (Rocket engines). Let X represent the maximum chamber pressure generated by ignition of a solid propellant, and Y be the strength of the
rocket chamber. Then R is the probability of a successful firing of the engine.
EXAMPLE 1.2 (Comparing two treatments). A standard design for the comparison of two drugs is to assign Drug A to one group of subjects and Drug B
to another group. Denote by X and Y the remission times with Drug A and
Drug B, respectively. Inferences about R = P [ Y > X ] , based on the remission
time data X l , X 2 . . . . , X m and I11, Y2, " " , Yn, are of primary interest to the
experimenter. Although the name 'stress-strength' is not appropriate in the
present context, our target of inference is the parameter R which has the same
structure as in Example 1.1.
EXAMPLE 1.3 (Threshold response model). A unit, say a receptor in the human
eye, operates only if it is stimulated by a source whose random magnitude, Y, is
greater than a (random) lower threshold for the unit. Here
P[ Y> X] = P[unit operates]
is again of the form described above in stress-strength context.
2. Nonparametric inference about stress-strength reliability
Let the data consist of a random sample of size m of stresses X 1, X 2. . . . . Xm
and an independent random sample of size n of strengths I11, Y2. . . . . In.
Birnbaum (1956) proposed the point estimate
where U = number of pairs (X,., Yj) with Yj > Xr Alternatively, we can express/~
1~ =
Fro(y) dG,(y)
where Fm(") and Gn(') are the empirical cdfs of the X's and Y's, respectively.
E(/~)= ~
P[ Yj > X ; ] - P [ Y > X ] = R
Stress-strength models for reliability
so /~ is an unbiased estimator of reliability. Under the assumption that the
underlying cdfs F(.) and G(.) are continuous, the order statistics
X(1) ~< • • • ~<X(,,) and t"(1) ~< " " " ~< Y~n) are complete sufficient statistics so that/~
is the unique uniform minimum variance unbiased estimator of R.
Another equivalent expression for R takes advantage of the relation between the
Mann-Whitney and Wilcoxon form of the two-sample rank statistics. That is
1 ) / 2 - ~ rank(Xi).
Owen, Craswell and Hansen (1964) point out that /~ remains unbiased even if
F(.) and G(.) are not continuous.
It is also possible to obtain a distribution-free lower confidence bound on
R = P[Y<X]
based on /~. First note that,
Fro(y) d G . ( y ) -
1~ - R =
F ( y ) dG(y)
[Fm(y) - F(y)] d G , ( y ) +
- ~
F ( y ) [dGn(y ) - d G ( y ) ] .
Birnbaum and McCarthy (1958) bound the right hand side of relation (2.1) by
sup[Fm(x ) - F(x)] + sup[G(y) - G.(y)] = D.7, + D +
where D + (D,,7,) is the Smirnov statistic based on a random sample of size m.
P[I~ - R <~ c] >f P[DZ,, + D~+ <~ c]
so a conservative lower confidence bound can be obtained for R from the distribution of D~, + Dff. If c is selected so that 1 - ~ ~< P[D~, + D~+ <~ c], then
P [I~ - c <~ R ] >~ 1 - or.
Thus 1~ - c is a conservative 100(1 - ~)% confidence bound f o r R .
Under the transformation Z = F ( X ) , Fm(X ) - F ( x ) = H m ( z ) - z where
H m ( z ) = (number of F(Xi) <~ z)/m. Since the Z i = F ( X i ) are independent uniform
random variables, the distribution of Dm+ , and that of D~-, are free of F(.) and
G('). Furthermore, P[D~, < d ] =P[D+m < d ] and it is well known that
P[D+m < d ] - L ( d x / ~ ) ~ O
uniformly in d, where L ( z ) = 1 - e x p ( - 2 z 2 ) . This
suggests the approximation
R . A . Johnson
P [ D m + D + <<.c] ~ f f L((c - u) x ~ ) d L ( u x//m)
n e - 2mc2
m e - 2nc2
2 x / ~ m n c e - 2mnc2/(m+ n)
xf r ~
(m + n) 3/2
f 2~c/,/~+~
e - ~/2 d t .
Since D m and D + are independent, both P[Df,, + D + <~ c] = P [ D 2 + D + <<.c]
and the approximation are symmetric in the sample sizes m and n. Owen,
Craswell and Hansen (1964) present tables, based on approximation (2.4), which
extend those presented by Birnbaum and M c C a r t y (1958). The tables are entered,
using the confidence level and 2 where 2 = m/(n + m), to obtain 3 = c x / m + n in
our notation. Note that their upper bound on P [ X > Y] yields our lower bound
on R = P [ Y > X ] .
EXAMPLE 2.1. Suppose we have m = 20 values of maximum rocket pressures
and 30 observations on the strength of the chambers. Counting cases of strict
inequality in our samples, we obtain U = 591 so /~ = 591/(30)(20) = 0.985 and
2 = 20/50 = 0.4. For a 90~o confidence interval, b = 2.69289 = e x / ~
c -- 0.381 and ,~ - c = 0.985 - 0.381 = 0.604 is the 9 5 ~ lower confidence bound
on R.
Govindarajulu (1968) reports investigating two-sided confidence bounds based
on the inequality
I/~ - R ] ~ s u p I F . - F ] + s u p l G m - GI =
Dm + D,,
that follows directly from (2.1). However, the resulting intervals were very wide
so he suggests employing the large sample normal approximation for R.
I f m, n--* ~ ,
x/min (m, n'~ - -
, N (0, 1)
G(x) [ 1 - G(y)] dF(x) d F ( y )
Stress-strength models for reliability
The asymptotic normality follows directly from the representation as a
U-statistic. Involving Van Dantzig's bound Var(l~)<<.R(1-R)/min(m,n)
~< 1/4 min(m, n), one can choose c to satisfy
c =
~ - 1(1 - y)
2 x / m i n (m, n)
where cb-l(") is the inverse of the standard normal cdf. Then / ~ - c is the
100(1 - 7)Yo lower-confidence bound on R. For the equal sample size situation,
Govindarajulu (1968) shows the 9 5 ~ confidence bound takes x / m + n c = 1.17
whereas ~/-m + n c = 2.93 for the Birnbaum and McCarthy approach.
Alternatively, a 2 in Theorem 2.1 can be estimated by replacing F by Fm and
G b y Gn.
~r2=min(m,n) {~[ f FZ(x)dG,,(x)- ( f Fm(X)dG,,(x))2]
Sen (1967) gives essentially the same result as Govindarajulu although he derives
slightly different estimates of a 2. One of his estimates of a 2 can be described in
terms of ranks.
Let Rl, . . . , R m be the ranks of the X i and S 1. . . . . S n the ranks on the Yj in
the combined sample. Set
(R i _
i)2 _ m
12 .
( +)]
sgl = 1
(Si - j ) - n S n-1 j=l
The rank estimator of
0. 2
The normality of/~ should be a reasonable approximation if m, n t> 50 unless
the reliability in question is extremely high. In this latter case, a conservative
bound can be obtained based on (2.2) but using the exact distributions of
D~ + D~- rather than the approximation (2.4).
The nonparametric approach has one serious drawback. In return for its distribution-flee property, it is not possible to establish high reliability with even
moderate sample sizes.
R.A. Johnson
3. Parametric inference procedures
Given any parametric families {F(x[O1),
Oze 192} for strength, the reliability
for stress and {G(y[02),
Ro,, o2 = f F(xlO~) dG(xl02).
Among the numerous choices for stress and strength distributions, only a few
special cases are amenable to exact small sample inference procedures. We first
treat the normal and then the Weibull stress-strength models before discussing the
general case.
3.1. Normal theory stress-strength models
Suppose F(') is N(# t, a~) and G(') is N(/~ 2, a2). Then
R = PIt>
Xl =
e l Y - x > Ol = ¢
{ ]~2--/"£1 "~
where (I)(.) is the standard normal cdf with pdf cp(.). Without further assumptions,
it is not possible to obtain exact confidence procedures.
3.1. I. The general normal-normal model
Let Xl, X 2 . . . . , X m be a random sample of stress values and Y~, I"2. . . . . Y,
be an independent random sample of strengths. Downton (1973) obtained an
expression for the uniform minimum variance unbiased (UMVU) estimate of R
by conditioning the indicator I[X < Y] on the complete sufficient statistic x, y,
S~ = ~ ( X i -X')2/(m - 1), s~ = Y~(yj- y)2/(n - 1). In particular
f l f d(v) (1
-- /)2)(n - 4)/2(1 _ U2)(m- 4)/2 d u d v
-1 ~/-1
(y - X) x//m
s2(n - 1) fmm
d(v) =
s l ( m - 1)--+I) sl(m 1) ~]n "
We take
i f d ( v ) < ~ - 1 , all I r i s < l ,
ifd(v)>f 1, all [v[~<l.
Stress-strength modelsfor reflability
When the sample sizes m and n are both large, confidence bounds for R can
be set using the approximate normality of ~, &2 = y ~ ( x _ y ) 2 / m , ~ and
& 2 = y . ( y ~ - 2 ) 2 / n . The maximum likelihood estimator of reliability is
/~ = ~ ( Y - x)/v/~-~ + a2)"
Y- X
#2 - #l
[(Y- ~2) - ( x - u,)]
1 (122 -- ~/1)
2 (a~ + a~)3/2
+ op
[(a2 _ {7?) "}- ({722 -- 0"22)]
min (m, n)
we obtain the following asymptotic result.
and m/(m + n)~)~ ( 0 < 2 <
I f m, n ~ o o
(~__+=X 2
/ , 2 _ f ~t
1), then
"~ &a N(0, a2)
where an can be estimated by
~.2 + 8.2
[~+__+&22 ( y _ y ) 2
2(8 2 + &2z)z
--n-1 ^4)1
As a consequence of Theorem 3.1, an approximate 100(1 - a)% lower confidence bound for R is given by
',x/ a~ + a 2
where 1 - e = ~(z~).
3.1.2. Equal variances." ~rZ = aft
Suppose it can be assumed that the stress and strength distributions have equal
variances. Estimating the common 0-2 by
s2 = E i=l (xi - -x)2 + YV = 1
( Y j -- . ~ ) 2
R. A. Johnson
leads to the non-central t-variable
/[(1 1~,~,2]1/2
+n] "J
whose noncentrality parameter is b = (/~2 -/-~l)/a(1/m + 1/n) I/2. Since P~[ T ~< t] is
monotonically decreasing in 5, a 100(1 - e)~o lower confidence bound for b is
given by _5 where
P ~ [ T o b ~ ~<
t] = 1 -
(see Lehmann (1959), Corollary 3, p. 80 and also p. 223-224).
so the 100(1 - ~)% lower confidence bound on R is
R=¢ (~-~/1m + ~)
where _b is a solution to (3.7). Govindarajulu (1967) gives two-sided limits.
3.1.3. Known stress distribution
Mazumdar (1970) derived minimum variance unbiased estimates of R when the
stress distribution is known and when a~ is either known or unknown. Downton
(1973) gives an alternative integral expression for the estimator. Church and
Harris (1970) suggested a closely related estimator and derive the large sample
approximate confidence interval. In the notation of Section 3.1.1, their
100(1 - ~)% lower confidence bound is asymptotically equivalent to
(y-/~l) 2
( n - 1)a 2
The fact that/~l and ~r1 are known does not seem to lead to exact inference
procedures. Mazumdar (1970) does obtain an exact, but inefficient, confidence
bound by introducing m pseudo random numbers for the first sample.
Stress-strengthmodelsfor reliability
3.1.4. Some sample size considerations
Owen, Craswell and Hansen (1964) also treat paired observations and cases
where the variances and covariances are known. They then obtain some upper
bounds on the sample size required to achieve specified confidence bounds. We
present an extension of their approach.
For independent samples, when a 2 = a22 = 6 2, R = ~((/.L 2 -- #l)/N/~ 0") and a 2
is estimated by sp2. Given a fixed precision c and reliability R, required sample
sizes can be obtained by solving
1 - cc= P[I~ - c < R ] = P [
Lx/2 s v
=PIT(bm,,,)<x/2(l+~)-'/2z(~_R_c) 1
where T(bm,n) is a non-central t-variable with m + n - 2 degrees of freedom and
non-centrality parameter
~rn, n =
Note that the sample sizes m and n enter the non-centrality parameter, degrees
of freedom and the percentile x/~+mz(l_R_c). The values of m and n do,
however, enter (3.10) symmetrically. In an application, the solutions m, n must be
maximized over the range R of interest. Owen, Craswell and Hansen (1964) give
a table of values for the case of equal sample sizes.
3.2. Exponential and Weibull distributions with equal shape
When the stress and strength distributions are both Weibull, and their shape
parameters are equal
Rol.o~,p=l-fo~e-(x/o1'"P(X~] p- e- (x/°:)Pdx
02 \02/
1 + (O,/Oz) p
This Weibull expression includes the negative exponential distribution when p = 1,
and the Rayleigh distribution, when p = 2. Unless the common shape parameter
is known, only large sample approaches to inference are available.
When both distributions are negative exponential, some exact procedures are
available. With independent random samples, the likelihood is
R. A. Johnson
O l m e'rim= 1 Xi/Ol 0 2 n
eZT=1 r~/o2
and the maximum mikelihood estimator of R = 1/(1 + 01/02) is /~ = 1/(1 + X/Y).
The bias is relatively small and (m + n)(/~ - R) = O(1) where R is the UMVU
estimator. Since (X/01)/(Y/02) is distributed Fz,,,2n, a 100(1 - e)% lower confidence bound on R is given by
1/ 1 + ~
F2rt, 2m(0~ )
where F2. ' 2.n(~) is the upper ~-th point of the F2. ' 2,.-distribution.
Alternatively, since (n/m)F2." 2m/(1 + (n/m)F2,,, 2m) has a beta distribution with
parameters n and m, the lower confidence bound can also be expressed as
1 -- ~/1--0¢
where t/1 _~ is the 100(1 - ~)-th percentile of the beta distribution. The case of
known stress parameter, 01, can be treated by the same methods.
Basu (1980) considers the Marshall-Olkin bivariate exponential distribution.
3.3. General parametric families
Given point estimates 01 and 0 2 the point estimate of R,
= f F~,(x) dG~2(x )
usual!y be evaluated by numerical methods. Notice that /~ is the MLE if
0~ and 02 are MLE's. Except for the normal and exponential cases, confidence bounds must be based on large sample theory.
I 1
1(01) )
independent of 0 2 and
o2) ~'~, N(0, 12- '(02))
where 11(01) and 12(02) are the Fisher information matrices for the stress and strength
distributions, respectively. Then, if the derivatives are smooth
Stress-strength modelsfor reliability
~/m + n(R ~,. ~ - ROI"02)
? +n
~ f Fo,(x)dGo~(x)
Under suitable regularity conditions including the interchange of integration and
[1 - Go2(U)]
_ ~
f(ul O,)g(xl
02) d u d x
~ O , / f ( u [ 01 )
bo = f ~ F o , ( X ) ~02/g(xl
8g(x]02)02) g(xl 02) dx.
Notice that ao, and bo2 are expressions for the covariance of score functions.
THEOREM 3.2. If m, n ~
~/m + n(R~ ,, ~
and m/(m + n ) ~ 2 ( 0 < 2 < 1 ) , then
Ro,,o~) ~
N(O, a~,2 ~)
where aR,2 A may be estimated as
a'°llll(Ol)a'°l+--l- 2 b'~ I f
1( ^
02) ~ .
As a consequence of Theorem 3.2 an approximate large sample 100(1- ~)%
lower confidence bound on Ro," o~ is given by
R ~ , . ~2 - z ~ R , ~ / , f m
+ n.
R. A. Johnson
3.4. Drawback of the parametric model
Only moderate sample sizes are required for estimates of /~1, #2 and
a( = al = a2) in the normal model. However, estimates of reliability and the lower
confidence bound make strong use of the assumptions that the upper tail of the
stress distribution and lower tail of the strength are normal If the sample sizes
are not large enough to produce observations in these tails, we cannot even check
this assumption.
If a small fraction of the population of units contain major defects of material
or workmanship, even a moderate sample of strengths will not show these 'rare'
causes of failure. In this situation, use of an assumed parametric form for the
stress distribution will, typically lead to estimates of P [ Y > X ] which are, incorrectly, very high.
Even without such extreme departures from the postulated models, tail areas
remain very difficult to estimate. The choice between normal, Weibull or lognormal tails can change the estimated reliability by several orders of magnitude
when R is extremely large.
4. Stress-strength models for system reliability
System models have been discussed by Bhattacharyya and Johnson (1974,
1975, 1977), McCarthy and Orringer (1975), and Chandra and Owen (1975).
Bhattacharyya and Johnson (1975) study the situation where a system, consisting
of k components, functions when at least s (1 ~< s ~< k) of the components survive
a common shock of random magnitude. This formulation includes all series,
k-out-of-k, and parallel, 1-out-of-k, systems.
EXAMPLE 4.1. A panel consisting of k identical solar cells maintains an adequate power output if at least s of the cells are active during the duration of the
mission. The external force interfering with the operation of the cells may be
extreme temperatures and the strength of a cell, in this context, may be taken as
its capacity to withstand the external temperatures.
Under an 'identical component' model, the strengths of the components are
assumed to be independent and identically distributed random variables with cdf
G(y). The stress, common to all components, is a random variable having cdf
F(x). The system reliability is then a function of F(.) and G('). In particular, the
reliability of an s-out-of-k system, Rs. k, is given by
[1 - G(x)]JGk-J(x) dF(x)
R,, k = .
j ~ s
= 1-~2N[G(x)]dF(x
where ~ ( ' ) is the cdf of the beta distribution having density oc uk-S(1 - u)s- 1
Stress-strengthmodelsfor reliability
4.1. Nonparametric estimation of system reliability
Let Y 1, . . . , X m be a random sample from F(.) and Y1, . . . , Yn be a random
sample from G(') where F(.) and G(.) are assumed to be continuous. Replacing
F(.) and G(.), in (4.1), by the empirical cdf's Fm(") and Gn(.), gives rise to the
intuitive estimator
Fm(x) d~[G,,(x)]
m i= 1
where s(~)~< S(2 ) ~ " ' " ~ S(n) are the ordered ranks of the Y's in the combined
sample. Bhattacharyya and Johnson (1975) also derive the UMVE estimator as
a generalized U-statistic based on the kernal
h(xl;Y~ .... ,Yk)= 1 ifs
= 0
or m o r e y l . . . . ,Yk exceed x l ,
After some simplification, the UMVU estimator /~s, k can be expressed as
Note that/~s,~ is similar in form t o / ~ * k but that it has the feature of a trimmed
Bhattacharyya and Johnson (1977) establish the following large sample result.
Let m, n~oo with m/(m + n)--*2
Then pointwise
= 0(1)
(m + n) (Rs, ~ - R,,~)
,/m + n(~,,k- R,,~) ~ , Y O, 1 - 2
a~ = VarF[ ~(G(X))] = f ~ 2 ( G ) d F - [ f ~(G)dFl 2,
~r~ =
b[ G(x)] b[ G(y)] {G(min (x, y)) - G(x) G(y)} dF(x)
and b(u) is the pdf associated with ~.
R. A. Johnson
From Theorem 4.1 we conclude that a large sample 100(1 - a)% lower confidence bound for R~, k is given by
where a^ 2I , aA22 are obtained by replacing F and G by F m and G n in the
expressions for alz, a 2. Clearly /~*,k could replace R~,k in the confidence
bound (4.6).
When the stress distribution F is known, the intuitive estimator has the form
/~./,(F) = .~1 [ ~ ( ~ ) - ~ ( ~ n ~ ) l F ( Y ( i ) )
and the UMVU estimator is
t~* k(F)
Bhattacharyya and Johnson (1977) also establish
x/n(/~,~,(F) - R) ~ ,
N(0, a~),
n ( R * k ( r ) -/~s,k(r)) = O(1),
so confidence bounds similar to (4.6) are immediate. When F(.) is known, the
100(1 - e)% confidence bound on R~I, is
4.2. Exponential distributions for stress and strength
When F(x) = 1 - e -x/°l and G(x) = 1 - e -x/°2,
Rs, k
= 1
k! ~
s! j=s iJ + 02/01)
B(s, k
Z (-1) j k-s
s + 1) j=o
(s + j
+ 02/01)
where the last expression is obtained by expanding the product into partial
Stress-strength modelsfor reliability
fractions. Here B(s, k - s + 1) is the beta function. We note that ( ;=7 Xe,
e=l Y,.) is a complete sufficient statistic and ( s + j ) - l u [ ( s + j ) X 1 - YI],
u(x)--1(0) if x > (~<)0, is an unbiased estimator of (s + k + 02/01) -1. The
Rao-Blackwell method leads to the UMVU estimator but its form is complicated
and depends on the hypergeometric function of the second kind. The maximum
likelihood estimator, /~s, k, has the considerably simpler form
/~s.k = 1
k! k~s
S! j=O (j + S + Y/X)
Asymptotically, /~s, g is normally distributed.
L e t m, n --* oo a n d m / ( m + n) ~ 2, 0 < 2 < 1,
+ n(~'~, k - R~, 1,) ',,CP) N(0, o'R
As a consequence of Theorem 4.2, lower confidence bounds are obtained using
/~s, k to estimate R and Y/X to estimate 02/01 in the expression for trnz.
The asymptotic relative efficiency of the nonparametric estimator (4.2) or (4.4),
versus the exponential maximum likelihood estimator (4.11), is given by
2 o22
(1 - 2)a 2 + 2aft
Bhattacharyya and Johnson (1975) tabel values of e.
4.3. Further generalizations of the s-out-of-k
The foregoing results are concerned with the reliability of an s out of k system
where the underlying assumptions are that the component strengths I11. . . . , Yk
are iid random variables and all the components are subjected to a common
random stress X which is independent of the Y's. We outline here some extensions of the model for representing the reliability structure of more complex
(a) Non-identical component strength distributions. When the components of a
system are of different structure, the assymption of identical strength distributions
may not be realistic. This is often the case with systems having standby corn-
R. A. Johnson
ponents. Suppose that out of the k components, k~ are of one category and their
strengths can be reasonably assumed to have a common distribution G 1. The
remaining k 2 = k - k I components are of a different category and their common
strength distribution is denoted by G2. All the k components are exposed to a
common stress X having the distribution F, and the system operates successfully
if at least s of the k components withstand the stress. This corresponds to the
same structure function (4.3). Here, however, Y~. . . . , Yg, are iid G~,
Yk, + 1. . . . , Yg are iid G2 and X is distributed as F. The system reliability is a
functional of the triplet (F, G~, G2) and it can be formally expressed as
R= (j~)(j2)j~ ~ ~
Gf'-J'(1-G2)J2Gkl-J2 dF
where the sum extends over 0 ~ j a ~< k l , 0 ~<J2 ~< k2 such that s ~ j a +J2 ~< k.
When F, G1 and G 2 are exponential with the scale parameters 0, fl~ and f12, the
integral in (4.15) can be simplified to a linear function of terms of the form
[alfl I + a2fi2 + 0] - l where the known constants a I and a2 vary from term to
term. With independent random samples {X~, . . . , Xm}, {Y~, ..., Yah,} and
{Y21 . . . . . Y2n2} from F, G~ and G 2 respectively, one can easily obtain the maximum likelihood estimator of R. The U M V U estimator can also be worked out
along the lines of Section 4.2.
Nonparametric estimators of R can be constructed by either of the two procedures. For instance, a nonparametric estimator/~* is obtained by replacing F, G 1
and G 2 in (4.15) by the empirical cdfs. Alternatively, defining the kernel function
h(X1; Yll, .'., Ylk,; Y2~, "", Y2k:) = 1 if at least s of the (k~ + k2)
Y's exceed X1,
= 0 otherwise,
choices of the ordered subscripts, one obtains
and averaging h over all mC",~t"2~
~,kl ] k k 2 ]
the U M V U estimator of R.
EXAMPLE 4.2. Consider a system with k = 2 and s = 1 where the two components have strength distributions G 1 and G 2 and are subjected to common
stress with distribution F.
Stress-strength models for reliability
From (4.15) with k~ = k 2 = 1, we obtain
R= f (1-G1)G2dF+ f G I ( 1 - G z ) d F + f ( 1 - G 1 ) ( 1 - G 2 ) d F
= 1 - f GIG2dF.
The nonparanaetric UMVU estimator, based on random samples {X~. . . . . Xm},
{Y~ . . . . , Y~n,} and {Y2~, --., Y2n~} from F, G 1 and G 2 respectively, is given by
RNP = (Tl + T2 + T3)/mnxn2
where T 1, Tz and T3 are the numbers of triplets {X;, Ylj,, Y2j2} satisfying
(Y1j, < X i < Y2j2), (Y2j2 < Xi < Yljl) and (Xi < Ylj,, Xi < Y2j:), respectively. The
estimator based on the empirical cdf's is given by
1~* = 1 - f Gln G2,,2dFm = 1 - [mnln2] -1 ~=,~(Qi- i)(Q" - i)
where Qi is the rank of the i-th order statistic X(o within the combined X and Y~
samples, and Q[ is the rank of X(,.) within the combined X and Y2 samples.
(b) Subsystems with independent stresses. In a more complex situation a system
may consist of a number of independent subsystems performing different
functions. Within each subsystem, the components have independent and identically distributed strengths and are subjected to a common stress so that each
subsystem has the structure of an s out of k stress-strength model. The strength
and stress distributions as well as s and k may vary among the subsystems. The
following diagram illustrates such a system where the two subsystems A and B
are serially connected.
subsystem A
subsystem B
2 out of 3
1 out of 2
Fig. 1. Serially connected subsystems with independent stresses.
R. A. Johnson
The subsystem A functions when at least two of the three components survive
the stress X. The component strengths are iid with distribution G~ and the
common stress X has distribution F 1. Similarly, the subsystem B has the structure
of a 1 out of 2 stress-strength model where the strength and stress distributions
are G2 and F 2 respectively. The system reliability R is given by
R 2A, 3 R B
where the factors on the rhs are the stress-strength reliability functions for the
subsystems and they have the same forms as given in (4.1). Using the methods
of Section 4.1, one can obtain the UMVU estimator for each of R~, 3 and R B
and, due to independence, their product will give the nonparameter UMVU
estimator /~ of R. The limiting normal distribution o f / ~ and the form of the
asymptotic variance can then be obtained from the subsystem results.
(c) Binomial data on components. Often, components are tested under random
stress conditions that prevail, and only the number of survivors are recorded
rather than the measurements of stresses and strengths. In the context of a single
component stress-strength model where our objective is to estimate the probability
R~ = P [ Y > X] = S (1 - G ) d F , the present sampling process yields the count Z n
which is the number of pairs (X~, Y~.), i = 1, . . . , n, such that Y,. > X i. The numerical measurements of Y,. and Xi are not recorded. The problem then reduces to
estimating a binomial probability from the number of successes in n trials. More
generally, consider a system consisting of c subsystems where each subsystem has
the structure of a single component stress-strength model. The system reliability
is then a function
R = g ( P l , P2, " . . , Pc)
where Pi = S (1 - Gi) dF;, G,. and Fi are the strength and stress distribution for the
i-th subsystem, and the functional form o f g is determined by the manner in which
the system is structured. Methods of estimating the system reliability from
binomial count data have been developed by Myhre and Saunders (1968),
Madansky (1965), Easterling (1972), and many others. The stress-strength formulation of the model loses its distinctive features when only the count data are
recorded and the subsystems have single components.
For a k (~>2) component stress-strength system where all the components are
exposed to a common stress X in their natural operating environment, some care
is needed for using binomial count data of the component failures for estimating
the system reliability. Intuitively, one might interpret the reliability of an s out of
k system as the probability of obtaining s or more successes in k Bernoulli trials
and proceed to estimate this binomial probability from the count data. In this
process, one would be estimating the functional
Stress-strength modelsfor reliability
where R~ = ~ (1 - G ) d F . This is not the same as the system reliability for an s
out of k system which is given by
Rs, k= ~
Notice, in particular, when s = k and k ~> 2,
O(F, G) =
(1 - G ) d
(1 - G)~ d F = Rk, k .
Bhattacharyya (1977) explores estimation procedures in this contect. He considers
data in the form of failure counts when m components are subjected to a common
stress, and this experiment is repeated n times. Efficiences are also calculated
relative to the exponential model.
5. Extensions of the basic stress-strength model
Two recent developments merit further attention.
5.1. Stochastic process formulation
A more sophisticated stress-strength model allows the stress, X(t), and strength
Y(t) to vary over time. Specifically, let {X(t):t > O} and {Y(t):t > 0} be independent stochastic processes. Consonant with our initial formulation of the
stress-strength model in Section 1, we would define reliability for the period
(0, to] as
Rl(to) = P[ inf Y(t) > supX(t)].
t<~t o
Alternative definitions are also plausible. We could only require that current
strength exceed the maximum thus far encountered.
R2(to) = P[T(t) > supX(s), all t ~< to].
Even less stringent, the requirement could be that current strength exceeds current
Ra(to) = P[ Y(t) > X(t) , all t <~to].
Using definition (5.3), Basu and Ebrahimi (1983) consider the case where X(t)
and Y(t) are brownian motion processes with means /Zl, #a and covariances
tr~ min(s, t), tr22min(s, t). They show that
R . A . Johnson
R3(to) = q~ ( /~2- ~1
\(~? + ,r~)to/
which is of the same form as the normal theory model in Section (3.1). Expression
(5.4) would not be expected to apply for large to since R(to)>_. 0.5 all to, when
['/'2 > /21"
5.2. Stress-strength models with covariates
Strength can usually only be determined by testing a unit to destruction.
However, it is often possible to measure covariates of strength without damaging
the unit.
EXAMPLE 5.1. A 2 X 4 to be used in the frame of a house has bending strength
Y which can be observed only by destructive testing. Yet stiffness (the modulus
of elasticity), which can be used to predict strength, is easily measured by a
non-destructive test. Data for some species suggest that strength is related to
stiffness Z according to the linear relation
Y = ~z + f12z + e2
where e2 is distributed N(0, ~rzZ). For a specimen whose stiffness is z, the conditional reliability becomes
R(X) = PIE> X[z] =
r'z- ~]A1]
EXAMPLE 5.2. Refer to Example 1.2 where the purpose is to compare remission
time X using Drug A with remission time Y using Drug B. Suppose that the age
z of the subject influences the remission time. We postulate the linear regression
cq + fllz + e l ,
Y = c~2 + ~2z + e z
where e 1 is distributed N(0, a2) independent of e2 which is distributed N(0, a~).
For a new subject of given age z, we should provide information about
= e[r>
Xlz] = ~1 ~ 2
(¢~2 -- fl,)Z'].
The models in Example 5.1 and Example 5.2 were introduced by Bhattacharyya
and Johnson (1981). Initially, we consider the more general model where X and
Y may depend on possible different covariates. Set
Z 1 = [ Z l l , Z12, . . . ,
Zlql It
Z,2 = [Z21 , .722 . . . . .
Z2q2] t
Stress-strength models for reliability
and assume
X l z 1 ~ N(~ 1 + ~'IZl,
independent of
rrz: ~ N(~: + / ~ z : , a~).
We are then interested in making inferences concerning the reliability
R(Zl, Z2)= P[ Y > X'Zl, z2] = ~ ( ~ 2 - O~-+--[J'2z--~2-''lZl)
Some exact inference procedures are available when the variances are equal.
Set a 2 = a ~ = a 2 so
R(Z,, Z2) = ~ C x 2 - cq + '2Z2
-- fl'lZl)
We have available, data of the form
(Xl, z , , ) ,
(x2, z~2) . . . . .
(Xm, zl,,,),
(Y1, Z21),
(Y2, z22) .....
(Y,, Z2,).
Given the covariate values Zao, Z2o we note that
is normally distributed with mean ~2 + P~Z2o - ~ - P' zlo and standard deviation
Coa where
= - - "+ -- "1- ( Z l 0 -- ~ 1 ) t
Z j = 1 (Zlj
-- ~ I ) ( Z l j
-- ~ 1 ) t
(ZlO -- Z l )
(Z2o - ~2).
f12 a r e
the least squares estimators. Also
(m + n - qa - q2 - 2) se = ~ (xj - ~
- ]~11(Zlj-
~i)) 2
+ Z (y; - y - ~ ( z 2 j - ~2)) 2 •
R. A. Johnson
is independently distributed as O"2 X#2+ ,. - 2
q !
q2" Consequently,
T + ~2(Z2o - ~2)
has a non-central t-distribution with m + n - 2 - ql
noncentrality parameter.
- -
q2 degrees of freedom and
q = ~ + t ~ Z o - ~, - tr, Z~o
A lower 95% confidence bound, r/, is obtained by solving Fn(Tobs) 0.95 for
r/. Consequently, a 95% lower confidence bound for R(z~o, Z2o) is given by
R(zlo, Z2o)= ~(Cotl/x/~).
Gutmann, Johnson, Bhattacharyya and Reiser (1988) discuss the unequal variance case.
6. Bayesian inference procedures
Given the random sample X1 . . . . , Xm from f(" q01) and an independent random sample from g(" I 02), together with a prior density p(O,, Oz), in principle one
can obtain the posterior distribution
h(01, 02[Xl . . . . , Xm, Y,, "" ", Yn) = p(01, 02) f i f(xil01) (-~ g(y, 102)
i= I
j= 1
for (01, 02)- This distribution could then be transformed to the posterior distribution of Ro," o2 = ~ F(yl 01) dG(y] 02). Enis and Geisser (1971) obtained analytical
results for negative exponential distributions and for normal distributions with
equal variances.
6.1. Bayesian analysis with exponential distributions
Enis and Geisser (1971) assume that the negative exponential scale parameters
01 and 02 are independent, a priori. In particular they make the choice of conjugate prior distributions
pa(01 ) ~ O-s, -1 e - c,/O,,
P2(02) oc 0z-s2- 1 e - c2/o2
Combining the likelihood (3.12) of the samples of sizes m and n, we obtain the
joint posterior density
h(O,, 02I~, Y)~:\~I
e-C.... ~>/o2
Stress-strength modelsfor reliability
Transforming to r = 02/(01 + 0 2 ) and v = 01 a + 0 2
produces the marginal posterior distribution of R.
h(r)ocrm+S~-l(1 - r)n+s2 1(1
and then integrating out v
c 2 - c I + ny - m 2
C2 + n y
The transformed variable p = (1 - r)/(1 - cr) has a beta distribution with parameters n + s 2 and m + s ~ so
1 - 1 _ r]_
B(n + 82, m + sl)
-- E)/(1 -- c_r)
u,+S2- 1(1
_ u)m+sl-
1 du.
A 100(1 - ~)% Bayesian lower bound on R is given by
1 -- ~]1--o:
r -
1 -
where qa - ~ is the 100(1 - ~)-th percentile of the beta distribution with parameters
n + s 2 and m + s I . Comparing (6.7) with the alternative form for the bound below
(3.13), we see that the choice of 'informationless' priors, s a - - s 2 = 0 and
c I = c 2 -- 0, leads to the same bound as the classical procedure.
6.2. Bayesian analysis with normal distributions
For the case of independent samples, Enis and Geisser (1971) restrict their
treatment to normal populations with equal variances. They employ the conjugate
O')~O'-(b+3)exp { -~12Cr2 (bco+cl(}
where b, c o, c l, c 2 > 0. The likelihood is
(2/1;)TM + n)/2
x exp
o.rn + n
--[(m + n - 2)sp2 + m(/~ I - 2)z + n(#2 _ .~)2]}.
2ff 2
R. A. Johnson
Since the reliability R = ~(6) where 6 = (/t 2 - I ~ ) / v / 2 a ,
determine the joint posterior distribution of b and (m + n
posterior density for (/~,/~2, a) can then be written as
h(#l, ~2, tr)m t r - ( b + l + m + " ) e x p
it is convenient to
- 2)sz/tr 2. The joint
- ~ a 2 bco + (m + n - 2)Sp2
+ m¢l(-X - ml) 2 + nc2(Y__- _m2)2]t
m + c1
n = c2
.o._lexp { (c I + m ) (
• rr 1 exp
2a 2
#2 . . . . . . .
c2 + n
"] .)
CZ + m
/ )
F r o m (6.10) it is readily seen that, a posteriori,
[/(ny+_c2m2) (_my+ clm,)
~5 given
m+c 1
, ~
+C 1
n+c 2
independent of
= 0,2
(m + n - 2)s 2 + bc 0 + mcl(Y - ml)2/(m + cl) + nc2(y - m2)/(n + c2)
w h i c h is d i s t r i b u t e d
2 + n + b" Setting
as ~(m
c =1 I - -1 + - - 1
+ C1
1 ,
Fl "[- C 2
k . [. n~+c2m2
. .
m ~ + c~m~-] / . -
n -J- C 2
the joint posterior distribution of b, z is
z(m+n+b)/2- I e-Z~2
F() 2
m + c~ d /
Stress-strength models for reliability
and the marginal posterior distribution of b is given by
h(blxl, ...,
x m, Yl . . . .
(C- lk2 +
1) -(m+n+b)12
• ~ [xf2c- 'k6(1
e - ~)2/2C
+ c - 'k2) - lizlg F(½(m + n + b + j ) )
r ( j + 1)
Although expression (6.11) is rather tbrmidable, lower bounds on ~5 can be
obtained via one-dimensional numerical integration.
In addition to their usual interpretation as information from earlier samples,
some guidance in the choice of parameters is provided by the prior expected value
Ep[R] = e
I tb <
m2 - ml
+C 1
Enis and Geisser (1971) show that the choice of a vague prior oc tr- ~ produces
a posterior distribution whose expectation E ( R ) is closer to 0.5 than is the
maximum likelihood estimator• Finally, it should be remarked that they treat the
slightly more general case of estimating P [ a I X 1 + a 2 X 2 + • • • + apXp > 0] and
that one of their formulations includes paired stress-strength data.
6.3. The Bayesian stress-strength model in risk analysis
The stress-strength reliability model is also an integral part of many risk
analyses. At the component level, for instance, it may be necessary to make an
assessment of the reliability of a motor operated value in a nuclear power plant.
This application of the stress-strength model has one dominant feature. Little or
no data are available on either the critical stress or even on the strength of the
With regard to estimating the strength distribution, one method is to gather
expert opinions from several persons. The ellicited information could be in the
form of percentiles such as the 10-th, 50-th and 90-th percentiles. A lognormal,
or other distribution, could be fit to each person's percentiles. These must then
be combined, possibly in a weighted fit, to provide an estimated strength distribution. Estimation of the stress distribution is usually approached via mathematical models which convert phenomena like earthquake magnitude to the stress
on a component located at a given site. Random quantiles, like ground motion
from the earthquake and parameters of the structure housing the component, are
then introduced. The resulting process is studied by simulation to produce an
R.A. Johnson
estimated stress distribution. The component reliability, R = P [ Y > X], given an
earthquake, can then be estimated using the estimated stress and strength distributions determined above. Mensing (1984) provided the following example.
EXAMPLE 6.1. One important eomponent in the operation of a nuclear power
plant is the steam generator. In a study of the risk of a nuclear power plant to
earthquakes, it is necessary to assess the ability of the generator to withstand the
stresses imposed by the ground motion due to an earthquake. Almost no data
exists for estimating the strength of steam generators, with respect to ground
motion, so expert opinions were elucidated. It was first determined that the most
likely cause of generator failure would be failure of its supports. Five experts were
asked to estimate the 10-th, 50-th and 90-th percentiles for the strength of the
steam generator supports. Their responses are summarized in Table 5.1 where the
strength variable is the peak acceleration in ft/sec 2.
Table 5.1
Expert opinions concerningpercentiles of the strength distributions (ft/sec2)
Assuming the strength of the generator supports can be approximated by a
lognormal distribution, a weighted least squares procedure was used to estimate
the mean, 0, and standard deviation, b, of the natural logarithm of the strength
distribution. The resulting estimates are 0 = 4.06 and ~ = 0.29.
Mathematical modeling can be used to estimate the distribution of stress at the
base of the steam generator. Suppose the stress distribution is modeled by a
lognormal distribution where the natural logarithm of stress has mean
0s = 2.32 ft/sec 2 and standard deviation bs = 0.40 ft/sec 2. Then, it is clear that the
reliability of the steam generator is nearly 1.0. Specifically, R = P [ l n Y > lnX]
= ~(3.52) = 0.99978.
Since the primary source of information about the random variation in stress
and strength is expert opinion and engineering judgement, it is a more difficult
problem to obtain lower bounds for R. In the context of the nuclear power plant,
the lower bound on R converts to an upper bound on the probability of failure
and subsequent radioactive release. Some attempts have been made to quantify
the uncertainty experts have in formulating their opinions and using this quantified
Stress-strength models for reliability
uncertainty to develop bounds for the probability of failure. See Bohn et al. (1983)
for more information.
A risk analysis of a system is considerably more complicated than for a single
component. With a nuclear power plant, failure can occur in numerous ways.
From a fault-tree analysis, each separate failure path is determined. Data are
typically available on some component strengths but it is mostly expert opinion
that must be combined in order to obtain an estimate of the failure path probabilities and, ultimately, the system reliability. The calculation of an estimate of
system reliability can involve as many as 300 to 400 components and the probability of an accident sequence is calculated from, say, a multivariate normal
distribution. In this setting it is possible to include a stress such as an earthquake,
as a common stress to numerous components.
Basu A. (1980). The estimation of P[X< Y] for distributions useful in life testing. Navel Res. Log.
Quart. 3, 383-392.
Basu, A. and Ebrabimi, N. (1983). On computing the reliability of stochastic systems. Statistics and
Probability Letters 1, 265-268.
Bhattacharyya, G. K. (1977). Reliability estimation from survivor count data in a stress-strength
setting. IAPQR Transactions--Journal of the Indian Association for Productivity, Quality and Reliability
2, 1-15.
Bhattacharyya, G. K. and Johnson, R. A. (1974). Estimation of reliability in a multicomponent
stress-strength model. J. Amer. Statist. Assoc. 69, 966-70.
Bhattacharyya, G. K. and Johnson, R. A. (1975). Stress-strength models for system reliability. Proc.
Syrup. on Reliability and Fault-tree Analysis, SIAM, 509-32.
Bhattacharyya, G. K. and Johnson, R. A. (1977). Estimation of system reliability by nonparamatric
techniques. Bulletin of the Mathematical Society of Greece (Memorial Volume), 94-105.
Bhattacharyya, G. K. and Johnson, R. A. (1981). Stress-strength models for reliability: Overview
and recent advances. Proc. 26th Design of Experiments Conference, 531-546.
Bhattacharyya, G. K. and Johnson, R. A. (1983). Some reliability concepts useful in materials testing.
Reliability in the Acquisitions Process. Marcel Dekker, New York, 115-131.
Birnbaum, Z. W. (1956). On a use of the Mann-Whitney statistic. Proc. Third Berkeley Symp. Math.
Statist. Prob. 1, 13-17.
Birnbaum, Z. W. and McCarthy, R. C. (1958). A distribution free upper confidence bound for
P(Y < X) based on independent samples of X and Y. Ann. Math. Statist. 29, 558-62.
Bohn, M. P. et al. (1983). Application of the SSMRP methodology to the seismic risk at the Zion
Nuclear Power Plant, NUREG/CR-3428 Nuclear Regulatory Commission, Nov.
chandra, S. and Owen, D. B. (1975). On estimating the reliability of a component subject to several
different stresses (strengths). Naval Res. Log. Quart. 22, 31-40.
Church, J. D. and Harris, B. (1970). The estimation of reliability from stress-strength relationship.
Technometncs 12, 49-54.
Downton, F. (1973). The estimation of P ( Y < X) in the normal case. Technometrics 15, 551-558.
Easterling, R. (1972). Approximate confidence limits for system reliability. J. Amer. Statist. Assoc. 67,
Enis, P. and Geisser, S. (1971). Estimation of the probability that Y < X. J. Amer. Statist. Assoc. 66,
Govindarajulu, Z. (1967). Two sided confidence limits for P ( Y < X ) for normal samples of X and
Y. Sankhy-d B 29, 35-40.
R . A . Johnson
Govindarajulu, Z. (1968). Distribution-free confidence bounds for P ( X < Y). Ann. Inst. Statist. Math.
20, 229-38.
Guttman, I., Johnson, R. A., Bhattacharyya, G. K. and Reiser, B. (1988). Confidence limits for
stress-strength models with explanatory variables. Technometrics (in press).
Lehmann, E. (1959). Testing Statistical Hypotheses. Wiley, New York.
Kececioglu, D. (1972). Reliability analysis of mechanical components and systems. Nuclear Eng. Des.
9, 257-290.
Lloyd, D. K. and Lipow, M. (1962). Reliability, Management, Methods and Mathematics. Prentice-Hall,
Englewood Cliffs, NJ.
Madansky, A. (1965). Approximate confidence limits for the reliability of series and parallel systems.
Technometrics 7, 495-503.
Mazumdar, M. (1970). Some estimates of reliability using interference theory, Naval Res. Log. Quart.
17, 159-65.
McCarthy, J. F. and Orringer, O. (1975). Some approaches to assessing failure probabilities of
redundant structures. Composite Reliability, ASTM STP 580, American Society for Testing and
Materials, 5-31.
Mensing, R. (1984). Personal communication.
Myhre, J. M. and Saunders, S. C. (1968). On confidence limits for the reliability of systems. Ann.
Math. Statist. 39, 1463-72.
Owen, D. B., Craswell, K. J. and Hanson, D. L. (1964). Nonparametric upper confidence bounds
for P(Y < X) and confidence limits for P(Y < X ) when X and Y are normal. J. Amer. Statist. Assoc.
59, 906-24.
Sen, P. K. (1967). A note. on asymptotically distribution-free confidence bounds-for P(X < Y) based
on two independent samples. Sankhy-d A 29, 95-102.
P. R. Krishnaiah and C. R. Rao, eds., Handbook of Statistics, Vol. 7
© Elsevier Science Publishers B.V. (1988) 55-72
Approximate Computation of Power Generating
System Reliability Indexes
I. Introduction
An electric power system is a massive energy conversion and transmission
facility. Its function is to convert chemical, nuclear, or kinetic potential into a
more useful electrical potential and transmit electrical energy to its consumers.
Power systems tend to have generation concentrated in specific locations, whereas
demand is spread over a large geographic region. The problem of providing power
to widely scattered demands from remote generating stations is solved by the
electric utility companies through a three-tiered system. Elements of this system
are: power generation subsystem, transmission subsystem, and distribution subsystem. In the power generation system, electric power is produced from a
number of different types of generating plants (fossil, nuclear, hydroelectric, etc.).
Transmission systems carry large amounts of power for long distance at a high
voltage level. From the transmission sources, distribution systems carry the load
to a service area by forming a fine network.
The reliability of an electric power system has been defined as the probability
of providing the user with continuous service of satisfactory quality [8]. By
satisfactory quality, it is meant that the frequency and the voltage of the power
supply remain within prescribed bounds. There are several reasons why reliability
is very important to the electric power industry. The public has grown accustomed
to very reliable supply of electricity, and it would not accept lower standards. The
occurrence of power failure is expensive to the customer as well as to the utilities.
The social costs of power failure have also been well-documented. There has been
increasing concern during the recent years on the risks to public health and safety
associated with different energy sources that are used to produce electricity. In
relation to nuclear power, these risks are largely contingent on the probability and
severity of infrequent system failures. Therefore, reliability considerations have
come to play a major role in the planning, design, operation and maintenance of
electric power plants. To achieve a high degree of reliability at the customer's
level, it is necessary that each of the three components of the power system-generation, transmission and distribution--provide an even higher degree of reliability.
M. Mazumdar
The performance of electric power systems is influenced by a large number of
random phenomena. First, the demand for electric power has a large stochastic
component, which is strongly influenced by weather. The outdoor equipment, such
as transmission lines, is subject to natural causes, e.g., storm, lightning, floods as
well as to inadvertent man and animal-caused damages. The equipment used to
generate and transmit electricity fails randomly. The time to restore the failed
equipment is also a random variable. It is thus necessary to construct probability
models which can be used to predict the performance of the power systems as
they are influenced by these random variables. These probability models are used
to compute standard reliability parameters such as mean time to failure, availability, etc., as well as reliability indexes which are special to the electric utility
industry. Concurrently, one needs to pay attention to proper collection and analysis of 'outage' data so that one has appropriate confidence in the output of these
reliability studies.
Early studies in power system reliability evaluation were confined to determination of generating system reserve capacity. Only comparatively recently, such
studies have been extended to cover the transmission and distribution systems.
Consequently, the state-of-the-art with regard to generation reliability models is
much more advanced as compared to the transmission and distribution systems.
Reliability models play an important role in the determination of required
installation generation reserves of a given electric utility company, and its longterm planning for generation capacity expansion. The quantity determined here is
the amount of installed reserve capacity required such that the probability of
load-loss does not exceed a prescribed small amount. These studies thus help the
planner in scheduling generating unit additions as the load grows over time. Such
models also play an important role in the evaluation of expected generation
system production costs. An electric utility system typically consists of many
generating units of different capacities, availabilities and operating costs. Not all
the units within a system experience equal utilization over time because of such
differences, and units with high running costs are pressed into operation only if
the load is high and/or the cheaper units are failed at the time. Therefore, the
computation of expected overall production costs of a utility system needs to
account for the stochastic characteristics of the generating unit failures and the
system load. Power generation reliability models include only the generating units
within a given system, and the rest of the system is assumed to be perfectly
reliable. Thus, according to these models, a system failure occurs when the total
power generated by the system falls short of the system load. In order to contrast
the available system capacity with the demand, two sets of models are required,
one for the states (e.g., failed and non-failed) of the generating units, and the other
for load variations in a given system. When one combines these two stochastic
models, one arrives at an overall system model whose solution provides the
required reliability indexes which can then be used as engineering tools for planning and operating decisions.
In this paper, we will confine ourselves to an examination of the computational
aspects of two important power generation system reliability indexes which are
Approximate computation of power generating system reliability indexes
used in power system planning and production costing evaluation. In particular,
we provide additional results on the use of an effective approximation scheme
which was proposed in a recent paper [ 14]. This scheme uses a transformation
proposed by Esscher [10] for computing actuarial risks. Section 2 describes the
reliability models used for the power generating system in connection with the
determination of the risk due to load loss and the expected production cost. We
define here the reliability indexes of interest, and point out the difficulties in their
computation. Section 3 gives a description of the Esscher's approximation method
as well as that of a more common approximation known in the recent power
system literature as the method of cumulants [ 15]. We derive here the necessary
formulas in connection with the application of Esscher's method. Section 4
provides the numerical estimates of the accuracy of this method by applying it to
several prototype systems. Section 5 states the conclusions.
For a more detailed discussion of generation reliability models and their uses,
the reader is referred to the monographs by Billinton [3], Billinton, Ringlee and
Wood [4], Endrenyi [8], and Sullivan [16].
2. Generating system model and the reliability indexes
Generating unit representation
It is assumed that the generating system under consideration is composed of
n independent generating units. That is, they can fail, and be repaired, independent of failures and repairs of the other units. This is usually a realistic
assumption except when a single boiler supplies steam to several turbogenerators
through a common header. In course of their operation, generating units may
suffer complete failures or partial failures, where they lose a part of their capacity.
If a simple two-state model is assumed for the generating unit, such that it
alternates between two operating states, 'up' (operating) or 'down' (under repair),
a measure of the unit performance is given by its unavailability, which is defined
as follows [2]:
Mean downtime
A --
Mean downtime + Mean uptime
The index A measures the fraction of the time that a unit is unavailable for service
during periods when it is not on planned outage. Endrenyi [8] has shown that
this index is meaningful even when maintenance lasting short length of times is
concerned, provided that maintenance itself does not contribute to failure.
In the power system vocabulary, the term used for unavailability is the forced
outage rate (FOR), which unfortunately is a misnomer, since the index represents
a pure number and not a rate. The FOR is defined as
Forced outage time
Forced outage time + In-service time
M. Mazumdar
where the times appearing in the numerator and denominator refer to a reasonably
long period of observation. The index (2) is equivalent to (1) when the period
under question is long enough. The above definition of the forced outage rate or
the unavailability assumes that the generating unit has only two states--operating
at 100~o capacity or completely failed. The intermediate capacity states are
usually accounted for by defining an index called the equivalent forced outage rate
(EFOR), which is given by the following equation:
where FOH, EFOH and SH denote respectively full forced outage hours, equivalent forced outage hours and service hours. The quantity, equivalent forced outage
hours, is obtained by multiplying the actual partial outage horus for each partial
outage event by the fractional capacity reduction and then summing the products.
The introduction of the index EFOR enables one to approximate a unit with
several capacity states by one having only two states. In this two-state equivalent
representation, the index EFOR estimates the long-run probability of being fully
out and the quantity (1-EFOR) estimates the long-run probability that it is available at full capacity. Data on EFOR are presented for a variety of sizes and types
of generating units in reports published by the Edison Electric Institute, see, for
example, [7].
Load models
An hourly load duration curve is obtained by first plotting on the vertical axis
the power demand forecasted for each hour in a planning period in a chronological order along the horizontal axis. The load duration curve (LDC) is then
obtained by reordering the demands in a descending order of magnitude. Here,
the number of days on which the load exceeds a given value is plotted as an
abscissa with the forecasted load value as the ordinate. Assume that the forecasted peak demand occurs for one hour during each of the days in a 20-day
planning. Then, one can say that the peak load occurred in a fraction equal to
1/24 of the planning period. Figure 1 shows that the system load was expected
to be above 100 MW during 50~o of the time. When the abscissa is normalized
to 1, the figure can be read to denote the fraction of the time the load is expected
to be above a given value. Thus it is possible to give a probabilistic interpretation
to the load duration curve. The horizontal axis of the curve yields the survivor
function of the load when it is treated as a random variable. It gives the probability that the observed load will exceed a specified value as denoted by the
In some studies on generation reliability, notably when unit production costs
are evaluated, it is a practice to merge the individual generating unit failure models
and the load probability distribution by defining the so-called equivalent load
Approximate computation of power generating system reliability indexes
150 MW-
100 MW
75 MW
Fig. 1. H o u r l y l o a d d u r a t i o n
curve: An example (abscissa normalized
(in hours)
to 1 for LOLP
duration curve, abbreviated as E L D C [15]. This definition rests on the observation that the outages caused by plant unreliability can be thought of as additions
to the true load on the system. Suppose that all n units within a given system are
candidates for operation to meet a given load, L. Then
= c 1 Jr- c 2 --1- • • • --1- C n - ( X 1 "Jr"X 2 -1- " ' " "~ X n "1- L ) ,
where ci is the installed capacity of unit i, and X; is the capacity on outage for
unit i. Notice that the quantity, (X 1 + X 2 + • • • + X, + L), plays the role of an
equivalent load that confronts the n units of the system. A curve which shows the
proportion of times that the observed equivalent load will exceed given specified
values is called the equivalent load duration curve (ELDC). It is clear from the
foregoing discussion that separate sets of such curves can be drawn for all the
n individual units of the system.
Loss of load probability index
Two different sets of generation reliability indexes are used by the electric utility
industry--one in the context of long-range planning and the other for short-term
operational planning. The former provides inputs to decisions in generation
expansion planning and the scheduling of new unit additions. The latter indexes
are of use to the operating engineer in the daily operation of a power system. The
loss of load probability (LOLP) index is used in the long-range planning context,
and it measures the probability that a given system's available capacity is insufficient to meet the system peak load on a given day. It estimates the fraction of
time the utility system will have a generation deficit, with no consideration given
to the magnitude of the deficit.
Consider a system consisting of n units such that the installed capacity of unit
i is c~ and its (equivalent) forced outage rate is p;, i = 1, 2, ..., n. Define X i as
the unavailable capacity or the capacity on outage for unit i on a given day. We
M. Mazumdar
assume that X; is a sequence of independent random variables. Thus X; is a
random variable with ,,, distribution of
X i = ci
= 0
with probability = Pi,
with probability = 1 - p ; .
Let L denote the system peak load. Then the loss-of-load probability (LOLP)
index is measured by
LOLP=Pr{X 1 +X 2+...
+X n>c 1+c2+...
In the situation where the LOLP index is being estimated for future time periods,
as is typically done in power generation planning, the forecasted peak load will
be uncertain and regarded as a random variable. We usually regard L as normally
distributed with mean/~ and variance a 2, its distribution being independent of the
X i random variables. If the peak load is regarded as known, a 2 = 0 and L = #,
but otherwise, a 2 > 0, and departures from normality may also be anticipated. Let
Y denote the deviation of the peak load from its mean /~. Then we can also
express (4) as follows:
2 + ...
+ X n + Y>
z} ,
where z = Cl + c2 +
+ Cn ~, a n d Y is normally distributed with mean 0 and
variance a 2. The electric utilities in the United States plan their operation so as
to meet a targeted value of the LOLP index of the order of 10- 4. Thus, the LOLP
measure represeilts the extreme tail probability in the distribution of
" ' "
X l + X2 + " " + X . .
P r o d u c t i o n costing i n d e x
For the evaluation of the expected operating costs of a given utility, we assume
somewhat simplifying the real-life situation, that (a) there are n units in the system, (b) the units are brought into operation in accordance with somespecified
measure of economy of operation (e.g., marginal cost), and ( c ) t h e unit i, in
decreasing order of economy of operation, has capacity, c i and EFOR, pi,
i = 1, 2, ..., n. Let U denote the system load, and let F(x) = Pr { U > x}. Thus
F(x) represents the load-duration curve or LDC.
Consider now the i-th unit in the loading order and let W~ denote the energy
unserved (i.e., the unmet demand) after it has been loaded. Let, as before, X,.
denote the unavailable capacity for unit i, whose probability distribution is given
by (3) and let U denote the system load. We define
C i :
C i ~- C 2 ~-
i =
" " " -~ C i.
1, 2, . . . , n ,
Approximate computation of power generating system reliability indexes
Thus, Z; represents the equivalent load on the first i units. Let ge(.) and G~(.)
denote the probability density and distribution functions, respectively, of Z;.
= z,Thus,
E(Wi) =
z , < c,,
z , > c,.
fc o(z - Ce)g~(z) dz.
Now denote by ei the energy produced by unit i. Then it follows from (9) that
E(ee) = E(W,._ 1) - E(Wi)
(z- Ci_l)gi_l(Z)dz-
( z - C~)g~(z)dz
i -1
Gi- l(z) dz -
Ci- I
ai(z) dz
G i_,(z) d z ,
i = 1, 2 , . . . , n .
G,(z)= 1 - G , ( z ) ,
i = 1,2 . . . . , n ,
In the above, we interpret CO = 0 and Go(x) = if(x). The development of (10) is
due to Baleriaux et al. [1]. We define the capacity factor for unit i to be
i = 1,2 . . . . . n.
This index gives the ratio of the expected output to the maximum possible output
for each unit. An accurate estimate of this index is needed by the utilities for the
purposes of evaluating expected system operating costs and optimizing its generation planning.
Computational difficulties
In its planning process, a given utility needs to compute the LOLP and CF
indexes for various combinations of the system load and mix of generating units.
Thus it is necessary that an inexpensive method of computation be used for the
purpose of computing these indexes. Examining (4), we observe that when the ci's
and the pt's are all different, at least 2 n arithmetic operations will be required to
evaluate one LOLP index. Thus, the total number of arithmetic operations in the
computation of one LOLP index varies exponentially with the number of gener-
M. Mazumdar
ating units in the system, and it might become prohibitively large for large values
of n. From (10), we observe that the expected energy output of a given unit is
proportional to an average LOLP value over a range of z between Ce_ ~ and Ci.
Thus, it is not feasible for a power system planner to engage in a direct computation of (4) or (10), and he has to resort to approximations which require much
less computer time.
3. Approximate procedures
Method of cumulants
From an uncritical application of the central limit theorem, one could have
made the convenient assumption that the distribution of X1 + )(2 + "'" + Xn in (5)
or the survivor function G~_ l(x) in (10) will be approximately normal. While this
assumption may not cause problems while computing probabilities near the central region of the probability distribution, the 'tail' probabilities may be inaccurately estimated. A typical generation mix within a given utility usually
contains several large units and otherwise mostly small units thus violating the
spirit of the Lindeberg condition [ 11 ] of the central limit theorem. An approach
to the problem of near-normality is that of making small corrections to the normal
distribution approximation by using asymptotic expansions (Edgeworth or
Gram-Charlier) based on the central limit theorem. Use of these expansions in
evaluating power generating system reliability indexes has come to be known in
the recent power-system literature as the method of cumulants. For details on the
use of these expansions in computation of LOLP, see [13], and for its use in
computing the capacity factor index, see [5]. In the evaluation of LOLP, one first
obtains the cumulants of X1 + X 2 + • • • + X n + Y by summing the corresponding
cumulants of the Xi's and of Y. These are then used in the appropriate Edgeworth
or Gram-Charlier expansion. Similarly, for the purpose of evaluating E(e~) in
(10), one first obtains the cumulants of Z; for each i = 1, 2 . . . . , n, by summing
up the cumulants of X1, X 2. . . . . X~ and U. Next, one writes the formal expansion
for G~(x) using these cumulants upto a given order. Next, one integrates the series
term by term in (10) to obtain an approximation for E(ei). Caramanis et al. [5]
have made a detailed investigation of this approximation in the computation of
the capacity factor indexes. Their results have cast favorable light on the efficiency
of this method.
Esscher's approximation: Computation of LOLP
We illustrate this method first with respect to the computation of LOLP. We
assume that the peak load is non-random and known, i.e., a = 0. As demonstrated in [ 14], this is the worst case for the peak load distribution insofar as the
relative accuracy of the different approximation methods is concerned. We use the
symbols F i and F* to denote the distribution functions of the random variables
X, and X 1 -~- X 2 -t- " " " -1- X n , respectively. The moment generating functions of F;
Approximate computation of power generating system reliability indexes
and F* are respectively given by
Fi(s) = eaCip; + (1 -p~.),
.~*(s) = f i l~i(s ) - e ~'(s~ ,
In order to provide a notation which covers the continuous as well as discrete
variables, we use the symbol F(dx) to denote the 'density' of the distribution
function F(.), (see Feller [11, p. 139] for a mathematical explanation of the
symbol F(dx)). We now define for some s,
V~(dx) - - -
Further, let V* denote the convolution of V~, V2 . . . . . V,. With these definitions,
it is seen that the LOLP index may be expressed, as follows:
F*(dx) = F*(s)
We now choose s such that z equals the mean of V*(.). Thus, although in
practical application, z will lie in the fight hand tall of F*(-), it will now be at the
center of the d.f. V*(.). We also expect the distribution of V*(.) to be much closer
to the normal distribution in the central portion of the distribution (much more
so as compared to the tails). Thus, in the second integral of (15), the integration
is being done in a region where V*(.) could be accurately approximated by a
normal distribution or an expansion of the Edgeworth type. The effect of the
multiplier e-sz for s > 0 is to de-emphasize the contribution of V*(dx) for values
of x in the tail. Esscher's approximation technique consists in replacing V*(dx)
by an appropriate normal distribution or an Edgeworth expansion, and evaluating
It can be shown [9] that corresponding to a given s, the first four cumulants
of V*(.) are given by
= ~
i= 1 Pi + (1 -- Pi) e . . . .
= , =~"1 [Pi + (1 - P i ) e . . . . 12 ,
p,(_l- p9c£
~b'(s) = ~
c~pi(1 - Pi) e . . . . [ - P i +
(1 - p;) e-~C~ ]
[Pi + (1 - p ; ) e . . . . ]3
M. Mazumdar
c ? p , ( l -- p,) e - ~ ' [ p e - 4p~.(1 - p;) e . . . . + (1 - p,)2 e - 2~,]
[p; + (1 - p , ) e-SC'] 4
In applying Esscher's approximation, we first solve the equation (in s):
~O'(s) = z.
Call this unique root so. We now replace V*(dx) in (15) by the normal density
or an appropriate Edgeworth expansion. For a random variable X, whose first
four cumulants are kl, k2, k 3 and k 4, its density F(dx) is approximated by the
Edgeworth expansion [6] formula as follows:
F(dx) -~ kl/~
q(t) - ~- ~ ( t )
~4~(t) + ~
dt ~ ,
t - x - k1
k3/2 ,
Now, if we replace V*(dx) in (15) by the appropriate normal and Edgeworth
expansions (18) using first and second order terms, the following formulas result:
- LOLP l = eq,¢s0)-~O~Eo(u)
LOLP2 = L O L P I I I _ 7~6 vl
_ LOLP 3 = LOLP 2 + LOLP 1
uv + -7'lz
u3v -
, (19c)
u = ~o~/q,"(So),
Eo(u ) = e"2/2[ 1 - ~(u)] (¢(u) =
7', -
[ 0" (So)] 3/2
?~ -
[ 0" (so)l 2
w = x/~Eo(u)
q~(u) du),
Approximate computation of power generating system reliability indexes
u21) =
U 3
~](SO) = logeP*(So) •
E s s c h e r ' s a p p r o x i m a t i o n : C o m p u t a t i o n o f unit capacity f a c t o r s
A typical load distribution curve is multimodal, and it cannot be approximated
by a standard distribution. For the purpose of applying the present approximation, we discretize the load-duration curve into a distribution representation
having probability masses at a given number (say, m) of discrete points. That is,
we obtain a discrete approximation of the load duration curve where the load
points are l 1, l2 . . . . , lm with the corresponding probabilities q , r 2 . . . . , rm, where
rj = P r { U = lj.}. With this approximation, one can evaluate G i _ l ( x ) in (10) as
+ X 1 +X 2 + '''
--- ~ P r { X I + X 2 + " "
where zj = x - lj. The expression Pr{X 1 + X 2 + . . . + X i_ 1 > z j } can be evaluated using the formulas given in (19).
It can be seen from (16b) that q/(s) = z is an increasing function in s, and we
have defined s o to be the root of the equation: ~k'(s) = z. From (16a), we observe
that q/(0) = Y~;= ~ cep e. Thus, in (20), if z j < E [ X ~ + X 2 + . . . + X i_ 1 ], So(Zj) will
be negative. Now consider equation (15). If So < 0, the effect of the multiplier e-sx
is to amplify the error in the approximation of V*(dx) for large x - - a clearly
undesirable situation. Thus, it appears appropriate in this situation to express
Pr{X, + X 2 + . . .
+Xi_ , >zj}
Pr{X, +)(2 + " ' "
1 -
+Xi_, <~zj},
and use Esscher's method on the right hand side of (21).
We define
L O L P = Pr{X~ + X 2 + . . - + X , <~ z } .
Corresponding to (19), we obtain the following approximation for LOLP:
L O L P --- L O L P 1 = e¢(S°)-S°ZEo(u)
1 - 7-1 v'
_~ L O L P 3 = L O L P 2 + L O L P ~
7~ u v ' + - (24
uv' -
M. Mazumdar
u2 - 1
~o(U) = e "2/2 ~ ( u ) ,
w' = - , j ~ e o ( U ) ,
u 3
For the purpose of evaluating E(ei) in (10), the integration can be done using
an appropriate numerical integration routine after evaluating G,._ l(x) for as many
points as the quadrature formula requires. In the numerical work reported in
Section 4, we used the Trapezoidal rule for numerical integration.
4. Numerical results
This section applies the formulas obtained in the preceding section to two
prototype systems. System A is the prototype generating system provided by the
Reliability Test System Task Force of the I E E E Power Engineering Application
of Probabilistic Methods Subcommittee [12]. Table 1 gives the assumed generation mix of the 32 units comprising the system--their installed capacities and
Table 1
Unit power ratings for a prototype generating
system, and their assumed FOR's (System A)
FOR's. Table 2 provides a comparison of the estimated L O L P corresponding to
different values of the system margin obtained with the use of Esscher's approximation formulas (19) and the method of cumulants. For the latter method, we use
the Edgeworth expansion formula keeping terms up to the first four cumulants
only. Usually, such expansions are sufficient to provide close enough approximations in cases where the use of such expansion is appropriate. We also display
in this table the exact L O L P values for benchmarking and comparison. Figure 2
shows the percentage relative errors resulting from using the two approximations
for a wide range of values of the system margin.
Approximate computation of power generating system reliability indexes
Table 2
Comparison of algorithms for LOLP estimation (System A)
Esscher's approximation
Exact a
1.23 ( - 1)
6.21 ( - 2)
2.47 ( - 2)
4.34 ( - 3)
7.91 ( - 4)
4.01 ( - 4 )
1.02 ( - 4)
8.06 ( - 6)
1.58 ( - 6)
4.69 ( - 8)
7.25 ( - 9)
8.43 ( - 10)
9.27 ( - 11)
7.97 ( - 12)
1.35 ( - 1)
7.75 ( - 2)
4.94 ( - 3)
8.51 ( - 4)
1.03 ( - 4)
3.01 ( - 5 )
7.69 ( - 6)
1.69 ( - 6)
5.16 ( - 8)
7.07 ( - 9)
8.29 ( - 10)
6.27 ( - 12)
1.26 ( - 1)
7.45 ( - 2)
5.07 ( - 3)
8.85 ( - 4)
1.07 ( - 4)
3.11 ( - 5 )
7.85 ( - 6)
1.72 ( - 6)
5.23 ( - 8)
7.08 ( - 9)
8.49 ( - 10)
5.54 ( - 12)
1.23 q - 1 )
7.22 - 2)
4.03~ - 2)
2.13, - 2 )
1.06~ - 2)
4.95 - 3)
2.16, - 3)
8.75, - 4)
3.24, - 4 )
1.08, - 4)
3.16, - 5)
7.99, - 6)
1.73, - 6)
3.23~ - 7)
5.21, -8)
6.90 - 9 )
8.49~ - 10)
1.45~ - 10)
1.68, - 12)
1 . 1 9 ( - 1)
6.30 ( - 2)
3.48 ( - 2)
1.34 ( - 2)
6.82 ( - 3)
2.74 ( - 3)
8.57 ( - 4)
4.07 ( - 5)
6.29 ( - 6)
7.78 ( - 7)
7.77 ( - 8)
6.28 ( - 9)
4.12 ( - 10)
9.63 ( - 13)
3.44 ( - 14)
" Excerpted from (12).
Table 2 and Figure 2 impress one with the accuracy of Esscher's approximation
in the region of our interest, i.e., for values of LOLP in the range between 10-3
and 10-5 and beyond. There is very little difference between the three formulas,
and perhaps the formula (19b) represents the overall best choice. The cumulants
methods does not fare too badly in the probability range 10- 1 to 10- 3; but below
this range, the Esscher approximations appear to be decidedly superior to the
method of cumulants. Similar comparisons for several other systems are given in
a research report [9]. The results of this report as well as those given in [14] show
that Esscher's method, while very accurate, is also speedy enough to be adopted
in routine utility practice.
For the purpose of evaluating the accuracy of Esscher's approximation in
providing production costing expressions, we use the data provided by Caramanis
et al. [15] with respect to a second synthetic system, referred to as the EPRI
system D. Tables 3 and 4 give respectively the capacity mix of the system with
the associated FOR's and the load duration curve. Table 5 gives the derived
probability distribution (Is, rs) obtained from Table 4. Here, ls is the interval
midpoint for the j-th load class interval in Table 4, and rj is the associated
probability mass obtained from differencing. Table 6 gives the estimates capacity
factors using the three versions of Esscher's approximations using the normal,
M. Mazumdar
(100 MW)
Percentage Relative Error
5 "--T'-'T"
-] 1.2(-1)
- 6.2(-2)
--~ 4.2(-2)
t 4.0(-41
8.1 (-6)
- 2.9(-7)
18 >
Equation (19a)
Equation (19b)
Equation (19c)
Fig. 2. Graph of relative error for the Esscher and cumulants approximations for LOLP (system A).
Table 3
EPRI system D. Unit power ratings in loading order
Power rating (MW)
No. of units
Availability -= 1 - FOR.
first a n d s e c o n d o r d e r E d g e w o r t h e x p a n s i o n s . T h e s e e s t i m a t e s are c o m p a r e d with
a n u m e r i c a l analytic algorithm ( d e n o t e d b y SC-16), w h i c h is c o n s i d e r e d as a n
i n d u s t r y b e n c h m a r k , a n d P3, a n algorithm b a s e d o n the m e t h o d o f c u m u l a n t s .
Approximate computation of power generating system reliability indexes
Table 4
EPRI system D. Description of the LDC
Load (MW)
Load duration
Load (MW)
Load duration
Table 5
Discrete version of LDC (EPRI system D)
Load (MW)
Load (MW)
12 544
13 056
13 568
14 592
15 616
16 640
17 664
18 688
19 200
19 712
20 224
20 736
22 272
23 296
23 808
24 320
24 832
25 344
0.04 ! 583
T h e latter t w o a l g o r i t h m s are c o n s i d e r e d to be t h e b e s t in their r e s p e c t i v e
c a t e g o r i e s b y C a r a m a n i s et al. [5].
W h e n o n e r e g a r d s the v a l u e s p r o v i d e d by S C - 1 6 as b e n c h m a r k v a l u e s as
C a r a m a n i s et al. [5] do, o n e o b s e r v e s t h a t E s s c h e r ' s m e t h o d p r o v i d e s excellent
a p p r o x i m a t i o n s to t h e c a p a c i t y f a c t o r s for e a c h unit in the l o a d i n g o r d e r o f E P R I
S y s t e m D . Especially, the L D - 2 a n d L D - 3 a p p r o x i m a t i o n s u n i f o r m l y o u t p e r f o r m
M. Mazumdar
Table 6
Comparison of algorithms for capacity factors (EPRI system D)
Esscher's approximation
SC-16 a
P3 ~
LD-3 b
45 - 4 9
71 - 78
141 - 150
0.01 !
Approximate computation of power generating system reliability indexes
Table 6
Esscher's approximation
SC- 16a
P3 a
LD- 1
LD-3 b
a Excerpted from [5].
u LD-1, LD-2, LD-3: Esscher's approximations using normal and first and second order Edgeworth
the method of cumulants. We conjecture that the performance of the Esscher
approximation will be more convincingly superior to the method of cumulants for
systems with lower unit FOR values.
Summary and conclusions
Reliability of electrical power supply is of utmost importance to the public. To
insure adequate and reliable power supply, the electric power industry spends a
considerable effort in long-term generation planning. In this connection, several
reliability indexes are used by the power system planners. The loss-of-load probability (LOLP) index for a power generating system measures the probability that
system load exceeds its available capacity. Direct numerical computation of this
index proves unfeasible, and one needs to resort to approximate methods. We
adapt an approximation scheme proposed by Esscher in an actuarial context for
evaluating the LOLP index. Numerical results given in this article demonstrate
that this approximation is very accurate.
A second problem considered is estimating the capacity factors of various units
which experience different rates of utilization within the system. These indexes are
used to determine the expected operating costs of an electric utility company. The
computation of these indexes involves similar difficulties as that of LOLP. Here,
also, for a typical system evaluated, Esscher's method provides very accurate
M. Mazumdar
[1] Baleriaux, E., Jamoville, E. and Fr. Linard de Guertechin (1967). Simulation de l'exploitation
d'un pare de machines thermiques de production d'61eetricit6 couples a des stations de
pompage. Revue E(SRBE ed.) 5, 3-24.
[2] Barlow, R. E. and Proschan, F. (1975). Statistical Theory of Reliability and Life Testing Probability
Models. Holt, Rinehart and Winston, New York.
[3] Billinton, R. (1970). Power System Reliability Evaluation. Gordon and Breach, New York.
[4] Billinton, R., Ringlee, R. J. and Wood, A. J. (1973). Power System Reliability Calculations. MIT
Press, Cambridge, MA.
[5] Caramanis, M., Stremmel, J. V., Fleck, W. and Daniel, S. (1983). Probabilistic production
costing. International Journal of Electrical Power and Energy Systems 5, 75-86.
[6] Cramer, H. (1946). Mathematical Methods of Statistics. Princeton University Press, Princeton, NJ.
[7] EEI Equipment Availability Task Force (1976). Report on equipment availability for the
ten-year period, 1966-1975. Edison Electric Institute, New York.
[8] Endrenyi, J. (1978). Reliability Modeling in Electric Power Systems. Wiley, New York.
[9] Electric Power Research Institute (1985). Large-deviation approximation to computation of
generating-system reliability and production costs. EPRI EL-4567, Palo Alto, CA.
[10] Esscher, F. (1932). On the probability function in the collective theory of risk. Scandinavian
Actuariedskrift 15, 175-195.
[11] Feller, W. (1971). An Introduction to Probability Theory and its Applications, Vol. II, 2nd ed.
Wiley, New York.
[12] IEEE reliability test system (1979). A report prepared by the Reliability Test System Task
Force of the Application of Probability Methods Subcommittee. IEEE Transactions on Power
Apparatus and Systems 98, 2047-2064.
[13] Levy, D. J. and Kahn, E. P. (1982). Accuracy of the edgeworth expansion of LOLP calculations
in small power systems. IEEE Transactions on Power Apparatus and Systems 101, 986-994.
[14] Mazumdar, M. and Gaver, D. P. (1984). On the computation of power-generating system
reliability indexes. Technometrics 26, 173-185.
[15] Stremmel, J. P., Jenkins, R. T., Babb, R. A. and Bayless, W. D. (1980). Production costing using
the cumulant method of representing the equivalent load curve. IEEE Transactions on Power
Apparatus and Systems 99, 1947-1953.
[16] Sullivan, R. L. (1976). Power Systems Planning, McGraw-Hill, New York.
P. R. Krishnaiah and C. R. Rao, eds., Handbook of Statistics, Vol. 7
© Elsevier Science Publishers B.V. (1988) 73-98
Software Reliability Models
Thomas A. Mazzuchi and Nozer D. Singpurwalla
I. Introduction
In the past ten years or so, there has been considerable effort in what has been
termed software reliability modeling. The generally accepted definition of software
reliability is 'the probability of failure-free operation of a computer program in a
specified environment for a specified period of time' (Musa and Okumoto, 1982).
This area has begun to receive much attention for several reasons. Today the
computer is used in many vital areas where a failure could mean costly, even
catastrophic consequences. Due to the recent advances in hardware modeling and
technology, the main cause for computer system failure would be in the software
sector. At the other end of the spectrum, software production is costly and time
consuming, with much of the time and cost being devoted to testing, correcting
and retesting the software. The software producer needs to know the benefits of
testing and must be able to present some tangible evidence of software quality.
The issues concerning the quality and performance of software which are of
interest to the statistician (see Barlow and Singpurwalla, 1985) are:
(1) The quantification and measurement of software reliability.
(2) The assessment of the changes in software reliability over time.
(3) The analysis of software failure data.
(4) The decision of whether to continue or stop testing the software.
The problem of software reliability is different from that of hardware reliability
for several reasons. The cause of software failures is human error, not mechanical
or electrical imperfection, or the wearing of components. Also, once all the errors
are removed, the software is 100~o reliable and will continue to be so. Furthermore, unlike hardware errors there is no process which generate failures. Rather
software 'bugs' which are in the program due to human error are uncovered by
certain program inputs and it is these inputs which are randomly generated as
part of some operational environment.
Research supported by Contract N00014-85-K-0202, Project NR 347-128-410, Office of Naval
Research and Grant DAAG 29-84-K-0160, U.S. Army Research Office.
T. A. Mazzuchi and N. D. Singpurwalla
A more formal discussion of the software failure process is given in Musa and
Okumoto (1982). A computer program, is a 'set of complete instructions (operations with operands specified) that executes within a single computer some major
function', and undergoes several runs, where a run is associated with 'the accomplishment of a user function'. Each run is characterized by its input variables
which is 'any data element that exists external to the run and is used by the run
or any externally-initiated interrupt'. The environment of a computer program is
the complete set of input variables for each run and the probability of occurrence
of each input during operation. A failure is 'a departure of program operation
from program requirements' and is usually described in terms of the output
variables which are 'any data element that exists external to the run and is set by
the run or any interrupt generated by the run and intended for use by another
program'. A fault or bug is the 'defect in implementation that is associated with
a failure'. The 'act or set of acts of omission or commission by an implementor
or implementors that results from a fault', is an error.
For a more indepth treatment of software terminology the reader is referred to
Musa and Okumoto (1982). For further clarification of types of software errors
and their causes see Amster and Shooman (1975).
Software reliability models may be classified by their attributes (Musa and
Okumoto, 1982; Shanthikumar, 1983) or the phase of the software life cycle
where they may be used (Ramamoorthy and Bastani, 1982). The later approach
will be used here. There are four main phases of the software lifecycle: testing and
debugging phase, validation phase, operational phase, and maintenance phase.
Currently no models exist for use in the maintenance phase and thus this phase
will not be discussed.
2. Models of the testing and debugging phase
In the testing and debugging phase the software is tested for errors. In this
phase an attempt is made to correct any bugs which are discovered. The
discovery of a software bug is a function of its size (the probability that an input
will result in the bug's dis:zovery) and the testing intensity which reflects the way
in which inputs are selected. Another issue in this phase, is the treatment of the
error correction process. The more simple models assume all errors are corrected
with certainty and without introducing new errors, while other account for
imperfect debugging. Models in this phase may be classified into two main
categories: error counting models, and non-error counting models. Models may
be further classified by their approach (Bayesian or classical), their treatment of
the effect of error removal on reliability (deterministic or stochastic) and their
consideration of the time it takes to find and fix software bugs.
Software reliability models
2.1. Error counting models
Error counting models are based on the assumption that
(A1) The failure rate of the software at any point in time is a function of the
residual number of errors in the program.
Thus effort centers around estimating the residual number of errors and using this
to obtain other reliability measures. Furthermore, deterministic models are based
on the assumption that
(A2) Conditional on the model parameters the correction of an error results in
a known improvement in the reliability of the software.
The simplest, most cited and most criticized model is that of Jelinski-Moranda
(henceforth JM). In addition to (A1) and (A2) this model is based on the
(JM1) Each undetected error contributes an equal amount to the failure rate of
the software, which is proportional to the number of remaining errors,
(JM2) Conditional on the model parameters the time between successive software
failures are independent,
(JM3) Once discovered, errors are removed in a minimal amount of time without
introducing any new errors.
Given the above, the reliability function for Ti, the time between the (i - 1)st and
ith failure, is given by
i=1 ..... N,
where N is the initial number of software bugs and ~p is the failure rate
contribution of an individual error. This model may be used to make inference
about the software once estimates of N and ~p are obtained. Given the software
is tested and n ~< N software failures have occurred, the parameters N and q5 can
be obtained by maximum likelihood techniques. They are obtained as the simultaneous solution to
q~ =
5~i=1 ( i - 1)t,
i=12N - i +
~ ( i - 1)t~
where T = Y~7=1 ti"
The authors note that the above model assumes equal amounts of testing in all
periods. They suggest normalizing the time scale by using a time dependent
T. A. Mazzuchi and N. D. Singpurwalla
parameter e(t), the exposure rate. This parameter would reflect the testing intensity
at any time. The model could thus be modified by normalizing the time between
failures as
t* =
t t;
e(u) d u
where t; is the time of the ith failure.
Shooman (1972) develops a model similar to JM and further elaborates on the
notion of 'testing intensity'. Shooman suggests treating the number of corrected
software bugs as a continuous function of debugging time, say e(z). The function
~(z) would relate the cumulative number of corrected errors/number of program
instructions/debugging time. Once e(v) was established, future software questions,
such as when to stop testing could be answered. Analogous to (2,1.1) the reliability
function for software which has undergone ~ months of debugging is
R ( t [ N , I, 8(z)) = exp { - C ( N / I - ~(z))t}
where I is the total number of program instructions and C is an unknown
constant. In shooman (1973) the function e(z) is defined simply as 'the total
number of errors corrected by time z normalized with respect to I'. This
assumption essentially makes the Shooman and JM models different in notation
only. Shooman (1973) and (1975) however, suggest a different technique for
estimating C (and thus ~p) and N. The debugging process is divided into k intervals
of lengths H~ . . . . , H k. The end of the ith debugging interval is denoted rt- In the
ith interval ni failures are recorded but they are not fixed until the end of the
interval. A similar approach is undertaken using the JM model in Lipow (1974).
The parameters C and N may be obtained by the method of moments by choosing
times zi < ~ such that e(z;)< e(zj.) and solving
C[N//- e(zj)]
or by using the method of maximum likelihood estimation and solving
Z~=I ni
Z j = 1 [ N / I - g(za.)]//j
n i / [ N / I - e(zi) ]
Z jk= l n j
Software reliability models
If large sample theory is applicable, asymptotic variances of the MLE's are
obtained as
var (d)
- -
2iK1 ni
var( )
Shick and Wolverton (1973) actually specify and incorporate a testing intensity
in their modification of the JM model by assuming the failure rate of the software
is a linear function of testing time. The resulting distribution for the interfailure
times is the Rayleigh distribution
R(til cp, N) = exp { - ~p(N - i + 1)t2/2}
and the resulting MLE's for N and ¢ are obtained by solving
N = [2n/~ + 2;=1 (i - 1)t,.2]
2 i=l t2
q~ =
[27=1 2 / ( N - i + 1)]
i = l t2
Several alterations of this have appeared. Wagoner (1973) fit a Weibull distribution to software failure data using least squares estimation for parameters. Lipow
(1974) suggested using a linear term which would be a function of the most recent
failure time. Shick and Wolverton (1978) discuss the use of a parabolic function
to model testing intensity. Sukert (1976) also adapted the model to include the
case of more than one failure occurring in a debugging interval.
Musa (1975) was the first to point out that software reliability models should
be based on execution time rather than calendar time. Musa's model is essentially
the same as the JM model but he attempts to model the debugging process in a
more realistic fashion. The model undergoes some alterations in Musa (1979).
Here, the expected net number of corrected software bugs is expressed as an
exponential function of execution time, and the fault correction occurrence rate is
assumed proportional to the failure occurrence rate.
The reliability of software tested for ~ units of execution time is
R(t) = exp { - t/T} where T, the mean time to failure (in execution time) is given
r = TOexp
7". A. Mazzuchi and N. D. Singl~urwalla
In the above, TO is the mean time to failure before debugging, M o is the total
number of possible software failures in the maintained life of the software and C
is a testing compression factor. The value TO is further expressed by TO = 1 / f K N o
where f is a ratio of average instruction execution rate to the number of
instructions, called the linear execution frequency and K is an error exposure ratio
relating error exposure frequency to linear execution frequency. The value N O is
the initial number of software errors in the program and is related to M o by
M o = N o / B . The parameter B is called the fault reduction factor. This gives the
model the additional characteristic of being able to handle the possibility of more
than one error being found at one time or the possibility of imperfect debugging.
The value C is a ratio relating the rate of failures during testing to that during use.
From the parameter relationships, two central measures are obtained. The
additional number of software errors which needs to be corrected to increase the
mean time to failure for the program from T~ to T2 is given as
Am = M o To T1
and the additional execution time required to increase the mean time to failure
from T 1 to T2 is given as
A z - M°T°
log(T2/T1) .
Musa derives an execution to calendar time conversion by pointing
testing time is a function of three limited resources: failure identification
(I), failure correction personnel (F) and computer time (C). The resource
ture associated with a change in mean time to failure is approximated
out that
h Z k = O k h r + l~kAm
for k = I, F, C, where Az and Am are the additional execution time needed and
the additional errors corrected to bring about the change and Ok and/1~ are the
average resource expenditure rate per execution time and failure respectively.
Assuming resources remain constant throughout testing, the testing phase may be
divided into three distinct phases. In each phase only one of the resources is
limiting and the other two are not fully utilized. Thus the additional calendar time
required to increase the mean time to failure from T 1 to T2 is given as
1 I (1
1) Ok log(Tk2~]
where k = C, F, I corresponding to the appropriate resource limiting phase, P. is
the amount of resource available, p. is the resource utilization factor, and O. and
Software reliability models
/~. are as previously defined. The quantities Tk, and Tk2 are the mean time to
failures at the boundaries of each resource limiting phase. These boundaries are
at the present and desired mean time to failure and the transition points which
lie in this range. The mean time to failure for a transition point is derived as
Tk~,, =
c [ P k , ~ , p~ - P~,,~p~, ]
[P~, p~, Ok - P,p~O~, ]
for (k, k ' ) = (C, F), (F, I), (I, C). M u s a notes that it is generally true that OF = 0
and PI = 1 and discusses a method for obtaining PF by treating the failure
correction process as a truncated M / M / P F queueing model.
Most of the parameters of M u s a ' s model must be obtained from past data on
similar projects. The parameters M o and T O (and thus K and No) m a y be obtained
by using m a x i m u m likelihood techniques. The M L E ' s are obtained by solving
T o = -n
i=1 M o - i + 1
ze ,
M o T o i=1
where zi, i = 1, . . . , n, is the e x e c u t i o n t i m e between the ( i - 1)st and ith failure.
An exact expression for the variance of To is obtained as
V a r (7"o) = 7"2/n
yielding a coefficient of variation of 1/n 1/2. Though an exact expression for the
variance of M o is not available, confidence bounds for M o are obtained using
Chebychev's inequality. Based on the distribution of the failure m o m e n t statistic
7 = M o / n - 1/A~k where A ~ = ~k(Mo + 1) - ~k(Mo + 1 - n) and qJ is the d i g a m m a
function, a (1 - ~)~o confidence interval for M o is obtained by determining the
values of M o which correspond to the values of 7 such that
7= ~ +
A¢ '
1 (A~O' +
(A ~k)2 \ ( A ~k)2
with A~k' = ~k'(Mo + 1) - ~ ' ( M o + 1 - n) and ~k' is the trigamma function.
The M u s a model was one of the first to suggest that the number of software
failures was governed by a Poisson distribution. Another model which adopted
this approach was the Generalized Poisson Model ( G P M ) of Angus, Schafer and
T. A. M a z z u c h i a n d N. D. Sin~vurwalla
Sukert (1980). This model is also based on the JM assumptions but includes the
additional assumption that the severity of the testing process is proportional to
an unknown power of elapsed test time. In the ith debugging time interval of
length H;, the number of errors observed N t is given by a Poisson distribution with
mean value E [ N i ] = dp(N - M e_ I ) H i ~ where M i_ 1 is the number of errors removed
before the start of the ith debugging interval and ~ is an unknown constant. As
in the debugging scenario of Shooman it is assumed that if bugs are corrected they
are corrected at the conclusion of the debugging interval.
Parameters N, ~p and ~ may be obtained by solving the maximum likelihood
equations. In Ramamoorthy and Bastani (1982) these are given for the case
H i = ti, the time between the ( i - 1)st and ith failure, as
~ ~pt~ = O,
+ ~ logti- ~ ¢p(N- M i_,)t/~ logt/= 0,
Mi_ 1
- Z ( N - M i_l)t~ = 0.
The extra parameter gives the GPM flexibility but also difficulties in terms of
parameter estimation. Once the parameter estimates are obtained they may be
used with the model to make conclusion regarding the software. One important
expression obtained in Angus, Schafer and Sukert (1980) is the expected time
until the removal of an additional k ~<N - M faults given M faults have already
been removed. The expression is
7~k= &-'F(&)
11} -'/~
i=M+ I
where F(.) is the gamma function and &, ¢p, and N are the MLE's of ~, ¢p
and N. The use of least squares estimates is also discussed by the aforementioned
There has been much comparison and criticism of the early models in terms
of their assumptions and their parameter estimation. (See for example Forman
and Singpurwalla (1977), Shick and Wolverton (1978), Forman and Singpurwalla
(1979), Sukert (1979), Musa (1979), Littlewood (1979), Littlewood (1980a), Littlewood (1980b), Angus, Schafer and Sukert (1980), Littlewood (1981a), Littlewood
(1981b), Keiller, Littlewood, Miller and Sofer (1982), Musa and Okumoto (1982),
Ramamoorthy and Bastani (1982), Stefanski (1982), Singpurwalla and Meinhold
(1983), Langberg and Singpurwalla (1985)). The paramete(estimation of the JM
model has been most criticized. Forman and Singpurwalla (1977) and (1979),
Littlewood and Verrall (1981) and Joe and Reid (1983), have all illustrated that
the solution of the maximum likelihood equations for the JM model can produce
unreasonably large even non-f'mite estimate for N. In Forman and Singpurwalla
(1977) the authors found that when n is small relative to N, the likelihood function
of N is very unstable and may not have a finite optimum. Littlewood and Verrall
(1981) found that the estimate of N is finite if and only if
2"i=1 (i - 1)ti> Y~1=1t;
2i= 1 (i - 1)
The authors note that violation of the above implies that no
taking place as a result of the debugging process. In Joe and Reid (1983) the
authors show that g is an unsatisfactory point estimate because its median is
negatively biased and can be infinite with substantial probability. The authors
advocate the use of likelihood interval estimates.
Forman and Singpurwalla (1977) and (1979) develop an estimation procedure
to insure against unreasonably large estimates. They propose a stopping rule
based on the comparison of the relative likelihood function for N, to the
'approximate normal relative likelihood' for N:
R . . . . . 1(N) = exp { - ½(N - N)z/var(N)}
Var(N) = n
/Ini~= (
1 (N-/+l)
)2 - ( ~ ,
i=1 ( N - i + I
The above function may be used to give an indication of the appropriateness of
the large sample theory for estimating N. When appropriate, plots of the relative
likelihood function and that of R . . . . al(N) compare favorably. Thus to get a
meaningful estimate of N, the authors suggest the following stopping rule. After
testing the software to n failures
Compute g the MLE of N using (2.1.2a) and (2.1.2b).
If g ~ n go to step 3, if not continue testing until another failure occurs and
to step 1.
Compute the relative likelihood function for N and compare it with
Rnormal(N ). If plots of the two functions display a large discrepancy, this estimate
is misleading. Continue testing until another failure occurs then go to step 1. If
the plots are in good agreement, stop testing.
Furthermore, if the large sample theory appears appropriate, then inference
concerning N (and in an analogous manner tp) may be obtained using the normal
Meinhold and Singpurwalla (1983) suggest the adoption of the Bayesian point
of view when considering the likelihood function of the JM model. In so doing,
the conclusion to be obtained from ridiculous parameter estimates is that the
method of inference--specifically maximum likelihood estimation, rather than the
T. A. Mazzuchi and N. D. Singpurwalla
model that needs to be questioned. A Bayesian approach to inference on N and
~p is discussed.
Goel and O k u m o t o (1979) treat the cumulative number of software failures by
time t, N(t) is assumed to be a nonhomogeneous Poisson process with mean value
m(t) = a(1 - e - b t )
where the unknown constants a and b represent the expected number of failures
eventually discovered and the occurrence rate of an individual error respectively.
Thus for any t >~ 0
Pr {N(t) : n la, b} =
[a(1 - e - b ' ) ] " e n!
= poim(n'a(1 - e-bt)),
- e-bt)]
n = 0, 1, 2 . . . . .
F r o m (2.1.23) the distribution for the total error content is poim(n" a) and the
conditional distribution of the number of remaining errors at time t ' ,
N ( t ' ) : N ( o o ) - N ( t ' ) is
P r { N ( t ) = n ' l N ( t ) = n, a, b} -- p o i m ( n ' + n , a ) ,
n' = 0 , 1 , 2 , . . . .
The reliability function for the interfailure time T; is given by
R(ti] t~_ 1, a, b) = exp { - a l e -bt;-I - e - b(t''- 1+ti)]}
where t i
j= l tj is the time until the ith failure. Thus in contrast to JM2,
software interfailure times are not independent. Also note that due to this
dependence, the G o e l - O k u m o t o model is of the stochastic type.
Estimators of a and b are obtained via the solution of the m a x i m u m likelihood
n/a = 1 - exp { - bt'n },
n/b = ~
t'k + I t 'n e x p { - b t ' n } .
A (1 - ~)~/o confidence region for a and b may be established using the approximation
L ( h, b lt; . . . . , t'n) - L(a, blt~', . . . , t'n) ~ ~Xz,~
Goel and O k u m o t o (1980) also discuss the use of the asymptotic normality of
h and b for constructing confidence intervals. Here, model results are based
Software reliabilitymodels
on execution rather than calendar time. This approach represents an extension of
the basic model derived in Schneidewind (1975) and is itself extended in Shanthikumar (1981) using a nonhomogeneous Markov process.
A combination of the Musa model and Goel and Okumoto model is given in
Musa and Okumoto (1984). This model incorporates use of execution time with
the analytical ease of the Nonhomogeneous Poisson Process. Furthermore, the
authors define the failure intensity in such a way as to reflect the fact that errors
with larger size are found earlier. If 20 and 0 are the initial failure intensity and
the rate of reduction in the normalized failure intensity per failure, the failure
intensity is defined in terms of execution time as
2(~) = 20 e -
where re(v) is the mean value function for N(~). Given the above the mean value
function is given by
m('c) = = log(2oO'r + 1)
and the distribution for N(,) is given by poim(n : (1/0) log(2 o 0z + 1)). Expressions
analogous to (2.1.22)-(2.1.25) are obtained by substituting (1/0)log(2 o 0~ + 1) for
a(1 - e-bt). Musa and Okumoto obtain further functions of interest by exploiting
the relationship between time until the ith failure, T" and the number of failures
in a given time. Using this notion
e { r ; ~< ~} = ~ [m(~)]J
e -m(x)
P{T" < "cIN('c,) = nx} =
[ m ( z ) -- m ( ' q )]
( j - rh)[
where T; = Z'j = ~ Tj. is the time of the ith software failure.
Maximum likelihood estimation is discussed for both cases where failure times
and number of failures are used. The complexity of the estimation procedure is
reduced by estimating the parameter cp = 200 and solving for 2 and 0 by choosing
the mean number of failures equal to the number of software failures encountered.
When the software is tested for a time v~ and n failures are recorded at times
z;, ..., v,~, ~p may be obtained by solving
q~ i=l tp~,' + 1
(q~" + 1)(log(tpr~ + 1))
= 0
T. A. Mazzuchi and N. D. Singpurwalla
Given ~, estimates 20 and /) may^ be obtained ^by ^setting m(z)=
(1/0)ln[~bz" + 1] = n, thus 0 = (1/n)ln[~x" + 1] and 2 = ~/0. When the
software is tested over an interval [0, xp] and this is partitioned into intervals
(0, xl], (x 1, xz] . . . . . (Xp_ 1, Xp] with n; denoting the number of failures recorded
in (0, xi], i = 1, . . . , p, then the maximum likelihood equation for q~ is given as
Z ni
Xi- 1
~x i + 1
~bxi + 1
log(~xi + 1 ) - log(~xi_ l + 1)
np Xp
= 0
(~bXp + 1)log(q~xp + 1)
where ni = n ; - n;_l. ^ Again using the same approach as before ~ and 2o
may be obtained as 0 = (1/np)log[~pXp + 1] and 2 = @~.
In the above model, times are in terms of executime time rather than calendar
time. The conversion to calendar time follows the developments in Musa (1975).
The Musa (1975) model was also one of the first models to address the notion
of imperfect debugging. Goel and Okumoto (1979) suggested the use of a Markov
process to model imperfect debugging. Kremer (1982) uses a multidimensional
birth-death process to account for imperfect debugging and the introduction of
new errors as the result of debugging. Kremer (K) begins by assuming that the
failure rate of the software is a product of its fault content and an exposure rate,
h(t). To account for imperfect debugging he further assumes
(K) When a failure occurs, the repair effort is instantaneous and results in one
of three mutually exclusive outcomes
(i) the fault content is reduced by 1 with probability p;
(ii) the fault content remains unchanged with probability q;
(iii) the fault content is increased by 1 with probability r.
Thus the author defines a birth-death process with birth rate rh(t) and death rate
A multidimensional process is defined with X(t) denoting the fault content of
the software at time t and N ( t ) the number of failures to time t. Though reliability
measures are obtained from N(t), the failure rate of the software is a function of
X(t), which is changing in a stochastic manner.
Given the initial fault content of N, the expected number of faults in the
program by time t is
E [ X ( t ) IN, p, r] = N e - p(o
where p(t) = ( p - r ) ~ o h ( u ) d u and the expected number of failures by time t is
E [ N ( t ) IN, p, r] =
h(u) d u ,
p = r.
Software reliability models
Thus in the life of the software (if p > r) the expected number of failures will be
N/(p - r). Thus p - r is similar to Musa's constant B. Given n failures obtained
by time to, conditional expectations may be obtained as
E[X(t o + t)lN, p, r,N(to) = n] = [ N - (p - r)n] e -p(t°,')
where p(t o, t) = p(t o + t) - p(to). Using (2.1.36) the conditional expectation for the
number of failures in (to, to + t] is
E[N(t o + t) - N(t)IN, p, r, N(to) = n]
!P_-_r)n) [ 1 _ e-P(to, t,]
[Dt o + t
N Jt
h(u) du
pm(t) = e{x(t) = m} as
Po(t) = [~(t)] N,
min (N, m)
Pm(t) =
(jN.) ( N + N S ~ -
(a(t)) lv-j(fl(t)) m -J(1 - ct(t) - fl(tt} i
e(t) = 1
e -°(t) + A(t)
fl(t)= 1
e -p(° + A(t)
A(t) = f o rh(u) e pCu)d u .
From these, the reliability of a program tested for to units of time may be obtained
R (t[ N, p, r) = ~ pro(to) [ Sto(t)] m
Sto(t) = exp { - f , i ° + ' h ( u ) d u }
is the reliability attribute of each remaining fault. Given n failures by time to the
reliability may be expressed as
R(tln, p, r, N(to) = n) = ~ Pm(to)[Sto(t)] ~v-m
7'. A, Mazzuchi and N. D. Singpurwalla
where Pro(to) = P{X(to) = N P,,,(to) =
m l N ( t o ) = n} and is given by
- piqJrk.
i--k=m i!j!k!
This model is dependent on the parameters N, p, q, r and h(t). Maximum
likelihood estimates may be used for N, p, q, r and the parameters of h(t). The
amount of data required and the accuracy of the estimates have not been
investigated. Estimates of p, q and r could be obtained from experience or best
prior guesses. The author also suggests a Bayesian approach for estimating h(t),
which closely resembles that pursued in Littlewood (1981).
The model of Goel and Okumoto (1979) and Musa and Okumoto (1984)
represent a step towards a Bayesian analysis of the problem. In Singpurwalla and
Kyparisis (1984) a fully Bayesian approach is taken using the nonhomogeneous
poisson process with failure intensity function 2(0 = (fl/~)(t/~) t~- ~ for t>~ 0.
Due to the resemblance of 2(0 to the failure rate function of the Weibull
distribution, the model is referred to as the Weibull process. Thus N(t) again is
assumed to be a nonhomogeneous poisson process with mean value function
m(t) = (t/~) t~. In the true Bayesian context uncertainty concerning ~ and fl are
expressed by their respective prior densities
go(a) = - - ,
0 < ~ ~< 7o,
r ( k , + k2) (~ - ~ , Y " - 1(~2 - ~y,2-,
(/~2 -/~IY' +k2-1
O~fll "(fl(fl2 ; kl,
For convenience it is assumed that the prior distributions for ~ and fl are
independent. Posterior inference concerning the number of future failures in an
interval or the time until the next failure may be obtained once the posterior
distributions of ~ and fl are computed. The posterior distribution of fl is of interest
in its own right as it may be used to assess the extent, of reliability growth.
Reliability growth would be taking place if fie (0, 1), by observing the posterior
density one may examine the extent to which this is true.
Posterior analysis is conducted fo~ both the case where only the number of
failures per interval and the case where the actual failure times are recorded. In
both cases the posterior distributions of ~ and/3 are intractable. An approximation
is given for the posterior of ft. Due to the intractability of the posterior
distributions of c~ and r, posterior inference concerning the number of failures in
future intervals and the time next failure are conducted numerically via a computer
code described in Kyparisis, Soyer and Daryanani (1984).
When only the number failed in each interval is recorded over a period [0, Xe]
the posterior distribution of Ark the number of failures in (xk_ 1, xk], k = p + 1,
Software reliability models
p + 2, p + 3, ... is given by
Pr{Nk = nk[nl . . . . .
1)] n~
exp { - [m(Xk) -- m(Xk_ 1)]}
= f o = ° f l ] =~ [m(Xk)--m(Xk-nk
• gl(~,fl[nl . . . . .
np) do~ d/3
where gl(c~,/31nl . . . . , np) is the joint posterior density o f ~ and/3. The approximate
marginal posterior density of/3 is obtained as
gl(/3ln I . . . . , nk) OC (/3 -- ill) ~' -1(/32 -- /3)k2-
- - .
F(np - 1//3)
Ix f • [1
X f _ l ] n~
where S(/3) = Y~= 1 ( x f - xf_ 1). The approximate posterior distribution for /3 is
based on the approximation
/35(/3)n~ - 1/fl
which works well if % f> S(/3) 1/~.
When the software is tested over a period (0, T) and failure times
t'1 ~< t~ ~< • • • ~< t', are recorded, then the joint posterior distribution of a and/3 is
given by
g2(=, ~3It'1, . . . , tL)oc ( / 3 - / 3 , ) < - ' ( / 3 2
• [I (t; I 0 ~ - '
exp { -
(t'lT) p}
and the marginal posterior of/3 is given by
g2(/3[t'1, . . . , t~,)oc(fl- /31)kl-- 1(/32 -- /3)k2 1 / 3 " - ' r ( n
- n/~
- 1//3)
using an approximation similar to (2.1.44) which works well provided ~o >/t;.
Posterior inference concerning the number of failures in future intervals may be
obtained using (2.1.42) in conjunction with (2.1.45). Posterior inference concerning
Z k given t',, the time to the (n + k)th failure from t', is obtained by noting that
T. A. Mazzuchi and N. D. Singpurwalla
given ~ and/7, failure times (t~/a) ~, (t2/a)
' # .... can be viewed as being generated
from a homogeneous poisson process. The posterior conditional distribution of Z k
given t, is obtained from
P r { Z ~ <<.z l t l . . . . . t'} =
(t n z)
fo f, fo
1 e - v
(k i 1)! dv
• gz(~, [3l tl, . . . , t',) d a dfl
_ _
Littlewood (1980) also initiates a Bayesian approach to error counting, but
expresses uncertainty of the software's performance through 2~, the failure rate of
the software given i - 1 failures have occurred. This Littlewood model embraces
the assumptions of the JM model except for (JM1). Arguing that errors with the
largest size (and thus greater failure contribution) will be discovered first, Littlewood instead views 2~ = q~ + (P2 q- " ' " -t" ~N--i+ i where (p~ is the failure contribution of the ith remaining error. Uncertainty about the ~O~is expressed via the
prior distribution
fl~ (p~- ~ e - ~ ,
(p >i 0
which is denoted ~ ~ G(a, fl).
Because the uncertainty is the same for all (p~, i = 1. . . . . N, initially, the prior
distributions will all be identical. The failure contribution for an error which has
not been observed by the (i - 1)st failure is given q~i~ G(a, fl + t'_ l) where, as
usual the t;_ 1 is the time of the ( i - 1)st failure. Thus the uncertainty about the
failure rate of the software after the ( i - 1 ) s t failure is expressed via
2 i ~ G ( ( N - i + 1)~, fl + t'_ 1). The reliability function of T~ may be expressed as
R ( t ils, fl) =
f l + t , _ l ](N-i+l)~
1 + ti
fl + t[
a Pareto distribution. Unlike the exponential distribution the Pareto distribution
permits the possibility of very large error free intervals. Also it is interesting to
note that the failure rate function given by
2(ti) = ( N - i + 1)/(fl + t ; - i + ti)
displays a decreasing failure rate and this property can be shown to be independent of the prior distribution for the (Pi.
Littlewood discusses the use of (2.1.48) and (2.1.49) in determining other
reliability measures. The author suggests the use of maximum likelihood estimation
Software reliability models
(similar to that used in the JM model) in order to obtain estimates of N, a, and
ft. A purely Bayesian approach would determine the parameters from elicited prior
All models thus far have ignored the time required to find and correct software
errors. While this keeps the model derivation simple, it may not be adequate and
does not enable the measurement of an important reliability parameter, availability.
Shooman and Trivedi (1976) introduced the use of the Markov Process to
account for the time to find and correct software bugs in large software systems.
The thrust of this analysis is to estimate availability rather than reliability. In Kim,
Kim, and Park (1982) (KKP) this model is developed and extended. As with the
JM model it is assumed that the failure rate of the software is directly proportional
to the number of errors and that each error contributes an equal amount to the
failure rate. To account for the debugging process the following additional assumption is made
(KKP) When a failure occurs, errors are corrected perfectly with rate #o, or are
corrected but with the addition of a new error with rate #lGiven the above assumptions the differential difference equations for
p,(t) = e(N(t) = n} when the computer is up, and q,(t) = e{N(t) = n} when the
computer is down, are given by
pN(t ) _ AN + #o + Ill eA+vt+ BN
pN_1,(t) -(~b#°~'N!
+ ~o + 111
BN -- AN
~ {
(Alv-J+ l~O + #')eAU-/
(BN-j + 110 + ~1)eBN-jt
. . AN-i)
. . . . []i=O,,+,i(BN-:
; - - - - [[~=o (B N-/--- BN-i) J '
k = 1, . . . , N,
~= (~H
(eAN Jt -- e-(bt°+ tll)t
qN- k(t) j:o
, N-
_j -
AN- i)
e-(f,o + u,)t
17~:o (B,,±+ - A~_~) Hi:~ o, i+,j (BN-j
k = 0, 1 . . . .
BN- i)
T. A. Mazzuchi and N. D. SingJ~urwalla
½ { - [ # o +/~1 + ( N - k)~p]
+ ~'~1 -I- ( N - k ) ~ ] 2 - 4 ( N -
k)~j[~o) .
Once estimates of N, tp, #o and #1 are obtained the availability of the system
is given by Y,~=oPN_k(t). The authors specify no means for estimating the
parameters, however N and tp could be estimated using methods applied to the
JM model, while #o and /~1 could be estimated from past experience or from
correction times.
2.2. Non-error counting models
Non-error counting models are not designed to provide estimates of the number
of residual failures but only provides estimates of the effects of the residual errors
on software reliability. Deterministic models are represented by the Halden
Project model (Dahil and Lahti (1978) and a modification of the JM model called
the Jelinski-Moranda Geometric De-Eutrophication model presented in Moranda
(1975) and (1979)). This model was designed to handle the case where groups of
errors are removed at one time, but can also be used to account for the case
where larger size errors are removed first, as in Littlewood (1980) and Musa and
Okumoto (1984). The model assumes that 2; = D U - 1 where D is the initial
detection rate, and k is the ratio between the ( i - 1)st and ith failure. These
parameters may be estimated from the maximum likelihood equations
i= 1
D = kn
k it; .
= (n + 1)/2,
Moranda also suggests using this formulation in conjunction with the nonhomogeneous Poisson process. Sukert (1977) generalizes the model to include more
than on failure per debugging interval.
In Littlewood and Verrall (1973) a stochastic Bayesian model is presented. In
this approach the author attempts to model the debugging behavior of the
programmer or programmers involved. As each error is encountered, it is the
intent of programmer to correct the error and thus increase the reliability of the
software. Though this is always the intent it is not always achieved. Often new
errors are created which reduce the reliability of the software. To model this
situation in a Bayesian context, Littlewood suggests expressing the uncertainty
about 2; by assuming a priori that 2 i ~ G(a, ~(i)) where 0(i) is an increasing
function indicating the complexity of the program and quality of the programmer.
Defining 0(i) as an increasing function of L incorporates the assumption that the
programmer's intent is always to improve the software's reliability since
0(i) > 0 ( i - 1) implies that
Software reliability models
e(,t, < z} I_.
< z}
for l > 0. The above implies that the ).i are stochastically ordered.
Combining the usual assumption that given 2; the variables T;, i = 1. . . . , n, are
independent exponential random variables, with the prior distributions for 2; the
posterior reliability for T~ can be obtained as
which is a Pareto distribution.
The author suggest trying several parametric families for ~(i) notably
t~(i) = [3o + [31i a n d ~ ( i ) = floi + B l i a. The author does discuss the possibility of
using a prior distribution for ~, but Littlewood (1980) suggest maximum likelihood
estimation for the model parameters, thus making this model a hybrid approach.
Further analysis along the lines of modeling the stochastic ordering of 2; are
pursued in Ramamoorthy and Bastani (1980). Specifically these models are
referred to as the mixed gamma model and the stochastic input domain model.
Bayesian time series analysis is used to assess software reliability growth and
other reliability parameters in Horigome, Singpurwalla and Soyer (1984) (HSS)
and Singpurwalla and Soyer (1985). The authors assume a power law relationship
between T,. and T;_ 1 where T; is defined as the failure time at the ith testing stage
(note if a testing stage consists of testing to the first system failure the T; is as
previously defined). The relationship assumed is
T; = Ti~ , bi
where 0; reflects the effects of the changes made as a result of the (i - 1)st stage
of testing and bI is an error term to account for uncertainty. Note that reliability
growth will have taken place as a result of changes made in the (i - 1)st stage
of testing if 0; > 1; 0; = 1 indicates no improvement and 0; < 1 indicates reliability
The model is developed based on the following assumptions:
(HSS1) The variables Ti, i = 1. . . . . n, are lognormally distributed with T;~< 1
assumed for all i.
(HSS2) The values b;, i = 1, . . . , n, are lognormally distributed with known
parameters 0 and a 2.
(HSS3) The quantities 0i, i = 1. . . . , n, are exchangeable and are distributed
according to some distribution G with density g.
Taking the logarithm of both sides of (2.2.4) yields
r, -- o , r ; _ 1 +
( 2.2.5)
T. A. M a z z u c h i and N. D. Sin~ourwalla
where Y~= log T~ and e; = log ~ are normally distributed, the latter with mean 0
and variance alz. The sequence { Y,.} is thus given by a first order autoregressive
process with a random coefficient 0,..
By assuming further that 0~ ~ N(2, a22) where a ff is known and 2 ~ N(#, a 2)
with # and a32 known, the following posterior results are obtained.
(2tYl . . . . . y . ) ~ N ( # . , a.:) with
~ Y i Y i -,l/] a f f
Yi- 1
i= 1 Wi _ 1.t
Wi- 1 =
(0nlYl. . . . .
a~Yi- 1 + a~ ;
y.)'~ N (a~#"+a~YnY"-l,a~(W._la~+aa2a~)Iw~_l);
Wn- 1
(I1. +1 tyl . . . . .
(On +1 [el . . . . . Y~) "~ N(#n, tr2 + tr2) •
y.) ~
+ w.) ;
Note that aft reflects the views about the consistency of policies regarding modifications and design changes made. Using the above, posterior inference can be
obtained for any relevant quantity. For example Bayes probability intervals can
be constructed for the next failure time or reliability growth at each stage can be
assessed by plotting E[Oilyl, . . . , yi] vs. i. Overall, reliability growth can be
examined via E[2Iy~, . . . , y~], i = 1 . . . . . n. In Singpurwalla and Soyer (1985) this
basic model is extended by assuming various dependence structures for the
sequence {0;}. Three additional models are developed using the structure of the
Kalman Filter Model.
2.3. Model unification
Though highly criticized, the JM model remains central to the topic of software
reliability. Langberg and Singpurwalla (1985) provide an altemative motivation for
the JM model using shock models. Stefanski (1982) provides another motivation
for the JM model using renewal theoretic arguments. Both works allude to the
centrality of the model. Langberg and Singpurwalla further provide a unification
of software reliability models by illustrating that many other well known models
such as Littlewood-Verrall (1973) and Goel and Okumoto (1979) can be obtained
by specifying prior distributions for the parameters of the JM model. Extensions
to the basic Bayes model and the discussion of the use of posterior modes as
point estimates is given in Jewell (1985).
Software reliability models
3. Models of the validation phase
When a decision is made to stop testing the software (see Forman and
Singpurwalla, 1977; Okumoto and Goel, 1979, 1980; Krten and Levy, 1980;
Shanthikumar and Tufekci, 1981, 1983; Koch and Kubat, 1983; Chow and
Schechner, 1985, for decision criteria), the software enters t.he validation phase.
In this phase the software undergoes intensive testing in its operational environment with a goal of obtaining some measurement of its reliability. Software errors
are not corrected in this phase and, in fact, a software failure could result in the
rejection of the software.
Nelson (1978) introduced a simple reliability estimate based on probabilistic
laws. Letting e r denote the size of the remaining errors in the program and noting
that errors are not removed, the number of runs until a software failure is a
geometric random variable with parameter e r. Thus the maximum likelihood
estimate of e r can be used to determine an estimate of reliability. This is given
R = 1 - nf/n
where n is the total number of sample runs and nf is the number of sample runs
which ended in failure.
The above model suffers from several drawbacks (Ramamoorthy and Bastani,
1982) stemming from its simplicity.
(1) A large number of sample runs is required to obtain meaningful estimates.
(2) The model is based on the assumption that inputs are randomly selected
from the input domain and thus does not consider the correlation of runs from
adjacent segments of the input domain.
(3) The model does not consider any measure of complexity of the program.
Extensions to the basic model have attempted to reduce the number sample
runs by specifying equivalence classes for the input domain (Nelson, 1978;
Ramamoorthy and Bastani, 1979). This goal is achieved at the cost of an increase
in model complexity.
Crow and Singpurwalla (1984) address the issue of correlation of inputs using
a fourier series model. The authors observe that in many cases software failures
occur in clusters and thus the usual assumption that the times between failures
are independent may not be valid. Rather they assume that the time between
failures is given by
T i = f(i) + ~
where ee is a disturbance term with mean 0 and constant variance and f(i) is some
cyclical trend. To identify the cyclical pattern (if any) with which failures occur,
the authors fit the Fourier series model
f(i) = eo + ~ [e(kj)cos 2re- kji+ fl(kj)sin 2~r- kji 1
T. A. Mazzuchi and N. D. Singpurwalla
where n (the number of observed time between failures) is assumed odd,
q = (n - 1)/2 and kj = j , j = 1 . . . . . q. Using the method of least squares the model
parameters are obtained as
ao = -
~ ti,
/'/ i = 1
~(kj) = -2 5] t ; c o s -2~
- kfi,
]~(kj)= 2 ~ t i s i n 2 n k f i ,
j = 1. . . . .
1. . . . .
The spectrogram is used to identify the period of the series, and thus the clustering
behavior. A parsimonious model may also be obtained by using only those
weights ~(kj) and/~(kj) for which p2(kj) = a2(ki) + fl2(kj) is large.
This model was applied to three sets of failure data from each of two software
systems. The model was found to adequately represent the failure behavior. One
potential problem of the model is that due to the relationship of a(kj) and/~(kj)
on trigonometric functions, negative values of f ( i ) may be produced. When such
is the case, the authors interpret this as an implication of a very small time
between failure.
Though the intent of the authors in this paper is data analysis, the model can
be used to predict future time between failures and future failure clusters. Also by
specifying a functional form for ee (such as the usual normal assumption),
inference can be made.
4. Models of the operational phase
Models in this phase are used to illustrate the behavior of the software in its
operating environment. Both Littlewood (1979) and Cheung (1980) obtain the
software reaiiability by assuming the software program is divided into modules.
Cheung suggests a combination of deterministic properties of the structure of
he software with the stochastic properties of module failure behavior, via a
Markov process. He assumes
(C1) Reliabilities of the modules are independent.
(C2) Transfer of control among program modules is a Markov process.
(C3) The program begins and ends with a single module, denoted N 1 and Nn
The state space is divided into N 1. . . . . Nn, C, F where N; are the modules, C
indicates successful completion, and F indicates an encountered failure. States C
and F are absorbing. Transition probabilities from N; to Nj (i # j ) are given by
Software reliability models
R~p~j where Ri is the reliability of module i and p~j is the usual transition
probability from module i to module j. The transition probability from Ni to F is
1 - R ; and the transition probability from N n to C is given by R,. Thus the
reliability of the software is obtained as the probability of being absorbed into
state C given that the initial state is N 1. This is obtained as
R = S(1, n)R,
where S(i, j) is the (i, j)th entry in the matrix S = ( I - Q)-1 and Q is the
transition matrix of the process with the rows and columns of C and F deleted.
The module reliabilities R~ may be determined before system integration by
techniques of Section 2 or 3. Transition probabilities may be estimated by running
test case. Cheung further discusses the use of this module in determining testing
strategies and expected error cost of the software. The latter may be used in place
of system reliability in determining the acceptance of the software.
Littlewood (1979) assumes semi-Markov process and takes into account the
time spent in each module. The model further incorporates two sources of failure:
within module failure with rate 7~, i = 1. . . . , n, and failure associated with the
transfer from module i to module j which is given with rate 2;j (i # j ) . Assuming
that these individual failure rates are small in comparison to the switching rates
between modules, Littlewood states that the failure point process of the integrated
program is asymptotically a Poisson proces s with rate parameter
E,.j YlePu(#~J 7~ + 2u)
2,.: 11,pu.f
In the above 11 = (H~, ..., Hn) is the equilibrium vector of the imbedded Markov
chain, and /x~j is the expected sojourn time in module i before transferring to
module j. An estimate of overall program availability is given as
E,,j YIiPulX~J
Ei,j Ilipij[#~ j + #~Jv,rnl + 2um[J]
where m~ and m~: are the expected downtime due to failure in module i and due
to transfer from module i to module j, respectively.
As with Chueng's model individual module failure rates can be obtained before
interfacing takes place and all other parameter values may be estimated from test
cases or experience with similar programs. Estimation of expected costs of failures
is also discussed by Littlewood.
5. Closing comments
Though there is a large body of literature on software reliability (see Shick and
Wolverton, 1978; Ramamoorthy and Bastani, 1982; Shanthikumar, 1983) several
T. A. Mazzuchi and N. D. Singpurwalla
issues remain. First, there is a lack of models for the validation, operational and
m a i n t e n a n c e phase of the software. Additional models are needed to address such
issues as software design and testing criteria for release of software. Furthermore,
the vast n u m b e r of models for the testing and development phase has left the user
somewhat confused. Criteria for c o m p a r i s o n and selection of software models
needs to be developed as is done initially in M u s a and O k u m o t o (1982), Kieffer,
Littlewood, Miller and Sofer (1982) a n d I a n n i n o , Musa, Okumoto, Littlewood
(1984), and Soyer and Singpurwalla (1985).
Amster, S. J. and Shooman, M. L. (1975). Software reliability: An overview. In: E. Barlow, J. B.
Fussell and N. D. Singpurwalla, eds., Reliability and Fault Tree Analysis: Theoretical and Applied
Aspects of System Reliability and Safety Assessment. SIAM, Philadelphia, PA, 455-485.
Angus, J. E., Schafer, R. E. and Sukert, A. (1980). Software reliability model validation. Proceedings
of the 1980 Annual Reliability and Maintainability Symposium, 191-198.
Barlow, R. E. and Singpurwalla, N. D. (1985). Assessing the reliability of computer software and
computer networks: An opportunity for partnership with computer scientists. The American
Statistician 39, 88-94.
Cheung, R. C. (1980). A user-oriented software reliability model. IEEE Transactions on Software
Engineering 6, 118-125.
Chow, C. and Schechner, Z. (1985). On simple statistical stopping rules for software debugging
processes. Technical Report. Columbia University.
Crow, L. H. and Singpurwalla, N. D. (1984). An empirically developed Fourier series model for
describing software failures. IEEE Transactions on Reliability 33, 176-183.
Dahil, D. and Lahti, J. (1978). Investigation of methods for production and verification of computer
programs with high requirements for reliability. OECD Halden Reactor Project Preliminary
Forman, E. H. and Singpurwalla,N. D. (1977). An empirical stopping rule for debugging and testing
computer software. Journal of the American Statistical Association 72, 750-757.
Forman, E. H. and Singpurwalla, N. D. (1979). Optimal time intervals for testing hypotheses on
computer software errors. IEEE Transactions on Reliability 28, 250-253.
Goel, A. L. (1980). Software error detection model with application. The Journal of Systems and
Software 1, 243-249.
Goel, A. Lo (1980). A summary of the discussion on 'An analysis of competing software reliability
models'. IEEE Transactions on Software Engineering 6, 501-502.
Goel, A. L. and Okumoto, K. (1979). Time-dependent error-detection rate model for software
reliability and other performance measures. 1EEE Transactions on Reliability 28, 206-211.
Goel, A. L. and Okumoto, K. (1979). A Markovian model for reliability and other performance
measures. Proceedings of the National Computer Conference, 769-774.
Horigome, M., Singpurwalla, N. D. and Soyer, R. (1984). A Bayes empirical Bayes approach for
(software) reliability growth. In: L. Bilard, ed., Computer Science and Statistics: Proceedings of the
16th Symposium on the Interface. North-Holland, Amsterdam, 45-56.
Iannino, A., Musa, J. D., Okumoto, K. and Littlewood, B. (1984), Criteria for Software Reliability
Model Comparisons. IEEE Transactions on Software Engineering 10, 687-691.
Jelinski, Z. and Moranda, P. (1972). Software reliability research. In W. Freiberger, ed., Statistical
Computer Performance Evaluation. New York, Academic Press, 465-484.
Jewell, W. S. (1985). Bayesian extensions to a basic model of software reliability. Technical Report,
Operations Research Center, University of California in Berkeley.
Joe and Reid (1983). Estimating the number of faults in a system. Submitted to JASA.
KeiUer, P. A., Littlewood, B., Miller, D. R. and Sofer, A. (1982). On the quality of software reliability
Software reliability models
prediction. In: J. K. Skwirzynski, ed., Electronic Systems Effectiveness and Life Cycle Costing.
Springer, New York, 441-460.
Kim, J. H., Kim, Y. H. and Park, C. J. (1982). A modified Markov model for the estimation of
computer software performance. Operations Research Letters 1, 253-257.
Koch, H. S. and Kubat, P. (1983). Optimal Release Time of Computer Software. IEEE Transactions
on Software Engineering 9, 323-327.
Kremer, W. (1983). Birth-death and bug counting. IEEE Transactions on Reliability 32, 37-46.
Krten, O. J. and Levy, J. (1980). Software modeling from optimal field energy. Proceedings of the
Annual Reliability and Maintainability Symposium, 410-414.
Kyparisis, J. and Singpurwalla, N. D. (1984). Bayesian inference for the Weibull process. In: L.
Bilard, ed., Computer Science and Statistics; Proceedings of the 16th Symposium on the Interface.
North-Holland, Amsterdam, 57-64.
Kyparisis, J., Soyer, R. and Daryanani, S. (1984). Computer programs for inference from the Weibull
process. Institute for Reliability and Risk Analysis Technical Report, The George Washington
University, Washington, DC.
Langberg, N. and Singpurwalla, N. D. (1985). Unification of some software reliability models via the
Bayesian approach. SIAM Journal on Scientific and Statistical Computing 6, 781-790.
Lipow, M. (1974). Some variations of a model for software time-to-failure. Correspondence ML-742260.1, TRW Systems Group.
Littlewood, B. (1979). How to measure software reliability and how not to. IEEE Transactions on
Reliability 28, 103-110.
Littlewood, B. (1979). Software reliability model for modular program structure. IEEE Transactions
on Reliability 28, 241-246.
Littlewood, B. (1980). The Littlewood-Verral model for software reliability compared with some
rivals. The Journal of Systems and Software 1,251-258.
Littlewood, B. (1980). Theories of software reliability: How good are they and how can they be
improved. IEEE Transactions on Software Engineering 6, 489-500.
Littlewood, B. (1981). A critique of the Jelinski-Moranda model for software reliability. Proceedings
of the 1981 Annual Reliability and Maintainability Symposium, 357-362.
Littlewood, B. (1981). Stochastic reliability growth: a model for fault-removal in computer-programs
and hardware design. IEEE Transactions on Reliability 30, 313-320.
Littlewood, B. and Veri'all, J. L. (1973). A Bayesian reliability growth model for computer software.
Applied Statistics 22, 332-346.
Littlewood, B. and Verrall, J. L. (1981). Likelihood function of a debugging model for computer
software reliability. IEEE Transactions on Reliability 30, 145-148.
Meinhold, R. J. and Singpurwalla, N. D. (1983). Bayesian analysis of a commonly used model for
describing software failures. The American Statistician 32, 168-173.
Moranda, P. B. (1975). Prediction of software reliability during debugging. Proceedings of the 1981
Annual Reliability and Maintainability Symposium, 327-332.
Moranda, P. B. (1979). Event-altered rate models for general reliability analysis. IEEE Transactions
on Reliability 28, 376-381.
Musa, J. D. (1975). A theory of software reliability and its application. IEEE Transactions on Software
Engineering 1, 312-327.
Musa, J. D. (1979). Validity of execution-time theory of software reliability. IEEE Transactions on
Reliability 28, 181-191.
Musa, J. D. and Okumoto, K. (1982). Software reliability models: Concepts classification, comparisons, and practice. In: J. K. Skwirzynski, ed., Electronic Systems Effectiveness and Life Cycle Costing.
Springer, New York, 395-423.
Musa, J. D. and Okumoto, K. (1984). A logarithm Poisson execution time model for software
reliability measurement. Proceedings of the 1984 Reliability and Maintainability Symposium.
Nelson, E. (1978). Estimating software reliability from test data. Microelectron. Reliab. 17, 67-74.
Okumoto, K. and Goel, A. L. (1979). Optimal release time for software systems. Proceedings of
COMPSAC, 500-503.
T. A. Mazzuchi and N. D. Singpurwalla
Okumoto, K. and Goel, A. L. (1980). Optimal release time for software systems based on reliability
and cost criteria. Journal of Systems and Software 1, 315-318.
Petroski, C. M. (1984). A survey of software reliability. Student Report, The George Washington
Ramamoorthy, C. V. and Bastani, F. B. (1979). An input domain based approach to the quantitative
estimation of software reliability. Proceedings of the Taipei Seminar on Software Engineering, Taipei,
Ramamoorthy, C. V. and Bastani ,F. B. (1980). Modeling the software reliability growth process.
Proceedings of COMPSAC, Chicago, IL, 161-169.
Ramamoorthy, C. V. and Bastani, F. B. (1982). Software reliability-status and perspectives. IEEE
Transactions on Software Engineering 8, 354-371.
Schick, G. J. and Wolverton, R. W. (1978). An analysis of competing software reliability models.
IEEE Transactions on software Engineering 4, 104-120.
Schick, G. J. and Wolverton, R. W. (1973). Assessment of software reliability. Proceedings Operations
Research, Physica, Werzberg-Wein, 395-422.
Schneidewind, N. F. (1975). An analysis of computer processes in computer software. Proceedings
of the International Conference on Reliable Software, 337-346.
Shanthikumar, J. G. (1981). A general software reliability model for performance prediction.
Mircoelectron. Reliab. 27, 671-682.
Shanthikumar, J. G. (1983). Software reliability models: A review. Microelectron. Reliab. 23, 903-943.
Shanthikumar, J. G. and Tufekci, S. (1981). Optimal release time using generalized decision trees.
Proceedings of the Fourteenth Annual Hawaii International Conference on System Sciences, 58-65.
Shanthikumar, J. G. and Tufekci (1983). Application of a software reliability model to describe
software release time. Microelectron. Reliab. 23, 41-59.
Shooman, M. L. (1972). Probabilistic models for software reliability prediction. In: W. Freiberger,
ed. Statistical Computer Performance Evaluation. Academic Press, New York, 485-502.
Shooman, M. L. (1973). Operational testing and software reliability estimation during program
development. Record of the 1973 IEEE Symposium on Computer Software Reliability, 51-57.
Shooman, M. L. (1975). Software reliability: Measurement and models. Proceedings of the 1975
Annual Reliability and Maintainability Symposium, 485-489.
Shooman, M. L. and Trivedi, A. K. (1976). A many state Markov model for computer software
performance parameters. IEEE Transactions on Reliability 25, 66-68.
Singpurwalla, N. D. and Soyer, R. (1985). Assessing (software) reliability growth using a random
coefficient autoregressive process and its ramifications. To appear in IEEE Transactions on Software
Sukert, A. N. (1977). An investigation of software reliability models. Proceedings of the Annual
Reliability and Maintainability Symposium, 78-84.
Sukert, A. N. (1979). Empirical validation of three software prediction models. IEEE Transactions on
Reliability 28, 199-205.
Stefanski, L. A. (1982). An application of renewal theory to software reliability. Proceedings of the
Twenty-Seventh Conference on the Design of Experiments in Army Research Development Testing. ARO
Report 82-2, 101-118.
Wagoner, W. L. (1973). The final report on a software reliability measurement study. Report TOR0074-(41221)-1, The Aerospace Corp., El Segundo, CA.
P. R. Krishnaiah and C. R. Rao, eds., Handbook of Statistics, Vol. 7
© Elsevier Science Publishers B.V. (1988)99-111
Dependence Notions in Reliability Theory
Narasinga R. Chaganty and Kumar Joag-dev
1. Introduction
The concepts of stochastic dependence play an important role in many statistical applications. Although in reliability theory it is rare that new dependence
concepts are created, the well known concepts such as Markov dependence, total
positivity, stochastic monotonicity and some others related to positive dependence
are quite important. The study of their significance and relevance in reliability
theory is the main object of the present chapter. The definitions and some
immediate consequences of the concepts which we use in the following, have
already appeared in the Handbook of two articles: Boland and Proschan (this
Volume, Chapter 10), and Joag-dev (see Vol. 4, Chapter 4). We briefly review
these for the sake of completeness.
Part I
The first part of our study will consist of the effects of dependence on the
classification of life distributions according to the properties of aging. Most of
these concepts originate in the bivariate case and due to its importance and
simplicity we will study this case in more detail. Major source for the material
covered in this part consists of the articles by Freund (1961), Harris (1970),
Brindly and Thompson (1972), Shaked (1977) and the book by Barlow and
Proschan (1981).
1.1. Definitions
Let (X, Y) be a pair of real valued random variables defined on a fixed probability space. The joint distribution function and the marginals of (X, Y) will be
denoted by Fx, v, Fx and F r and the corresponding density functions by fx, r, f x ,
f y respectively. We write I(A) for the indicator of an event A. Many of the
concepts of positive and negative dependence can be defined in terms of conditions on covariances of functions restricted to certain classes. Thus conditions
N. R. Chaganty and K. Joag-dev
(a) Cov[X, Y] i> 0,
(b) Cov[gl(X), hi(Y)] >~ 0, where gl and hi are nondecreasing,
(c) Cov[g2(X, Y), h2(X, r)] >i 0, where g2 and hE are co-ordinatewise nondecreasing,
define successively (strictly) stronger positive dependence conditions. Condition
(b) is known as positive quadrant dependence (PQD), it can be seen to be
equivalent to
(b ~) C o v [ I ( X > x), I ( Y > y)] >10.
Condition (c) is known as association. A condition stronger than (c) known as
positive regression dependence is obtained by requiring
(d) E [ f l ( X ) I Y = y] to be nondecreasing in y, for every nondecreasing function
Note that this condition is non-symmetric. A condition known as 'monotone
likelihood ratio' or 'totally positive of order 2 (TP2)' is even stronger and is given
(e) fx. r(x2, Y2)fx, r(xl, Y,) >>"fx. r(x2, Y , ) f x , r(x,, Y2)
for x 2 > Xl and Y2 > YI.
Some of the concepts above have multivariate analogs. We mention some of
these. Corresponding to PQD, two non-equivalent multivariate generalizations
can be described. First one is called 'positive upper orthant dependence' (PUOD)
and the second one is labeled as 'positive lower orthant dependence' (PLOD).
These are defined by the conditions:
P[Xi >~ x e, i = 1, . . . , k] >t l-I e [ x , >/xe)
for every x = (xl, ..., xk)e Nk,
P [ X i <~x i, i = 1, . . . , k] >t l~ P[Xi <<-xi]
for e v e r y x e ~ k .
The condition of 'association' for X
P U O D and PLOD is given by
Cov[gk(X), hk(X)] 1> O,
Xk) ,
which is stronger than
for every co-ordinatewise nondecreasing pair of functions ~k_~ R. This condition
was first introduced and studied by Esary, Proschan and Walkup (1967), (see
Boland and Proschan's article in this volume).
A version of regression dependence similar to (d) above would be to require,
for every i = 1. . . . , k,
E[fI(Xi)[Xj = Xy, j = 1, . . . , (i - 1)],
Dependence notions in reliability theory
nondecreasing in each xj, for every f l nondecreasing. This is sometimes known
as 'positive regression dependence in sequence'. It can be shown that this implies
The property of association is important for obtaining bounds on the survival
probabilities of the coherent systems. For example, if it is a series system and if
the component lives are denoted by Ti then the system life is min e T~ and association provides the bound,
P[min T,. > t] ~> 1-]P[Ti> t].
Similar bound can be obtained for a parallel system. These two bounds can be
combined to obtain bounds for a general coherent system.
An analog of TP z dependence given in (e) above is obtained by imposing this
condition on every pair of the arguments of the joint density in •k, while other
arguments are kept fixed. This condition known was MTP z implies association
(see for example Barlow and Proschan, 1981).
Finally, some of these conditions with appropriate changes, may be used to
define negative dependence. For example, see Block, Savits and Shaked (1982)
and Joag-dev and Proschan (1983). The components of a vector
X = (Xa, X2, . . . , Xk) are negatively associated if for every nonempty subset A of
{ 1, 2, ..., k} and every pair of co-ordinatewise nondecreasing functions g and h,
the Cov(f(XA), g(X~)) is nonpositive, where A- denotes the complement of A.
Negative dependence is relevant in systems defined in closed environments. For
example, a given number of species competing in an ecosystem with a fixed
amount of resources, may have their life lengths negatively associated.
1.2. Dependence and aging classification
We adopt the usual notation. A life distribution function F is said to be
increasing failure rate (IFR) if the ratio r(x) = f(x)/ff(x) is nondecreasing in x. We
say F is decreasing failure rate (DFR) if r(x) is nonincreasing in x. Here f is the
density corresponding to F and ff = 1 - F is the survival function. The function
r(x) is known as the failure rate. The distribution function F is said to be
increasing failure rate on the average (IFRA) if [if(x)] l/x is nonincreasing in x >/0
and F is new better than used (NBU) if ff(x + y) <~F(x)F(y) for all x, y >t 0.
Let X, Y be the life-lengths of two components. We examine some dependence
relations which have interpretations in terms of failure rates. First note that IFR
property is equivalent to having F log concave. Thus the conditional failure rate
r(x] Y = y) being increasing in x for every y, is equivalent to having conditional
survival function ffc(xly) log concave in x for every y. Suppose now that r(x[y)
is decreasing in y for every x, in addition to the conditional IFR property. This
would imply
- - - r(xly) (logFc(xly)) >~ O,
dy dx
N. R. Chaganty and K. Joag-dev
or equivalently ff(xl Y = y) considered as a function of x and y is TP z. Note that
if the joint density fx. r(x, y) is TP z, so is the conditional density fc(xly). However, this implies that ffo(xly) is TP 2. This is analogous to the univariate case,
where log-concavity implies IFR.
Another quantity of interest is the 'mean residual life', re(x), which is the
conditional expectation of life at age x. This is given by
m(x) =
tf(t) dt/ff(x) =
i(t) dt/i(x).
The life distribution F is said to be increasing mean residual life (IMRL) if m(x)
is increasing in x >~ 0. We say F is decreasing mean residual life (DMRL) if m(x)
is decreasing in x >~ 0. To obtain the monotone behaviour of the conditional mean
residual life mo(x]y), it can be shown that it suffices to have
h(x, y) =
(t - x)fc(tly) dt
be TP 2. Again it can be shown that this condition is weaker than that needed for
the monotonicity of r(xlc). These results and some extensions were derived by
Shaked (1977). In the same article, Shaked (1977) also introduced the concept of
dependence by total positivity (DTP) for bivariate distributions. Recently Lee
(1985a) generalized the DTP concepts to the multivariate case and obtained a
number of inequalities and monotonicity properties of conditional hazard rate and
mean residual life functions of some multivariate distributions satisfying the DTP
property. In a subsequent paper Lee (1985b) introduced the concept of dependence by reverse regular (DRR) rule, which is the mirror image of DTP, and
studied the relationship of DRR with other concepts of negative dependence.
Harris (1970) defined IHR (increasing hazard rate) property for a multivariate
distribution by requiring
(a) ff(x + t 1)~if(x) nonincreasing in x, and
(b) P [ X > u IX > x] nondecreasing in x for every fixed vector u.
Geometric interpretation of (b) has prompted its name 'right comer set increasing'
(RCSI). Condition (a) is clearly 'wear out' condition, while as we shall see, (b)
describes positive dependence.
Brindley and Thompson (1972) studied the class of distributions where only (a)
is satisfied. In order to distinguish between these two classes based on aging
property, one satisfying (a) is called IFR, while the subclass with the additional
requirement of (b) is called IHR (H is for hazard or Harris!). In both cases the
classes can be seen to be closed under (a) taking subsets (b) unions of independent sets of variables (c) taking minimums over subsets. Note that ir,Jportance of
the minimums stems from its role in the series systems. Both definitions, when
restricted to univariate, yield the usual IFR distribution. For the univariate case
(b) is trivially satisfied.
Dependence notions in reliability theory
To see that RCSI implies positive dependence, let K and M be arbitrary subsets
(not necessarily disjoint) of { 1, 2 . . . . . n}. Denoting appropriate subvectors by x K
and xM etc., it can be seen readily that (1.10b) implies that
P[XM > uM ]XK > xK]
is a co-ordinatewise nondecreasing function of Xk, for every u M fixed. Repeated
application of condition (1.11) with a singleton K yields
F(x) >~ f i F;(xi)
which is PUOD.
It would be worthwhile to mention examples of distributions where the above
dependency concepts are manifested in a natural way. If the components are
independent, then most of the conditions are trivially satisfied and hence we
consider those having dependent components.
Let U, X~, X 2 be independent random variables. Consider Y1 = min(U, X1),
I12 = min(U, X2), such functions determine the life of a system where the component corresponding to U is connected in series. These functions are also important when U represents the arrival time of a shock which disables components
corresponding to X~, X 2. This model, when U, Xa, X 2 each has exponential
distribution, was studied by Marshall and Olkin (1967). They also studied its
multivariate analog where different shocks disable 2, 3. . . . , n components. It
should be noted that the property of association is preserved due to the fact that
minimum of random variables is a co-ordinatewise increasing functions.
Gumbel (1960) discussed a simple model with bivariate distribution where its
survival function is given by
G(x, y) = exp(-
x - y - bxy) ,
x, y >~ O ,
where 0 ~< b ~< 1. It is clear that the marginals are exponential and since X has
negative regression dependence, it is only appropriate when two variables have
such dependence. Freund (1961) describes a bivariate model of a two component
system where the joint survival function is same as that of two independent
exponentially distributed random variables with shape parameters e and B, as long
as both components have not failed. Upon failure of one item the shape
parameter of the life distribution of the other component is changed to e I or
changed to ill. The joint survival probability function can be written as
if(x, y) = exp ( - (e + fl)x) [[( fl
e x p ( - (e + f l ) ( y - x))
+ (e +/~ - ¢~')
e x p ( - fll(y - x ) ) / ,
N. R. Chaganty and K. Joag-dev
~--CZ 1
exp( - (~ + / / ) ( y - x))
= exp(-(o~ + ]~)x)[ioc
exp(-eX(y- x))[,
+ (~ +/~ _ ~1)
y~< x .
The marginal distributions are not exponential but are certain mixtures of
exponentials and the nature of dependence is determined by the relative magnitudes of the parameters. In fact,
ffl(X )
( . +/~ - .~)
exp ( - (~ +/~)y) +
( . +/~ - .~)
exp ( -- ~ 1X)
exp ( - (~ +
fl)x) +
exp ( -
It is easy to verify that Fl(x) is IFR if and only if ~ < ~1 and F2(y) is IFR if
and only if fl < ill.
Part II
The second part of our study deals with dependence concepts relevant to the
models which consider repair and replacement of the components of a system.
These dependent concepts arise from the study of the theory of stochastic processes. Some of the classical types of stochastic processes characterized by different dependence relationships are Markov processes, Renewal processes and
Markov renewal processes. The latter includes the previous two as special cases.
The dependent relations such as total positivity, association, stochastic monotonicity studied in Part I, have natural occurrence among these processes. It is
needless to say that the vast number of results in the study of the above processes
have wide applications in reliability theory. In the next few sections we shall
examine some of these processes and their applicability in characterizing the
failure rate of the life distributions of systems, as well as in obtaining bounds of
some other quantities of interest in reliability theory. The organization of this part
is as follows: In Section 2.1, we define totally positive Markov process and
discuss some useful theorems related to this process. A concept weaker than
totally positivity is stochastic monotonicity, that is, all totally positive Markov
processes are stochastically monotone but not vice versa. This is discussed in
Section 2.2.
Dependencenotionsin reliabilitytheory
Many of the models in reliability theory which consider replacement of items
as they fail can be delineated by a renewal process. The renewal function is
defined as the expected number of items replaced at a given instant of time. We
can obtain lower and upper bounds for the renewal function, when the life
distribution of the items is assumed to be in one of the reliability classes of life
distributions. These results are discussed in the last Section 2.3.
Totally positive Markov
DEFINITION 1. A stochastic process {XA t+ [0, +)} is said to be a Markov
process with state space S if for any t, s >_-0 and j in S,
P[X,+ s = J IX,; u ~< tl =
The Markov process is said to be a time-homogeneous Markov process when
the conditional probability,
P [ X t + s = J bXt = i] = P s ( i , j )
is independent of t >t 0, for all i, j in S and s >/0. The collection of matrices
j)), t > 0, is simply called the transition function of the Markov processes.
P , = (Pt(i,
DEFINITION 2. A Markov process with transition matrix Pt is said to be totally
positive (TP) if i I < i 2 < . • • < i n a n d J l <J2 < " " " < J , , the determinant
[ il, ,,
is strictly positive when t > 0 for all n >~ 1. If (2.3) holds for n ~< r, we say that the
Mmkov process is totally positive of order r (TPr).
When the state space S is a countable set and the parameter set is the set of
integers, the Markov process is known as a Markov chain. The Markov chain is
said to be time-homogeneous if the transition function Pn is independent of n, in
which case we simply write P. The Markov chain is totally positive if P satisfies
condition (2.3). Karlin and McGregor (1959a, b) have shown that, indeed several
Markov chains and Markov processes are totally positive, the prominant one
being the birth and death process.
An excellent treatise of totally positive Markov chains and totally positive
Markov processes together with applications in several domains of mathematics,
including reliability theory, is given in Karlin (1964). Typical of the results of
Karlin (1964) are the following theorems regarding inheritance of TP character.
N. R. Chaganty and K. Joag-dev
THEOREM 3. Let the transition matrix P of a Markov chain {XK, K >/1) be TP r.
Define for i > j,
Q(n, i) = P [ j < XK <~ i, l <~K <~ n - 1, xn = j [ X o = i ] .
Then Q(n, i) is TPr for n >~ 0 and i > j.
The TP property is also prevalent when the initial state of the Markov chain
is fixed. We state this in the theorem below.
Assume the hypothesis of Theorem 3. Define for i > L
Q I [ n , j ) = P [ i < XK < j , l < K < n -
1, X n = j [ X o =
i 1.
Then Q1 is TPr in the variables n ~ O and i > j.
The above Theorem 4 was used by Brown and Chaganty (1983) to show that
the first passage time distribution from an initial state to a higher state in a birth
and death process is IFR. This result was also obtained by Keilson (1979),
Derman, Ross and Schechner (1979) using other methods. Another application of
Theorem 4 is given by Assaf, Shaked and Shanthikumar (1985). They have
shown that the time to failure of some systems which are subject to shocks and
damages, which are not necessarily nonnegative, is IFR.
2.2. Stochastic monotonicity in Markov processes
A useful notion weaker than total positivity is stochastic monotonicity. This
concept was introduced by Kalmykov (1962) and later was discussed in detail by
Veinott (1965), Daley (1968), O'Brien (1972) and Kirstein (1976). A detailed study
of stochastic monotonicity in Markov processes can be found in the book by
Keilson (1979). Stochastic monotonicity is a structural property of the Markov
process. The random variables in such processes are associated and this connection gives rise to many interesting inequalities in reliability theory. We define
below, stochastic monotonicity for Markov chains and then extend the definition
for Markov processes.
DEFINITION 5. A Markov chain {XK, K >/0) is said to be stochastically monotone if XK+ 1 given X K = i, is stochastically larger than XK+ 1 given X K =j, for
all k / > 0 and i > j .
The extension of stochastic monotone property to continuous time Markov
processes is straight forward.
DEFINITION 6. A time-homogeneous Markov process {Art, t >~ 0} is said to be
stochastically monotone if X t given Xo = x l is stochastically larger than Xt given
X o = x 2 for all t > 0 and X l > X 2 .
Dependence notions in reliability theory
Numerous Markov processes are indeed stochastically monotone. These include Markov diffusion processes. More generally the class of totally positive
Markov process is a proper subset of the class of stochastically monotone
Markov process.
Stochastically monotone Markov chains with partially ordered state spaces
were introduced by Kamae, Krengel and O'Brien (1977) and their applications to
problems in reliability theory were studied by Brown and Chaganty (1983). We
discuss these after introducing some notation. Let S be a countable set with a
partial ordering denoted by >~. A subset C of S is said to be increasing set if i
belongs to C and j >/i implies j is in C. A time homogeneous Markov chain
{Xn, n ~> 0} with state space S is said to be stochastically monotone if for j >/i,
the transition probability from j to C is larger than from i to C, for all increasing
sets C. The Markov chain is said to have monotone paths if P ( X n + 1 >>-Xn) = 1,
for all n >/0. The following theorem characterizes the class of I F R A distributions
with stochastically monotone Markov chains.
THEOREM 7. Let S be a partially ordered countable set. Let {X n, n >i 0} be a
stochastically monotone Markov chain with monotone paths and state space S. Let C
be an increasing subset of S, with finite complement. Then the first passage time from
state i to set C is IFRA.
Shaked and Shanthikumar (1984) generalized the above theorem by removing
the restriction that the complement of C is finite. As a converse to Theorem 7 we
have the following result.
THEOREM 8. Every I F R A distribution in discrete time & either the first passage
time distribution to an increasing set for a stochastically monotone Markov chain with
monotone paths on a partially ordered finite set, or the limit of a sequence of such
Analogous theorems in the continuous time frame also hold. The above
theorems were used by Brown and Chaganty (1983) to show that the convolution
of two I F R A distributions is IFRA. Various other applications of the above
theorems to shock models in reliability theory, sampling with and without replacement can also be found in Brown and Chaganty (1983).
Stochastically monotone Markov chains also take an important place in
obtaining optimum control limit rules. The following formulation is due to Derman
(1963). Suppose that a system is inspected at regular intervals of time and that
after each inspection it is classified into one of (m 4- 1) states denoted by 0, 1,
2 . . . . . m. A control limit rule l simply says that replace the system is the observed
state is one of the states k, k + 1, . . . , m for some predetermined state k. The
state k is called the control limit of l. Let X n denote the observed state of the
system in use at time n >/0. We assume that {X~, n ~> 0} is a stationary Markov
chain. Let c ( j ) denote the cost incurred when the system is in state j. Let L
denote the class of all possible control limit rules. For l ~ L , the asymptotic
N.R. Chagan~ and K. Joag-dev
expected average cost is defined as A(I) = l i m , _ ~ 1/n ~,= 1 c(X,). The following
theorem was proved by Derman (1963).
THEOREM 9. Let the Markov chain {X~, n >/0} be stochastically monotone. Then
there exists a control limit rule l* such that
A (I*) = miLnA (l).
2.3. Renewal theory in reliability
Let {Xi, i/> 1} be a sequence of nonnegative, independent and identically distributed random variables. Let S n = X 1 + . . . + X n be the nth partial sum and let
N, be the maximum value of n for which S n ~< t. In the context of reliability theory
we can think that the Xt's represent the life times of items being replaced. The
partial sum Sn represents the time at which the nth renewal takes place and N t
is the number of renewals that will have occurred by time t. The dependent
process {N,, t ~> 0} is known as a renewal process. The study of renewal theory
is to derive properties of certain random variables associated with N t from the
knowledge of the distribution function F of X~. In this section we shall discuss
the important results, when the underlying distribution F is assumed to belong to
one of the reliability classes of life distributions. For an extensive study of the
general theory of renewal process we refer the reader to the expository article by
Smith (1968) and to the books by Cox (1962), Feller (1966) and Karlin and
Taylor (1975).
The renewal function M(t) = E[Nt] plays a central role in reliability, especially
in maintenance models. It is useful to get bounds on M(t) for finite t, since in
most cases computing M(t) may be difficult. One such bound is given by
M(t) ~ t/#~ - 1, where #1 is the mean of F. Under the additional assumption that
F is IFR, Obretenov (1974) obtained the following sharper bound:
M(t) >~-- + - - - 1,
where ~ = l i m n _ o ~ n + l / ( n + 1)/~, #n =E(X~). Barlow and Proschan (1964),
while studying replacement policies, when the life distribution of the unit is IFR,
obtained the following lower and upper bounds for the renewal random variable
t •
Let R(t) = -logF(t). If F is IFR with mean #5 then
P(N, >~n) >~ ~ (nR(t/n))J e x p ( - nR(t/n))
for t >>.O, n >~ l.
~ (t/l~l)J e x p ( - t / ~ t l ) ,
Dependence notions in reliability theory
Under weaker conditions on F we have the following theorem.
Let R(t) = - logF(t). I f F is N B U with finite mean then
P(N t >~n) <~ ~Z (R(t)) exp(- R(t)),
M(h) <~M ( t + h) - M ( t ) ,
Var (Nt) < M(t)
for t >~ O, h >~ O, n >>. 1.
The reverse inequalities in the above theorem are valid for F new worse than
used (NWU), that is, ff(x + y) >1 F(x)F(y), for all x, y/> 0. In a two paper series
Brown (1980, 1981) obtained nice properties for the renewal function M(t) when
the underlying distribution F is assumed to be D F R or IMRL. Let
Z(t) = S N ( t ) + 1 - - t denote the forward recurrence time at time t and A(t) = t - SN,,
the renewal age at t. The following theorem can be found in Brown (1980, 1981).
THEOREM 12. (a) I f the underlying distribution F of the renewal process is DFR,
then the renewal density M ' ( t ) exists on (0, ~ ) and is decreasing, that is, M(t) is
concave. Furthermore, Z(t), A(t) are both stochastically increasing in t >/O.
(b) I f F is I M R L then M ( t ) - t/l~ is increasing in t>~ 0 and E[~b(Z(t))] is
increasing in t >/0 for increasing convex functions ~.
In the case where F is IMRL, Brown (1981) provides counter examples to show
that Z(t) is not necessarily stochastically increasing, E[A(t)] not necessarily
increasing and M(t) need not to be concave. An example of Berman (1978) shows
that the analogous results do not hold for I F R and D M R L distributions. As an
application of Theorem 12, Brown (1980) obtained sharp bounds for the renewal
function M(t) for F I M R L , with improved bounds for F DFR. These results are
given in the next theorem.
THEOREM 13. Let Pn = E(X~'), n ~> 1. Let U(t) = t/Izl +/~2/2#12. Let #K+2 be
finite for some k ~ O. I f F is I M R L then
U(t) >~ M(t) >~ U(t) -
min d i t - ' ,
where the constant di is a simple function of gl . . . . .
U(t) >~M(t) >1 U(t) -
min uid~t -~,
where % = 1, c¢i = (i/i + 1) i for i >1 1.
#;÷2. Furthermore, if F is D F R
N. R. Chaganty and K. Joag-dev
M a r s h a l l and P r o s c h a n (1972) o b t a i n e d the f o l l o w i n g c h a r a c t e r i z a t i o n o f the
N B U class o f life d i s t r i b u t i o n s in t e r m s o f the r e n e w a l p r o c e s s N,.
THEOREM 14. The distribution function F is B N U ( N W U ) i f and only i f
N(s + t) >1 ( <~) N ( s ) • N(t) f o r all s, t >~ O, where • denotes the convolution operation.
Esary, M a r s h a l l and P r o s c h a n (1973) e s t a b l i s h e d the following I F R A p r o p e r t y
for the r e n e w a l p r o c e s s , while studying s o m e s h o c k m o d e l s .
THEOREM 15. L e t {Nt, t>~0} be a renewal process. Then P [ N t > / k ] l/k is
decreasing in k >~ 1, that is, N t possesses the discrete I F R A property.
Assaf, D., Shaked, M. and Shanthikumar, J. G. (1985). First passage times with PF r densities.
Journal of Appl. Prob. 22, 185-196.
Barlow, R. E. and Proschan, F. (1964). Comparison of replacement policies, and renewal theory
implications. Ann. Math. Statist. 35, 577-589.
Barlow, R. E. and Proschan, F. (1981). Statistical Theory of Reliability and Life Testing. To Begin With,
Silver Spring, Maryland.
Berman, M. (1978). Regenerative multivariate point processes. Adv. Appl. Probability 10, 411-430.
Block, H. W., Savits, T. H. and Shaked, M. (1982). Some concepts of negative dependence. Ann.
of Probability 10, 765-772.
Brindley, E. C. Jr. and Thompson, W. A. Jr. (1972). Dependence and aging aspects of multivariate
survival. Journal of Amer. Stat. Assoc. 67, 822-830.
Brown, M. (1980). Bounds, inequalities, and monotonicity properties for some specialized renewal
processes. Ann. of Probability 8, 227-240.
Brown, M. (1981). Further monotonicity properties for specialized renewal processes. Ann. of P,obability 9, 891--895.
Brown, M. and Chaganty, N. R. (1983). On the first passage time distribution for a class of Markov
Chains. Ann. of Probability 11, 1000-1008.
Cox, D. R. (1962). Renewal Theory. Methuen, London.
Daley, D. J. (!968). Stochastically monotone Markov chains. Z. Wahrsch. verw. Gebiete 10, 305-317.
Derman, C. (1963). On optimal replacement rules when changes of state are Markovian. In: Richard
Bellman, ed., Mathematical Optimization Techniques. Univ. of California Press, 201-210.
Derman, C., Ross, S. M. and Schechner, Z. (1979). A note on first passage times in birth and death
and negative diffusion processes. Unpublished manuscript.
Esary, J. D., Marshall, A. W. and Proschan, F. (1973). Shock models and wear processes. Ann. of
Probability 1, 627-649.
Esary, J. D., Proschan, F. and Walkup, D. W. (1967). Association of random variables, with
applications. Ann. Math. Stat. 38, 1466-1474.
Feller, W. (1966). An Introduction to Probability Theory and lts Applications, Vol. II. Wiley, New York.
Freund, J. E. (1961). A bivariate extension of the exponential distribution. Journal of Amer. State.
Assoc. 56, 971-977.
Gumbel, E. J. (1960). Bivariate exponential distributions. Journal of Amer. Star. Assoc. 55, 698-707.
Harris, R. (1970). A multivariate definition for increasing hazard rate distribution functions. Ann.
Math. Statist. 41, 713-717.
Joag-dev, K. and Proschan, F. (1983). Negative association of random variables with applications.
Ann. Statist. 11, 286-295.
Karlin, S. (1964). Total positivity, absorption probabilities and applications. Trans. Amer. Math. Soc.
Dependence notions in reliability theory
III, 33-107.
Karlin, S. and McGregor, J. (1959a). Coincidence properties of birth and death processes. Pacific
Journal of Math. 9, 1109-1140.
Karlin, S. and McGregor, J. (1959b). Coincidence probabilities. Pacific Journal of Math. 9, 1141-1164.
Karlin, S. and Taylor, H. M. (1975). A First Course in Stochastic Processes, 2nd edition. Academic
Press, New York.
Kalmykov, G. I. (1962). On the partial ordering of one-dimensional Markov processes, Theor. Prob.
Appl. 7, 456-459.
Kamae, T., Krengel, U. and O'Brien, G. C. (1977). Stochastic inequalities on partially ordered
spaces. Ann. of Probability 5, 899-912.
Keilson, J. (1979). Markov Chain Models--Rarity and Exponentiality. Springer, New York.
Kirstein, B. M. (1976). Monotonicity and comparability of time homogeneous Markov processes with
discrete state space. Math. Operations Forschung. Stat. 7, 151-168.
Lee, Mei-Ling Ting (1985a). Dependence by total positivity. Ann. of Probability 13, 572-582.
Lee, Mei-Ling Ting (1985b). Dependence by reverse regular rule. Ann. of Probability 13, 583-591.
Marshall, A. W. and Proschan, F. (1972). Classes of distributions applicable in replacement, with
renewal theory implications. Proceedings of the 6th Berkeley Symposium on Math. Stat. and Prob.
I. Univ. of California Press, Berkeley, CA, 395-415.
Marshall, A. W. and Olkin, I. (1967). A multivariate exponential distribution. Journal ofAmer. Stat.
Assoc. 62, 30-44.
Obretenov, A. (1974). An estimation for the renewal function of an IFR distribution. In: Colloq.
Math. Soc. Janos Bolyai 9. North-Holland, Amsterdam, 587-591.
O'Brien, G. (1972). A note on comparisons of Markov processes. Ann. of Math. Stat. 43, 365-368.
Shaked, M. (1977). A family of concepts of dependence for bivariate distributions. Journal of Amer.
Stat. Assoc. 72, 642-650.
Shaked, M. and Shanthikumar, J. G. (1984). Multivariate IFRA properties of some Markov jump
processes with general state space. Preprint.
Smith, W. L. (1958). Renewal theory and its ramifications. J. Roy. Statist. Sot., Series B 20, 243-302.
Veinott, A. F. (1965). Optimal policy in a dynamic, single product, nonstationary inventory model
with several demand classes. Operations Research 13, 761-778.
P. R. Krishnaiah and C. R. Rao, eds., Handbook of Statistics, Vol. 7
© Elsevier Science Publishers B.V. (1988) 113-120
Application of Goodness-of-Fit Tests in Reliability
B. W. W o o d r u f f a n d A. H. M o o r e
1. Introduction
Prior to using a probability model to represent the population underlying data,
it is important to test adequacy of the model. One way to do this is by a
goodness-of-fit test. However one must make an initial selection of models to be
tested. Several avenues are available for an initial screening of the data. One could
construct histograms, frequency polygons or more sophisticated non-parametric
density estimates [4, 23]. Another very useful initial screening device is the use
of a probability plot on special graph paper available for a variety of common
distributions used in life testing. Nelson [19] gives an extensive coverage to the
use of probability plots in his book on reliability theory. After one has selected
a model to be tested further, an initial screening of the model could be done by
a X2 goodness-of-fit test discussed below. If the Z2 test rejects at a suitable
significance level, then one can proceed to test other reasonable models. However
if one fails to reject the model, then one should consider, if possible, other more
powerful goodness-of-fit tests.
2. Z2 goodness-of-fit tests
This classical test is an almost universal goodness-of-fit test since it can be
applied to discrete, continuous or mixed distributions, with grouped or ungrouped
data, model completely specified or with the parameters estimated. It can also be
adapted to be used with censored data or truncated distributions.
The test is an approximate test since the sample statistic is only asymptotically
g 2 distributed. Several authors have shown it to have lower power than other
applicable tests. In applying the test, the data must be grouped into intervals.
Since several statisticians may group the data differently, this may lead to a
change in the reject or accept decision and hence the test is not unique. It also
requires moderate to large sample sizes.
B. W. Woodruff and A. H. Moore
2.1. X2 test procedure
Ho: F(x) = Fo(x),
H A : F(x) ~ Fo(x ) .
Take a random (or censored) sample from the unknown distribution and divide
the support set into a set of k subsets. Now under the null hypothesis, determine
the expected number of observations in each subset denoted by E i (i = 1. . . . . k).
The observed number of sample observations in each subset is denoted by O,. A
usual rule is to choose the subsets so that the expected number of observations
in each subset is greater than or equal to 5. The test statistic is
( O i - Ei)2
We reject Ho if Z^2 > ) ~ 2. k - p - i where p is the number of parameters estimated in the specification of the null hypothesis Fo(x).
3. Graphical techniques
A probability plot is a very useful way to provide a preliminary examination of
how well a particular distribution fits the data. It is fast and easy to use and can
provide parameter and percentile estimates of the distribution. It can be applied
to complete and censored data and to grouped data. There are probability graph
papers for normal, lognormal, exponential, Weibull, extreme-value and chi-square
distributions. Weibull graph paper may be used for the Rayleigh distribution by
assuming the shape parameter is two.
3.1. Procedure for graphical techniques
(i) Order the observations from smallest to largest x(i ) (1 ~< i ~ n).
(ii) Assign the value of the cdf at each order statistic F(x(o ). A reasonable
value of the cdf at the ith order statistic is its median rank (i - 0.3)/(n + 0.4).
Exact tables of median ranks are available for the smaller values of i and n
(where n is the sample size). Harter [8, 10] recently wrote several papers where
he studied various plotting positions.
(iii) Plot the values of x(~) vs. F(x(o ) on the probability paper. The papers are
constructed so that if a particular distribution fits the data, then the graph will
be approximately a straight line. A curved line would indicate that the chosen
distribution is inadequate to model the sample. Probability plots could also
uncover mixtures of distributions in modeling the sample. Mardia [ 11] states:
Application of goodness-of-fit tests in reliability
'The importance of the graphical method should not be underestimated and it is
always worthwhile to supplement a test procedure with a plot.'
4. Modified goodness-of-fit test
Goodness-of-fit tests based on the empirical distribution fimction (EDF) fall
into two categories: (a) a test where the probability model to be tested is completely specified and a single table may be used for all continuous distributions
for each test statistic, and (b) a test where the parameters are estimated, called
modified goodness-of-fit tests. A different table must be used for each family of
distributions. Occasions where the null hypothesis may be completely specified are
rare and that, except for one case, will not be pursued further in this paper. If
one foolishly used tables for the completely specified case when the parameters
are estimated then the actual a error is much smaller than the specified value so
strongly biasing the test towards acceptance that it is almost equivalent to accepting H o without testing. See Lawless [ 12] for an extensive coverage of goodnessof-fit tests.
4.1. M o d i f i e d test statistics b a s e d on E D F
To use a modified goodness-of-fit test based on the EDF, one has to choose
a family of cdfs of the form F [ ( x - c)/O] where c is a location parameter and 0
is a scale parameter. The estimators of the nuisance parameters must be scale and
location invariant. Usual estimators having this property are maximum likelihood
estimators. When the estimators are inserted in the cdf, we will denote the
cdf evaluated at each order statistic under the null hypothesis Fo[(X i - d ) / 0 ]
by t0i.
Consider the following test statistics:
(i) The Kolmogorov-Smirnov statistic /£:
/£ = max(D +, D - ) ,
D + = 1.u.b. (i/n - P i ) ,
l <~ i <~ n .
= 1.u.b.(F, - [ ( i - 1)/n]),
(ii) The Anderson-Darling statistic ,~2:
,~2 = _ ~ [/~._ ( 2 i - 1)/2n] 2 + (1/12n).
(iii) The Cramer-von Mises statistic 1~'2:
I'V2 = ~ [Fi - ( 2 i - 1)/2n] z + (1/12n).
B. W. Woodruffand A. H. Moore
(iv) The Kuiper statistic I7":
I?=D+ +D-.
(v) The Watson statistic U2:
0 2 = I,V2 - n ( F -
1/2)2 w h e r e P =
~ Filn.
When the parameters are estimated by location and scale estimators, then the
null distribution of the test statistic and hence its percentage points do not depend
on c and 0. However in using the tables, one must use the same estimators as
were used in the construction of the table. The table of critical values and the
power of the test is affected by the invariant estimators chosen.
4.2. Normal (and lognormal)
Mardia [ 11] gave an extensive discussion on tests of univariate and multivariate
normality. Many of the techniques discussed are applicable to other distributions.
In Table 1, he summarized the main univariate test statistics. If the distribution
is a two-parameter lognormal, then if we transform the data by taking the
logarithm of each observation, then we have a sample from the normal distribution with mean/~ and variance 02. If in a test for normality with the transformed
data we accept H o, then we are accepting that the original data was lognormal
with parameters /~ and 02. Lilliefors [13] derived by Monte Carlo simulation
tables for a modified goodness-of-fit test for the normal using the
Kolmogorov-Smirnov (KS) statistic and pointed out the difference in the percentage points for the modified test and standard test for a completely specified
Ho. He tabled critical values for n = 4(1)20(5)30 for significance levels ~ = 0.01,
0.05(0.05)0.20. He performed a power study for n = 10 and 20 with ~ -- 0.05 and
= 0.10 using four alternate distributions. In the power study he demonstrated
that the modified KS test had considerably higher power than the Z 2 test. Green
and Hegazy [6] derived tables for the modified goodness-of-fit test for the normal
among other distributions using Cramer-von Mises (CvM) and Anderson-Darling
(AD) statistics for n = 5, 10, 20, 40, 80, 160. Their power study showed improved
power over other known tests.
4.3. Exponential and Rayleigh distributions
Lilliefors [14] derived tables for a modified KS goodness-of-fit test for the
exponential distribution with unknown mean. He tabled critical values for
n = 3(1)20(5)30 and for significance levels 0.1, 0.05(0.05) 0.20 and of an n = 10,
20, and 50. He conducted a power study for two alternative distributions.
Woodruff et al. [24] and Bush et al. [2] derived tables or modified KS, CvM
and AD tests for the two-parameter negative-exponential (Weibull with shape
parameter 1.0) for n = 5(1)15(5)30 and significance levels as above. A power
Application of goodness-of-fit tests in reliability
study was done for seven altemate distributions. It was shown that the CvM test
had the highest power for most of the alternative distributions studied when the
null hypothesis was the two parameter negative exponential.
Woodruff et al. [24] and Bush et al. [2] also derived tables for the Rayleigh
distribution (Weibull shape parameter 2.0) for the same sample sizes and significance levels given above.
The papers by Woodruff and Bush also studied a range of other Weibull shape
parameters from 0.5(0.5)4.0. A second power study with seven alternate distributions showed that the AD statistic was the most powerful when the null distribution was a Weibull with shape parameter 3.5. A relationship between critical
values and the inverse of the shape parameter was presented for the range of
shape parameters studied.
4.4. Extreme-value and Weibull distributions
Nancy Mann [16] used the fact that two-parameter Weibull distributions (with
known location parameter) may be transformed, by taking the logarithm of the
observations, to the extreme-value distribution. After the transformation, one has
a family with unknown scale and location parameters. She was able by deriving
the variance-covariance matrix of the standardized order statistic from extremevalue distribution, to obtain best linear unbiased (BLUE) and best linear invariant
(BLIE) estimators for the unknown parameters and hence estimates of the
parameters of the original Weibull distribution. It should be noted that the
estimators of the parameters of the extreme-value are invariant scale and location
parameter estimators. In a following paper [ 17], she derived a goodness-of-fit test
for the extreme-value distribution of smallest-values. Accepting the smallest
extreme-value distributions as the model for the transformed data is equivalent to
accepting the Weibull distribution as the model for the original data.
The test is not an E D F test but several papers based on the E D F followed that
used the same principal of transforming the Weibull into the extreme-value distribution.
Littell et al. [15] derived, by Monte Carlo techniques, tables of critical values
for the modified KS, CvM and AD statistics for the extreme-value distribution for
n = 10(5)40 and ~ = 0.1, 0.5(0.5)0.20. They use maximum likelihood estimators
for the parameters. A power study compared the three new goodness-of-fit tests
with several earlier ones. In a later paper, Chandra et ai. [3] derived tables of
critical values for modified goodness-of-fit statistics for the KS and for the Kuiper
tests for testing the fit to the extreme-value distribution with unknown parameters.
The unknown parameters were estimated by the method of maximum likelihood.
4.5. Gamma distribut&n
Woodruff et al. [25] derived tables for the percentage points for the modified
KS, AD and CvM statistics for goodness-of-fit tests for the gamma distribution
with unknown scale and location parameters and known shape parameter for
n = 5(5)30 and/~ = 0.1, 0.5(0.5)0.20.
B. W. Woodruff and A. H. Moore
A power study indicated that for larger sample sizes, the CvM was the most
powerful of the three tests. The equation C = a o + ~l(1/fl 2) describes the form of
the relationship between the critical values C and the shape parameter fl derived
for each of the statistics studied. Again ML estimators were used.
4.6. Logistic distribution
Woodruff et al. [26] derived tables of critical values for the modified KS, AD
and CvM goodness-of-fit statistics for the logistic distribution with unknown
shape and location parameters. ML estimators were used to obtain estimates of
the unknown parameters. The statistics were tabled for sample sizes n = 5(5)30
and significance levels ~ = 0.1, 0.5(0.5)0.20. A power study indicated quite good
power against uniform and exponential alternatives. The modified KS test had
lower power than the other two tests studied.
4.7. Pareto distribution
Porter and Moore [20] derived tables of critical values for the modified KS,
AD, and CvM goodness-of-fit statistics for the Pareto distribution with unknown
location and scale parameters and known shape parameters. Best linear unbiased
estimators were used to obtain the parameter estimates. The critical values were
tabled for sample sizes n = 5(5)30, significance levels ~ = 0.1, 0.5(0.5)2.0 and
Pareto shape parameters 0.5(0.5)4.0. The powers were investigated for eight alternative distributions. A functional relation between the critical values of test
statistics and the Pareto shape parameters was derived.
4.8. Laplace distribution
Yen and Moore [28] derived tables of critical values for the modified AD and
CvM goodness-of-fit statistics for the Laplace distribution. The critical values
were tables for sample sizes n = 5(5)50 and significance levels ~ = 0.1,
0.5(0.5)0.20. The AD test generally yielded higher power than the CvM test.
5. Modifications of the EDF
One way to improve the power of a goodness-of-fit test is to improve the
non-parametric estimate of the distribution function. Harter, Khamis and Lamb
[7] modified the definition of the cdf at the ith order statistic to obtain a
(modified) KS test statistic for the case where the probability model to be tested
is completely specified. They have shown that the test obtained in this fashion is
substantially more powerful than the usual KS tests for small to moderate sample
sizes. Harter [9] also developed asymptotic formulaes for the critical values of the
above test statistic.
Application of goodness-of-fit tests in reliability
6. New modified goodness-of-fit tests
New goodness-of-fit tests for symmetric alternatives were obtained by Moore
et al. [18], W o o d r u f f et al. [27] and Yen and Moore [29] for the normal, uniform, and Laplace distributions, respectively. A reflection technique in which the
data points are reflected about an invariant estimate of the mean is used to double
the sample size. The new sample is used to obtain a better estimate of the
distribution function to be used in the goodness-of-fit statistics. New tables were
derived for the KS, A D and C v M statistics. The new goodness-of-fit statistics are
still invariant with respect to a change in scale or location parameters. Extensive
power studies showed that the new test yielded considerably higher power for
sample sizes greater than or equal to 25 for all symmetric or nearly symmetric
alternative distributions. For non-symmetric alternative distributions, the new test
showed a decrease in power which was expected since Schuster [21] showed that
the reflection technique gave a poorer estimate of the distribution function in this
7. Likelihood ratio tests
When a goodness-of-fit test fails to reject two families of distributions, one can
use a likelihood ratio test to discriminate between them. Bain [1] ~ves an
extensive coverage to likelihood ratio tests. H e lists the test statistic to be used
to discriminate between normal vs. two-parameter exponential, normal vs. double
exponential, normal vs. Cauchy, Weibull vs. lognormal, and extreme-value vs.
normal. For large samples, the asymptotic likelihood ratio test could be used. For
small samples from other distributions Monte Carlo techniques can be used to
obtain the percentage points of the sample statistic for the likelihood ratio test.
[1] Bain, L. J. (1978). Statistical Analysis of Reliability and Life-Testing Models (Theory and Methods).
Marcel Dekker, New York and Basel.
[2] Bush, J. G., Woodruff, B. W., Moore, A. H. and Dunne, E. J. (1983). Modified Cramer-von
Mises and Anderson-Darling tests for Weibull distribution with unknown location and scale
parameters. Commun. Statist.-- Theor. Meth. A 12, 2463-2476.
[3] Chandra, M., Singpurwalla, N. and Stephens, M. A. (1981). Kolmogorov statistics for tests of
fit for the extreme-value and Weibull distributions. J. Amer. Statist. Assoc. 71, 204-209.
[4] Devroye, L. and Gyrrfi, L. (1985). Non-Parametric Density Estimation: the Li View. Wiley, New
[5] Durbin, J. (1975). Kolmogorov-Smirnov tests when parameters are established with application
tests of exponentiality and tests on spacings. Biometn'ka 62, 5-22.
[6] Green, J. R. and Hegazy, Y. A. S. (1976). Powerful modified goodness-of-fit tests. J. Amer.
Statist. Assoc. 71, 204-209.
[7] Harter, H. L., Khamis, H. T. and Lamb, R. E. (1984). Modified Kolmogorov-Smirnov tests
of goodness-of-fit. Commun. Statist.--Simula. Computa. 13, 293-323.
B. W. Woodruff and A. H. Moore
[8] Harter, H. L. (1984). Another look at plotting positions. Commun. Statist.--Theor. Method. 13,
[9] Hatter, H. L. (1984). Asymptotic formulas for critical values of a modified Kolmogorov test
statistic. Communications in Statistics B 13, 719-721.
[10] Harter, H. L. and Wiegand, R. P. (1985). A Monte Carlo study of plotting positions. Commun.
Statist.--Simula. Computa. 14, 317-343.
[11] Krishnaiah, P. R. (1980). Handbook of Statistics I, North-Holland, Amsterdam.
[12] Lawless, J. F. (1982). Statistical Models and Methods for Lifetime Data. Wiley, New York.
[13] Lilliefors, H. W. (1967). On the Kolmogorov test for normality with mean and variance
unknown. J. Am. Statist. Assoc. 62, 143-147.
[14] Lilliefors, H. W. (1969). On the Kolmogorov test for the exponential distribution with mean
unknown. J. Am. Statist. Assoc. 64, 387-389.
[15] Littell, R. D., McClave, J. T. and Often, W. W. (1979). Goodness-of-fit tests for the twoparameter Weibull distribution. Commun. Statist.--Simula. Computa. B 8, 257-269.
[16] Mann, N. R. (1968). Point and interval estimation procedures for the two-parameter Weibull
and extreme-value distributions. Technometrics 10, 231-256.
[17] Mann, N. R., Scheuer, E. M. and Fertig, K. W. (1973). A new goodness-of-fit test for the two
parameter Weibull or extreme-value distribution with unknown parameters. Communications in
Statistics 2, 383-400.
[18] Moore, A. H., Ream, T. J. and Woodruff, B. W. A new goodness-of-fit test for normality with
mean and variance unknown. (Submitted for publication.)
[19] Nelson, W. (1982). Applied Life Data Analysis. Wiley, New York.
[20] Porter, J. E., Moore, A. H. and Coleman, J. W. Modified Kolmogorov, Anderson-Darling and
Cramer-von Mises tests for the Pareto distribution with unknown location and scale parameters. (Submitted for publication.)
[21] Schuster, E. F. (1975). Estimating the distribution function of a symmetric distribution. Biometrika 62, 631-635.
[22] Stephens, M. A. (1977). Goodness-of-fit for the extreme-value distribution. Biometrika 64,
[23] Tapia, R. A. and Thompson, J. R. (1978). Nonparametric Probability Density Estimation. The
Johns Hopkins University Press, Baltimore and London.
[24] Woodruff, B. W;, Moore, A. H., Dunne, E. J. and Cortes, R. (1983). A modified
Kolmogorov-Smirnov test for Weibull distributions with unknown location and scale parameters. IEEE Transactions on Reliability 32, 209-213.
[25] Woodruff, B. W., Viviano, P. J., Moore, A. H. and Dunne, E. J. (1984). Modified goodness-of-fit
tests for gamma distributions with unknown location and scale parameters. 1EEE Transactions
on Reliability 33, 241-245.
[26] Woodruff, B. W., Moore, A. H., Yoder, J. D. and Dunne, E. J. (1986). Modified goodness-of-fit
tests for logistic distribution with unknown location and scale parameters. Commun.
Statist.--Simula. Computa. 15(1), 77-83.
[27] Woodruff, B. W., Woodbury, L. B. and Moore, A. H. A new goodness-of-fit test for the uniform
with unspecified parameters. (Submitted for publication.)
[28] Yen, V. C. and Moore, A. H. Modified goodness-of-fit tests for the Laplace distribution.
(Submitted for publication.)
[29] Yen, V. C. and Moore, A. H. New modified goodness-of-fit tests for the Laplace distribution.
(Submitted for publication.)
P. R. Krishnaiah and C. R. Rao, eds., Handbook of Statistics, Vol. 7
© Elsevier Science Publishers B.V. (1988) 121-129
Multivariate Nonparametric Classes in Reliability
Henry W. Block* and Thomas H. Savits*
I. Introduction
This paper is a sequel to the survey paper of Hollander and Proschan (1984)
who examine univariate nonparametric classes and methods in reliability. In this
paper we will examine multivariate nonparametric classes and methods in reliability.
Hollander and Proschan (1984) describe the various univariate nonparametric
classes in reliability. The classes of adverse aging described include the IFR,
IFRA, NBU, N B U E and D M R L classes. The dual classes of beneficial aging are
also covered. Several new univariate classes have been introduced since that time.
One that we will briefly mention is the H N B U E class, since we are aware of
several multivariate generalizations of this class.
The univariate classes in reliability are important in applications concerning
systems where the components can be assumed to be independent. In this case
the components are often assumed to experience wearout or beneficial aging of
a similar type. For example, it is often reasonable to assume that components
have increasing failure rate (IFR). In making this I F R assumption it is implicit
that each component separately experiences wear and no interactions among
components can occur. However in many realistic situations, adverse wear on one
component will promulgate adverse wear on other components. From another
point of view a common environment will cause components to behave similarly.
In either situation, it is clear that an assumption of independence on the components
would not be valid. Consequently multivariate concepts of adverse or beneficial
aging are required.
Multivariate nonparametric classes have been proposed as early as 1970. For
background and references as well as some discussion of univariate classes with
multivariate generalizations in mind see Block and Savits (1981). In the present
paper we shall only describe a few fundamental developments prior to 1981 and
* Supported by Grant No. AFOSR-84-0113 and ONR Contract N00014-84-K-0084.
H. W. Block and T. H. Savits
focus on developments since then. The coverage will not be exhaustive but will
emphasize the topics which we feel are most important.
Section 2 deals with multivariate nonparametric classes. In Section 2.1 multivariate IFRA is discussed with emphasis on the Block and Savits (1980) class.
Multivariate N B U is covered in Section 2.2 and multivariate N B U E classes are
mentioned in Section 2.3. New developments in multivariate IFR are considered
in Section 2.4 and in Section 2.5 the topics of multivariate D M R L and H N B U E
are touched on.
Familiarity with the univariate classes is assumed. The basic reference for the
IFR, IFRA, NBU and N B U E classes is Barlow and Proschan (1981). See also
Block and Savits (1981). For information on the D M R L class see Hollander and
Proschan (1984). The H N B U E class is relatively recent and the best references
are the original articles. See for example, Klefsj6 (1982) and the references
contained there.
2. Multivariate nonparametric classes
Many multivariate versions of the univariate classes were proposed using
generalizations of various failure rate functions. These multivariate classes were
extensively discussed in Block and Savits (1981). Other classes were proposed by
attempting to imitate univariate definitions in a multivariate setting. (See also
Block and Savits, 1981.) One of the most important of these extensions was due
to Block and Savits (1980) who generalized the IFRA class. This multivariate
class was proposed to parallel the developments of the univariate case where the
IFRA class possessed many important closure properties. As in the univariate
case the following multivariate class of IFRA, designated the MIFRA class,
satisfies important closure properties. First, as in the univariate case, monotone
systems with MIFRA lifetimes have MIFRA lifetimes and independent sums of
MIFRA lifetimes are MIFRA. From the multivariate point of view, subfamilies
of MIFRA are MIFRA, conjunctions of independent MIFRA are MIFRA, scaled
MIFRA lifetimes are MIFRA, and various other properties are satisfied. We
discuss this extension first since several other classes have been defined using
similar techniques.
2.1. Multivariate I F R A
Using a characterization of the univariate IFRA class in Block and Savits
(1976) the following definition can be made.
DEFINITION 2.1.1. Let T = (T1, ..., 7",) be a nonnegative random lifetime. The
random vector T is said to be M I F R A if
E~'[h(T)] <<.E[h~'(T/o~]
for all continuous nonnegative nondecreasing functions h and all 0 < ~ ~< 1.
Multivariate nonparametric classes in reliability
This definition as mentioned above implies all of the properties one would
desire for a multivariate analog of the univariate IFRA class. Part of the reason
for this is that the definition is equivalent to many other properties which are both
theoretically and intuitively appealing. The statement and proofs of these results
are given below; the form in which these are presented is influenced by the paper
of Marshall and Shaked (1982) who defined a similar M N B U class.
NOXES. (1) Obviously in Definition 2.1.1 we need only consider h defined on
E+ = {xlx >i 0}. Hence all of the functions and sets mentioned below are
assumed to be Borel measurable in ~q+.
(2) We say a function g is homogeneous (subhomogeneous) on ~ + if
~g(t) = (<~)g(at)
for all 0~< a~< 1, 0 ~ < t .
(3) A is an upper set if x ~ A and x <<,y implies y ~ A.
THEOREM 2.1.2. The following conditions are all equivalent to T being MIFRA.
(i) P~{T~A" 5 <~P{T~c~A) for all open upper sets in R~+, all 0 < oct< 1.
(ii) P ~ { T 6 A ) < ~ P { T ~ A )
for all upper sets in R"+, all 0 < ~ < 1 .
E~((o(T)) <~E(gp~(T/~)) for all nonnegative, binary, nondecreasing ~ on ~+ ).
(iii) E~(h(T))<~E(h(T/~)) for all nonnegative, nondecreasing h on R~+, all
(iv) For all nonnegative, nondecreasing, subhomogeneous h on ~"+, h(T) is IFRA.
(v) For all nonnegative, nondecreasing, homogeneous h on R+, h(T) is IFRA.
PROOF. (i) => (ii). By Theorem 3.3 of Esary, Proschan and Walkup (1967) for
an upper set A and any e > 0 there is an open upper set A~ such that A c A~ and
P { T 6 ~A~} <~P{TE aA} + e. Thus
P~{T~A} <<.P={T~A~} <<.P(T~ ~A~} <<.P{T~ ~a) + e.
(ii) ~ (iii). Let hk, k = 1, 2, . . . , be an increasing sequence of increasing step
functions such that l i m k _ ~ h , = h. Specifically take
if i - 1
2k < , h ( t ) < 2 k '
i= 1 , 2 , . . . , k 2 k
hl,(t) =
h(t)>, k ,
h~(t)= kZk
E ~1
where IA,.~ is the indicator function of the upper set Ai,/, = {tih(t) >~i/2k}. Thus
H. IV. Block and T. H. Savits
we need only prove the result for functions of the form
h(t) = ~ ailA~(t), ai>~ O,
where A1, . . . , A m are upper sets, since the remainder follows by the monotone
convergence theorem. We have
E~(i~=l ailAi(T))=[i~l aiP{T~Ai}] ~
V~, a,P1/~'{ }]'~
=[~=, {~a~'l,(t/¢)dF(,)~)l/~-]1
<~i=~, f ailAi(t/g)dF(t)
= E ([;=~ ailA,(T/~))~]
where the last inequality is due to Minkowski.
(iii) =:- Def. Obvious.
Def. ~ (i). From Esary, Proschan and Walkup (1967) for any open upper
set A there exist nonnegative, nondecreasing, continuous functions h~, such that
hkT IA. Then apply the monotone convergence theorem.
(iii) ~ (iv). Let h be nonnegative, nondecreasing and subhomogeneous. Then
<~E(l(t, oo)(~ h(T)))= e{h(T)> ~t}
where the first inequality follows from (iii) and the second by the subhomogeneity.
(iv) ~ (v). Obvious.
(v) => (vi). Let A be an open upper set and define
sup 0 > 0 : 1
h(t) =
o t~A }
O: 1
Then h is nonnegative, nondecreasing and homogeneous. Thus
P={TeA} = P={h(T)> 1} -<. P{h(T)> ~} = P{T~ ~4}.
Multivariate nonparametric classes in reliability
NOTE 2.1.3. The following two alternate conditions could also have been added
to the above list of equivalent conditions (provided F(0) = 1).
(vi) P ~ { T ~ A } <~P{T~ ~A} for each set of the form A = U,."_1A+ where
A + = { x l x > x + } , x+E~+ and for all 0 < c ~ < 1 .
(vii) For each k - - 1 , 2 . . . . .
for each a o, i = 1. . . . . k, j = 1. . . . . n,
0 ~< a+/~< 0% and for each coherent life function z of order kn z(allT1,
a~2T~ . . . . . alnT1, a21T2, . . . , ak, T,) is IFRA. (See Block and Savits (1980) for
a definition of coherent life function and for some details of the proof.)
In conjunction with the preceding result the following lemma makes it easy to
demonstrate that a host of different lifetimes are MIFRA.
LEMMA 2.1.4. Let T be MIFRA and ~1 . . . . . t~m be any continuous, subhomogeneous functions of n variables. Then if Si= ~O+(T) for i= 1. . . . . m,
S = (S1, . . . , Sm) is MIFRA.
PROOF. This follows easily by considering a nonnegative, increasing, continuous
function h of m variables and applying the M I F R A property of T and the
monotonicity of the ~;.
COROLLARY 2.1.5. Let ~ . . . . . rm be coherent life functions and T be MIFRA.
Then (z~(T) . . . . . zm(T)) is MIFRA.
Since coherent life functions are homogeneous this follows easily.
EXAMPLE 2.1.6. Let X 1. . . . . X n be independent I F R A lifetimes and
0 = S + c { 1 , 2 . . . . . n}, i = 1. . . . , m . Since it is not hard to show that independent I F R A lifetimes are MIFRA, it follows that T+ = minj+s Xs, i = 1. . . . . m,
are MIFRA. Since many different types of multivariate I F R A can be generated
in the above way, the example shows that any of these are MiFRA. See Esary
and Marshall (1979) where various types of multivariate I F R A of the type in this
example are defined. See Block and Savits (1982) for relationships among these
various definitions.
Multivariate shock models with multivariate I F R A properties have been treated
in Marshall and Shaked (1979) and in Savits and Shaked (1981).
2.2. Multivariate NBU
As with all of the multivariate classes, the need for each of them is evident
because of the usefulness of the corresponding univariate class. The only difference is that in the multivariate case, the independence of the components is
lacking. In particular the concept of N B U is fundamental in discussing maintenance policies in a single component system. For a multicomponent system,
where components are dependent, marginally components satisfy the univariate
N B U property under various maintenance protocols. However, a joint concept
H. W. Block and T. H. Savits
describing the interaction of all the components is necessary. Hence multivariate
N B U concepts are required.
Most of the earliest definition of multivariate N B U (see for example Buchanan
and Singpurwalla, 1977) consisted of various generalizations of the defining
property of the univariate N B U class. For a survey of these see definitions (1)-(5)
of Section 5 of Block and Savits (1981). For shock models which satisfy these
definitions see Marshall and Shaked (1979), Griffith (1982), Ebrahimi and Ghosh
(1981) and Klefsjo (1982). Other definition s involving generalizations of properties
of univariate N B U distributions are given by (7)-(9) of the same reference. These
are similar to definitions used by Esary and Marshall (1979) to define multivariate
I F R A distributions. Definitions (7) and (8) of the Block and Savits (1981)
reference represent a certain type of definition and bear repeating here. The vector
T is said to be multivariate N B U if:
~(T1, . . . , Tn) is N B U
for all ~ in a certain class of life functions;
There exist independent N B U X~ . . . . . X k and life functions %
i = 1, . . . , n, in a certain class such that T,. = vi(X), i = 1. . . . , n.
E1-Neweihi, Proschan and Sethuraman (1983) have considered a special case of
(2.2.2) where the zi are minimums and have related this case to some other
definitions including the special case of (2.2.1) where ~ is any minimum.
As shown in Theorem 2.1, definitions involving increasing functions can be
given equivalently in terms of upper (or open upper) sets. Two multivariate N B U
definitions which were given in terms of upper sets were those of E1-Neweihi
(1981) and Marshall and Shaked (1982). These are respectively:
For every upper s e t A c R + and for every 0 < ~ < 1
P { T ~ A } <~P(min(T'/c~, T"/(1 - cO~A)
where T, T ' , T" are independent and have the same distribution.
For every upper s e t A c N + and for every ~ > 0 , f l > 0
Relationships among these definitions are given in E1-Neweihi (1981). A more
restrictive definition than either of the above has been given in Berg and Kesten
For every upper A, B c ~n,
P ( T c A + B) <~P ( T c A ) P ( T c B ) .
This definition was shown to be useful in percolation theory as well as reliability
Multivariate nonparametric classes in reliability
A general framework involving generalizations of the concept (2.2.1) called
taking the C-closure of ~ and of the concept (2.2.2) called C-generating from
(where ~ is the class of univariate NBU lifetimes in (2.2.1) and (2.2.2)) has been
given by Marshall and Shaked (1984). Many of the previous NBU definitions are
organized within this framework. A similar remark applies when the classes ~- are
exponential, IFR, IFRA and NBUE. See Marshall and Shaked (1984).
2.3. Multivariate NBUE
Along with the multivariate NBU versions of Buchanan and Singpurwalla
(1977) are integrated versions of these definitions. These authors give three
versions of multivariate NBUE. The relations among these and closure properties
are discussed in Ebrahimi and Ghosh (1981). Furthermore the latter authors
relate these multivariate N B U E definitions to four definitions of multivariate
NBU (i.e. definitions (1)-(4) of Section 5 of Block and Savits (1981)).
Some other multivariate N B U E classes are mentioned by Block and Savits
(1981) and Marshall and Shaked (1984). One extension of a univariate characterization of the N B U E class mentioned in Block and Savits (1978) has been
proposed by Savits (1983).
2.4. Multivariate IFR
Perhaps the most important univariate concept in reliability is that of increasing
failure rate. One reason for this is that in a very simple and compelling way this
idea describes the wearout of a component. Many engineers, biologists and
actuaries find this description fundamental. The monotonicity of the failure rate
function is simple and intuitive and occurs in many physical situations. This also
is crucial in the multicomponent case where the components are dependent.
Several authors have attempted to describe the action of the failure rates
increasing for n components simultaneously. These cases were discussed in Block
and Savits (1981) and in the references contained therein.
A recent definition of multivariate IFR was given by Savits (1985) and is in
the spirit of the classes defined by Block and Savits (1980) and Marshall and
Shaked (1982). For shock models involving multivariate IFR concepts see Ghosh
and Ebrahimi (1981).
It is shown in Savits (1985) that a univariate lifetime T is IFR if and only if
E[h(x, T)] is log concave in x for all functions h(x, t) which are log concave in
(x, t) and are nondecreasing in t for each fixed x >~ 0. This leads to the following
multivariate definition.
DEFINITION 2.4.1. Let T be a nonnegative random vector. Then T has an
MIFR distribution if E[h(x, T)] is log concave in x for all functions h(x, t) which
are log concave in (x, t) and nondecreasing in t >~ 0 for each fixed x >~ 0.
This class enjoys many closure properties. Among these are that all marginals
are MIFR, conjunction of independent M I F R are MIFR, convolutions of MIFR
H. W. Block and T. H. Savits
are MIFR, scaled MIFR are MIFR, nonnegative nondecreasing concave
functions of MIFR are MIFR, and weak convergence preserves MIFR. See
Savits (1985) for details. From these results it follows that the multivariate
exponential distribution of Marshall and Olkin (1967)is MIFR, as are all distributions with log concave densities. Since the multivariate folded normal has a log
concave density, it also is MIFR.
The technique used in Definition 2.4.1 for the MIFR class extends to other
multivariate classes. In particular, if we replace log concave with log subhomogeneous, we get the same multivariate IFRA class as in Definition 2.1.1; if we
replace log concave with log subadditive, we get a new multivariate NBU class
which is between that of (2.2.3) and (2.2.4). For more details see Savits (1983,
2.5. Multivariate D M R L and H N B U E
Few definitions of multivariate D M R L have been discussed in the literature,
although E. E1-Neweihi has privately communicated one to us. Since developments are premature with respect to this class we will not go into details.
Multivariate extensions of the H N B U E class have been proposed by Basu and
Ebrahimi (1981) and Klefsj0 (1980). The extensions of the former authors are
similar in spirit to the multivariate N B U E classes of Ghosh and Ebrahimi (1981).
The latter author's definition extends the univariate definition by replacing the
univariate exponential distribution with the bivariate Marshall and Olkin (1967)
distribution and considers various multivariate versions of the definition.
Basu and Ebrahimi (1981) show relationships among their definitions and
KlefsjO's, given some closure properties and also point out relations with multivariate N B U E classes.
Barlow, R. E. and Proschan, F. (1981). Statistical Theory of Reliability and Life Testing: Probability
Models. To Begin With, Silver Spring, MD.
Basu, A. P. and Ebrahimi, N. (1981). Multivariate HNBUE distributions. University of MissouriColumbia, Technical Report # 110.
Berg, J. and Keston, H. (1984). Inequalities with applications to percolation and reliability.
Unpublished report.
Block, H. W. and Savits, T. H. (1976). The IFRA closure problem. Ann. Prob. 4, 1030-1032.
Block, H. W. and Savits, T. H. (1978). Shock models with NBUE survival. J. AppL Prob. 15,
Block, H. W. and Savits, T. H. (1980). Multivariate increasing failure rate average distributions. Ann.
Prob. 8, 793-801.
Block, H. W. and Savits, T. H. (1981). Multivariate classes in reliability theory. Math. of O.R. 6,
Block, H. W. and Savits, T. H. (1982). The class of MIFRA lifetimes and its relation to other classes.
NRLO 29, 55-61.
Buchanan, B. and Singpurwalla, N, D. (1977). Some stochastic characterizations of multivariate
survival. In: C. P. Toskos and I. Shimi, eds., The Theory and AppL of Reliability, Vol. I, Academic
Press, New York, 329-348.
Multivariate nonparametric classes in reliability
Ebrahimi, N. and Ghosh, M. (1981). Multivariate NBU and NBUE distributions. The Egyptian
Statistical Journal 25, 36-55.
E1-Neweihi, E. (1981). Stochastic ordering and a class of multivariate new better than used distributions. Comm. Statist.-Theor. Meth. A 10(16), 1655-1672.
EI-Neweihi, E., Proschan, F. and Sethuraman, J. (1983). A multivariate new better than used class
derived from a shock model. Operations Research 31, 177-183.
Esary, J. D. and Marshall, A. W. (1979). Multivariate distributions with increasing hazard rate
averages. Ann. Prob. 7, 359-370.
Esary, J. D., Proschan, F. and Walkup, D. W. (1967). Association of random variables, with
applications. Ann. Math. Stat. 38, 1466-1474.
Ghosh, M. and Ebrahimi, N. (1981). Shock models leading to increasing failure rate and decreasing
mean residual life survival. J. Appl. Prob. 19, 158-166.
Griffith, W. (1982). Remarks on a univariate shock model with some bivariate generalizations. NRLQ
29, 63-74.
Hollander, M. and Proschan, F. (1984). Nonparametric concepts and methods in reliability. In: P.
R. Krishnaiah and P. K. Sen, eds., Handbook of Statistics, Vol. 4, Elsevier, Amsterdam.
Klefsj6, B. (1980). Multivariate HNBUE. Unpublished report.
Klefsj6, B. (1982). NBU and NBUE survival under the Marshall-Olkin shock model. IAPQR Transactions 7, 87-96.
Klefsj6, B. (1982). The HNBUE and HNWUE classes of life distributions. NRLQ 29, 331-344.
Marshall, A. W. and Olkin, I. (1967). A generalized bivariate exponential distribution. J. Appl. Prob.
4, 291-302.
Marshall, A. W. and Shaked, M. (1979). Multivariate shock models for distributions with increasing
hazard rate average. Ann. Prob. 7, 343-358.
Marshall, A. W. and Shaked, M. (1982). A class of multivariate new better than used distributions.
Ann. Prob. 10, 259-264.
Marshall, A. W. and Shaked, M. (1984). Multivariate new better than used distributions. Unpublished report.
Savits, T. H. and Shaked, M. (1981). Shock models and the MIFRA property. Stoch. Proc. Appl.
11, 273-283.
Savits, T. H. (1983). Multivariate life classes and inequalities. In: Y. L. Tong, ed., Inequalities on
Probability and Statistics IMS, Hayward, CA.
Savits, T. H. (1985). A multivariate IFR class. J. Appl. Prob., 22, 197-204.
P. R. Krishnaiah and C. R. Rao, eds., Handbook of Statistics, Vol. 7
© Elsevier Science Publishers B.V. (1988) 131-156
Selection and Ranking Procedures in Reliability
Shanti S. Gupta and S. Panchapakesan
I. Introduction
Situations abound in practice where the aim of the statistical analyst is to
compare two or more populations in some fashion with a view to rank them or
select the best one(s) among them. For example, a purchasing firm may want to
determine which one of several competing suppliers of components for a certain
computer is producing the highest quality product. Typically, the populations that
are compared will be life length distributions of the components from the competing manufacturers. The best population could be defined as the one with the
largest mean life or with the largest quantile (percentile) of a given order. In such
situations, the classical tests of homogeneity are not designed to answer efficiently
several possible questions of interest to the experimenter. Selection and ranking
procedures were initially devised in the early 1950's to provide the analyst appropriate tools to answer these questions. Most of the investigations in the last thirty
odd years have adopted one or the other of two basic formulations. One of them
is the so-called indifference zone (IZ) formulation of Bechhofer (1954) and the
other is the subset selection (SS) approach of Gupta (1956).
Our main purpose in this paper is to describe some important selection procedures that are relevant to reliability models. Selection procedures are available in
the literature for various parametric families of distributions. Many of these distributions serve as appropriate models for the life length of a unit. However, we
will be concerned with only a few of these such as exponential, gamma, and
Weibull distributions. Besides some nonparametric and distribution-free procedures, we emphasize selection procedures for restricted families of distributions
such as the increasing failure rate (IFR) and increasing failure rate on the average
(IFRA) families which are of importance in reliability problems. In dealing with
these procedures, we mainly use the SS aproach.
* This research was supported by the Office of Naval Research Contract N00014-84-C-0167 at
Purdue University. Reproduction in whole or in part is permitted for any purpose of the United
States Government.
s. s. Gupta and S. Panchapakesan
In the last three decades and more, the literature on selection and ranking
procedures has grown enormously. Several books have appeared exclusively
dealing with selection and ranking procedures. Of these, the monograph of
Bechhofer, Kiefer and Sobel (1968) deals with sequential procedures with special
emphasis on Koopman-Darmois family. Gibbons, Olkin and Sobel (1977) deal
with methods and techniques mostly under the IZ formulation. Gupta and
Panchapakesan (1979) provide a comprehensive survey of the developments in the
field of ranking and selection, with a special chapter on Guide to Tables. They
deal with all aspects of the problem and provide an extensive bibliography.
BOringer, Martin and Schriever (1980) and Gupta and Huang (1981) have discussed some specific aspects of the problem. A fairly comprehensive categorized
bibliography is provided by Dudewicz and Koo (1982). For a critical review and
an assessment of developments in subset selection theory and techniques, reference may be made to Gupta and Panchapakesan (1985).
Section 2 discusses the formulation of the basic problem of selecting the best
population using the IZ and SS approaches. Section 3 deals with selection from
gamma, exponential and Weibull populations. Procedures for different generalized
goals are discussed using both IZ and SS approaches. Nonparametric procedures
are discussed in Section 4 for selecting in terms of ~t-quantiles. This section also
discusses procedures for Bernoulli distributions. These serve as distribution-free
procedures for selecting from life distributions in terms of reliability at an arbitrarily chosen time. Procedures for selection from restricted families of distributions are described in Section 5. These include procedures for IFR and IFRA
families in particular. A brief discussion of selection in comparison with a
standard or control follows in Section 6.
2. Selection and ranking procedures
Let 7Zl, ..., 7Zk be k given populations where ni has the associated distribution
function Fo,, i = 1. . . . , k. The 0i are real-valued parameters taking values in the
set O. It is assumed that the 0; are unknown. The ordered 0,. are denoted by
011j ~< 0[2] ~< • • • ~< 0[k] and the (unknown) population ne associated with Oto by n;,
i = 1. . . . . k. The populations are ranked according to their 0-values. To be
specific, nu~ is defined to be better than nti) if i < j . No prior information is
assumed regarding the true pairing between (01 . . . . . 0~) and (0711, ..., 0[k]).
2.1. Indifference zone (IZ) formulation
The goal in the basic problem in the IZ approach is to select the best population, namely, the one associated with 0[k]. A procedure is required to choose one
of the populations. A correct selection (CS) is a selection of population(s) satisfying the goal. Here CS corresponds to choosing the best population. Any selection procedure is required to guarantee a minimum probability o f a correct selection
(PCS). In the IZ formulation, this requirement is that, for any rule R,
Selection and ranking procedures in reliability models
P(CS IR) ~ P*
whenever b(0[/,], 0[~_ 11) >/b*,
where P(CSIR) denotes the PCS using R, and b(Otk1, 0[k_ 1]) is an appropriate
measure of separation of the best population re(k) from the next best re(k- 1~' The
constants P* and b* are specified by the experimenter in advance. The statistical
problem is to define a selection rule which really consists of a sampling rule, a
stopping rule for sampling, and a decision rule. If we consider taking a single
sample of fixed size n from each population, then the minimum value of n is
determined subject to (2.1). A crucial step involved in this is to evaluate the
infimum of the PCS over 12~. = {0 = 01, . . . , Ok): b(Otl,], Otk_ 11) ~> b*}. Any configuration of 0 where this infimum is attained is called a least favorable configuration (LFC). Between two valid (i.e. satisfying (2.1)) single sample procedures, the sample size n is an obvious criterion for efficiency comparison. The
region f2b. is called the preference zone. No requirement is made regarding the
PCS when 0 belongs to the complement of fib* which, in fact, is the indifference
2.2. Subset selection (SS) approach
In the SS approach for selecting the best population, the goal is to select a
nonempty subset for the k populations which includes the best population. The
size of the selected subset is not fixed in advance; it is rather determined by the
data themselves. Selection of any subset consistent with the goal (i.e. including
the best population) is a correct selection. It is required that, for any rule R,
for all 0~f2
where f2 = {0} is the whole parameter space. It should be noted that there is no
indifference zone specification in this formulation. As is to be expected, a crucial
step is the evaluation of the infimum of the PCS over 12. Any subset selection rule
that satisfies (2.2) meets the criterion of validity. Denoting the selected subset by
S and its size by IS I, the expected value of lSI serves as a reasonable measure
for efficiency comparison between valid procedures. Besides E(IS b), possible
performance characteristics include E(IS I) - PCS and E([S ])/PCS. The former
one represents the expected number of nonbest populations included in the
selected subset. As an overall measure, one can also consider the supremum of
E ( / S I) over O.
2.3. Some general remarks
The probability requirement, (2.1) or (2.2) as the case may be, is usually
referred to as the basic probability requirement, or the P*-requirement, or the
P*-condition. There are several modifications and generalizations of the basic goal
and requirements on the procedures in both IZ and SS formulations. These will
be described as the necessity arises during our discussion of several procedures.
s. s. Gupta and S. Panchapakesan
For details on these aspects of the problem, reference may be made to Gupta and
Panchapakesan (1979).
Suppose that the best population is the one associated with the largest 0,.. A
procedure R is said to be monotone if the probability of selecting ~i is at least as
large as that of selecting rcj whenever 0~> 0j..
2.4. Two types of subset selection rules
Let T~ be the statistic associated with the sample from rce (i = 1. . . . . k) with
distribution function F(x, 0e); the 0i are the parameters to be ranked. Most of the
rules that have been studied in the literature are of one of the following types:
RI: Select re; if and only if
T,. >t max Tj - d
1 <~j<~<k
R2: Select zci if and only if
r,~>c max Tj
1 <~j<~k
where d > 0 and c e (0, 1) are to be determined so that the P*-requirement is
These rules R 1 and R 2 have been typically proposed when 0; is a location and
a scale parameter, respectively. When 0,. is neither a location nor a scale parameter
(e.g. a noncentrality parameter), usually one of these two rules has been proposed
depending on the nature of the support of T,.. Most of the rules that are discussed
in this paper c o m e under one of these two types. Treatment of R 1 and R 2 in the
location and scale case, respectively, is given in Gupta (1965). The following
properties hold for RI in the location case and R 2 in the scale case.
(1) The procedure is monotone (Gupta, 1965).
(2) If the distribution F(x, O) possesses a density f ( x , O) having a monotone
likelihood ratio (MLR) in x, then E ( [ S J ) is maximized when 01 . . . . .
Ok and
this maximum is kP* (Gupta, 1965).
(3) Under the MLR assumption, the rule is minimax when the loss is measured
by JSp or the number of non-best populations selected (Berger, 1979).
(4) In a fairly large class of rules, the procedure is minimax when the loss is
measured by the maximum probability of including a non-best population (Berger
and Gupta, 1980).
A comprehensive unified theory is due to Gupta and Panchapakesan (1972),
who have considered a class of rules which includes R1 and R 2 as special cases;
see Gupta and Panchapakesan (1979, Section 11.2). Gupta and Huang (1980)
have obtained an optimal rule in the class of rules for which the PCS is at least
7 by minimizing the supremum of E([S I).
Selection and ranking procedures in reliability models
3. Selection from parametric families
N u m e r o u s p a r a m e t r i c m o d e l s are e m p l o y e d in the analysis o f life length d a t a
a n d in p r o b l e m s c o n n e c t e d w i t h t h e m o d e l i n g o f aging o r failure p r o c e s s e s .
A m o n g u n i v a r i a t e m o d e l s , a few p a r t i c u l a r distributions, n a m e l y , the e x p o n e n t i a l ,
Weibull, a n d g a m m a , s t a n d o u t in v i e w o f their p r o v e n u s e f u l n e s s in a w i d e r a n g e
o f situations. O f course, t h e s e d i s t r i b u t i o n s are related to e a c h other. In this
section, we will d i s c u s s a few typical p r o c e d u r e s for selection f r o m t h e s e p o p u l a tions.
3.1. Selection from gamma populations
Let 7zI . . . . , rc~ d e n o t e k given g a m m a p o p u l a t i o n s with d e n s i t y f u n c t i o n s
f ( x , Oi)- - -
0,., e > 0 ;
1. . . . .
w i t h a c o m m o n k n o w n s h a p e p a r a m e t e r ~. F o r the goal o f selecting a subset
c o n t a i n i n g the b e s t p o p u l a t i o n , n a m e l y , the o n e a s s o c i a t e d w i t h 0tk 1, G u p t a
(1963a) p r o p o s e d a rule b a s e d o n the s a m p l e m e a n s X;, i = 1, . . . , k, arising f r o m
n i n d e p e n d e n t o b s e r v a t i o n s f r o m e a c h p o p u l a t i o n . T h e rule o f G u p t a (1963a) is
Table la
Values of the constant c of Rule
satisfying equation (3.3); P* = 0.90
S. S. Gupta and S. Panchapakesan
Table lb
Values of the constant c of Rule
satisfying equation (3.3); P* = 0.95
R3: Select rEi if a n d o n l y if
Xi >~ c m a x X j
1 ~<j<~k
w h e r e c is the largest n u m b e r w i t h 0 < c < 1 for w h i c h the P * - r e q u i r e m e n t is met.
T h e L F C is given by 01 . . . . .
Ok a n d the c o n s t a n t c is d e t e r m i n e d by
fO e Gkv - l ( x / c ) g v ( x ) d x = e * ,
w h e r e v = nc~ and, Gv a n d gv are the c d f a n d the density, respectively, o f a
s t a n d a r d i z e d g a m m a r a n d o m v a r i a b l e (i.e. with 0 = 1) w i t h s h a p e p a r a m e t e r v.
G u p t a (1963a) has t a b u l a t e d the v a l u e s o f c for v = 1(1)25, k = 2(1)11, a n d
P * = 0.75, 0.90, 0.95, 0.99. T a b l e s l a a n d l b are e x c e r p t e d f r o m the tables o f
G u p t a (1963a) a n d t h e y p r o v i d e c - v a l u e s for k = 2(1)11, v = 1(1)20, a n d
P * = 0.90 a n d 0.95, respectively.
D e p e n d i n g on the p h y s i c a l n a t u r e o f the p r o b l e m , w e m a y be i n t e r e s t e d in
selecting the p o p u l a t i o n a s s o c i a t i o n w i t h 0tl 1, w h i c h is the best p o p u l a t i o n n o w .
Selection and ranking procedures in reliability models
In this case, the procedure analogous to R 3 is
R4: Select zcg if and only if
min X.
C 1 <~j~<k J
X~ ~< -
where 0 < c' < 1 is the largest number for which the P*-condition is met. The
constant c' is given by
f o ~ [ 1 - Gv(c' x)] k - lgv(x ) d x = P *
where v = n~. The values of the constant c' have been tabulated for v = 1(1)25,
k = 2(1)11, and P* = 0.75, 0.90, 0.95, 0.99 by Gupta and Sobel (1962b) who have
studied rule R 4 in the context of selecting from k normal populations the one with
the smallest variance in a companion paper (1962a).
It is known that the gamma family {F(x, 0)}, with common parameter r, is
stochastically increasing in 0, i.e., F(x, 0~) and F(x, Oj) are distinct for 0,. # 0j, and
F(x, 0~) >1F(x, Oj) for all x when 0~< 0j.. This implies that ranking them in terms
of 0 is equivalent to ranking in terms of a-quantile for any 0 < a < 1.
3.2. Selection from exponential (one-parameter) populations
We first note that this is a special case of gamma populations with densities
f(x, 0~) in (3.1) with ~ - - 1 . Thus the rules R 3 and R 4 are applicable. Now
consider a life testing situation where a sample of n items from each population
is put on test and the sample is censored (type II) at the rth failure. Let
Xil < X i 2 < ' " < X t r
denote the r complete lives in the sample from re;,
i = 1, . . . , k. Define
T,= L X,j + (n - r)X,r,
i= l, ..., k.
The Ti are the so-called total life statistics. It is well-known that 2Te/Oi has a
chi-square distribution with 2r degrees of freedom. In other words, 7",. has a
gamma distribution with scale parameter 0~ and shape parameter r. Thus for
selecting the population with the largest mean life 0e, the procedure R 3 (stated in
terms of the T~) will be
R3: Select I[i if and only if
T,/> c max Tj
1 <~j<~k
where c is given by (3.3) with v -- r.
s. s. Gupta and S. Panchapakesan
3.3. Selection J~om two-parameter exponential distributions
Let ni have density
x>Oi; O ~ , a > O ; i = l , . . . , k . ( 3 . 7 )
The density (3.7) provides a model for life length data when we assume a
minimum guaranteed life 0~, which is here a location parameter. It is assumed that
all the k populations have a common scale parameter a. The 0i are unknown and
our interest is in selecting the population associated with the largest 0~. We will
discuss some procedures under the IZ formulation. Consider the generalized goal
of selecting a subset of fixed size s so that the t best populations (1 ~< t ~< s < k)
are included in the selected subset. This generalized goal was introduced by Desu
and Sobel (1968). The special case of t = s, namely, that of choosing t populations
so that they are the t best, was considered originally by Bechhofer (1954). When
s = t = 1, we get the basic goal of selecting the best population. The probability
requirement is that
PCS >~ P*
where 0* and P* are specified
a subset of s populations is
meaningful problem, we should
dures, we will adopt either the
will consider the two cases of
Otk-t+ lj - 0tk-tl >/0* > 0
in advance and a correct selection occurs when
selected consistent with the goal. Also, for a
have 1/(~) < P* < 1. In describing several procegeneralized goal or one of its special cases. We
known and unknown a separately.
Case A: Known or. We can assume without loss of generality that cr = 1. Let
Xij, j = 1, ..., n, denote a sample of n observations from re;, i = 1, . . . , k. Define
Yi mini <-~j<~nXij , i = 1, ..., k.
Raghavachari and Starr (1970) considered the goal of selecting the t best
populations (i.e. 1 ~<s = t < k ) and they stvdied the 'natural' rule
Select the t populations associated with Ytk-,+ ~1,''',
The L F C for this rule is given by
. . . . .
O[k_t] ;
1] ~--- ' ' "
O[k-t + 11
O[k] ;
O[k_t] = O*.
(3. lo)
Selection and ranking procedures in reliability models
The minimum sample size required to satisfy (3.8) is the smallest integer n for
(1-e-n°*) k
t + ( k _ t) e n O . t i ( e - n O . , t + 1, k -
t)>~ P *
~ , f l > 0 ; 0 ~ < z ~ < 1.
Equivalently, we need the smallest integer n such that
nO* >~ - log v,
where v (0 < v < 1) is the solution of the equation
(1 - v) k - t + (k - t ) v - t I ( v , t + 1, k -
t) = P * .
Raghavachari and Starr (1970) have tabulated the v-values for k = 2(1)15,
t = l(1)k - 1, and P* = 0.90, 0.95, 0.975, 0.99.
In particular, for selecting the best population, the equation (3.14) reduces to
(vk)-l[1 - (1 - v) k] = P * .
For the generalized goal, Desu and Sobel (1968) studied the following rule R 6.
' Ytk~Given n, k, t, 0", and P*, they have shown that the smallest s for which the
probability requirement (3.8) is satisfied is the smallest integer s such that
R6: Select the s populations associated with Ytk-s+ 1]. . . .
(~) >~p.(k)
e-n,o*, '
It should be pointed out that Desu and Sobel (1968) have obtained general results
for location parameter family. They have also considered the dual problem of
selecting a subset of size s (s ~< t) so that all the selected populations are among
the t best.
Case B: Unknown a. In this case, we consider the basic goal of selecting the
best population. Since a is unknown, it is not possible to determine in advance
the sample size needed for a single sample procedure in order to guarantee the
P*-condition. This is similar to the situation that arises in selecting the population
with the largest mean from several normal populations with a common unknown
variance. For this latter problem, Bechhofer, Dunnett and Sobel (1954) proposed
a non-elimination type two-stage procedure in which the first stage samples are
utilized purely for estimating the variance without eliminating any population from
further consideration. A similar procedure was proposed by Desu, Narula and
Villarreal (1977) for selecting the best exponential population. Kim and Lee (1985)
have studied an elimination type two-stage procedure analogous to that of Gupta
S. S. Gupta and S. Panchapakesan
and Kim (1984) for the normal means problem. In their procedure, the first stage
is used not only to estimate a but also to possibly eliminate non-contenders. Their
Monte-Carlo study shows that, when 0tkI - 0tk_ 1] is sufficiently large, the elimination type procedure performs better than the other type procedure in terms ot
the expected total sample size.
The procedure R 7 of Kim and Lee (1985) consists of two stages as follows.
Stage I ." Take n o independent observations from each rcg (1 ~< i ~< k), and compute
y/.(o = min~ ~j<~noXij, and a pooled estimate ~ of a, namely,
O" ~- 2
2 ( Y / j - Y~l))/k(n 0 -- 1).
i=1 j = l
Determine a subset I of {1, ..., k} defined by
I = {i1 y.(1) >~ max y)l) _ (2k(no _ 1) &h/n o - 0") + } ,
where the symbol a + denotes the positive part of a, and h ( > 0 ) is a design
constant to be determined.
(a) If I has only one element, stop sampling and assert that the population
association with V(1)
--[k] as the best.
(b) If I has more than one element, go to the second stage.
Stage 2: Take N - n o additional observations X U from each re,. for i E L where
N = max{n o, (2k(n o - 1)~rh/O*)},
and the symbol ( y ) denotes the smallest integer equal to greater than y. Then
compute, for the overall sample, Y~.= maxl~j~vX~j and choose the population
associated with maxi~ x Y~ as the best.
The constant h used in the procedure R 7 is given by
fO °°
-- O~(x))k}2/{k20~2(x)}fv(X) d x = P*
where e ( x ) - - e x p ( - h x ) and fv(x) is the chi-square density with v = 2 k ( n o - 1)
degrees of freedom. The h-values have been tabulated by Kim and Lee (1985) for
P* = 0.95, k = 2(1)5(5)20, and n o = 2(1)30.
3.4. Selection from Weibull distributions
Let n~ have a two-parameter Weibull distribution given by the cdf
Fi(x ) =- F(x;
0 i, e l ) = 1 - e x p { -
0;,c~>0; i = 1. . . . , k .
x > 0;
Selection and ranking procedures in reliability models
The c`. and Oz. are unknown. Kingston and Patel (1980a, b) have considered the
problem of selecting from Weibull distributions in terms of their reliabilities
(survival probabilities) at an arbitrary but specified time L > 0. The reliability at
L for F~ (i = 1. . . . . k) is given by
p`. = 1 - F~(L) = exp { - (L/O`.)c'}.
We can without loss of generality assume that L = 1 because the observed failure
times can be scaled so that L = 1 time unit. Further, letting (0`.)c' = 2;, we get
p`. = exp { - 27 1}. Obviously, ranking the populations in terms of the p; is equivalent to ranking in terms of the 2;, and the best population is the one associated
with 2[k], the largest 2,.. Kingston and Patel (1980a) considered the problem of
selecting the best one under the IZ formulation using the natural procedure based
on estimates of the 2`. constructed from type II censored samples. They also
considered the problem of selecting the best in terms of the a-quantiles for a given
Ok= 0 (unknown). The
~ (0, 1), ~ 1 - e -1, in the case where 01 . . . . .
~-quantile of F`. is given by ¢`. = 0[ - l o g ( 1 - ~)]l/ci so that ranking in terms of the
~-quantiles is equivalent to ranking in terms of the shape parameter. It should be
noted that the ranking of the ci is in the same order as that of the associated 4`.
if a < 1 - e-1, and is in the reverse order if a > 1 - e-1. The procedures discussed above are based on maximum likelihood estimators as well as simplified
linear estimators (SLE) considered by Bain (1978, p. 265). For further details on
these procedures, see Kingston and Patel (1980a).
In another paper, Kingston and Patel (1980b) considered the goal of selecting
a subset of restricted size. This formulation, usually referred to as restricted subset
selection (RSS) approach, is due to Gupta and Santner (1973) and Santner
(1975). In the usual s s approach of Gupta (1956), it is possible that the procedure selects all the k populations. In the RSS approach, we restrict the size of
the selected subset by specifying an upper bound m (1 ~< m ~< k - 1); the size of
the selected subset is still random variable taking on values 1, 2 . . . . , m. Thus it
is a generalization of the usual approach (m = k). However, in doing so, an
indifference zone is introduced. The selection goal can be more general than
selecting the best. We now consider a generalized goal in the RSS approach for
selection from Weibull populations, namely, to select a subset of the k given
populations not exceeding m in size such that the selected subset contains at least
s of the t best populations. As before, the populations are ranked in terms of their
2-values. Note that 1 ~< s ~< min (t, m) ~< k. The probability requirement now is
PCS >~P*
whenever ~, = (21 . . . . . 2~)~f2a.
f2~. = {2: 2"2[k t~ ~< 2[k-,+ ,], 2* ~> 1}.
S . S . Gupta and S. Panchapakesan
When t = s = m and 2* > 1, the problem reduces to selecting the t best populations using the IZ formulation. When s = t < m = k and 2*= 1, the problem
reduces to selecting a subset of random size containing the t best populations (the
usual SS approach). Thus the RSS approach integrates the formulations of
Bechhofer (1954), Gupta (1956), and Desu and Sobel (1968). General theory
under the RSS approach is given by Santner (1975).
Returning to the Weibul selection problem with the generalized RSS goal,
Kingston and Patel (1980b) studied a procedure based on type II censored
samples from each population. It is defined in terms of the maximum likelihood
estimators (or the SLE estimators) 2 i. This procedure is
R8: Include 7ri in the selected subset if and only if
,~i >~ max{'~[k-m+ t1, CA[k-,+ 1]},
where c~ [0, 1] is suitably chosen to satisfy (3.20).
Let n denote the common sample size and consider censoring each sample at
the rth failure. For given k, r, n, s, t, and m, we have three quantities associated
with the procedure R 8, namely, P*, c, and 2 * > 0. Given two of these, one can
find the third; however, the solution may not be admissible. For example, for
some P* and 2*, there may not be a constant c e [0, 1] so that (3.20) is satisfied
unless m = k. Kingston and Patel (1980b) have given a few tables of ),*-values for
selected values of other constants. Their table values are based on Monte Carlo
techniques and the choice of SLE's.
4. Nonparametric and distribution-free procedures
Parametric families of distributions serve as life models in situations where
there are strong reasons to select a particular family. For example, the model may
fit data on hand well, or there may be a good knowledge of the underlying aging
or failure process that indicates the appropriateness of the model. But there are
many situations in which it becomes desirable to avoid strong assumptions about
the model. Nonparametric or distribution-free procedures are important in this
Gupta and McDonald (1982)have surveyed nonparametric selection and ranking procedures applicable to one-way classification, two-way classification, and
paired-comparison models. These procedures are based on rank scores and/or
robust estimators such as the Hodges-Lehmann estimator. For the usual types
of procedures based on ranks, the LFC is not always the one corresponding to
identical distributions. Since all these nonparametric procedures are relevant in
the context of selection from life length distributions, the reader is best referred
to the survey papers of Gupta and McDonald (1982), Gupta and Panchapakesan
(1985), and Chapters 8 and 15 of Gupta and Panchapakesan (1979).
Selection a n d ranking p r o c e d u r e s in reliability m o d e l s
There have been some investigations of subset selection rules based on ranks
while still assuming that the distributions associated with the populations are
known. This is appealing especially in situations in which the order of the observations is more readily available than the actual measurements themselves due,
perhaps, to excessive cost or other physical constraints. Under this setup, Nagel
(1970), Gupta, Huang and Nagel (1979), Huang and Panchapakesan (1982), and
Gupta and Liang (1987) have investigated locally optimal subset selection rules
which satisfy the validity criterion that the infimum of the PCS is P* when the
distributions are identical. They have used different optimality criteria in some
neighborhood of an equiparameter point in the parameter space. An account of
these rules is given in Gupta and Panchapakesan (1985).
Characterizations of life length distributions are provided in many situations by
so-called restricted families of distributions which are defined by partial order
relations with respect to known distributions. Well-known examples of such
families are those with increasing (decreasing) failure rate and increasing (decreasing) failure rate average. Selection procedures for such families will be discussed
in the next section.
In the remaining part of this section, we will be mainly concerned with nonparametric procedures for selection in terms of a quantile and selection from
several Bernoulli distributions. Though the Bernoulli selection problem could have
been discussed under parametric model, it is discussed here to emphasize the fact
that we can use the Bernoulli selection procedures as distribution-free procedures
for selecting from unknown continuous (life) distributions in terms of reliability at
any arbitrarily chosen time point L.
4.1. Selection & terms of quantiles
Let ~1 . . . . . rck be k populations with continuous distributions F+(x), i = 1, ..., k,
respectively. Given 0 < c~< 1., let x~(F) denote the ~th quantile ofF. It is assumed
that the ~-quantiles of the k populations are unique. The populations are ranked
according to their ~-quantiles. The population associated with the largest ~-quantile is defined to be the best. Rizvi and Sobel (1967) proposed a procedure for
selecting a subset containing the best. Let n denote the common size of the
samples from the given populations and assume n to be sufficiently large so that
1 ~< (n + 1)~< n. Let r be a positive integer such that r~< (n + 1)~< r + 1. It
follows that 1 ~< r ~< n. Let Yj, i denote the jth order statistic in the sample from
rc~, i = 1. . . . . k. The procedure of Rizvi and Sobel (1967) is
Select ~zi if and only if
Y~ i>~ max Yr e j
-- "
where c is the smallest integer with 1 ~< c ~< r - 1 for which the P*-condition is
For the procedure R9, the infimum of the PCS is attained when the distributions F 1. . . . . F k are identical and it is shown by Rizvi and Sobel (1967) that c
s. s. Gupta and S. Panchapakesan
is the smallest integer with 1 ~< c ~< r - 1 satisfying
1 Grk--cl(u) dGr(u) ~> P*
ur - l ( 1 - u ) . . . .
1)!(n - r)!
Rizvi and Sobel have shown that the maximum permissible value o f P* such that
a c-value satisfying (4.2) exists is P1 = PI( n, ~, k) given by
P1 =
r 1))
A short table of Pl-values is given by Rizvi and Sobel for ~ = 0.5 and k = 2(1)10.
The n-values range from 1 in steps of 2 to a value (depending on k) for which
P1 gets very close to 1. Also given by them is a table of the largest value of r - c
for c~ = 1/2 (which means that r = (n + 1)/2), k = 2(1)10, n = 5(10)95(50)495, and
P* = 0.75, 0.90, 0.95, 0.975, 0.99. For the IZ approach to this selection problem,
see Sobel (1967).
4.2. Distribution-free procedures using Bernoulli model
Let re1, ..., lt~ be k populations with the associated continuous (life) distributions F 1. . . . , F k, respectively. The reliability of ~; at L is p~ = 1 - Fi(L ). Let Xo,
j = 1, . . . , n, be sample observations from rc~, i = 1. . . . , k. Define
if X ° > L
i=1 .....
The Yil ..... Yin are independent and identically distributed Bernoulli r a n d o m
variables with success probability p;, i = 1. . . . . k. We are interested in selecting
the population associated with the largest pi.
G u p t a and Sobel (1960) proposed a subset selection rule based on
Yi = ~nj=l Y/j, i = 1, . . . , k. Their rule is
Rio: Select re,. if and only if
Y,. >/ max Ys - D
1 <-%j<~k
where D is the smallest nonnegative integer for which the P*-requirement is met.
An interesting feature o f Procedure Rio is that the infimum of the PCS occurs
when Pl . . . . .
Pk = P (say) but it is not independent of their c o m m o n value p.
Selection and ranking procedures in reliability models
For k = 2, Gupta and Sobel (1960) showed that the infimum takes place when
p = 1/2. When k > 2, the common value Po for which the infimum takes place is
not known. However, it is known that this common value Po ~ 1/2 as n ~ ~ . An
improvement in the situation is provided by Gupta, Huang and Huang (1976)
who investigated conditional selection rules and, using the conditioning argument,
obtained a conservative value of d. Their conditional procedure is
RI~: Select re,. if and only if
Y~>>. m a x
1 ~<j~<
given T = ~k;= ~ Y~-= t, where D(t) > 0 is chosen to satisfy the P*-condition. Exact
result for the infimum of the PCS is ~ ~tained only for k = 2; in this case, the
infimum is attained when p~ = P2 = P and is independent of the common value p.
For k > 2, Gupta, Huang and Huang (1976) obtained a conservative value for
D(t) and also for D of Rule Rio. They have shown that infP(CS ]R~I ) >i P * if D(t)
is chosen such that
for k = 2,
~max{d(r): r = 0, 1, . . . , min(t, 2n))
for k > 2,
where d(r) is defined as the smallest value such that
for k = 2 ,
N(2; d(r), r, n) >1/.[1 - (1 - P * ) ( k - 1)- l] (zn)
for k > 2 ,
and N(k; d(t), t, n) = • ( ~ ) . . . ( ~ ) , with the summation taken over the set of all
nonnegatlve integers s; such. that ~ i = 1 si = t and s k >>,m a x i <~j<<.k- ~sj - d(t).
A conservative constant d for Procedure Rio is given by d = maxo<.t<~knd(t ).
Gupta, Huang and Huang (1976) have tabulated the smallest value d(t) satisfying
(4.5) for k = 2,4(1)10, n = 1(i)10, t = 1(1)20, and P* = 0.75, 0.90, 0.95, 0.99.
They have also tabulated the d-values (conservative) for Procedure Rio for
P* = 0.75, 0.90, 0.95, 0.99, and n = 1(1)4 when k = 3(1)15, and n -- 5(1)10 when
k = 3(1)5.
Under the IZ formulation, one can use the procedure of Sobel and Huyett
(1957) for selecting the population associated with the largest Pi which guarantees
a minimum PCS P* whenever PtkJ -- Ptg- II >/A* > 0. Based on samples of size
n from each population, their procedure based on the Yi defined in (4.1) is
R12: Select the population associated with the largest Yi,
using randomization to break ties, if any.
The sample size required is the smallest n for which the PCS >~ P* when
Pt~] . . . . .
P[k-lJ = P t k ] - A*, the LCF in this case. Sobel and Huyett (1957)
have tabulated the sample sizes (exact and approximate) for k = 2, 3, 4, 10;
A* = 0.05(0.05)0.50, and P* = 0.50, 0.60, 0.75(0.05)0.95, 0.99.
S. S. Gupta and S. Panchapakesan
When n is large, the normal approximation to the PCS yields
A*z)/4A .2
where c = c(k, P * ) is the constant satisfying
qtr~- l(x + c)qg(x)dx = P*
and, ~ and q~ denote correspondingly the cdf and density of the standard normal
distribution. The c-value can be obtained from tables of Bechhofer (1954), Gupta
(1963b), Milton (1963) and Gupta, Nagel and Panchapakesan (1973) for several
selected values of k and P*.
The Bernoulli selection problem has applications to the drug selection problem
and to clinical trials. This fact has spurred lots of research activity involving
investigations of selection procedures using sampling procedures such as the
play-the-winner (PW) sampling rule (introduced by Robbins, 1952 and 1956) and
vector-at-a-time (VT) rule with a variety of stopping rules. One of the main
considerations in many of these procedures is to design the sampling rule so as
to minimize the expected total number of observations and/or the expected number of observations from the worst population. Some of these procedures suffer
from one drawback or another. For excellent review/survey/comprehensive assessment of these (and other) procedures, reference should be made to Bechhofer and
Kulkarni (1982), BOringer, Martin and Schriever (1980), Gupta and
Panchapakesan (1979, Sections 4.2 through 4.6), and Hoel, Sobel and Weiss
(1975). For corresponding developments in subset selection theory, see Gupta and
Panchapakesan (1979, Section 13.2).
5. Selection from restricted families of distributions
A restricted family of probability distributions is defined by a partial order
relation with respect to a known distribution. As we have pointed out earlier, such
families provide characterizations of life length distributions. Selection rules for
such restricted families were first considered by Barlow and Gupta (1969). We
define below the binary partial order relations ( < ) that have been used in studying
selection procedures. These are partial ordering in the sense that they enjoy only
reflexivity and transitivity properties, that is, (1) F < F for all distributions F, and
(2) F < G, G < H implies F < H. Note that F < G and G < F do not necessarily
imply F - G.
DEFINITION 5.1. (1) F is said to be convex with respect to G ( F < c G ) if and
only if G 1F(x) is convex on the support of F.
(2) F is said to be star-shaped with respect to G ( F < . G) if and only if
F(O) = G(O) = O, and G - 1F(x)/x is increasing in x >I 0 on the support of F.
Selection and ranking procedures in reliability models
(3) F is said to be r-ordered with respect to G ( F < r G ) if and only if
F(0) = G(0) = 1/2 and G - 1 F ( x ) / x is increasing (decreasing) in x positive (negative).
(4) F is said to be tail-ordered with respect to G ( F < t G ) if and only if
F(0) = G(0) = 1/2 and G - iF(x) - x is increasing on the support of F.
It is well-known that convex ordering implies star ordering. Further, when
G(x) = 1 - e - x (x >i 0), F < c G is equivalent to saying that F has an increasing
failure rate (IFR) and F < . G is equivalent to saying that F has an increasing
failure on the average (IFRA). Of course, if F is IFR, then it is also IFRA. IFR
distributions were first studied in detail by Barlow, Marshall and Proschan (1963)
and IFRA distributions by Birnbaum, Esary and Marshall (1966). The r-ordering
was investigated by Lawrence (1975). Doksum (1969) used the tail-ordering. The
convex ordering and s-ordering (not defined here) have been studied by van Zwet
(1964). Without the assumption of the common median zero, Definition 5.1-(4)
has been used by Bickel and Lehmann (1979) to define an ordering by spread with
the germinal concept attributed to Brown and Tukey (1946). Saunders and Moran
(1978) have also perceived this kind of ordering (called ordering by dispersion by
them) in the context of a neurobiological problem.
Gupta and Panchapakesan (1974) have defined a general partial ordering
through a class of real-valued functions, which provides a unified way to handle
selection problems for star-ordered and tail-ordered families. Their ordering is
defined as follows.
DEFINITION 5.2. Let ~ = {h(x)} be a class of real-valued functions h(x). Let F
and G be distributions such that F(0) = G(0). F is said to be ~-ordered with
respect to G ( F < i~eG) if G-1F(h(x))>f h(G-1F(x)) for all h • ~ and all x on
the support of F.
It is easy to see that we get star-ordering and tail-ordering as special cases of
W-ordering by taking W = {ax, a>1 1}, F ( 0 ) = G ( 0 ) = 0 , and out° = { x + b ,
b >~ 0}, F(0) = G(0) = 1/2, respectively. Hooper and Santner (1979) have used a
modified definition of W-ordering. For some useful probability inequalities involving Jt~-ordering, see Gupta, Huang and Panchapakesan (1984).
5. I. Selection in terms of quantiles from star-ordered distributions
Let rc~, ..., ~tk have the associated absolutely continuous distributions
F 1. . . . . F~, respectively. All the F i are star-shaped with respect to a known
continuous distribution G. The population having the largest ~-quantile
(0 < ~ < 1) is defined as the best population. It is assumed that the best population is stochastically larger than any of the other populations. Under this setup,
Barlow and Gupta (1969) proposed a procedure for selecting a subset containing
the best. Let Tj. i denote the jth order statistic in a sample of n independent
observations from rci, i = 1. . . . , k, where n is assumed to be large enough so that
S. S. Gupta and S. Panchapakesan
j ~< (n + 1)c¢< j + 1 for some j. The Barlow-Gupta procedure is
Select n i if and only if
max Tjr
1 <~r<~k
c(k, P*, n, j) is the largest number in (0, 1) for which the P*-condition
is satisfied. The constant c is given by
where c =
~o~ Gf- '(x/c)&.(x) dx p*
where Gj denotes the cdf of the jth order statistic in a sample of n observations
from G, and gj is the corresponding density function. The values of c satisfying
(5.2) are tabulated by Barlow, Gupta and Panchapakesan (1969) in the special
case of exponential G, i.e. for selecting from IFRA populations, for P* = 0.75,
0.90, 0.95, 0.99, and the following values of k, n, and j: (i) j = 1, k = 2(1)11 (in
this case, c is independent of n), (ii) k = 2(1)6, j = 2(1)n, and n = 5(1)10 or 12 or
15 depending on k. Table 2a is excerpted from the tables of Barlow, Gupta and
Panchapakesan (1969). It gives the values of c for P* = 0.90, 0.95, k = 2(1)5,
Table 2a
Values of the constant c of Rule R13 satisfying equation (4.2) for selecting
the IFRA distribution with the largest median; G(x)= 1 - e -x, x~>0,
j~< (n + 1)/2 < j + 1, P * = 0.90 (top entry), 0.95 (bottom entry)
Selection and ranking procedures in reliability models
Table 2b
Values of the constant d of Rule RI4 satisfying equation (5.4) for selecting
the IFRA distribution with the smallest median; G(x)= 1 - e -x, x>~O,
j ~<(n + 1)/2 < j + 1, P* = 0.90 (top entry), 0.95 (bottom entry)
n = 5(1)12, and j Such that j ~ (n + 1)/2 < j + 1 (i.e. a p p r o p r i a t e for selection in
terms of median).
F o r the selection of the p o p u l a t i o n with the smallest a-quantile ( a s s u m e d to be
stochastically smaller than any other Fe) the analogous p r o c e d u r e is
R14: Select
rei if a n d only if
l <~r<~k
Tj, r
where d = d(k, P*, n, j ) is the largest n u m b e r in (0, 1) satisfying the P * - c o n d i t i o n
and is given by
f o B [1 - G j ( x d ) ] k - l g j ( x ) d x = P *
where Gj and gs are defined as in (5.2). Barlow, G u p t a a n d P a n c h a p a k e s a n (1969)
have t a b u l a t e d the values of d in the case o f exponential G for P * = 0.75, 0.90,
0.95, 0.99 a n d the following values o f k, n, and j : ( i ) j --- 1, k = 2(1)11 (d is
i n d e p e n d e n t o f n), (ii) k = 2(1)6, j -- 2(1)n, n = 5(1)12 for k = 6, and n = 5(1)15
s. s. Gupta and S. Panchapakesan
for other k values. Table 2b is excerpted from the tables of Barlow, Gupta and
Panchapakesan (1969). It gives the values of d for P * = 0.90, 0.95, k = 2(1)5,
n = 5(1)12, and j such that j ~< (n + 1)/2 < j + 1 (i.e. appropriate for selection in
terms of median).
Suppose that G is the Weibull distribution with cdf G(x) = 1 - exp { -(x/O)~},
x ~> 0, and 0, 2 > 0. It is assumed that 2 is known. Then it is easy to see that
the new constant c~ is given by c I = c ~/~, where c is the constant in the exponential case (2 = 1). Another interesting special case of G is the half-normal distribution obtained by folding N(0, a 2) at the origin, where a is assumed to be known.
The class of distributions which are star-shaped with respect to this folded normal
is a subclass of IFRA distributions. Selection in terms of quantiles in this case
has been considered by Gupta and Panchapakesan (1975), who have tabulated
the constant c associated with RI3 for k--- 2(1)10, n = 5(1)10, j = l(1)n, and
P* = 0.75, 0.90, 0.95, 0.99.
5.2. Selection in terms of medians from tail-ordered distributions
Barlow and Gupta (1969) considered also the selection of the population with
the largest median (assumed to be stochastically larger than other populations)
from a set of distributions F,., i = 1, . . . , k, which have lighter tails than a specified
distribution G with G(0)= 1/2. This means that, for each i, F i centered at its
median A; is r-ordered with respect to G, and (d/dx)Fi(x+Ai)lx= o
>1 (d/dx)G(x)Ix= o. This definition of F,. having a lighter tail than G used by them
implies that F~ centered at Ai is tail-ordered with respect to G. The procedure of
Barlow and Gupta (1969) has been shown by Gupta and Panchapakesan (1974)
to work for this wider class defined using tail-ordering. Actually, Gupta and
Panchapakesan have also shown a generalized version of this by considering
tail-ordering of F; and G when both are centered at their respective ~-quantiles.
For selection in terms of .medians, the procedure of Barlow and Gupta is
R15: Select ni if and only if
Tj.t>/ max T/
1 ~r~<k
j~<(n+ 1)/2<j+ 1
where the T/, r are defined as in the case of the procedure R13 , and the appropriate
constant D = D(k, P*, n) > 0 is given by
~_~ G f - '(t + D)gy(t) dt = P*.
Here, Gs and gs are the cdf and the density of the jth order statistic in a sample
of n independent observations from G. The values of D are given by Gupta and
Panchapakesan (1974) in the special case where G is the logistic distribution,
G(x) = [ 1 + e-X] - 1, for k = 2(1)10, n = 5(2)15, and P* = 0.75, 0.90, 0.95, 0.99.
Using the ~-ordering (Definition 5.2) with the functions h satisfying certain
properties, Gupta and Panchapakesan (1974) have discussed a class of proce-
Selection and ranking procedures in reliabilitymodels
dures for selecting the best (i.e. the one which is stochastically larger than any
other, assumed to exist) of k distributions F;, i, . . . , k, which are Yr'-ordered with
respect to G. The procedures R13 and R15 are special cases of their procedure.
Hooper and Santner (1979) considered selection of good populations in terms
of c~-quantiles for star- and tail-ordered distributions using the RSS approach. Let
ni have the distribution F; and let Fvl denote the distribution having the ith
smallest c~-quantile. Denoting the c~-quantile of any distribution F by x~(F), ~ is
called a good population if x~(F~) > c*x~(Ftk_,+ 11), 0 < c* < 1, in the case of
star-ordered families, and if x~(F,.)> x~(Ft~,_t+ q ) - d*, d* > 0, in the case of
tail-ordered families. The goal of Hooper and Santner (1979) is to select a subset
of size not exceeding m(1 ~< m ~< k - 1) that contains at least one good population. They have also considered the problem of selecting a subset of fixed size s
so as to include at least r good populations (r~< t, r~< s < k - t + r) using the IZ
Selection of one or more good populations as a goal is a relaxation from that
of selecting the best population(s). A good population is defined suitably to reflect
the fact that it is 'nearly' as good as the best. In some form or other it has been
considered by several authors; mention should be made of Fabian (1962),
Lehmann (1963), Desu (1970), Carroll, Gupta and Huang (1975), and Panchapakesan and Santner (1977). A discussion of this can be found in Gupta and
Panchapakesan (1985, Section 4.2).
5.3. Selection from convex ordered distributions
Let ~t~. . . . . rc~ have absolutely continuous distributions F 1. . . . . F k, respectively,
of which one is assumed to be stochastically larger than the rest. This distribution,
denoted by Ft~j, is defined to be the best. It is assumed that Ft~,l < c G, where G
is a known continuous distribution. All distributions in the context are assumed
to have the positive real line as the support. Let X)f)~(Yj,n) denote the jth order
statistic in a random sample of size n from Fe(G ). Considering samples of size n
from F~, . . . , F k each censored at the rth failure, define
T i= ~ a X g )
i= 1,
" " " '
j= 1,...,r-
and g is the density associated with G.
If G(y) = 1 - e-Y, y >>,O, then a 1 . . . . .
a t - 1 = 1/n, and ar = (n - r + 1)/n.
r-- 1
Consequently, n 7",.= ~]j = 1 X)f~ + (n - r + 1) X~I n, the well-known total life statistic
until the rth failure from F i.
S. s. Gupta and S. Panchapakesan
Now, for selecting a subset containing Fte], Gupta and Lu (1979) proposed the
R16: Select
n~ if and only if
Ti>~ c max Tj,
1 <~j<~k
where c is the largest number in (0, 1) satisfying the P*-condition. They have
shown that, if aj ~> 0 for j = 1. . . . . r, a,/> c, and g(0) ~< 1, then
infP(CS ]R16) =
G~r- l ( y / c ) d G r ( y ) ,
~O ~
where GT- is the distribution of T = Y~j= 1 aj Yj, n, and f2 is the space of all k-tuples
(F 1. . . . . Fk) such that there is one among them which is stochastically larger than
the others and is convex with respect to G. Thus, the constant c = min(ar, c*)
where c* is the solution for e by equating the fight-hand side of (5.10) to P*.
For the special case of G ( y ) = 1 - e -y, y~>0, we get c = m i n ( c * ,
(n - r + 1)/n). This special case is a slight generalization of the results of Patel
6. Comparison with a standard or control
Although the experimenter is generally interested in selecting the best of k (>t 2)
competing categories, in some situations even the best one among them may not
be good enough to warrant its selection. Such a situation arises when the
goodness of a population is defined in comparison with a standard (known) or
a control population. For convenience, we may refer to either one as the control.
nk be the k (experimental) populations with associated distribution
Let ~1,
functions F ( x , Or), i = 1, . . . , k, respectively. The 0r are unknown. Let 0o be the
specified standard or the unknown parameter associated with the control population n o whose distribution function is F ( x , 0o). Several different goals have been
considered in the literature. For example, one may want to select the best experimental population (i.e. the one associated with 0[k], the largest 0;) provided that
it is better than the control (i.e. 0rk] > 0o), and not to select any of them otherwise. An alternative goal is to select a subset (of random size) of the k populations which includes all those populations that are better than the control. Some
of the early papers dealing with these problems are Paulson (1952), Dunnett
(1955), and Gupta and Sobel (1958).
One can define a good population in different ways using comparison with a
control. For example, rc~ may be called good if 0r > 0o + A, or [0,. - 0o1 ~< A for
some A > 0. Several procedures have been investigated with the goal of selecting
good populations or those better than the control and these will not be described
here. A good account of these can be had from Gupta and Panchapakesan (1979,
Selection and ranking procedures in reliability models
Chapter 20). A review of subset selection procedures in this context, including
recent developments, is contained in Gupta and Panchapakesan (1985).
An important aspect of the recent developments is the so-called isotonic p r o c e d u r e s which become relevant in the situations where it is known that
01 <~ 02 <~ • • • <<, Ok although the values of the 0,. are unknown. This is typical, for
example, of experiments involving different dose levels of a drug so that the
treatment effects will have a known ordering. Suppose that a population ni is
defined to be good if 0~>~ 0o and bad otherwise. For the goal of selecting all the
good populations, any reasonable procedure R should have the property: If R
selects ~ti then it selects all populations nj for j > i. This is the isotonic behavior
of R. Naturally, one would consider procedures based on isotonic estimator of the
0,. Such procedures have been recently studied by Gupta and Yang (1984) in the
case of normal means (common variance o"2, known or unknown), by Gupta and
Huang (1984) in the case of binomial populations with success probabilities 0;,
and by Gupta and Leu (1986) in the case of two-parameter exponential populations with guarantee times (location parameters) 0i and common (known or
unknown) scale parameter. All these papers deal with both cases of known and
unknown 00.
7. Concluding remarks
In the preceding sections, we have described several selection procedures that
have special significance in reliability studies. However, we have confined our
attention to the classical type procedures since they are of common interest to a
wide variety of users. We have also generally restricted ourselves to single-stage
procedures. T h e r e is ample literature on two-stage and sequential procedures.
Further, we have not discussed decision-theoretic formulations and Bayes and
empirical Bayes procedures. There have been substantial developments in these
regards, especially using subset selection approach, in the last ten years. For a
comprehensive survey of developments until the late 1970's, we refer to Gupta
and Panchapakesan (1979). A critical review of developments in the subset selection theory including very recent developments is given by Gupta and Panchapakesan (1985).
Bain, L. (1978). Statistical Analysis of Reliability and Life-Testing Models, Theory and Methods. Marcel
Dekker, New York.
Barlow, R. E. and Gupta, S. S. (1969). Selectionprocedures for restricted families of distributions.
Ann. Math. Statist. 40, 905-917.
Barlow, R. E., Gupta, S. S. and Panchapakesan, S. (1969). On the distribution of the maximum and
minimum of ratios of order statistics. Ann. Math. Statist. 40, 918-934.
Barlow, R. E., Marshall, A. W. and Proschan, F. (1963). Properties of probability distributions with
monotone hazard rate. Ann. Math. Statist. 34, 375-389.
S. S. Gupta and S. Panchapakesan
Bechhofer, R. E. (1954). A single-sample multiple decision procedure for ranking means of normal
populations with known variances. Ann. Math. Statist. 25, 16-39.
Bechhofer, R. E., Dunnett, C. W. and Sobel, M. (1954). A two-sample multiple-decision procedure
for ranking means of normal populations with a common unknown variance. Biometrika 41,
Bechhofer, R. E., Kiefer, J. and Sobel, M. (1968). Sequential Identification and Ranking Procedures
(with special reference to Koopman-Darmois populations). The University of Chicago Press, Chicago.
Bechhofer, R. E. and Kulkarni, R. V. (1982). Closed adaptive sequential procedures for selecting the
best of k >/2 Bernoulli populations. In: S. S. Gupta and J. O. Berger, eds., Statistical Decision
Theory and Related Topics--Ill, Vol. 1, Academic Press, New York, 61-108.
Berger, R. L, (1979). Minimax subset selection for loss measured by subset size. Ann. Statist. 7,
Berger, R. L. and Gupta, S. S. (1980). Minimax subset selection rules with applications to unequal
variance (unequal sample size) problems. Scand. J. Statist. 7, 21-26.
Bickel, P. J. and Lehmann, E. L. (1979). Descriptive statistics for nonparametric models IV. Spread.
In: Jana Jureckova, ed., Contributions to Statistics: Jaroslav Hajek Memorial Volume, Reidel, Boston,
Birnbaum, Z. W., Esary, J. D. and Marshall, A. W. (1966). A stochastic characterization of wear-out
for components and systems. Ann. Math. Statist. 37, 816-825.
Brown, G. and Tukey, J. W. (1946). Some distributions of sample means. Ann. Math. Statist. 7, 1-12.
BiJringer, H., Martin, H. and Schriever, K.-I-I. (1980). Nonparametric Sequential Selection Procedures.
Birkhanser, Boston, MA.
Carroll, R. J., Gupta, S. S. and Huang, D.-Y. (1975). On selection procedures for the t best
populations and some related problems. Comm. Statist. 4, 987-1008.
Desu, M. M. (1970). A selection problem. Ann. Math. Statist. 41, 1596-1603.
Desu, M. M., Narula, S. C. and Villarreal, B. (1977). A two-stage procedure for selecting the best
of k exponential distributions. Comm. Statist. A--Theory Methods 6, 1223-1230.
Desu, M. M. and Sobel, M. (1968). A fixed-subset size approach to a selection problem. Biometrika
55, 401-410. Corrections and amendments: 63 (1976), 685.
Doksum, M. (1969). Starshaped transformations and the power of rank tests. Ann. Math. Statist. 40,
Dudewicz, E. J. and Koo, J. O. (1982). The Complete Categorized Guide to Statistical Selection and
Ranking Procedures. Series in Mathematical and Management Sciences, Vol. 6, American Sciences
Press, Columbus, OH.
Dunnett, C. W. (1955). A multiple comparison procedure for comparing several treatments with a
control. J. Amer. Statist. Assoc. 50, 1096-1121.
Fabian, V. (1962). On multiple decision methods for ranking population means. Ann. Math. Statist.
33, 248-254.
Gibbons, J. D., Olkin, I. and Sobel, M. (1977). Selecting and Ordering Populations: A New Statistical
Methodology. Wiley, New York.
Gupta, S. S. (1956). On a decision rule for a problem in ranking means. Mimeograph Series No.
150, Institute of Statistics, University of North Carolina, Chapel Hill, NC.
Gupta, S. S. (1963a). On a selection and ranking procedure for gamma populations. Ann. Inst. Statist.
Math. 14, 199-216.
Gupta, S. S. (1963b). Probability integrals of the multivariate normal and multivariate t. Ann. Math.
Statist. 34, 792-828.
Gupta, S. S. (1965). On some multiple decision (selection and ranking) rules. Technometrics 7,
Gupta, S. S. and Huang, D.-Y. (1980). A note on optimal subset selection procedures. Ann. Statist.
8, 1164-1167.
Gupta, S. S. and Huang, D.-Y. (1981). Multiple Decision Theory: Recent Developments. Lecture Notes
in Statistics, Vol. 6, Springer, New York.
Gupta, S. S., Huang, D.-Y. and Huang, W.-T. (1976). On ranking and selection procedures and tests
of homogeneity for binomial populations. In: S. Ikeda, T. Hayakawa, H. Hudimoto, M. Okamoto,
Selection and ranking procedures in reliability models
M. Siotani and S. Yamamoto, eds., Essays in Probability and Statistics, Shinko Tsusho Co. Ltd.,
Tokyo, Japan, Chapter 33, 501-533.
Gupta, S. S., Huang, D.-Y. and Nagel, K. (1979). Locally optimal subset selection procedures based
on ranks. In: J. S. Rustagi, ed., Optimizing Methods in Statistics, Academic Press, New York,
Gupta, S. S., Huang, D.-Y. and Panchapakesan, S. (1984). On some inequalities and monotonicity
results in selection and ranking theory. In: Y. L. Tong, ed., Inequalities in Statistics and Probability,
IMS Lecture Notes--Monograph Series, Vol. 5, 211-217.
Gupta, S. S., Huang, W. T. (1984). On isotonic selection rules for binomial populations better than
a standard. In: A. M. Abuammoh, E. A. Ali, E. A. El-Neweihi and M. Q. E1-Osh, eds.,
Developments in Statistics and lts Applications, King Sand Univ. Library, Riyadh, 89-112.
Gupta, S. A. and Kim, W.-X. (1984). A two-stage elimination type procedure for selecting the largest
of several normal means with a common unknown variance. In: T. J. Santner and A. C. Tamhane,
eds., Design of Experiments: Ranking and Selection, Marcel Dekker, New York, 77-93.
Gupta, S. S. and Leu, L.-Y. (1986). Isotonic procedures for selecting populations better than a
standard: two-parameter exponential distributions. In: A. P. Basu, ed., Reliability and Quality
Control, Elsevier Science Publishers B.V., Amsterdam, 167-183.
Gupta, S. S. and Liang, T.-C. (1987). Locally optimal subset selection rules based on ranks under
joint type II censoring. Statistics and Decisions 5, 1-13.
Gupta, S. S. and Lu, M.-W. (1979). Subset selection procedures for restricted families of probability
distributions. Ann. Inst. Statist. Math. 31, 253-252.
Gupta, S. S. and McDonald, G. C. (1982). Nonparametric procedures in multiple decisions (ranking
and selection procedures). In: B. V. Gnedenko, M. L. Puri and I. Vincze, eds., Colloquia
Mathematica Societatis Janos Bolyai, 32: Nonparametric Statistical Inference, Vol. I, North-Holland,
Amsterdam, 361-389.
Gupta, S. S., Nagel, K. and Panchapakesan, S. (1973). On the order statistics from equally correlated normal random variables. Biometrika 60, 403-413.
Gupta, S. S. and Panchapakesan, S. (1972). On a class of subset selection procedures. Ann. Math.
Statist. 43, 814-822.
Gupta, S. S. and Panchapakesan, S. (1974). Inference for restricted families: (a) multiple decision
procedures; (b) order statistics inequalities. In: F. Proschan and R. J. Serfling, eds., Reliability and
Biometry: Statistical Analysis of Lifelength, SIAM, Philadelphia, 503-596.
Gupta, S. S. and Panchapakesan, S. (1975). On a quantile selection procedure and associated
distribution of ratios of order statistics from a restricted family of probability distributions. In: R.
E. Barlow, J. B. Fussell and N. D. Singpurwalla, eds., Reliability and Fault Tree Analysis: Theoretical
and Applied Aspects of System Reliability and Safety Assessment, SIAM, Philadelphia, 557-576.
Gupta, S. S. and Panchapakesan, S. (1979). Multiple Decision Procedures: Theory and Methodology of
Selecting and Ranking Populations. Wiley, New York.
Gupta, S. S. and Panchapakesan, S. (1985). Subset selection procedures: review and assessment.
Amer. J. Management Math. Sci. 5, 235-311.
Gupta, S. S. and Santner, T. J. (1973). On selection and ranking procedures--a restricted subset
selection rule. Proceedings of the 39th Session of the International Statistical Institute, Vol. 45, Book I,
Gupta, S. S. and Sobel, M. (1958). On selecting a subset which contains all populations better than
a standard. Ann. Math. Statist. 29, 235-244.
Gupta, S. S. and Sobel, M. (1960). Selecting a subset containing the best of several binomial
populations. In: I. Olkin, S. G. Ghurye, W. Hoeffding, W. G. Madow and H. B. Mann, eds.,
Contributions to Probability and Statistics, Stanford University Press, Stanford, Chapter 20, 224-248.
Gupta, S. S. and Sobel, M. (1962a). On selecting a subset containing the population with the smallest
variance. Biometrika 49, 495-507.
Gupta, S. S. and Sobel, M. (1962b). On the smallest of several correlated F-statistics. Biometrika 49,
Gupta, S. S. and Yang, H.-M. (1984). Isotonic procedures for selecting populations better than a
control under ordering prior. In: J. K. Ghosh and J. Roy, eds., Statistics: Applications and New
S. S. Gupta and S. Panchapakesan
Directions: Proceedings of the Indian Statistical Institute Golden Jubilee International Conference,
Indian Statistical Institute, Calcutta, 279-312.
Hoel, D. G., Sobel, M. and Weiss, G. H. (1975). A survey of adaptive sampling for clinical trials.
In: R. M. Elashoff, ed., Perspectives in Biometry, Academic Press, New York, 29-61.
Hooper, J. H. and Santner, T. J. (1979). Design of experiments for selection from ordered families
of distributions. Ann. Statist. 7, 615-643.
Huang, D.-Y. and Panchapakesan, S. (1982). Some locally optimal subset selection rules based on
ranks. In: S. S. Gupta and J. O. Berger, eds., Statistical Decision Theory and Related Topics--III,
Vol. 2, Academic Press, New York, 1-14.
Kim, W.-C. and Lee, S.-H. (1985). An elimination type two-stage selection procedure for exponential
distributions. Comm. Statist.--Theor. Meth. 14, 2563-2571.
Kingston, J. V. and Patel, J. K. (1980a). Selecting the best one of several Weibull populations. Comm.
Statist. A--Theory Methods 9, 383-398.
Kingston, J. V. and Patel, J. K. (1980b). A restricted subset selection procedure for Weibull
distributions. Comm. Statist. A--Theory Methods 9, 1371-1383.
Lawrence, M. J. (1975). Inequalities for s-ordered distributions. Ann. Statist. 3, 413-428.
Lehmann, E. L. (1963). A class of selection procedures based on ranks. Math. Annalen 150, 268-275.
Milton, R. C. (1963). Tables of equally correlated multivariate normal probability integral. Technical
Report No. 27, Department of Statistics, University of Minnesota, Minneapolis, MI.
Nagel, K. (1970). On subset selection rules with certain optimality properties. Ph.D. Thesis (also
Mimeograph Series No. 222), Department of Statistics, Purdue University, West Lafayette, IN.
Panchapakesan, S. and Santner, T. J. (1977). Subset selection procedures for Ap-superior populations. Comm. Statist. A--Theory Methods 6, 1081-1090.
Patel, J. K. (1976). Ranking and selection of IFR populations based on means. J. Amer. Statist. Assoc.
71, 143-146.
Paulson, E. (1952). On the comparison of several experimental categories with a control. Ann. Math.
Statist. 23, 239-246.
Raghavachari, M. and Starr, N. (1970). Selection problems for some terminal distributions. Metron
28, 185-197.
Rizvi, M. H. and Sobel, M. (1967). Nonparametric procedures for selecting a subset containing the
population with the largest ~-quantile. Ann. Math. Statist. 38, 1788-1803.
Robbins, H. (1952). Some aspects of the sequential design of experiments. Bull. Amer. Math. Soc.
58, 527-535.
Robbins, H. (1956). A sequential design problem with a finite memory. Proc. Nat. Acad. Sci. U.S.A.
42, 920-923.
Santner, T. J. (1975). A restricted subset selection approach to ranking and selection problems. Ann.
Statist. 3, 334-349.
Saunders, I. W. and Moran, P. A. P. (1978). On the quantiles of the gamma and F distributions.
J. AppL Prob. 15, 426-432.
Sobel, M. (1967). Nonparametric procedures for selecting the t populations with the largest
c~-quantiles. Ann. Math. Statist. 38, 1804-1816.
Sobel, M. and Huyett, M. J. (1957). Selecting the best one of several binomial populations. Bell
System Tech. J. 36, 537-576.
Zwet, W. R. van (1964). Convex Transformations of Random Variables. Mathematical Center,
P. R. Krishnaiah and C. R. Rao, eds., Handbook of Statistics. Vol. 7
© Elsevier Science Publishers B.V. (1988) 157-174
| ~'~
The Impact of Reliability Theory on Some
Branches of Mathematics and Statistics
Philip J. Boland and Frank Proschan*
0. Introduction
It is obvious that reliability theory has used a great variety of mathematical and
statistical tools to help achieve needed results. These include: total positivity,
majorization and Schur functions, renewal theory, Bayesian statistics, isotonic
regression, Markov and semi-Markov processes, stochastic comparisons and
bounds, convexity theory, rearrangement inequalities, optimization theory--the
list is almost endless•
The question now arises: Has reliability theory reciprocated--that is, has
reliability theory made any contributions to the development of any of the
mathematical and statistical disciplines listed above? The answer is a definite Yes.
In this article we shall show that in the course of solving reliability problems,
theoreticians have developed new results in some of the disciplines above, of
direct value to the discipline and having application in other branches of statistics
and mathematics•
1. Total positivity and P61ya frequency functions
A function K ( x , y) of two real variables ranging over linearly ordered sets X
and Y respectively is said to be totally positive o f order r (TPr) if for all 1 ~< rn ~< r,
Xl < x2 < " " " < Xm, Yl < Y2 < " " " < Ym (Xi ~ X, yj E Y), we have the inequalities
[;::;; ;m]
g(xl, Yl)
K ( X l , Y2) "'" K(x1, Ym) >I
K(X2, Yl)
K ( x 2 , Y2) "" " K ( x 2 , Y,,,)
g ( x m , Yl)
K ( x m , y2)" " K(x,,,, y,,,)
* Research supported by the Air Force Office of Scientific Research Grant AFOSR 82-K-0007.
P. J. Boland and F. Proschan
Typically, X is an interval of the real line, or a countable set of discrete values
on the real line such as the set of all integers or the set of nonnegative integers;
similarly for Y. When X or Y is a set of integers, we may use the term 'sequence'
rather than 'function'.
If a TPr function K(x, y) is a probability density in one of the variables, say
x, with respect to a a-finite measure #(x) for each fixed value of y, and is expressible
as a function K(x, y) = f ( x - y) of the difference of x and y, then f is said to be
a P6lyafrequencyfunction (or density) of order r (PFr). The argument o f f traverses
the real line. If the argument is confined to the integers we shall speak of a P61ya
frequency sequence of order r (PF~ sequence). Note that if f is a density function
on R, then K(x, y) = f ( x - y) is TP 2 if and only if the family of density functions
{ f ( x - y ) : y ~ Y} has the monotone likelihood ratio property.
Many totally positive kernels (functions) may be generated by the judicious use
of the following convolution result:
I f K is TP r, L is TP s and # is a a-finite measure, then the
M(x, y) = f K(x, z)L(z, y) d/~(z)
is TPmin (r, ~).
PROOF. The result follows from the 'Basic Composition Formula' (see Karlin
(1968) for a proof):
MIXI'X2..... Xml= I'''I
Y,~ A
KFXI..... Xm]
< Z 2 <
., z,,,
o • - "<Zm
xF;,1.... mZm] 'l'
whenever M(x, y) = S K(x, z)L(z, y) d#(z) converges absolutely with respect to
the a-finite measure #.
An important feature of totally positive functions is their variation diminishing
property. Suppose that K(x, y) is TP r and that h(y) changes sign j times where
j ~< r - 1. Let g(x) = S K(x, y)h(y) d#(y), an absolutely convergent integral with
/~ a a-finite measure. Then g(x) changes sign at most j times. Moreover, if g(x)
actually changes sign j times, then g(x) must have the same arrangement of signs
as h(y) does, as x and y traverse their respective domains from left to right. The
variation diminishing property is actually equivalent to the (inequalities) definition
we have given of TP r (see Karlin and Proschan, 1960; Karlin, 1968, Chapter 5).
The impact of reliability theory on mathematics and statistics
The theory of totally positive kernels and P61ya frequency functions has been
extensively applied in several domains of mathematics, statistics, economics, and
mechanics. In particular to give but a few examples in the theory of reliability, the
techniques of total positivity have been useful in developing properties of life
distributions with monotone failure rates and in the study of some notions of
component dependence, P61ya frequency functions of order 2 have been helpful
in determining optimal inspection policies, and the variation diminishing property
has been used in establishing characteristics of certain shock models. Reliability
theory has in turn, however, been the motivating force behind some important
developments in the theory of total positivity itself. A good example is the following result (see Karlin and Proschan, 1960):
THEOREM 1.2. Let f l , f2, ... be any sequence of densities of nonnegative random
variables, where each f is PF r Then the n-fold convolution g(n, x) = f l * r E * " " *
fn(x) is TP~ in the variables n and x, where n ranges over 1, 2, ... and x traverses
the positive real line.
A similar total positivity result for the first passage time probabilities of the
partial sum process can be proved in the more general case when the random
variables range over the whole real line.
THEOREM 1.3. Let f l , f z , ... be any sequence of PF~ densities of random variables
X l , X 2 . . . . respectively, which are not necessarily non-negative. Consider the first
passage probability for x positive:
..... . 1 ]
for n = 1, 2 . . . . . Then h(n, x') is TPr, where n ranges over 1, 2, . . . , and x traverses
the positive real line.
Theorems 1.2 and 1.3 were initially inspired by certain models in reliability and
inventory theory, and these results in turn motivated Karlin (1964) to characterize
new classes of totally positive kernels and develop new applications in (for example) discrete Markov chains. Typical of the results of Karlin (1964) are the
following two propositions (see also Karlin, 1968, p. 43):
PROPOSITION 1.4. Let ~ be a temporarily homogeneous TP r Markov chain (the
transition probability matrix P is TPr) whose state space is the set of nonnegative
integers. Then the n-step transition function P~j is TPr in the variables 0 ~ j < oo and
PROPOSITION 1.5. Let ~ be a TP r Markov chain. Let Finjo denote the probability
that the first passage into the set of states <<-Jooccurs at the nth transition when the
initial state of the process is i > Jo. Then is TP r in the variables n >i 1 and i > Jo.
P.J. Boland and F. Proschan
We now briefly trace the development leading to Theorems 1.2 and 1.3,
beginning with a basic problem in reliability theory that Black and Proschan
(1959) consider. (See also Proschan (1960) and Barlow and Proschan (1965) for
related problems).
Suppose that a system is required to operate for the period [0, to]. When a
component fails, it is immediately replaced by a spare component of the same
type if one is available. The system fails if no spare is available. Only the originally
supplied spares may be used for replacement during the period [0, to]. Assume
that the system uses k different types of components. At time 0, for each
i = 1, ..., k there are d r 'positions' in the system which are filled with components
of type i. By 'position (i, j)' we mean the jth location in the system where a
component of type i is used. Components of the same type in different positions
may be subject to varying stresses, and so we assume that the life of a component
in position (i, j ) has density function f j . Each replacement has the same life
distribution as its predecessor, and component lives are assumed to be mutually
independent. Let Pr(nr) be the reliability during [0, to] of the ith part of the system
(that is the subsystem consisting of the di components of type i), assuming that
n; spares of type i are available for replacement. The problem is to determine the
'spares kit' n = ( n ~ , . . . , nk) which will maximize the reliability of the system
P~n) = I~/k=l er(nr) during [0, to] subject to a cost constraint of the form
Y~r= 1 crnr <~ C (where cr > 0 for all i = 1, . . . , k).
A vector n o = (n °, n °, ..., n°) is an undominated spares allocation if whenever
P(n) > e(n ), then Y.r= l cinr > Y~= l Crn°. Black and Proschan (1959) consider
methods for quickly generating families of undominated spares allocations, which
can then be used to solve (approximately) the above problem. One of their
procedures is to start with the cheapest cost allocation (0, 0 . . . . . 0), and successively generate more expensive allocations as follows: If the present allocation is
n, determine the index io for which
[logPr(nr + 1) - logP~(n¢)]/G
(i = 1. . . . . k)
is a maximum (in the case of ties the lowest such index is taken). The next
allocation is then n' = (n 1. . . . , nio_ j, n;o + 1, n;o + 1. . . . , nk). Black and Proschan
observe that the procedures they describe generate undominated allocations if each
Pc(n) is log concave in n. They are able to verify this directly in the case where
the component lives in the ith part of the system are exponentially distributed with
parameter 2~.
Note that logP;(n) is concave in n if and only if (Pr(n + 1)/Pr(n)) is a decreasing
function of n, or equivalently that Pr(n) is a PF a sequence. Let N o. for j = 1, . . . , d r
be the random variable indicating the number of replacements of type i needed
at position (i, j ) in the interval [0, to]. Proschan (1960) is able to show that iff.j.(t)
satisfies the monotone likelihood ratio property for translations (equivalently that
fj(t) is a PF 2 function), then f~n)(t) is a TPa function in the variables n and t
(where f;~n) is the n-fold convolution of f j with itself). Judiciously using
Theorem 1.1 on convolutions of totally positive functions, one is then able to
The impact of reliability theory on mathematics and statistics
show P r o b ( N i j = n ) is a PF 2 sequence and finally that P i ( n ) = P r o b
(Ni~ + • •. + Nid ~<~ n) is a PF 2 sequence. The key tool is of course to show that
when f ( t ) is a PF 2 function, then f(n~(t) is TP2 in n and t. Theorems 1.2 and 1.3
are natural generalizations of this result.
One may further generalise the 'spares kit' procedure above to show that when
each life distribution function F;j (for position (i, j)) is IFR (equivalently that
ff = 1 - F is PF2), then the procedure generates undominated allocations (Barlow
and Proschan, 1965).
2. Association of random variables
The notion of associated random variables is one of the most valuable contributions to statistics that has been generated as a result of reliability theory considerations.
We consider two random variables to be in some sense associated if they are
positively correlated, that is cov(S, T) >/0. A stronger requirement is c o v ( f ( S ) ,
g ( T ) ) >1 0 for all nondecreasing f and g. Finally if c o v ( f ( S , T), g ( S , T)) >I 0 for
all f and g nondecreasing in each argument, we have still stronger version of
association. Esary, Proschan and Walkup (1967) generalize this strongest version
of association to the multivariate case in defining random variables T1, . . . , T~ to
be associated if c o v ( f ( T ) , g ( T ) ) >>,0 for all nondecreasing functions f and g for
which the covariance in question exists. Equivalent definitions of associated
random variables result if the functions f and g are taken to be increasing and
either (i) binary or (ii) bounded and continuous.
Association of random variables satisfies the following desirable multivariate
(P1) Any subset of a set of associated random variables is a set of associated
random variables.
(P2) If two sets of associated random variables are independent of one another,
then the union of the two sets is a set of associated random variables.
(P3) Any set consisting of a single random variable is a set of associated random
(P4) Increasing functions of associated random variables are associated.
(Ps) A limit in distribution of a sequence of sets of associated random variables
is a set of associated random variables.
Note that properties P3 and P2 imply that any set of independent random variables is associated. This fact, together with property P4 enables one to generate
many practical examples of associated random variables. In the special case when
dealing with binary random variables, one can readily show that the binary
random variables X~, . . . , X n are associated if and only if 1 - X ~ ,
1 - X 2 . . . . . 1 - X n are associated.
Many interesting applications may be obtained as a consequence of the following result about associated random variables:
P. J. Boland and F. Proschan
THEOREM 2.1. L e t T 1. . . . . 1", be associated, and Si = f ( T )
function f o r each i = 1, . . . , k. Then
be a nondecreasing
P[S1 <~ S1 . . . . ' Sk <~"Sk] ~ H P [ S i <'~Si]
P[S1 > Sl' " ' ' ' Sk > Sk] >~ H
P[Si > si]
f o r all s = (s 1. . . . , Sk)~ R ~.
The following two corollaries are immediate consequences of this theorem.
COROLLARY 2.2. (Robbins, 1954). L e t T~ . . . . . T , be independent random variabe the i t h p a r t i a l s u m f o r i = 1, .. . , n . Then
bles, and let S i = ~ 'j = I T j
P [ S 1 <~ s I . . . . .
S n <.
s.] >~ fi P(S, <. s,)
f o r all s = (s, . . . . .
s,) e R'.
L e t T t u , . . . , T[, 1 be the order statistics in a random sample
T 1. . . . , T,. Then
P [ T v , 1 <~ ti~, . . . , Tu, ] <~ t~] >/ I-I P[Tto] <~ t~]
P [ T v , I > til, " " , 'Tvkl > tik] >/ I ] P[T[,)1 > tij]
f o r every choice o f 1 <~ i~ < • • • < ik <~ n and ti, < • • • < tik.
Marshall and Olkin (1967) consider the multivariate exponential distribution
F(s, . . . . , sin) = 1 - exp [ - £ 2is i - Z 2u max(s,., sj)
k i= 1
2;jk max(s;, sj, s~) . . . .
. . . .
max(sl, s2 . . . . , Sin)I •
They point out that if this is the distribution function of the random variables
S 1. . . . . S=, then there exist independent exponential random variables
T l . . . . , T , such that Sj = m i n { T i : i e A j } where A j c {1, 2, . . . , n}. The random
variables S 1. . . . , S= are associated and therefore using Theorem 2.1, we can
The impact of reliability theory on mathematics and statistics
show that
F(s 1. . . . , Sm) ~/ ~
1 - F(s 1, . . . , Sm) ~ ~
[1 -- Fi(si) ]
where F i is the marginal distribution of S i. The multivariate exponential distribution is also useful in studying shock models.
Another application of Theorem 2.1 can be made in the case of analysis of
variance in which two hypothesis are tested using the same error variance for each
test. Consider the case in which the effects of both rows and columns are to be
tested. The standard procedure is to calculate three quadratic forms, ql, q2, q3
which are independently distributed as Z2 with n~, n z, and n 3 degrees of freedom
respectively, where ql represents the sum of squares between rows, q2 the sum of
squares between columns, and q3 the error sum of squares. The likelihood ratio
test statistics for testing the two hypotheses are
F 1 = (ql/nx)/(q3/n3)
F 2 = (q2/n2)/(q3/n3).
The probability of making no errors of the first kind is P [ F I , , F 2 <<.F2~ ], where
F1~ (F2~) is the 100~ per cent point of the distribution of F 1 (/72). Kimball (1951)
P [ F , <~ F,=, F 2 ~ F2= ] > P[F~ ~ FI~,]P[F 2 <~F2=],
or in other words that the chance of no errors of the first kind is greater following
the standard experimental procedure than if separate experiments had been performed. This result is an immediate consequence of Theorem 2.1 once it is
observed that F 1 and F z are nondecreasing functions of the associated random
variables qx, q2, q3 1
The concept of associated random variables has proved to be a useful tool in
various areas of operations research. Shogan (1977) uses properties of associated
random variables to construct bounds for the stochastic activity duration of
PERT network. Heidelberger and Inglehart (1979) use associativity to construct
a set of sufficient conditions which guarantee that the dependent simulations of
a stochastic system produce a variance reduction over independent simulations.
Niu (1981) makes use of association in studying queues with dependent interarrival and service times.
The notion of association of random variables is just one of many notions of
multivariate dependence. Lehmann (1966) introduces several concepts of bivariate
dependence, the strongest of which is TP z dependence ((S, T) are TP 2 dependent
if the joint probability density (or in the discrete case joint frequency function)
f ( s , t) is totally positive of order 2). For a discussion concerning the relationship
P. J. Boland and F. Proschan
among several notions of multivariate dependence see Barlow and Proschan
(1981). Newman and Wright (1981) obtain limit theory result for sequences of
associated random variables.
In applications it is often easier to verify that one of the alternative notions
which imply association holds, instead of verifying association directly. For
example if T = (T 1. . . . . T,) has density f ( q , . . . , t,) which is TP 2 in every pair
of variables when the remaining variables are kept fixed and which is everywhere
positive on a rectangular support, then T~ . . . . . T, are associated (see Kemperman, 1977). Pitt (1982) proves the following important characterization of association for the multivariate normal case (for a simpler proof see also Joag-dev, Perlman
and Pitt (1983)):
L e t T -- (T 1. . . . , T , ) be multivariate normal. Then T l . . . . , T~ are
associated if and only if coy(T/, Tj.) ~> 0 f o r all i, j = 1, . . . , n.
A related result of particular importance in statistical mechanics is the F K G
inequality. Let T = (T1, . . . , Tn) be a random vector with density f ( q . . . . , t,).
For s = (s 1, . . . , sn) and t = (q, . . . , tn), let
s v t = (max(s,, tl), max(s2, tz) . . . . , max(s~, tn))
S ^ t = (min(s 1, q), min(s 2, t2), . . . , min(s,, t~))
f is said to satisfy the F K G condition (or be multivariate totally positive o f order 2
(MTP2)) if
f ( s v t ) f ( s ^ t) >t f ( s ) f ( t )
for all s, t ~ ~ .
The F K G inequality, obtained by Fortuin, Kasteleyn and Ginibre (1971), says
that if the density f of T satisfies the F K G condition, then T 1. . . . . T, are
associated. For an excellent discussion of the application of the F K G inequality
in statistics see Kemperman (1977).
The notion of association of random variables which Esary, Proschan and
Walkup (1967) develop, has its origins in a problem of Esary and Proschan (1963)
concerning coherent structures. Moore and Shannon (1956) investigate the reliability of relay circuits and show that arbitrarily reliable circuits can be constructed
from arbitrarily unreliable relays. They prove that if h ( p ) is the probability of
closure of a relay network plotted as a function of the common probability p of
the closure of a simple relay, then
for O < p <
Therefore h ( p ) is s-shaped (crosses the diagonal at most once and always from
below), a property which is crucial in constructing relay circuits of arbitrarily high
reliability. Birnbaum, Esary and Saunders (1961) generalize this result of Moore
The impact of reliability theory on mathematics and statistics
and Shannon to coherent structures of independent components with identical
reliability. Esary and Proschan (1963) in turn generalize to coherent structures
with independent components not necessarily of the same reliability. The main
tool in their paper is the following specialized version of an inequality of
Tchebichev (see Hardy, Littlewood and P61ya, 1952), which may be regarded as
a 'forerunner' to the definition of association of random variables:
THEOREM 2.5. Let X 1 , . . . , X n be independent binary random variables. Let f(X),
i = 1, 2, be increasing functions. Then cov[fl(X), f2(X)] ~> 0.
Esary and Proschan also use Theorem 2.5 to construct upper and lower bounds
for t h e reliability of a coherent structure in terms of the minimal paths and
minimal cut sets of the structure.
3. Renewal theory
Renewal theory has its origins in the study of self-renewing aggregates and
especially in actuarial science. Today we view the subject more generally as the
study of functions of independent identically distributed nonnegative random
variables which represent the successive intervals between renewals of a process.
The theory is applied to a wide variety of fields such as risk analysis, counting
processes, fatigue analysis, inventory theory, queuing theory, traffic flow, and
reliability theory. We will summarize a few of the more important and basic ideas
in renewal theory (for a more complete treatment consult Smith (1958), Cox
(1962), Feller (1966), Ross (1970), or Karlin and Taylor (1975)) and then indicate
some of the contributions to this area arising from reliability theory.
By a renewal process we will mean a sequence of independent identically
distributed nonnegative random variables X1, X 2 . . . . . which are not all zero with
probability one. We let F be the distribution function of X1, and F (k) will denote
the k-fold convolution of F with itself. The kth partial sum S k = X 1 + • • • + X k is
the kth renewal point and has distribution function F (k). For convenience we
define F (°) by F(°)(t) = 1 for t >i 0 and zero otherwise. Renewal theory is primarily
concerned with the number N(t) of renewals in the interval [0, t]. N(t), the
renewal random variable, is the maximum value of k for which Sk <~ t, with the
understanding that N(t)= 0 if X ~ > t .
It is clear that P ( N ( t ) = n ) =
F(n)(t) - F (n+ 1)(0 and e ( N ( t ) >>.n ) = F(")(t). The process {N(t): t >/0} is known
as a renewal counting process.
The renewal function M(t) is defined to be the expected number of renewals
in [0, t], that is M(t) = E(N(t)). Since M(t) = E(N(t)) = 2 k~= l k P [ N ( t ) = k] =
~=1~ P[N(t) >t k], it follows that M(t) = Zk= ~ FCk)(t) and moreover that M(t) =
~o- [1 + M ( t - x)] dF(x) (this latter identity being known as the fundamental
renewal equation). In spite of the fact that a closed functional form for M(t) is
known for only a few special distributions F, the renewal function M(t) plays a
central role in renewal theory.
P. J. Boland and F. Proschan
If F is the distribution function of X1, F is nonlattice if there exists no h > 0
such that the range of X 1 c {h, 2h, 3h,...}. The following basic results were
proved in the early stages of renewal theory development.
I f F has mean #i, then N ( t ) / t ~ 1/# 1 almost surely as t--* oo.
THEOREM 3.2. Let F have mean #1. Then
(i) M(t) >1 t/#
I for all t >1 0;
(ii) (Blackwell) if F is non-lattice,
lira [M(t + h) - M(t)] -- h / # ,
for any h > 0 ;
(iii) if F is non-lattice with 2nd moment #2 < + ~ ,
M ( t ) = t / # l + # 2 / 2 # 2 - 1 +o(1) as t ~ c o .
Note that important as these results may be, they are, with the exception of
Theorem 3.2 (i), asymptotic in nature.
In their comparison of replacement policies for stochastically failing units,
Barlow and Proschan (1964) obtain several new renewal theory inequalities. An
age replacement policy is one whereby a unit is replaced upon failure or at age T,
a specified constant, whichever comes first. Under a block replacement policy a
unit is replaced upon failure and at times T, 2T, 3T, .... It is assumed that
failures occur independently and that the replacement time is negligible. There are
advantages for both types of policy, and hence it is of interest to compare the two
types stochastically with respect to numbers of failures, planned replacements and
removals (a removal is a failure or a planned replacement). In many situations it
will be assumed that the life distribution of a unit belongs to a monotone class
such as the IFR (DFR) class (F is IFR if it has increasing (decreasing) failure
rate). It is clear that the evaluation of replacement policies depends heavily on the
theory of renewal processes.
Suppose we let N(t) indicate the number of renewals in [0, t] due to replacements at failure, N*(t) be the number of failures in [0, t] under a block policy,
and N*(t) the number of failures in [0, t] under an age policy. Barlow and
Proschan (1964) prove the following result stochastically comparing these random
If F is IFR (DFR), then
P(N(t) >>,n) >~ ( <~)P(U*(t) >/n) >~ ( <~)P(U*(t) >/n)
for t >l O and n = O, 1, 2 , . . . .
The following bounds on the renewal function M(t) = E(N(t)) are an immediate
The impact of reliability theory on mathematics and statistics
COROLLARY 3.4. I f F is IFR (DFR), then
(i) M(t) >~ ( <~)E(N*(t)) >1 ( <~)e(N*(t)).
(ii) M(t) >t (<<.)kM(t/k), k = 1, 2, . . . .
(iii) M(t) <~ (>1) t/kL1
(iv) M(h) <~(>. )M(t + h) - M(t) for all h, t >~ O.
By considering the number of failures and the number of removals per unit of
time as the duration of the replacement operation becomes indefinitely large,
Barlow and Proschan (1964) obtain the following simple useful bounds on the
renewal function for any F, and an improvement on these bounds for the IFR
(DFR) case (these bounds were conjectured by Bazovsky (1962)):
(i) M(t) >~ t/S o i ( x ) d x - 1 >>.t/# 1 - 1 for all t >~ O.
(ii) I f F is IFR (DFR), then M(t) <~( >~) tF(t)/ S o if(x) d x <<.(>1) t/l~ 1for all t >~O.
As a consequence of this result, it follows that when F is IFR the expected
numbers of failures per unit of time under block and age replacement policies do
not differ by more than 1/T in the limit as t--, ~ .
Feller (1948) shows than l i m t ~ V a r ( N ( t ) ) / M ( t ) = tr2/#2~< 1. Barlow and
Proschan (1964) partially generalize this result in proving the following:
THEOREM 3.6. I f F is IFR (DFR), then Var(N(t)) ~<(>~)M(t), and this inequality
is sharp.
The renewal theory implications of the work of Barlow and Proschan (1964)
provide the key tool in the probabilistic interpretation of Miner's rule given by
Birnbaum and Saunders (1968) and Saunders (1970). Miner's rule (Miner, 1945)
is a deterministic formula extensively used in engineering practice for the cumulative damage due to fatigue. Prior to the work of Birnbaum and Saunders, Miner's
rule was supported by empirical evidence but had very little theoretical justification. Birnbaum and Saunders investigate models for stochastic crack growth
with incremental extensions having an increasing failure rate distribution. The
result that for an IFR distribution function F the inequality t/I21 - 1 <~M(t) <~ t/[.t 1
holds, is used to prove that T/121 -- 1 ~ ~ 1"/121 where ]A1 is the expected crack
increment per cycle, z is the expected crack length at which failure occurs and 7
is the expected number of loading cycles to failure. This in turn is used to show
that under certain conditions of dependence on load, Miner's rule does yield the
mathematical expectation of fatigue life. Saunders (1970) extends some of these
results by weakening the model assumptions, in particular by assuming that the
IFR assumption for the crack growth can be relaxed to assuming that F be new
better than used in expectation (NBUE), that is #l > So ff(t + x ) / i ( t ) d x for all
t >~ 0 such that F(t) > 0.
Marshall and Proschan (1972) determine the largest classes of life distributions
for which age and block replacement policies diminish, either stochastically or in
expected value, the number of failures in service. In doing so, they give the first
P. J. Boland and F. Proschan
systematic treatment of the NBU, NWU, NBUE and N W U E classes of life
distributions, which are now widely used in statistics. A life distribution function
F is new better than used (NBU) if i ( x + y) <~ F(x)F(y) for all x, y >~ 0. The new
worse than used (NWU) class of distributions is similarly defined by reversing the
order in this inequality. In their investigation they obtain important renewal
quantity inequalities, many of which generalize results from Barlow and Proschan
(1964). For example they show that if F is NBU (NWU) then VarN(t) ~< (>~)M(t)
and M(h) <~ (>l)M(t + h) - M(t) for all h, t ~> 0, while if F is NBUE (NWUE)
then M(t)<~ (>>.)tI# 1. The following interesting characterization of the NBU class
in terms of the renewal random variable is obtained. Let • denote convolution.
N(s) * N(t) <~ (>~)N(s + t)for all s, t >~ 0 ¢~ F is NBU (NWU).
Straub (1970) is interested in bounding the probability that the total amount of
insurance claims arising in a fixed period of time does not exceed the amount t
of premiums collected. Letting F(t) be the distribution function for the individual
claims amount, Straub desires bounds for ff(')(t)= P ( N ( t ) < n). Here we may
interpret N(t) as the maximum value of k such that the first k claims sum to a
total ~<t. Motivated by the use of tools in reliability theory and in particular in
the work of Barlow and Marshall on bounds for classes of monotone distributions, Straub establishes the following important result (see Barlow and Proschan,
THEOREM 3.8. Let F be a continuous distribution function with hazard function
R (t) = - logif(t).
(a) I f F is NBU (NWU), then
P(N(t)<n)>/(<<,) ~
1,2 .....
(b) I f F is IFR (DFR), then
, - l [nR(t/n)]j
P(N(t) < n) <~ (>I) F,
for t >1 O, n = 1, 2, . . . .
The bounds for the renewal function established by Barlow and Proschan
(1964) motivate Marshall (1973) to investigate the existence of 'best' linear bounds
for M(t) ('best' is interpreted to mean the sharpest bounds which when iterated
in the fundamental renewal equation converge monotonically to M(t) for all t).
Esary, Marshall and Proschan (1973) establish properties of the survival
function of a device subject to shocks and wear. One of their principal tools is
the result that [Ftkl(x)] 1/~" is decreasing in k = 1, 2, ..., for any distribution
function F such that F(x) = 0 for x < 0. This result, which is equivalent to the
following property of the renewal random variable N(t), can be used to demonstrate monotonicity properties of first passage time distributions for certain
Markov processes.
The impact o f reliability theory on mathematics and statistics
THEOREM 3.9. Let N(t) denote the number of renewals in [0, t] for a renewal
process. Then [P(N(t) >~k)] 1/k is decreasing in k = 1, 2 . . . . .
Another class of monotone distributions used for modeling in reliability theory
is the increasing mean residual life (IMRL) class. Let X 1 have life distribution F.
Then F is IMRL if E(X~ - tIX~ > t) is nondecreasing in t/> 0. A D F R distribution
function F with finite mean # 1 is IMRL. Mixtures of D F R distributions are DFR,
and D F R distributions are used to model the lifetimes of units which improve
with age, such as blast furnaces and work-hardening materials. Keilson (1975)
shows that a large class of first passage time distributions for Markov process are
DFR. Brown (1980) and (1981) proves some very nice renewal quantity results
for the D F R and IMRL classes, among which is the following:
THEOREM 3.10. (a) I f F is DFR, then the renewal function M(t) is concave. (b) I f
F is IMRL, then M(t) - (t/#~ - 1) is increasing in t >~O. (Note however that M(t)
is not necessarily concave.)
4. Majorization and Schur functions
The theory of inequalities has played a fundamental role in developing new
results in reliability theory. In attempting to compare and establish bounds for
probability distributions and systems, workers in reliability have been discovering
new inequalities. Many of these inequalities are of a general nature and can be
presented using the techniques of majorization and Schur functions.
Given a vector x = (xl, . . . , X n ) , let Xtl ] ~< X [ 2 ] ~ " " " ~ X[n ] denote an increasing rearrangement of Xl, . . . , x,. The vector x is said to majorize the vector
y (we write x > m y ) if
forj=2 ..... n
i= 1
~-'~ Y[ili= 1
Hardy, Littlewood, and P61ya (1952) show that x > m y if and only if there exists
a doubly stochastic matrix H such that y = xlI. Schur functions are real valued
functions which are monotone with repsect to the partial ordering of majorization.
A function h with the property that x > m y ~ h(x)>i ( ~ ) h ( y ) is called Schurconvex (Schur-concave). A convenient characterization of Schur-convexity (-concavity) is provided by the Schur-Ostrowski condition, which states that a differentiable permutation invariant function h defined on R" is Schur-convex
(Schur-concave) if and only if
( x i _ xj)(O~x,
a#h)>~ (~<)0
for all i , j
and xe~q".
For an excellent and extensive treatment of the theory of majorization, the reader
should consult Marshall and Olkin (1979).
P.J. Boland and F. Proschan
A k out o f n system is a system with n components which functions if and only
if k or more of the components function. Systems of this type are frequently
encountered in practice. A one out of n system is a parallel system and an n out
of n system is a series system. We assume that the n components of the system
function independently. Let hk(p) denote the reliability of a k out of n system in
which the component reliabilities are given by p = (Pl, - . . , P,). Computing the
reliability function hk(p) is often difficult, particularly when a large number of
unlike component probabilities are involved. Some interesting inequalities with
applications in other areas of statistics have resulted from efforts to obtain more
computable bounds for the system reliability hk(p).
For component reliability p; we define the corresponding component hazard R,.
by R; = -logp~. Pledger and Proschan (1971) obtain the following comparisons
for h~(p):
THEOREM 4.1. Let R = (R1, . . . , R , ) be a vector of component hazards which
majorizes R' = (R'1. . . . , R ' ) , a second vector of component hazards. Then for the
corresponding component reliability vectors p and p' (note that [I1 Pt = H1 p~ since
~ R~ = Y,~ R ; ) we have
1,... ,n-
h , ( p ) = h , ( p ' ) (that is the two systems are equally good in series).
Considering the particular case where R '1 . . . . .
R ' , one obtains the useful
bound hl,(pl, . . . , p,)>1 hk(Pc, . . . , P c ) for k = 1, ..., n, where Pc is the geometric mean (!q~ pt) 1/'.
Although a large collection of theory and methods exists for order statistics
from a single underlying distribution, a relatively small set of results is available
for the case of order statistics from underlying heterogeneous distributions. In as
much as the time to failure of a k out of n system of independent components
with respective life distributions F 1. . . . . F, corresponds to the (n - k + 1)th order
statistic from the set of underlying heterogeneous distributions {F 1. . . . , Fn},
results about k out of n systems may be interpreted in terms of order statistics
from heterogeneous distributions.
Let us assume that Y/(Y; ) is an observation from distribution Fi (F;) and that
Ri(x ) = - l o g f f i ( x ) (R~ (x) = -logff" (x)) is the corresponding hazard function for
i = 1. . . . , n. The ordered observations are denoted by YH ~ < ' ' ' ~ < Yt-~
(YI~I ~ < " " ~< YI-I)" A random variable Y is stochastically larger than Y'
( y >~st y , ) if Fr(x) <~Fy, (x) for all x. In the realm of order statistics, Theorem 4.1
yields the following result:
Let (/~I(X), . . . ,
Rn(X)) )-m(R11(X),
gn(x)) for all
for k = 2 . . . . , n .
' and Yrk~ >~st YEk]
x >t O. Then
The impact of reliability theory on mathematics and statistics
Pledger and Proschan (1971) obtain further results of this type for the case of
proportional hazards. We say that the distributions Fl, . . . , F,, F'l, . . . , F'n have
proportional hazards with constants of proportionality 21, ..., 2,, 2'1. . . . . 2" if
Ri(x ) = 2~R(x) and R ; ( x ) = 2;R(x) for some hazard function R ( x ) and all
i = 1, . . . , n. A consequence of Theorem 4.2 is the following:
COROLLARY 4.3. Let F 1. . . . . F,, F'I, . . . , F'n have proportional hazard functions with 21 . . . . . 2n, 2 ' 1 , . . . , 2;, as constants of proportionality. I f (21 . . . . . 2n)
>m(2'1, . . . , 2"), then Y[1] =st Y[I] and Y[k] >~st Yil,] for k = 2, . . . , n.
Proschan and Sethuraman (1976) generalize Corollary 4.3 and show that under
the same stated conditions, Y = (Y1, . . . , Yn) > / s t Y ' = (Y'l . . . . . Y'n) ( y > s t y , if
and only if f ( Y ) / > s t f ( y , ) for all real valued increasing functions f of n variables).
For more on stochastic ordering the interested reader should consult Kamae,
Krengel and O'Brien (1977). Proschan and Sethuraman apply their result to study
the robustness of standard estimates of the parameter 2 in an exponential distribution (F(x) = 1 - e - a x ) when the observations actually come from a set of heterogeneous exponential distributions.
Other comparisons for k out of n systems are given by Gleser (1975), and
Boland and Proschan (1983). While investigating the distribution of the number
of successes in independent but not necessarily identical Bernoulli trials,
Hoeffding shows that
1 >~ hk(1, . . . , 1, 2 1 P ; -
[2~Pi], 0 . . . . . 0) >/hk(pl . . . . . p , )
whenever Y~1Pi >~ k, and
0 = hk(1 . . . . .
~< h~(~, . . . ,
1, Z l P, - [ Z~ p,-], 0 . . . . . 0) ~< hk(pl . . . . . p n )
whenever Z l p ~ < ~ k . Here f i = Z l p ~ / n and [Y. lpe] is the integer part of Z l P i .
Gleser generalizes this in showing the following:
THEOREM 4.4. hk(p) is Schur convex in the region where Z~pe>~ k + 1 and
Schur concave in the region where Z~ pi <~ k - 2.
In further research on the reliability of k out of n systems, Boland and Proschan
(1983) show the following related result:
THEOREM 4.5. h~(p) is Schur convex in [(k - 1)/(n - 1), 1] n and Schur concave
in [0, ( k - 1)/(n - 1)] n.
P. J. Boland and F. Proschan
Theorems 4.4 and 4.5 represent inequalities which have practical use in the
study of k out of n systems. However it should be clear that they are of more
general interest and have applications in particular in the areas of order statistics
and independent Bernoulli trials.
Barlow and Proschan (1965) show that the mean life of a series system with
IFR components exceeds (is greater than or equal to) the mean life of a similar
system with exponential components, assuming component mean lives match in
the two systems. The reverse ordering is shown to hold in the parallel case.
Solovyev and Ushakov (1967) extend these results to include comparisons with
systems of degenerate and truncated exponential distributions. Marshall and
Proschan (1970) more generally show that if the life distributions F,. and Gi
of corresponding components of a pair of series systems satisfy
~o P~(x)dx >~ ~o Gi(x) dx for all t~> 0, then the same kind of inequality holds for
the system life distribution. Similarly they show that the domination
~) if(x) dx ~> ~ ~ G(x) dx for all t t> 0 is preserved under the formation of parallel
systems, and that both of these types of domination are preserved under convolutions. Marshall and Proschan (1970) are (implicitly) working with the concept
of continuous majorization (see Marshall and Olkin (1979)). We say the life
distribution function F majorizes the life distribution function G (written F >m G)
if #F = ~ o f f ( x ) d x = ~ o - G ( x ) d x = # a
and ~ f f ( x ) d x > > , ~ - G ( x ) d x
for all
t t> 0. As a by-product of their work on the mean life of series and parallel
systems, Marshall and Proschan establish the following result in the theory of
THEOREM 4.6. Suppose that Fi > m G J o r each i = 1. . . . , n where Fe and G~ are life
distribution functions. L e t F(t) = F 1 * • " • * Fn(t) and G(t) = G 1 * • ' • * Gn(t) be n-fold
convolutions, with respective means #F and #G. Then
Many elementary inequalities of general interest have been generated through
optimization problems in reliability theory. Derman, Lieberman and Ross (1972)
consider the problem of how to assemble J systems with n different components
in order to maximize the expected number of functioning systems. They extend
a basic inequality of Hardy, Littlewood, and P61ya and 'rediscover' (their extension is a special case of a result of Lorentz (1953)) the following inequality:
for i = 1,...,
L e t F ( x l , . . . , xn) be a joint distribution function. I f xi 1 <~ " " <~ x J
n, then
F(x{ .....
x~) >1 Z
F(x~, x022(j) . . . . , xO~n(j))
whenever ~i (i = 2, . . . , n) are permutations o f 1, 2 . . . . . J.
The impact of reliability theory on mathematics and statistics
Barlow, R. E. and Proschan, F. (1964). Comparison of replacement policies, and renewal theory
implications. Ann. Math. Statist. 35, 577-589.
Barlow, R. E. and Proschan, F. (1965). Mathematical Theory of Reliability. Wiley, New York.
Barlow, R. E. and Proschan, F. (1981). Statistical Theory of Reliability and Life Testing. To Begin With,
Silver Spring, MD.
Bazovsky, I. (1962). Study of maintenance cost optimization and reliability of shipboard machinery.
ONR Contract No. Nonr-374000(00) (FBM), United Control Corp., Seattle, WA.
Birnbaum, Z. W., Esary, J. D. and Saunders, S. C. (1961). Multi-component systems and structures
and their reliability. Technometrics 3, 55-77.
Birnbaum, Z. W. and Saunders, S. C. (1968). A probabilistic interpretation of Miner's rule. S I A M
J. App. Math. 16, 637-652.
Black, G. and Proschan, F. (1959). On optimal redundancy. Oper. Res. 7, 581-588.
Boland, P. J. and Proschan, F. (1983). The reliability of k out of n systems. Ann. Prob. 11, 760-764.
Boland, P. J. and Proschan, F. (1984). An integral inequality with applications to order statistics.
To appear.
Brown, M. (1980). Bounds, inequalities, and monotonicity properties for some specialized renewal
processes. Ann. Probability 8, 227-240.
Brown, M. (1981). Further monotonicity properties for specialized renewal processes. Ann. Probability. 9, 891-895.
Cox, D. R. (1982). Renewal Theory. Wiley, New York.
Derman, C., Lieberman, G. J. and Ross, S. M. (1972). On optimal assembly of systems. Nay. Res.
Log. Quart. 19, 569-574.
Esary, J. D., Marshall, A. W. and Proschan, F. (1973). Shock models and wear processes. Ann. Prob.
1, 627-649.
Esary, J. D. and Proschan, F. (1963). Coherent structures Of non-identical components. Technometrics
5, 191-209.
Esary, J. D., Proschan, F. and Walkup, D. W. (1967). Association of random variables, with
applications. Ann. Math. Stat. 38, 1466-1474.
Feller, W. (1948). On Probability problems in the theory of counters. Courant Anniversary Volume.
Interscience, New York.
Feller, W. (1966). An Introduction to Probability Theory and Its Applications, Vol. II. Wiley, New York.
Fortuin, C. M., Kastelyn, P. W. a~d Ginibre, J. (1971). Correlation inequalities on some partially
ordered sets. Comm. Math. Phys. 22, 89-103.
Gleser, L. (1975). On the distribution of the number of successes in independent trials. Ann. Prob.
3, 182-188.
Hardy, G. H., Littlewood, J. E. and P61ya. (1952). Inequalities. Cambridge University Press, New
Heidelberger, P. and Inglehart, D. L. (1979). Comparing stochastic systems using regenerative
simulation with common random numbers. Adv. Appl. Prob. 11, 804-819.
Hoeffding, W. (1956). On the distribution of the number of successes in independent trials. Ann.
Math. Stat. 27, 713-721.
Joag-dev, K., Perlman, M. D. and Pitt, L. D. (1983). Association of normal random variables and
Slepian's inequality. Ann. Prob. 11, 451-455.
Kamae, T., Krengel, U. and O'Brien, G. L. (1977). Stochastic inequalities on partially ordered spaces.
Ann. Probab. 5, 899-912.
Karlin, S. (1964). Total positivity, absorption probabilities and applications. Trans. Amer. Math. Soc.
III, 33-107.
Karlin, S. (1968). Total Positivity. Stanford University Press, Stanford, CA.
Karlin, S. and Proschan, F. (1960). P61ya type distributions of convolutions. Ann. Math. Stat. 31,
Karlin, S. and Taylor, H. M. (1975). A First Course in Stochastic Processes, 2nd edition. Academic
Press, New York.
P. J. Boland and F. Proschan
Keilson, J. (1975). Systems of independent Markov components and their transient behavior. In: R.
E. Barlow, J. B. Fussel and N. D. Singpurwalla, eds., Reliability and Fault Tree Analysis. SIAM,
Philadelphia, PA, 351-364.
Kemperman, J. H. B. (1977). On the FKG-inequality for measures on a partially ordered space.
lndag. Math. 39, 313-331.
Kimball, A. W. (1951). On dependent tests of significance in the analysis of variance. Ann. Math.
Star. 22, 600-602.
Lehmann, E. L. (1966). Some concepts of dependence. Ann. Math. Stat. 37, 1137-1153.
Lorentz, G. G. (1953). An inequality for rearrangements. Amer. Math. Mon. 60, 176-179.
Marshall, A. W. and Olkin, I. (1967). A multivariate exponential distribution. J. Amer. Stat. Assoc.
62, 30-44.
Marshall, A. W. and Olkin, I. (1979). Inequalities: Theory of Majorization and Its Applications.
Academic Press, New York.
Marshall, A. W. and Proschan, F. (1970). Mean life of series and parallel systems. J. App. Prob. 7,
Marshall, A. W. and Proschan, F. (1972). Classes of distributions applicable in replacement, with
renewal theory implications. In: L. LeCom, J. Neyman and E. L. Scott, eds., Proceedings of the 6th
Berkeley Symposium on Mathematical Statistics and Probability, Vol. I, University of California Press,
Berkeley, CA, 395-415.
Marshall, K. T. (1973). Linear bounds on the renewal function. SIAM J. App. Math. 24, 245-250.
Miner, M. A. (1945). Cumulative damage in fatigue. J. AppL Mech. 12, A159-A164.
Moore, E. F. and Shannon, C. E. (1956). Reliable circuits using less reliable relays. J. Franklin
Institute 262, part I 191-208 and part II 281-297.
Newman, C. M. and Wright, A. L. (1981). An invariance principle for certain dependent sequences.
Ann. Prob. 9, 671-675.
Niu, S. C. (1981). On queues with dependent interarrival and service times. Nay. Res. Log. Quart.
28, 497-501.
Pitt, L. D. (1982). Positively correlated normal random variables are associated. Ann. Prob. 10,
Pledger, G. and Proschan, F. (1971). Comparisons of order statistics and of spacings from heterogeneous distributions. In: J. S. Rustagi, ed., Optimizing Methods in Statistics. Academic Press, New
York, 89-113.
Proschan, F. (1960). P6lya Type Distributions in Renewal Theory, with an Application to an Inventory
Problem. Prentice-Hall, Englewood, NJ.
Proschan, F. and Sethuraman, J. (1976). Stochastic comparisons of order statistics from heterogeneous populations, with applications in reliability theory. J. Mult. Anal 6, 608-616.
Robbins, H. (1954). A remark on the joint distribution of cumulative sums. Ann. Math. Stat. 25,
Ross, S. M. (1970). Applied Probability Models with Optimization Applications, Holden-Day, San
Saunders, S. C. (1970). A probabilistic interpretation of Miner's rule. II. SlAM J. App. Math. 19,
Shogan, A. W. (1977). Bounding distributions for a stochastic PERT network. Networks 7, 359-381.
Solovyev, A. D. and Ushakov, I. A. (1967). Some estimates for systems with components 'wearing
out'. (In Russian). Avtomat. i Vycisl. Tehn. 6, 38-44.
Smith, W. L. (1968). Renewal theory and its ramifications. J. Roy. Statist. Soc., Series B 20, 243-302.
P. R. Krishnaiah and C, R. Rao, eds., Handbook of Statistics, Vol. 7
© Elsevier Science Publishers B.V. (1988) 175-213
1 1
.lk 1
Reliability Ideas and Applications in Economics and
Social Sciences
M. C. Bhattacharjee*
O. Introduction and summary
0.1. In recent times, Reliability theoretic ideas and methods have been used
successfully in several other areas of investigation with a view towards exploiting
concepts and tools, which have their roots in Reliability Theory, in other
settings to draw useful conclusions. For a purely illustrative list of some of these
areas and corresponding problems which have been so addressed, one may
mention: demography (bounds on the 'Malthusian parameter', reproductive value
and other related parameters in population growth models--useful when the
age-specific birth and death-rates are unknown or subject to error: Barlow and
Saboia (1973)), queueing theory (probabilistic structure of and bounds on the
stationary waiting time and queue lengths in single server queues: Kleinrock
(1975), Bergmann and Stoyan (1976), KollerstrOm (1976), Daley (1983)) and
economics ('inequality of distribution' and associated problems: Chandra and
Singpurwalla (1981), Klefsj6' (1984), Bhattacharjee and Krishnaji (1985)). In each
of these problems, the domain of primary concern and immediate reference is not
the lifelengths of physical devices/systems of such components or their failurelogic structure per se but some phenomenon, possibly random, evolving in time
and space. Nevertheless, the basic reason behind the success of cross-fertilization
of ideas and methods in each of the examples listed above is that the concepts
and tools which owe their origin to traditional Reliability theory are in principle
applicable to non-negative (random) variables and (stochastic) processes generated
by such variables.
0.2. Rather than attempt to provide a bibliography of all known applications
of Reliability in widely diverse areas, our purpose in this paper is more modest.
We review recent work on such applications to some problems in economics and
social sciences--which is illustrative of the non-traditional applications of Reliability ideas that is finding increase use. In Section 1, 'social choice functions' and
* Work done while the author was visiting the University of Arizona.
M. C. Bhattacharjee
the celebrated 'impossibility theorem' of Arrow (1951) are considered as an application of 'monotone-structure' ideas. Section 2 considers 'voting games' and
'power indices' which are among the best known quantitative models of group
behavior in political science, to show they can be modeled via the theory of
structure functions. Besides providing new viewpoints and alternative proofs of
well known classic results which these situations illustrate, reliability ideas can
also lead to new insights. Sections 3 and 4, which exploit appropriate parametric
and nonparametric 'life distribution' ideas, are in the latter category. Section 3
considers alternatives to the traditional Lorenz-coefficient and Gini-index for
measuring 'inequality of distribution' in economics by exploiting mean residual life
and TTT-transform concepts. Section 4 describes an approach to modeling some
aspects of the 'economics of innovation and R & D rivalry' by considering the
'reliability characteristics' of the time to innovation of a technologically feasible
product or process among a competing group of entrepreneurs or firms which are
in the race to be the first to innovate.
In each of the four themes, a summary of the problem formulation and basic
results of interest precedes the reliability analogies and arguments which can be
brought to bear on the problems. No detailed proofs are given except for Arrow's
theorem (Section 1.2) from an unpublished technical report whose succint arguments are reviewed to illustrate how the reliability approach can be constructive
in clarifying the role of underlying assumptions and an alternative insight. The role
of interpretation of appropriate reliability theoretic concepts and results for such
an interplay cannot be minimized and are interspersed throughout our presentation. The format is mainly expository in nature, although some results are new.
In each section, we also indicate some possible directions of further development
that would be interesting from the point of view of the themes addressed and that
of reliability theory and applications.
1. The 'Impossibility Theorem' of Arrow
1.i. Arrow (1951) considered the problem of aggregating 'individual perference
orderings' to form a 'social preference ordering'. In the conceptual framework of
social decision making and particularly in the context of voting theory, his celebrated
'impossibility theorem' is a landmark result which essentially states that there is
no social preference ordering which obeys two reasonable axioms and four conditions that one would expect all reasonable ways of aggregating individual preferences to a collective one to satisfy. Pechlivanides (1975) in a paper investigating
some aspects of social decision structures, has given an alternative proof of
Arrow's theorem using coherent-structure arguments of reliability theory which
appears to have remained unpublished and which we believe is a very apt illustration of the reliability arguments for many modeling problems in the social
sciences. His arguments are somewhat succint which we will review and amplify.
Before reviewing Pechlivanides' proof, we take up a brief description and
formal statement of Arrow's theorem which may not be entirely familiar to relia-
Reliability applications in economics
bility researchers. Central to this is the idea of a preference ordering R among the
elements x, y,, ... of a finite set F. R is a relation among the elements of F such
that for any x, y ~ F, we say: x R y iff x is at least as preferred as y. Such a relation
R is required to satisfy the two axioms:
(A1) Transitivity: For all x, y, z t F; x R y and y R z ~ x R z.
(A2) Connectedness: For all x, y 6 F; either x R y or y R x or both.
Technically R is a complete pre-order on F; it is analogous to a relation such as
'at least as tall as' among a set of persons. Notice that we can have both x R y
and y R x but x ~ y. For a given F, it is sometimes easier to understand the
relation R through two other relations P, I defined as x P y ~*~ x is strictly
preferred to y; while x I y ,~ x and y are equally preferred (indifference). Then
note, (i) x R y ~:~ y ~ x, i.e., x R y is the negation o f y P x and (ii) the axiom (A2)
says: either x P y or y P x or x I y .
Now consider a society S = { 1, 2 . . . . . n} of n-individuals (voters), n >I 2 and
a finite set A of alternatives consisting of k-choices (candidates/policies/actions),
k > 2. Each individual i t S has a personal preference ordering R i on A satisfying
the axions (A1) and (A2). The problem is to aggregate all the individual
preferences into a choice for S as a whole. To put it another way, since R;
indicates how i 'votes', an 'election' 8 is a complete set of 'votes' {formally,
= {Ri:i~ S}) and since the result of any such election must amalgamate its
elements (i.e., the individual voter-preferences) in a reasonable manner into a
well-defined collective preference of the society S; such a result can be thought
of as another relation R* on A which, to be reasonable, must again satisfy the
same two axioms (A1) and (A2) with F = A.
Arrow conceptualizes the definition of a "voting system" as the specification of
a social preference ordering R* given S, A. There are many possible R* that one
can define including highly arbitrary ones such as R* = R~ for some i ~ S (such
an individual i, if it exists, is called a 'dictator'). To model real-world situations,
we require to exclude such unreasonable voting systems and confine ourselves to
those R* which satisfy some intuitive criteria of fairness and consistency. Arrow
visualized four such conditions, namely:
(C1) (Well-definedness). A voting system R * must be capable of a decision. For
any pair of alternatives a, b; there exists an 'election' for which the society prefers
a to b. [R* must be defined on the set of all n-tuples B = (R~ . . . . . Rn) of
individual preferences and is such that for all a, b in A, either a R* b or a ~ * b,
there exists an B such that b $ * a . ]
(C2) (Independence of Irrelevant Alternatives). R* must be invariant under
addition or delition of alternatives. [ I f A ' c A and o~ = {Ri: i t S} is any election,
then RI*, should depend only on {Ril A, : i t S} where Rtl A, (RI*, 1, respectively) is
the restriction of R; (R* respectively) to A ' . ]
(C3) (Positive Responsiveness). An increasing (i.e., nondecreasing) preference for
an alternative between two elections does not decrease its social preference.
[Formally, given S and A, let g = {R~:i~S} and g ' = { R ' ' i ~ S } be two
elections. If there exists an a t A such that
M. C. Bhattacharjee
(i) a R i a ' =¢. aR; a' for all i t S , and a' ~ a ;
(ii) for all pairs ( a ' , b ' ) t A x A with a ' # a ,
a' R,b'} = {(a', b'):a' R; b'},
then, a R * a ' ~ a R * ' a', for all a' ~ a. In other words, if each voter looks on
a t A at least as favorably under g ' as he does under g and if the individual
preferences between any other pair of altematives remain the same under both
elections, then the society looks on a at least as favorably under g ' as it does
under do.]
(C4) (No Dictator). There is no individual whose preference ('vote') always
coincides with the social preference regardless of the other individual preferences.
[There does not exist i t S with R* = Ri, i.e., such that for all (a, b),
a R i b ~ A R * b and a ~ i b ~ a ~ * b . ]
Call a voting system (social preference ordering) R * admissible iff it satisfies the
axioms (A1), (A2) and the conditions (C1)-(C4). Arrow's impossibility theorem
then claims that for a society of at least two individuals and more than two
alternatives, an admissible voting system does not exist.
1.2. The 'reliability' argument. Traditional proof of Arrow's theorem depends
heavily on the properties of complete pre-orders. To see the relevance of reliability
ideas for proving Arrow's theorem, Pechlivanides imagines the society S as a
system and each voter i t S as one of its components. For every pair (a, b) of
alternatives with a ¢ b , associate a binary variable x i : A 2 - - * { O , 1}, where
A 2 = {(a, b): a t A, b E A, a ~ b} is a set in A x A devoid of its diagonal, by
xi(a,b)= 1 i f a R i b ,
= 0
if aI~ib.
Relative to b, every xi(a, b) is a vote for a if xe(a, b) = 1 and is a vote against
a if it equals zero. Thus x i defines i's vote and is an equivalent description of his
individual preference ordering R r The vote-vector x = {x I . . . . . xn): A 2 ~ {0, 1} n
is an equivalent description of an election ~ = (R l . . . . , R,). A voting system
(social preference ordering) R * is similarly equivalent to specifying a social choice
function FA: A 2 ~ {0, 1} such that
FA(a,b)= 1 i f a R * b ,
if a ~ * b .
Each xe(a, b) = 1 or 0 (FA(a, b) = 1 or 0 respectively) according as the individual
i (society S, respectively) does not/does prefer b to a. Formally, Arrow's result is
IMPOSSIBILITY THEOREM (Arrow). There does not exist a social choice function
FA satisfying (A1), (A2) and (C1)-(C4).
Reliability applications in economics
To argue that the two axioms and four conditions are collectively inconsistent,
the first step is to show:
(C1)-(C3) hold ¢~ FA = 4(x) for some monotone structure function 4.
PROOF. Recall that a monotone structure function in reliability theory is any
function 4: {0, 1}" ~ {0, 1} such that 4 is non-decreasing in each argument and
4(0) = 0, 4(1)= 1, where 0 = (0, ..., 0) and 1 = (1, ..., 1) (viz., Barlow and
Proschan, 1975).
First note (C2) ~ FA(a, b) depends only on (a, b) and not on all of A. Hence
we will simply write F for F A. The condition (C1) =*. F(a, b) = 4(x(a, b)) for all
(a, b ) ~ A 2, for some binary structure function 4. Next, (C3) =*, this 4(x) is
monotone non-decreasing in each coordinate x;. Finally (C1) and (C3) together =~ 4(0) = 0, 4(1) = 1; viz., since by (C1), there exist vote-vectors x o and x 1
such that 4(Xo) = 0, 4(xl) = 1; by the monotonicity hypothesis (C3) for 4, we get
0 ~< 4(0) ~< 4(Xo) = 0,
4(Xl) ~ 4{1) ~ 1.
Thus the conditions (C1)-(C3) imply F = 4(x) for some monotone structure
function 4. The converse is trivial. []
The axioms (A1) and (A2) for voting systems translated to requirements on the
social choice function F(a, b) = 4(x(a, b)) become
(A1) Transitivity:
F(a, b) = 1 = F(b, c) =~ F(a, c) = 1.
(A2) Connectedness: F(a, b)= 1 or 0.
Consider a pair of alternatives (a, b ) ~ A 2 such that F(a, b)= 4(x(a, b))= 1.
Borrowing the terminology of reliability theory, we will say
P(a, b ) = : { i ~ S : xi(a, b ) = 1) = {i~ S: a R, b}
is an (a, b)-path. Similarly if F(a, b) = 0, call the set of individuals
C(a, b) = : { i ~ S : xi(a , b)
= 0) =
{i6 S: b P~a}
as an (a, b)-cut. Thus an (a, b)-path ((a, b)-cut, respectively) is any coalition, i.e.,
subset of individuals whose common 'non-preference of b relative to a' ('preference of b over a', respectively) is inherited by the whole society S. Obviously
such paths (cuts) always exist since the whole society S is always a path as well
as a cut for every pair of alternatives. When the relevant pair of alternatives (a, b)
is clear from the context, we drop the prefix (a, b) for simplicity and just refer
to (1.3) and (1.4) as path and cut. A minimal path (cut) is a coalition of which
no proper subset is a path (cut).
M. C. Bhattacharjee
To return to the main proof, notice that Lemma 1 limits the search for social
choice functions F = ~(x) to those monotone structure functions tp which satisfy
(A1), (A2) and (C4). A social choice function satisfies the connectedness axiom
(A2) iff for every pair of alternatives (a, b); there exists either a path or a cut,
according as F(a, b) = 1 or 0, whose members' common vote agrees with the
social choise F(a, b). The transitivity axiom (A1) that F(a, b)= 1 =
F(b, c) =~ F(a, c ) = 1 for each triple of alternatives (a, b, c) can be similarly
translated as: for each of the pairs (a, b), (b, c), (a, c); there exists a path, not
necessarily the same, which allow the cycle of alternatives a, b, c, to pass.
Let ~ ' be the class of monotone structure functions and set
= : { ~ J g : no two paths are disjoint},
~ * =: {q~ J / : intersection of all paths is nonempty},
where q~d is the dual-structure function
~d(x) = :1 - ~b(1 - x).
(~-* respectively) are those monotone structures for which there is at least one
common component shared by any two paths (all paths, respectively). ~ is the
class of self-dual monotone structures for which every path (cut) is also a cut
Clearly i f * ~ ~. Also ~ c ~ ; for if not, then there exists two paths P~, /'2
(which are also cuts by self-duality) which are disjoint so that we then have a cut
P1 disjoint from a p a t h / 2 . This contradicts the fact that any two coalitions of
which one is a path and the other a cut must have at least one common component, for otherwise it would be possible for a structure tp to fail (tp(x) = 0) and
not-fail ((p(x)~ 0) simultaneously violating the weU-definedness condition (C1).
c~ ~ * ~ ~ .
To see if there is an admissible social choice function F, we are asking if there
exists a $ ~ '
satisfying (A1), (A2) and (C4). To check that the answer is no,
the underlying argument is as follows. First check
and hence q ~ ~ by (1.5). Which are the structures in (A2) that satisfy (A1)? We
show this is precisely ~ * , i.e., claim
~ (A1) = ~ *
so that any admissible F = q~(x)~ ~ * . The final step is to show the property
defining ~ * and the no-dictator hypothesis (C4) are mutually inconsistent.
Reliability applications in economics
The following outlines the steps of the argument. For any pair (a, b) of alternatives, the society S obeying axiom (A2) must either decide 'b is not preferred
to a' (F(a, b)= q)(x(a, b))= 0) or its negation 'b is preferred to a'
(F(a, b) = ¢(x(a, b)) = 1). If the individual votes x(a, b) result in either of these
two social choices as it must, the dual response 1 - x(a, b) (which changes every
individual vote in x(a, b) to its negation) must induce the other; i.e., for each x,
q~(x) = 0 (1, resp.) ¢>
q~(1 - x) = 1 (0, resp.)
.¢~ ~a(x) = 0 (1, resp.) = ¢(x)
Thus (A2) restricts use to ~.
To argue (1.6), consider a q~e o~*. If i0 is a component individual common to
all paths for all pairs of alternatives, then {io} is necessarily a cut; i.e., systems
in ~ * have a singleton cut {to}. Since this component io obeys the transitivity
axiom, so does q~. Thus systems in o~* satisfy (A1) so that together with o~ * c o~
we see, o~* is contained in o~ n (A1). One thus has to only argue the reverse
inclusion: systems in ~ obeying transitivity must be in o~*. Consider any such
system cpe ~ and the set of all of its paths for all alternative pairs (a, b). Now
(i) if there is only a single path, then cp¢ o~* trivially and hence satisfies (A1)
since ~ * does.
(ii) If there are exactly two paths in all, then ~ = ~ * ; so again ¢ e ~'*
satisfying (A1).
(iii) If there are at least three paths, choose any three, say P~, p2, p3. Let
i*(1, 2) be a component in p1 ~ e2. Suppose i*(1, 2) ¢ p3 if possible. Then there
exists distinct components i*(2, 3), i*(1, 3) in p2 n p3 and p1 c~ p3 respectively.
Choose the component-votes (individual preference orderings) of these components, and "the system-votes (social choices) by appropriate choices of the votes
for the remaining components in the three paths for an arbitrary but fixed cycle
of alternatives (a, b, c) as shown in Table 1 (for simplicity, the component preferences and votes are generically denoted by P and x(., ") by suppressing the
individual identity subscript. Thus for i*(1, 2), the preference P = Pi*(1.2),
x(a, b) = xi.(1 ' 2)(a, b) . . . . etc.).
Table 1
choices of
votes for
other components in
Corresponding social
ply p2
i~(1, 2)
t~(2, 3)
t'*(1, 3)
aP bP c
cP aP b
bP cP a
x(c, b) = x(b, a) = 0
x(b, a) = x(a, c) = 0
x(a, c) = x(c, b) = 0
F(c, b) = 0
F(b, a) = 0
F(a, c) = 0
p2, p3
p l , p3
M. C. Bhattacharjee
Since F = cp(x) is self-dual, we have
F(a,b)= 1-F(b,a),
all ( a , b ) ~ A 2 ;
viz., xi(a, b) = 1 - xt(b, a), all i~ S, all (a, b); hence F(a, b) = qb(x(a, b)) =
~d(x(a, b) = 1 - ~p(1 - x(a, b)) = 1 - ~(x(b, a)) = 1 - F(b, a). Hence, for the
cycle of alternatives (a, b, c); from the last column of the above table, we have:
F(b, c) = 1 = F(c, a), but F(b, a) = 0; thus contradicting the transitiveness axiom
(A1). Hence all three paths must share a common component.
In the spirit of the above construction, an inductive argument can now similarly
show that if there are (j + 1) paths in all and if every set of j paths have a
common component, then so does the set of all (j + 1) paths; j = 1, 2 . . . . if (A1)
is to hold. Thus there is a component common to all paths, i.e., q ~ if*. Let i*
be such a component. Since i* belongs to every path, it is a one-component cut.
It is also a one component path, but the self-duality of qk That {i*} is both a path
and a cut says,
irrespective of the votes x~ of all other individuals i ~ S , i # i*. Hence i* is a
dictator. But this contradicts (C4). []
While unless there are at least two individual components (n >~ 2) the problem
of aggregation is vacuous, notice the role of the assumption that there are at least
three choices ( k > 2 alternatives) which places the transitiveness axiom in
perspective. There are real-life voting systems (social choice functions) which do
not satisfy (A1). One such example is the majority system R * such that
.¢~ N ( a , b ) > l N ( b , a )
N ( a , b ) = {# of voters i ~ S with aRab} = ~ x~(a,b).
Since each individual is a one-component self-dual system (viz.,
xi(a, b) = 1 - xi(b, a), all (a, b)); the social choice function F corresponding to
the majority voting system R* is
r(a, b ) = (a(x(a, b))= O(l)
~ xi(a, 6)>1 (<)½n.
Thus F is the so-called (m, n)-structure cp in reliability theory, where
m = [½n] + 1 i f n o d d ,
= ½n
i f n even.
Reliability applications in economics
This F = ~p(x) is monotone, indeed a coherent-structure; but F and the corresponding voting system R* is not transitive since with three choices (a, b, c), we
may have a majority (>~ n/2) voters not preferring 'c to b' and 'b to a' but strictly
less than a majority not preferring 'c to a'. Formally Y~7=1x~(a, b)>~n/2,
"i = 1 x i ( b , c) >i n/2 but ~ni = 1 xi(a, c) < n/2; correspondingly F(a, b) = F(b, c) = 1
but F(a, c) = O.
The non-transitiveness of majority systems is a telling example of the impossibility of meeting conflicting requirements each of which is desirable by itself.
Pechlivanides (ibid.) also shows that if we replace axiom (A1) by symmetry of
components (i.e., require tp(x) to be permutation-invariant in coordinates of x) but
retain all other assumptions in Arrow's theorem; the only possible resulting structures are the odd-majority systems. In this sense, majority voting systems with an
odd number (n = 2m + 1) of voters is a reasonable system. While transitiveness
is essentially a consistency requirement, the symmetry hypothesis is an assumption of irrelevance of the identity of individuals in that any mutual exchange of
their identities do not affect the collective choice. One can ponder the implications
of the trade-off between these assumptions for any theory of democratic behavior
for social decision maing.
1.3. The monotone structures tp in Lemma 1 are referred to as coherent structures in Pechlivanides (1975). In accepted contemporary use (viz., Barlow and
Proschan, 1975) however, coherence requires substituting the assumption q~(x) = x
for x = 0, 1 for monotone structures by the assumption that all components are
'relevant'. A component (voter) i E S is irrelevant if its (the person's) functioning
or non-functioning (individual preference for or against an alternative) does not
affect the system's performance (social choice) i.e., ~(x) is constant in all x~,
tp(1,, x) - tp(0;, x) = 0,
all x
where (0;, x):= (x I . . . . x,._ l, 0, xi+ 1. . . . . xn) and (li, x) is defined similarly.
Hence tp(.;, x) is the social choice given i's vote, i e S. Thus,
ie S is relevant
q~(li, x) - tp(0i, x) ¢ 0,
some x
~b(li, x(a, b)) - ~(0 i, x(a, b)) v~ O, some (a, b)
when relevance is translated in terms of social choice given i's vote; while
i ~ S is a dictator
q~(le, x(a, b) = 1, qb(Oi, x(a, b)) = O, all (a, b).
S~, b = {i~ S: ¢(li, x(a, b)) - ~(0,, x(a, b)) = O}
M. C. Bhattacharjee
Then the set of dictators, if any, is
D = {i ~ S: tp(1 t, x) - (a(Oe, x) ~ O, all x} =
S~, b,
(a, b ) ~ A 2
while the set of irrelevant components is
D O = {i 6 S : tP(li, x) - tP(Oi, x) = O, all x} =
Sa, a.
(a, b ) ~ A 2
Note, tp is coherent ,~ ~p is coordinatewise monotone nondecreasing and D O =
(empty); while the 'no dictator hypothesis' holds ~,, D = ~.
In the context of the social choice problem, we may call D O as the set of
'dummy' voters who are those whose individual preferences are of no consequence
for the social choice. An assumption of no dummies (Do empty), which together
with (CI)-(C3) then leads to a coherent social choice function F = ~p(x), would
require that for every individual there is some pair of alternatives (a, b) for which
the social preference agrees with his own. By contrast Arrow's no-indicator hypothesis is the other side of the coin: i.e., for every individual there is some (a, b)
for which his preference is immaterial as a determinant of the society's choice.
While the coherence assumption of reliability theory has yielded rich dividends for
modeling aging/wear and tear of physical systems, it is also clear that the 'no
dummy' interpretation of 'all components are relevant' assumption is certainly not
an unreasonable one to require of social choice functions. What are the implications, for traditional reliability theory, of replacing the condition of relevance of
each component for coherent structures by the no-dictator hypothesis ? Conversely
in the framework of social choice, it may be interesting to persue the ramifications
of substituting the no dictator hypothesis (C4) by the condition of 'no dummy
voters'--themes which we will not pursue here, but which may lead to new
2. Voting g a m e s and political power
We turn to 'voting games' as another illustration of the application of reliability
ideas in other fields. Of interest to political scientists, these are among the better
known mathematical models of group behavior which attempt to explain the
processes of decision for or against an issue in the social setting of a committee
of n persons and formalize the notion of political power. For an excellent overview of literature and recent research in this area, see Lucas (1978), Deegan and
Packel (1978), and Straffin (1978)--all in Brams, Lucas and Straffin (1978a).
2.1. The model and basic results. Denote a committee of n persons by N. Elements of N are called players. We can take N = {1, 2 . . . . . n} without loss of
generality. A coalition is any subset S of players, S ~ N. Each player votes yes
or no, i.e., for or against the proposition. A winning (blocking) coalition is any
Reliability applications in economics
coalition whose individual yes (no)-votes collectively ensure the committee passes
(falls) the proposition Let W be the set of winning coalitions and v: 2Jv~ {0, 1},
t h e binary coalition-value function
v(S) = 1 if S ~ W (S winning),
= 0
if s~ W (S is not winning).
Formally, a simple voting game G (also referred to as a simple game) is an ordered
pair G = (N, W), such that
(i) ~ s W , N ~ W
(ii) S ~ W , S c
T =~ T e W
(if everyone votes 'no' ('yes'), the proposition fails (wins); and any coalition
containing a winning coalition is also a winning coalition) or, equivalently by an
ordered pair (N, v) where
(i) v(~) = 0, v(S) = 1
(ii) v is nondecreasing.
The geometry and analysis of winning coalitions in voting games, as conceptual
models of real life committee situations, provides insights into the decision processes involved within a group behavior setting for accepting or rejecting a proposition. The theoretical framework invoked for such analysis is that of multiperson cooperative games in which the games G are a special class. To formulate
notions of political power we view a measure of individual player's ability to
influence the result of a voting game G as a measure of such power. Two such
power indices have been advanced. To describe these we need the notions of a
pivot and a swing. For any permutation odering 7t = (re(l), ..., re(n)) of the players
N = { 1, ..., n), let Ji(r0 = {j ~ N: re(j) preceeds zr(i)} be the predecessor of i. The
player i is a pivot in zc if Jr(re) ~ W but Je(rc) u {i) e W; i.e., player i is a pivot if
i's vote is decisive in the sense that given the votes are cast sequentially in the
order 7r; his vote turns a loosing coalition into a winning one. A coalition S is
a swing for i if i E S, S e W but S \ { i } q~ W; i.e., if his vote is critical in turning
a winning coalition into a loosing one by changing his vote. Then we have the
following two power indices for each player i e N:
(Shapley- Shubik)
• i =:P(i is pivotal when all permutations are
= ~ ( s - 1)!(n - s)! ,
where s = :[ S] = the number of voters in S and the sum is over all s such that
S is a swing for i.
M. C. Bhattacharjee
/~+= :proportion of swings for i among all coalitions in
which i votes 'yes'
2 n-1
where 7+ is the number of swings for i. The Banzhaff power index also has a
probability interpretation that we shall see later (Section 2.4).
If the indicator variable,
xi = 1 if player i votes 'yes',
if player i votes 'no',
denotes i's vote and C l ( x ) = {x: x+ --- i} is the coalition of assenting players for
a realization x = (x 1, . . . , xn) of 2 n such voting configurations, then the outcome
function ¢: {0, 1}n~ {0, 1} of the voting game is
q,(x) = v ( C , ( x ) ) ,
where v is as defined in (2.1) and tells us whether the proposition passes or fails
in the committee. Note q/models the decision structure in the committee given its
rules, i.e., given the winning coalitions. In the stochastic version of a simple game,
the voting configuration X = (X 1, . . . , Xn) is a random vector whose joint distribution determines the voting-function
v =:E~O(X) = P { $ ( X ) = 1},
the win probability of the proposition in the voting game. Sensitivity of v to the
parameters of the distribution of X captures the effects of individual players' and
their different possible coalitions' voting attitudes on the collective committee
decision for a specified decision structure ft.
When the players act independently with probabilities p = (Pl . . . . . Pn) of voting
'yes', the voting function is
v = h(p)
for some h: [0, 1 ] n ~ [0, 1]. The function h is called Owen's multilinear extension
and satisfies (Owen, 1981):
h ( p ) = p~h(l~, p) + (1 - p+)h(O~, p ) ,
he(p) = : - - = h(l+, p) - h(0+, p ) ,
since the outcome function can be seen to obey the decomposition
Reliability applications in economics
~k(x) = xiO(le, x) + (1 - x~) ~k(O. x ) ,
x) = h(pl .....
P i - 1, ", P~+ 1. . . . , p , ) . These identities are reminiscent of
well known results in reliability theory on the reliability function of coherent
structures of independent components, a theme we return to in Section 2.2.
If, as a more realistic description of voting behavior, one wants to drop the
assumption of independent players; the modeling choices become literally too
wide to draw meaningful conclusions. The problem of assigning suitable joint
distributions to the voting configuration X = {X1. . . . , X,) which would capture
and mimic some of the essence of real life voting situations has been considered
by Straffin (1978a) and others. Straffin assumes the players to be homogeneous
in the sense that they have a common 'yes' voting probability p chosen randomly
in [0, 1]. Thus according to Straffin's homogeneity assumption; the players agree
to collectively or through a third party select a random number p in the unit
interval and then given the choice of p, vote independently. The fact that p has
a prior, in this case the uniform distribution, makes (X 1. . . . . X.) mutually
dependent with joint distribution
P(Xr:(1 ) .....
X . ( k ) = 1, X . ( k +
k ! ( n - k)!
1) . . . . .
X u ( n ) = O) -
(n + 1)!
for any permutation (n(1), ..., n(n)) of the players. (2.8) is a description of
homogeneity of the players which Straffin uses to formulate (i) a power index and
(ii) an agreement index which is a measure of the extent to which a player's vote
and the outcome function coincide. He also considers the relationship between
these indices corresponding to the uniform prior and the prior
f ( p ) = constp(1 - p ) ; results we will fred more convenient to describe in a more
general format in the next section.
2.2. Implications of the reliability framework for voting games. F r o m the above
discussions, it is clear that voting games are conceptually equivalent to systems
of components in reliability theory. Table 2 is a list o f the dual interpretations of
several theoretical concepts in the two contexts:
Table 2
Voting games
Reliability structures
winning (loosing) coalition
blocking coalition
outcome function
voting function
multilinear extension
patch (cut)
complement of a cut
structure function
reliability function
reliability function with independent components
M. C. Bhattacharjee
Thus every voting game has an equivalent reliability network representation and
can consequently be analysed using methods of the latter. As an illustration
consider the following:
EXAMPLE. The simple game (N, IV) with a five
N = {1, 2, 3, 4, 5} and winning coalitions IV as the sets
This voting game is equivalent to a coherent structure
of two parallel subsystems of two components each and a fifth component all in
series. We see that to win in the corresponding voting game, a proposition must
pass through each of two subcommittees with '50~o majority wins' voting rule and
then also be passed by the chairperson (component 5). The voting function of this
game when committee members vote 'yes' independently with a probability p (i.e.,
the version of Owen's multilinear extension in the i.i.d, case) is thus given by
the reliability function
h(p) =
p3(2 - p)2
of the above coherent structure. The minimal path sets of this structure are the
smallest possible winning coalitions, which are the four 3-player coalitions in IV.
Since the minimal cut sets are (1, 2), (3, 4) and (5), their complements
are the minimal blocking conditions which are the smallest possible coalitions B with
veto-power in the sense that their complements N \ B are not winning coalitions.
To persue the reliability analogy further, we proceed as follows. Although it is
not the usual way, we may look at a voting game (N, W) as the social choice
problem of Section 1 when there are only two alternatives A = {a, b}. Set a = fail
the proposition, and b = pass the proposition. Player i's personal preference
ordering R; is then defined by
Reliability applications in economics
i d o e s not (does) prefer b t o a
i votes no (yes).
If xi is i's 'vote' as in (2.4) and y,. = yi(a, b) = 1 or 0 according as a R~ b or a ~,. b
(as in Section 1) is the indicator of preference, then Ye = 1 - xi, i s N , and clearly
qJ(x) = 0 (1) ~ proposition fails (passes) ~ qJ(1 - x) = (p(y) --- 1 (0), where (p is
the social choice and ~ the outcome function. Hence
qJ(x) = 1 - q~(1 - x) = ~bd(x) = tp(x)
since ~b is self-dual. Thus ~O= (p and hence qJ is also self-dual. The latter in
particular implies the existence of a player who must be present in every winning
coalition (viz. (1.7)).
With the choice set restricted to two alternatives; Arrow's condition (C1) is
trivial, condition (C2) of irrelevant alternatives is vacously true and so is the
transitivity axiom (A1). Since ~O= tp, the condition (C1) says ~k(x) must be defined
for all x while axiom (A2) says ~k is binary. The condition of positive responsiveness (C3) holds ¢~- all supersets of winning coalitions are winning, built in the
definition of a voting game. Lemma 1 thus implies:
LEMMA 2. The outcome function ~k o f a voting game is a monotone structure
function. ~b is a coherent structure iff there are no "dummies'.
The first part of the above result is due to Ramamarthy and Parthasarathy
(1984). The social choice function analogy of the outcome function and its
coherence in the absence of dummies is new.
A dummy player is one whose exclusion from a winning coalition does not
destroy the winning property of the reduced coalition, i.e.,
i~S, S~W
Equivalently, i is not a dummy iff there is a swing S for i. The coherence
conclusion in Lemma 2 holds since in a voting game the 'no dummy hypothesis'
says all components are relevant in the equivalent reliability network, viz. for any
i is relevant
there exists x ° such that ~O(li, x °) - qJ(0;, x °) ~ 0
x ° = 1} u {i} is a swing for i
¢~ player i is not a dummy.
An equivalent characterization of a dummy i ~ N is that i ¢ minimal winning
coalitions. On the other hand in the social choice scenario of Section 1, a player
i ~ N is a dictator if {i} is a winning as well as a blocking coalition.
When the players act independently in a stochastic voting game, we recognize
the identities (2.6), (2.7) on the outcome function and Owen's multilinears
extension as reproducing standard decomposition results in coherent structure
M. C. Bhattacharjee
theory, as they must. The voting funcion h(p) being a monotone (coherent)
structure's reliability function must be coordinatewise monotone: p<~p'
=~ h(p)<~ h(p') which has been independently recognized in the voting game
context (Owen, 1982). The Banzhaffpower index (2.3) is none other than the
structural importance of components in ~. Since research in voting games and
reliability structures have evolved largely independent of each other, this general
lack of recognition of their dualism has been the source of some unnecessary
duplication of effort. Every result in either theory has a dual interpretation in the
other, although they may not be equally meaningful in both contexts. The following
are some further well known reliability ideas in the context of independent or i.i.d.
components which have appropriate and interesting implications for voting games.
With the exception of 2.2.1 below, we believe the impact of these ideas have not
yet been recognized in the literature on voting games with independent or i.i.d.
2.2.1. The reliability importance
v, = E{~/,(1,, x) - ~k(Oi, x)}
measures how crucial is i's vote in a game with outcome function ~k and random
voting probabilities. As an index of i's voting power, v; is defined for any
stochastic voting configuration X and has been used by Straffin within the homogeneity framework ((X~, . . . , X,) conditionally i.i.d, given p). We may call v; the
voting importance of i. If the players are independent, then
Vi = h i ( p )
in the notation of Section 2.1 (viz. (2.6)). Thus e.g., in the stochastic unanimity
game where all players must vote yes to pass a proposition, the player least likely
to vote in favor has the most voting importance. Similarly in other committee
decision structures, one can use vi to rank the players in order of their voting
importance. For a game with i.i.d, players, i's voting importance becomes the
function v; = hi(p) where he(p) = h(1 i, p) - h(O;, p) and h('i, o), h(p) denote the
corresponding versions of h(.i, p), h(p) respectively when p = (p . . . . . p). Since
in this case h'(p) = Y,i~Nhi(P), one can also use the proportional voting importance
v~* -
E j ~ N Vj
h' ( p )
as a normalized power index in the i.i.d, case.
2.2.2. The fault-tree-analysis algorithm of reliability theory will systematically
enumerate the smallest cut sets and hence the minimal blocking coalitions of a
voting game through its reliability network representation. The dual event tree
Reliability applications in economics
algorithm will similarly produce all minimal winning coalitions, the Banzhaff
power indices and the voting importances.
2.2.3. S-shapedness of the voting function for i.i.d, players with no dummies. This
follows from the M o o r e - S h a n n o n inequality (Barlow and Proschan, 1965)
p(1 - p) ~ >~ h(p)(1 - h(p))
for the reliability function of a coherent structure with i.i.d, components. Implications of this f a c t in the voting game context is probably not well known. In
particular the S-shapedness of the voting function implies that among all committees of a given size n, the k-out-of-n structure (lOOk~n% majority voting
games) have the sharpest rate of increase of the probability of a committee of n
i.i.d, players passing a bill as the players' common yes-voting probability increases.
2.2.4. Component duplication is more effective than system duplication. This
property of a structure function implies: replicating committees is less effective in
the sense of resulting in a smaller outcome/voting function than replicating committee members by subcommittees (modules) which mimic the original
committee structure ~. This may be useful in the context of designing representative bodies when such choices are available.
2.2.5. Composition of coherent structures. Suppose a voting game (N, W) has no
dummies and is not an unanimity game (series structure) or its dual (any single
yes vote is enough: parallel structure). Suppose each player in this committee N
with structure ~b is replaced by a subcommittee whose structure replicates the
original committee, and this process is repeated k-times; k = 1, 2, .... With i.i.d.
players, the voting function hk(p) of the resulting expanded committee is then the
reliability function of the k-fold composition of the coherent structure qJ which has
the property
hk(p) $ 0, = Po, 1' 1 ¢> p < , = or > Po
as ki', ~ or ~ ~ (Barlow and Proschan, 1965) where Po is the unique value
satisfying h(po) = Po, guaranteed by S-shapedness. When we interpret the above
for voting games, the first conclusion is perhaps not surprising, although the role
of the critical value Po is not fully intuitive. The other two run counter to crude
intuition; particularly the last one which says that by expanding the original
committee through enough repeated compositions, one can almost ensure winning
any proposition which is sufficiently attractive individually. The dictum 'too many
cooks spoil the broth' does not apply here.
M. C. Bhattacharjee
2.2.6. Compound voting games and modular decomposition. If (Nj, Wj), j = 1,
2, ..., k, are simple games with palrwise disjoint player sets and (M, V) is a
simple game with XMI = k players; the compound voting game (N, W ) is defined
as the game with N = Uj= ~Nj and
W= {ScN:
{jeM: SnNje
Wj.}e V}.
(M, V) is called the master-game and (Nj, Wj) the modules of the compound game
(N, W). The combinatorial aspects of compound voting games have been extensively studied. Considering the equivalent reliability networks it is clear however
that if the component games (Nj, Wj) have structures ~, j = 1, ..., k, and the
master game (M, V) has structure tp; then the compound voting game (N, W) has
= ,/,(¢,,
Conversely the existence of some tp, ~k~, ..., ~bk satisfying this representation for
a given ~k can be taken as an equivalent definition of the corresponding master
game, component subgames and the accompanying player sets as the modular
sets of the original voting game. E.g., in the 5-player example at the beginning of
this section, clearly both subcommittees J1 = { 1, 2}, J2 - {3, 4} are modular sets
and the corresponding parallel subsystems are the subgame modules. Ramamurthy and Parthasarathy (1983) have recently exploited the results on modular
decomposition of coherent systems to investigate voting games in relation to its
component subgames (modules) and to decompose a compound voting game into
its modular factors (player sets obtained by intersecting maximal modular sets or
their complements with each other). Modular factors decompose a voting game
into its largest disjoint modules. The following is typical of the results which can
be derived via coherent structure arguments (Ramanurthy and Parthasarathy,
THREE MODULES THEOREM. Let J;, i = 1, 2, 3, be coalitions in a voting game
(N, W ) with a structure ~b such that Ja to J2, Jz to J3 are both modular. Then each
J~ is modular, i = 1, 2, 3 and U~= x Ji is either itself modular or the full committee
N. The modules (J1, ~ki) i = 1, 2, 3 which appear in (N, ~k) are either in series or in
parallel, i.e., the three-player master game is either an unanimity game, or a trivial
game where the only blocking location is the full committee.
2.3. The usual approach in modeling coherent structures of dependent components is to assume the components are associated (Barlow and Proschan, 1975).
By contrast, the prevalent theoretical approach in voting games, as suggested by
Straffin (1978) when the players are not independent assumes a special form of
dependence according to (2.8). One can show that (2.8) implies X 1. . . . , Xn are
associated. Thus voting game results under Straffin's model and its generalized
version suggests an approach for modeling dependent coherent structures. These
Reliability applications in economics
results are necessarily stronger than those that can be derived under the associatedness hypothesis alone.
The remarkable insight behind Straffin's homogeneity assumption is that it
amounts to the voting configuration X being a finite segment of a special sequence
of exchangeable variables. The effect of this assumption is that the probability of
any voting pattern x -- (x~, . . . , x,) depends only on the size of the assenting and
dissenting coalitions and not on the identity of the players, as witness (2.8). One
can reproduce this homogeneity of players through an assumption more general
than Strattin's. Ramamurthy and Parthasarathy (1984) exploit appropriate reliability ideas to generalize many results of Straffin and others, by considering the
following weakening of Straffin's assumption.
X = (X 1. . . . .
voting configuration
X , ) is a finite segment of an infinite exchangeable sequence.
Since X l , 2 2 , . . . are binary; by the Finnetti's well known theorem, the voting
configuration's joint distribution has a representation
P(X~o ) . . . . .
X,~(k) = 1, X.(k+ ~) . . . . .
= --1"~p~'(1 - p ) " - k dF(p)
X,~(,,) = O)
for some prior distribution F on [0, 1]; and the votes X 1 . . . . . X n are conditionally
independent given the 'yes' voting probability p. Straffin's homogeneity assumption corresponds to an uniform prior for p, leading to (2.8). For a stochastic
voting game defined by its outcome (structure) function ~k, consider the powerindex
v,. =:E{$(1 i, X) - ~(0i, X)},
defined in (2.9) and the agreement indices
Ai = : e { x , = ¢ ( x ) } ,
pi =:cov(x;, q4x)),
t5 =:
cov(X, q l ( X ) l p ) d F ( p ) .
Also, let
b = :cov(P, H ( P ) ) .
Here P is the randomized probability of voting 'yes' with prior F in (2.10). Note
b, tri are defined only under the general homogeneity assumption, while vi, A t and
Pi are well defined for every joint distribution of the voting configuration X. Recall
M. C. Bhattacharjee
that a power index measures the extent of change in the voting game's outcome
as a consequence of a player's switching his vote and an agreement index
measures the extent of coincidence of a player's vote and the final outcome. Thus
any measure of mutual dependence between two variables reflecting the voting
attitudes of a player and the whole committee respectively qualifies as an
agreement index. An analysis of the interrelationships of these indices provides an
insight into the interactions between players' individual level of command over the
game and the extent to which they are in tume with the committee decision and
ride the decisive bandwagon.
The agreement index A i is due to Rae (1979). Under (2.8), ve becomes Straffin's
power index and a e is proportional to an agreement index also considered by
Straffin. Note all the coefficients are non-negative. This is clear for ve and A e, and
follows Pc, ere and b from standard facts for associated r.v.s. (Barlow and
Proschan, 1975) which is weaker than the general homogeneity (GH) hypothesis.
The interesting results under the assumption of general homogeneity (Ramamurthy
and Parthasarathy, 1984) are
2 b s ~ ) ~ tri >/
h(p)(1 - h(p)) d F ( p ) ,
A e = 2 o - j + 2 b + 1.
The equality in the second assertion holds only under StralTm's homogeneity (SH)
assumption. This assertion follows by noting tre = ~ o1 P ( 1 - h(p))dF(p) under
GH, h'(p) = Y'e hi(P), termwise integration by parts in Y~etre with uniform prior to
conclude the equality and invoking the S-shapedness of h(p) for the bound.
The above relations in particular imply
(i) Under GH, i is dummy ¢~ a~ = 0. If the odds of each player voting yes and
no are equal under GH, i.e., if the marginal probability P(X e = 1) = ½; then we
also have, i dummy ¢:~ Pc--- b ~ A i = 2b + ½. Thus since ~5 is in a sense the
minimal affinity between a player's vote and the committee's decision, Straffin
suggests using 2a e (Ae - 2b - 1) as an agreement index.
(ii) Let w, l = 2 n - w be the number winning and losing coalitions. Since
hi(½) = fli (structural importance = Banzhaff power index) and h(1) = w/2"; taking
F as a point-mass at ½, (2.11) gives
Z fli >/2-2(n-1) wl"
Without the equal odds condition, the last relation in (2.11) has a more general
version that we may easily develop. Let n; = : .[ 1 p dF(p) = E X~ be the marginal
probability of i voting yes under general homogeneity. Then
Reliability applications in economics
A i = ~ P(X i = ~b(X) = j ) = E X~k(1., X ) + E((1 - X~)(1 - ~b(0e, X))
= E X 1 ~O(X) + E(1 - X 0 ( 1 - if(X))
= 2 cov(X 1, qJ(X)) + E ~O(X){2E X~ - 1} + 1 - E X~
= 2p, + v ( 2 n , - 1) + (1 - hi),
= 2 p , + ~ v + (1 - h i ) ( 1 - v)
which reduces to the stated relationship whenever n i = 1 for some i e N. Notice
that the convex combination term in braces, which measures the marginal contribution to A i of a player's voting probability n/, depends on the game's value v via
an interaction term unless n i - 2"
2.4. Influence indices and stochastic compound voting games. There are some
interesting relationships among members of a class of voting games via their
power and agreement indices. In the spirit of (2.10), consider a compound voting
game consisting of the two game modules
(i) a voting game G = (N, W) with N = { 1. . . . , n}, and
(ii) a simple majority voting game G,, = ( N , W,,) of (2m + 1) players with
{n+ 1,...,n+2m, n + 2 m +
W m = ( S = U m" ISl>~m+ 1},
i.e., any majority (at least (m + 1) players) coalition wins. Replacing the player
- ( n + 2m + 1) in the majority game by the game G = (N, W), define the
compound game G~* = (N*, W*), where
{1 . . . . . n , n +
1. . . . . n + 2 m } ,
W* = {S c N*" either ] S \ N I ~ m + 1 or/and
I S \ N I >~m, S n N ~ W}.
G* models the situation where the player - (n + 2m + 1) in the majority game G m
is bound by the wishes of a constituency N, as determined by the outcome of the
constituency voting game G = (N, W), which he represents in the committee N m.
The winning coalitions in the composite game G* are those which either have
enough members to win the majority game G,, or is at most a single vote short
of winning the same Gm when the player representing the constituency N is not
counted but containing a winning coalition for the constituency game G = (N, W).
The winning coalitions in the latter category are precisely those S such that
(i) ]S\N[ = m, i.e., for any i¢ S \ N , {i} u S \ N is a swing for every such player
i in the majority game Gm and (ii) using appropriate players in S also wins the
constituency voting game G. With i.i.d, voting configuration, if hi(p) and h*(p)
M. C. Bhattacharjee
respectively denote the voting importance of i~ N in G and G*, then clearly
, i~N.
Under general homogeneity, the class of priors
F a . b ( p ) = ( a ~( )aT+(b~-- - l ) 1)! ! fo p u a - 1 ( 1 - u ) b- 1 du,
a>O, b>O,
which leads to the voting configuration distribution
a(k) b(n - k)
/'(X~ . . . . .
X k = 1, Xk+~ . . . .
= X. = 0)-
(a + b) (") '
can reflect different degrees of mutual dependence (tendency of alignments and
formation of voting blocks) of players for different choices of a, b. Player i's vote
X,. in the model (2.15) is described by the result of the i-th drawing in the well
known Polya-urn model which starts with a white and b black balls and adds a
ball of the same color as the one drawn in successive random drawings. For any
voting game G with a Polya-urn prior Fa. b, denote the associated influence indices
of power/agreement by writing ve = re(G: a, b), etc . . . . Notice that Straffin's
original homogeneity assumption corresponds to the prior F1, 1. Notice that
Straffin's original homogeneity assumption corresponds to the prior F1, 2. Using
vi(G: a, b)= S~ht(p)dF(p) and (2.14), Ramamurthy and Parthasarathy (1984)
have shown:
v,.(G: 1, 1)= ~i,
a/(G: a, b ) =
vi(G: a + 1, b + 1),
and, in the framework of the compound voting game G* in (2.13),
oi(G: m + 1, m + 1) = (2m + l)vi(G*: 1, 1),
extending the corresponding results of Straffin (1978) which can be recovered
from the above by setting a = b = m = 1. The second assertion above shows that
the apparently distinct influence notions of 'agreement' and 'power' are not
unrelated and one can capture either one from the other by modifying the degree
of dependence among the voters as modeled by (a, b) to (a + 1, b + 1) or
(a - 1, b - 1) as may be appropriate. The first assertion states the equivalence of
Shapley-Shubik index with voting importance under uniform prior (Straffin's
Reliability applications in economics
power index), while the third assertion shows a relationship between voting importances in the compound game in (2.13) and the corresponding constituency game
under appropriate choice of voter-dependence in the two games.
Notice v~(G: m + 1, m + 1)--}fl;, the Banzhaff power-index in the constituency
game, since the case of players voting yes or no independently with equal odds
(p = ½) can be obtained by letting m ~ oo in the prior Fm+ ~.m + 1" Hence by
(2.16), in the composite game G* with (2m + 1) players,
(2m + 1)v;(G~: 1, 1)~fle
as n ~ oo,
i.e., Straffin's power-index in the compound game G* multiplied by the number
of players approaches the Banzhaff power index (structural importance) in the
constituency game G = (N, W).
The priors Fa. b, under the general homogeneity hypothesis, reflect progressively
less and less voter interdependence with increasing (a, b) and thus in this sense
also models the maximum possible such dependence under Straffm's homogeneity
when a = b = 1, the minimal values for a Polya-urn. To emphasize the conceptual
difference as well as similarity of the Shapley-Shulik and Banzhaff indices of
power, we may note that they are the two extreme cases of the voting importance
vt (viz. 2.9)) corresponding to a = b = 1 and limiting case a = b---} oo.
It is interesting to contrast the probability interpretations of the Shapley-Shubik
and Banzhaff power indices. A player i~ N is crucial if given the others' votes, his
voting makes the difference between winning or loosing the proposition in the
committee. While the Shapley-Shubik index ~; in (2.2) is the probability that i ~ N
is crucial under Straffin's homogeneity (player's votes are conditionally i.i.d, given
p), the Banzhaff index fl; in (2.3) is the probability that i is crucial when the players
choose 'yes'-voting probabilities Pi, i ~ N, independently and the Pi, i ~ N are
uniformly distributed. The probability of individual group agreement under this
independence assumption is
/g;. (1) + (1 -/~;). (½) = ½(1 +/8~).
The right hand side can be used as an agreement index. These results are due to
Straffin (1978).
2.5. While we have argued that several voting game concepts and results are
variants of system reliability ideas in a different guise; others and in particular the
general homogeneity assumption and its implications may contain important
lessons for reliability theory. For example; in systems in which the status of some
or all components may not be directly observable except via perfect or highly
reliable monitors--such as hazardous components in a nuclear installation, the
agreement indices can serve as alternative or surrogate indices of reliability
importance of inaccesible components. The general homogeneity assumption in
system reliability would amount to considering coherent structures of exchangeable components, a strengthening of the concept of associatedness as a measure
M. C. Bhattacharjee
of component dependence; an approach which we believe has not been fully
exploited and which should lead to more refined results than under associatedness
of components alone.
3. 'Inequality' of distribution of wealth
3.1. One of the chief concerns of development economists is the measurement
of inequality of income or other economic variables distributed over a population
that reflects the degree of disparity in ownership of wealth among its members.
The usual tool kit used by economists to measure such inequality of distribution
is the well known Lorenz curve and the Gini index for the relevant distribution
of income or other similar variables, traditionally assumed to follow a log-normal
distribution for which there is substantial empirical evidence and some theoretical
arguments. Some studies however have questioned the universality of the lognormal assumption; see e.g., Salem and Mount (1974), MacDonald and Ransom
(1979). Mukherjee (1967) has considered some stochastic models leading to
gamma distributions for distribution of welath variables such as landholding.
Bhattacharjee and Krishnaji (1985) have considered a model for the landholding
process across generations, allowing for acquisition and disposal of land in each
generation and where ownership is inherited, to argue that the equilibrium distribution of landholding when it exists must be NWU ('new worse than used') in the
sense of reliability theory, i.e., the excess residual holding X - t [ X > t over any
threshold t stochasticaly dominates the original landholding variable X in the
population. The N W U property is a fairly picturesque description of the relative
abundance of 'rich' landowners (those holding X > t) compared to the total population of landowners across the entire size scale.
In practice, even stronger evidence of disparity has been found. In an attempt
to empirically model the distribution of landholdings in India, it has been found
(Bhattacharjee and Krishnaji, 1985) that either the log-gamma or/and the D F R
gamma laws provide a better approximation to the landholding data for each state
Table 3
Landholding in the State of W. Bengal, India (1961-1962) and model estimates
size (acres)
on (1, oo)
0- 1
1- 5
Reliability applications in economics
in India based on National Sample Survey (NSS) figures. Table 3 is typical of
the relatively better approximations provided by the gamma and the log-gamma
on (1, ~ ) relative to log-normal. While the log-gamma is known to have an
eventually decreasing failure rate, the estimated shape parameter of the gammas
were all less than one and typically around ½ for every state and hence all had
decreasing failure rates.
For landholdings, the NWU argument and the empirical D F R evidence above
(everywhere with gammas, or in the long range as with the log-gamma) are
suggestive of the possibility of exploiting reliability ideas. If X >/0 is the amount
of wealth, such as land, owned with distribution F; it is then natural to invoke
appropriate life-distribution for the concepts for the holding distribution F in an
attempt to model the degree of inequality present in the pattern of ownership of
wealth. The residual-holding X - t l X > t
in excess of t with distribution
Ft(x ) = 1 - {ff(t + x)/ff(t)} and the mean residual holding
g(t) : = E ( X - t IX > t)
correspond respectively to the notions of the residual-life and the mean residual
life in reliability theory. In particular the extent of wealth which the 'rich' command is described by the behavior of g(t) for large values of t. More generally,
the nature of/7, and the excess average holding g(t) over an affluence threshold
t as a function of the threshold provides a more detailed description of the pattern
of ownership across different levels of affluence in the population.
Using the above interpretations of F, and g(t); the notion of skew and heavy
tailed distributions of wealth as being symptomatic of the social disparity of
ownership can be captured in fairly pitcuresque ways with varying degrees of
strength by the different anti-aging classes (DFR, IMRL, NWU, NWUE) of 'life
distributions' well known in reliability theory. For example a holding distribution
F is D F R (decreasing failure rate: F,i"st stochastically increasing in t) if the proportion of the progressively 'rich' with residual holding in excess of any given
amount increases with the level of affluence. The other weaker anti-aging hypotheses: IMRL (increasing mean residual life: g(t)'r ), NWU (new worse than used:
Ft >~StF, all t) and N W U E (new worse than used in expectation: g(t)>~ g(0+)) can
be similarly interpreted as weaker descriptions of disparity.
Motivated by these considerations, Bhattacharjee and Krishnaji (1985) have
suggested using
11 = g*/l~,
where g* = lim g(t), /~ = g(0 +)
1 2 = t ~ o o l i m E ( E I x > t ) = l + limt_~g(t)--t '
when they exist, as indices of inequality in the distribution of wealth. They also
consider a related measure Io = g* - # =/~(I1 - 1) which is a variant of I~, but
M. C. Bhattacharjee
is not dimension free as 11, 12 are. The assumption that the limits in (3.1) exist
is usually not a real limitation in practice. In particular the existence of g* ~< oo
is free under IMRL and DFR assumptions, with g* finite for reasonably nice
subfamilies such as the D F R gammas. More generally, the holding distributions
for which g* ~< oo (g* < oo respectively) exists is the family of 'age-smooth' life
distributions which are those F for which the residual-life hazard function
- l n f f t ( x ) converges on [0, oo] ((0, ~ ] respectively) for each x as t ~ o o
(Bhattacharjee, 1986).
11 and 12 are indicators of aggregate inequality of the distribution of wealth in
two different senses. 11 measures the relative prepondrance of the wealth of the
super-rich, while 12 indicates in a sense how rich they are. The traditional index
of aggregate inequality, on the other hand, as measured by the classical Gini-index
(Lorenz measure) G can be expressed as
G = P ( Y > X ) - P(Y<~ X ) = 1 - 2
Fa(x ) dF(x),
where X is the amount of wealth with holding distribution F and Y has the so
called "share-distribution'
Fl(X ) = : # - 1 f o t dF(t),
the share of the population below x. A somewhat pleasantly surprising but not
fully understood feature of the three indices 11, I 2 and G is that they turn out to
be monotone increasing in the coefficient of variation for many holding distributions F. Such is the case with G under log-normal, 11 under gamma and I 2 under
log-gamma (Bhattacharjee and Krishnaji, 1985). Note also that whenever the
holding distribution is anti-aging in DFR, IMRL, NWU or NWUE sense, the
coefficient of variation (c.v.) is at least one (Barlow and Proschan, 1975); a
skewness feature aptly descriptive of the disproportionate share of the rich.
Recently the author has considered other inequality indices which share this
monotonicity in c.v. under weak anti-aging hypotheses and have re-examined the
appropriateness of 11, 12 and measures of aggregate inequality to show
(Bhattacharjee, 1986a):
(i) The non-trivial case 1 < 12 < m, implies I~ = ~ necessarily and then
12 = (1 + r/:) lim ~,'(t)
t~ ~ 11(0
where t/ is the coefficient of variation of the holding distribution F,
11(0 = g(t)/l~ = S ~ ff(u) d u / # f f ( t ) ~ I~ = ~ and IFl(t) is the inequality function
11( 0 computed for the share distribution F 1 associated with F.
Reliability applications in economics
(ii) The ratio of the hazard functions of the holding and share distributions
converge to 12:
12 = lim l n ( 1 - F(t))
' ~ ln(1 - El(t))
Clearly 11 ~> l if the holding distribution F is N W U E , with equality iff F is
exponential. Similarly by (3.1) I z >/1 with equality iff g(t) = o(t) or, an equivalent
condition on hazard functions via (3.4). The question, when 11 and I 2 are finite
so as to be meaningful for purposes of comparison across populations has the
following answers (Bhattacharjee, 1986a):
(iii) 11 < ~ ~ 1 - F(ln x) is ( - p)-varying, for some p • (0, 0o ]. F is strictly
N W U E ~ I 1 > 1.
(iv) For any holding distribution F, I <~ I 2 <<.00. The different possibilities are
characterized by
(a) I f F is D F R , then 12 = 1 ~:~ the residual holding scaled by its mean converges
to exponential, i.e.,
e ( x > t + xg(t) [X > t) ~ e - x .
This condition is necessary for I 2 = 1, without the D F R hypothesis.
(b) 1 < 12 < oo . ~ the "excess holding factor' over an affluence threshold t converges to the Pareto distribution:
with ~ = & l ( & -
(c) 12 = ~ ¢:~ P ( X - t > x i X > t) ~ t/(t + x) as t ~ 0o. Notice that the distribution on the right hand side is D F R with infinite mean.
The n.s.c, in (iii) is the condition of generalized regular variation (Feller, 1966;
Senata, 1976): a real valued function h(x) on the half-line is regularly-varying if
h(xy)lh(y) converges as y ~ o o and then h ( x y ) / h ( y ) - - , x ~, some ~ ( - ~ ,
With an obvious interpretation of x ~ when ~ = + ~ , such an h(x) is called
3.2. The Lorenz curve and TTT-transform. While 11, 12 and the classical Gini
index are all aggregate measures of inequality, it is also useful to have a more
dynamic measure of inequality which will describe the variation of the disparity
of ownership with changing levels of affluence. This is classically modeled by the
Lorenz curve
where # is the average holding and F - J(u) = inf{t: F(t) >1 u} measures the proportion of total wealth owned by the poorest 100p ~o of the population, and is thus
M. C. Bhattacharjee
a variant of the share distribution F 1 in (3.2), namely L(F(t)) = Fl(t ). As remarked
earlier, the ratio g(t)/# of the mean residual holding to the average holding can
also serve such a purpose. The Lorenz curve L and its inverse L - l are both
distribution functions on the unit interval. The relevance of reliability ideas for
modeling inequality and relationships of the Lorenz curve to some well known
functionals of life distributions was first indicated by Chandra and Singpurwalla
(1981) and further studied by Klefjs0 (1984). If
W(p) =" ~-- 1 ~0F '(p) F(t) dt
is the scaled total time on test (TTT) transform of the holding distribution F viewed
as a life distribution with mean # and the cumulative TTT-transform,
V:= So1 W ( p ) d p , then
L ( p ) = W ( p ) - (1 - p)/~- i F - l(p),
(Chandra and Singpurwalla, 1981) where the Gini-index
G= 1-2
= 2
F,(t) d F ( t ) = 1 - 2
L ( p ) dp
{ p - L ( p ) } d?
is scale-equivalent to the area bounded by the diagonal and the Lorenz curve, as
is well known. Based on a random sample with order statistics X(1), X(2). . . . , X(,)
from F, the estimated sample Lorenz curve and the Gini-statistic
/ n
j=,j(n -j)(X(j+I)n
(n - 1) Z j _ ,
are similarly related to the total time on test statistic and its cumulative version
= W.
- (n - i)
Chandra and Singpurwalla (1981), Klefsj0 (1984) and Taillie (1981) have used
partial orderings of life distributions to compare the Lorenz curves of holding
distributions which are so ordered. For the partial ordering notions
Reliability applications in economics
(i) H <c F if F - IH is convex,
(ii) H < . F if F - 1H is star-shaped (F- ~H(t)/t is increasing,
(iii) H <.T F if ( F - 1 / H - 1 ) is increasing,
(iv) H < m F if ~x~ { i f ( t ) - H(t)} dt>~ 0, all x > 0, with equality at x = 0;
they show,
H <oF
H<m F
L r ( p ) <~L~r(p)
LF(p)<~LI_I(p) ,
In particular taking H to be exponential, the distribution F in (i) above corresponds to DFR, (ii)to D F R A and (iv)to H N W U E (Klefsj6, 1982). Reversing
the roles of H and F leads to the dual aging classes. (3.6) implies that
L ( p ) <~p + (1 - p)ln(1 - p ) ,
the Lorenz curve of the exponential whenever the holding distribution is H N W U E
with a finite mean. This bound obviously remains valid for the smaller class of
NWU and D F R distributions for which we have earlier found some theoretical
and empirical evidence respectively as plausible models of landholding distributions.
In a more general vein, Klefsj6 (1984) remarks that in the spirit of (3.5);
contrasting the Lorenz curve against the uniform distribution on (0, 1), the quantities
Jk =:(k +
1) f o ~ ( 1 - p ) k
k>~ 1,
2 { p _ L ( p ) } d p , k>~2,
can be used as generalized indices of inequality. The Gini-index is the special case
G = J~ = L 2. Notice in view of (3.7), we have Jk >t O, L k >~ 0 for all anti-aging
holding distributions F or their 'aging' duals; and J~ = L k = 0 only in the
egaliterian case L ( p ) = p where everybody owns the same amount of wealth (F
is degenerate). By expressing Jk as
Klefsj6 (1984) implicitly notes that Jk can be interpreted as the excess over k - 1
of the ratio of the mean life of a parallel system of (k + 1) i.i.d, components with
life distribution F to that of a similar system with exponential lives. Similarly, we
M. C. Bhattacharjee
Lk = k
( l - u ) ~ - 1 ( 1 - W(u))du= 1 - #
ffk(t) dt
measures the relative advantage of a component with life F against a series system
of k such i.i.d, components as measured by the difference of the corresponding
mean lives as a fraction of the component mean life. These interpretations bring
to a sharper focus the relationships of the notion of 'inequality of distribution' in
economics to measures of system effectiveness in reliability.
3.3. Applications to statistical analysis of lifelengths. The reliability approach to
modeling 'inequality of distributions' suggest applications to reliability inference.
Using weak convergence of the empirical Lorenz process {L~(t): 0 ~< t ~< 1),
if j -
1 < t ~ < -j,
if t = 0,
to a process related to Brownian bridge (Goldie, 1977), it is thus possible to
construct a test of exponentiality--a theme of central interest in reliability and life
testing. However the difficulty of evaluating the exact distribution of L,(t) to
determine the critical points of the goodness-of-fit test based on the sample
Lorenz curve has in practice required simulation even in large samples (Gail and
Gatswirth, 1978). In contrast the critical cut-off values of the corresponding test
based on the sampled TTT-process Wn(t)=:xfn{Wn(j/n)-W(t)}, 0 ~ t ~ < 1,
(Barlow and Campo, 1975) are the usual Kolmogroff-Smirnov statistics; since,
under the null hypothesis of exponentiality (W(t) = t), Wn(t) converges exactly to
the Brownian Bridge.
If the alternatives belong to a more restricted family such as the well known
non-parametric life distribution classes in reliability, then there are other possibilities. Kelfsj0 (1983) has used a variant of the aggregate inequality index L~ in
(3.8) to construct a test of exponentiality against H N B U E ( H N W U E ) alternatives. His test statistic is based on an estimate of B~, =:kLk- ( k - 1), noting
B k >/(~<)0 if F is H N B U E ( H N W U E ) with B k = 0 only if F is exponential.
Estimation and tests of monotonicity and a turning point of the mean residual
life function g(t) have been considered by Hollander and Proschan (1975), Guess
and Proschan (1983). Our inequality indices 11 and 12 suggest a related open
problem: estimation and tests for I~, I 2 which are parameters descriptive of the
tail behavior of the mean residual life. The question of estimating I l is well defined
within the family of age-smooth life distributions (Bhattacharjee, 1986). On the
other hand the domains of attraction results (Bhattacharjee, 1986a) described
earlier, which characterize possible values of 12 implies that estimating 12 and
testing I s = 1 against 1 < 12 < oe are problems of independent interest for reliability theory.
Reliability applications in economics
4. R & D rivalry and the economics of innovation
4.1. Innovations and accompanying technological breakthroughs have changed
the lot of mankind throughout history and noticeably more so in the present
century at an accelerating pace. Since technological change affects market structure through altering the means of production, economists began to be interested
in the subject of technical advance around the fifties. Although there are some
earlier references to the economic aspects of technological advance (Taussig, 1915;
Hicks, 1932), the stage for serious inquiry on the economics of such advance was
set by Schumpeter (1961, 1964, 1975) who emphasized the role of innovation as
an economic activity. Since then, the recognition of technical advance as a major
source of economic growth has been the subject of many studies, mostly empirical. These studies deal with empirical relationships of industrial innovations to
firm size and concentration as indicators of market structure, the 'technologypush' and 'demand-pull' factors (Arrow, 1962) as incentives for innovation, and
such other relevant variables. Collectively they point to the need for a conceptual
framework and recently an economic theory of technical advance has began to
emerge (Kamien and Schwartz, 1982). In this view, the economic agents are firms
or entrepreneurs and an act of product- or process-innovation straddles all
activities from basic research through invention to development, production, distribution and collection of consequent revenues against the backdrop of industrial
rivalry in the competition to gain market supremacy. Schumpeter recognized that
acts of invention and innovational entrepreneurship are distinct as are the corresponding risks; and it is only the latter which can lead to the diffusion of benefits
of invention to its ultimate consumers.
Innovation and entrepreneurship in this framework is viewed as a race to be
the first with the incentive of commanding extraordinary profits at least until
imitators appear when such monopoly profits will begin to be eroded. The
'Schumpeterian hypothesis' that the opportunity to realize monopoy profits spurs
invention and the presence of some monopoly power has a similar effect, the latter
also stressed by Galbraith (1952), forms the basis of a modem economic theory
of technical advance. The accent is on competition through innovation rather than
through price alone, and is thus contrary to the traditional tenets of the western
economic doctrine of 'perfect competition' which would eliminate any excess
profit of an innovation by immediate imitation.
4.2. The presence of identified or potential rivals who are in the race to be the
first to innovate constitutes the major source of uncertainty for an entrepreneur.
It is this aspect of innovational ( R & D ) rivalry on which reliability ideas can be
brought to bear that is of interest to us. Even within the context of such applications, there are a host of issues in modeling the economics of innovation which
can be so addressed within the Schumpeterian framework. Kamien and Schwartz
(1982) provide a definitive account of contemporary research on the economics
of technical advance, where reliability researchers will recognize the potential to
exploit reliability ideas through modeling the uncertainty associated with
M. C. Bhattacharjee
innovational rivalry and possible duration of monopoly between successful
innovation and rivals' imitation. These ideas do not appear to have been explicitly
recognized and are only implicit in Kamien and Schwartz (1982). We will
consider one such model to focus on the relevance of reliability concepts in
modeling the economics of technical advance which may lead to deeper insights
into the role of innovational rivalry as a determinant of technological progress.
In this simplified model of innovation as an economic activity under the
Schumpeter scenario; our entrepreneur or firm has either only one product
(economic 'good') or none at all (breaking in as a newcomer), and is competing
against rivals to develop an innovation. We assume there is no essential resource
constraint and no major uncertainty important enough to warrant stochastic
modelling of the entrepreneur's time to complete development. Any desired completion time r can be achieved by spending a required amount C(v) representing
the net present value of the cost stream incurred to complete development at
time ~. Although it is usual to assume that 0 < C(x) is convex decreasing, for our
purposes the latter assumption is unnecessary, and only assuming C(0) sufficiently
large to prevent instantaneous development will suffice. Assume a market growth
rate 7; 7>, = or < 0 according as the market is growing, stationary or
The development process is assumed to be contractual in the sense that
innovation will be seen through its completion by the entrepreneur as well as the
rivals either as a pioneer or as an imitator. The entrepreneur has only an incomplete knowledge about rivals' introduction time T reflected by its d.f.
H(t) = P(T<~ t) about which more will be said later.
The current rate of the entrepreneur's return r(t; ~, T) at time t depends not
only on when the innovation is introduced in the market but also on whether our
entrepreneur is a winner succeeding first or, an imitator of the rivals. Let this be
r o (receipt on current good) until introduction of the innovation changes it to r 1
or Po recording as some rival or the entrepreneur succeeds first. These rates
remain in effect until the moment both the innovating pioneer and the imitator
appear. Once the entrepreneur and the rivals are both in the market, the former's
rate of return changes again. The current value of its contribution to the total
return is a function P(z, T), the current capitalized value of the stream of future
receipts, which depend on and T typically through I v - T I: the lag between
innovation and imitation. The structure of P also depends on whether the rivals
win (r >~ T; correspondingly P = :P1('), say) or imitate (T > z, when P = : Po(')).
P(z, T)= P o ( T - z)
= PI('~ -
if z > / T ;
Reliability applications in economics
and the flow of receipts can be schematically described as below
min (z, T)
max (~, T)
z < T: rival imitates
z >/T: rival precedes
The expected net present value of the entrepreneur's returns, with a market
interest rate i, as a consequence of the decision to choose an introduction time
z is
U(z) =
E { e - ( i - ~ ) ' r ( t ; z, T)} dt + E { e - ( ' - r) max(z. T)p(.c,
r),{ro~(t ) + rill(t) } dt + Po
+ e -(i-')*
Pl(z - t) dH(t) +
e - ( i - ,)t~(t) dt
e - ( i - ' ) t P o ( t - ~) d H ( t ) .
The optimal introduction time z* is of course the solution which maximizes the
expected value of profit
V('O = U(-c)- C('O.
While z* = 0 can be ruled out by taking C(0) to be sufficiently large, it is possible
to have z* = oo (best not to undertake development at all) depending on the
relative values of the economic parameters. In the remaining cases there is a finite
economically best introduction time. It is usual, but not necessary to have
Po >~ ro >1 rl and PD >~ 0, P'I ~< 0 which are easily interpreted: (i) rival precedence,
should it occur, does not increase the rate of return from old good which further
increases if the entrepreneur succeeds first, (ii)in the post-innovation-cumimitation period, the greater is the lag of rival entry, if we succeed first (the greater
is the lag in our following, if the rivals succeed first), the greater (the smaller) is
our return from the remaining market. Various special cases m a y occur within
these constraints, e.g., rivals' early success m a y m a k e our current good obsolete
(r~ = 0); or the entrepreneur m a y be a new entrant with no current good to be
M. C. Bhattacharjee
replaced (ro = r 1 = 0 ) . Sensitivity of the optimal introduction time to these and
other parameters in the model are of obvious economic interest and are easily
derived (Kamien and Schwartz, 1982).
4.3. Intensity of rivalry as a reliability idea and its implications. What interests us
more is how the speed of development, as reflected by the economic z*, is
affected by the extent of innovational rivalry which is built-in in the rivals' introduction time distribution H. Kamien and Schwartz (1982) postulate
H(t) = : P ( T > t) = e -hA(t)
and propose h > 0 as a degree of innovational hazard. To avoid confusion with
the notion of hazard in reliability theory, we call h as the intensity of innovational
rivalry. Setting F(t) = 1 - e-A(O, it is clear that
H(t) = fib(t)
i.e., the rival introduction time d.f. H belongs to a family of distributions with
proportional hazards which are of considerable interest in reliability. We may
think of F as the distribution of rivals' development time under unit rivalry (h = 1)
for judging how fast may the rivals complete development as indicated by H.
Since the hazard function A n ( t ) = : - i n H ( t ) is a measure of time-varying innovational risk of rival pre-emption, the proportional hazards hypothesis
A~(t) = hA(t) in (4.3) says the effects of time and rivalry on the entrepreneur's
innovational hazards are separable and multiplicative. If F has a density and
correspondingly a hazard rate (i.e., 'failure rate') 2(0, the so does H with failure
rate h2(t). It is the innovational rate of hazard at time t from the viewpoint of
our entrepreneur; and by standard reliability theoretic interpretation of failure
rates, the conditional probability of rivals' completion soon after t given completion has not occurred within time t is
P(T<<. t + 61 T > t) = h62(t) + 0(6).
As the intensity of rivalry increases by a factor from h to ch; this probability, for
each fixed t and small b, also increases essentiall by the same factor c.
To examine the effect of the intensity of rivalry on the speed of development,
assume that having imitators is preferable to being one (Po > P~) and that the
corresponding rewards are independent of 'innovation-imitation lag' (P'1 = P~ = 0)
as a simplifying assumption. By (4.1) and (4.2), the optimal introduction time z*
is then the implicit solution of
- e-(i-~)~[{ro _ Po + h(P, - Po)2(z)}F(z)
+ rl - ( i - 2)P~}F(z)] - C'(t) = O,
Reliability applications in economics
satisfying the second derivative condition for a maximum at z*. (4.4) defines
z* = z*(h) implicitly as function of the rivalry intensity. Kamien and Schwartz
(1982) show that if
2(t) t
in t,
then either (i) z*(h) 1" or (ii) z*(h) is initially ~ and then t in h. The crux of their
argument is the following. If ro(h) is implicitly defined by the equation
2(t){A~z)- h} = {po - ro + rl - ( i - 2)P1}/(Po- P1),
i.e., the condition for the left hand side of (4.4) to have a local extremum as a
function of h; then z*(h) is decreasing, stationary or increasing in h according as
z*(h) > , = or < zo(h). Accordingly, since (4.5) implies that zo(h) is decreasing in
h; either z*(h) behaves according to one of the two possibilities mentioned, or
(iii) r*(h) < zo(h) for all h >~ 0. The last possibility can be ruled out by the continuity of V= V(z, h) in (4.2), V(0, h ) < 0, V(z*, h ) > 0 and the condition
P1 > Po. Which one of the two possibilities obtains of course depends on the
model parameters. In case (i), the optimal introduction time z*(h) increases with
increasing rivalry and the absence of rivalry (h = 0) yields the smallest such
optimal introduction time. The other case (ii), that depending on the rates of
return and other relevant parameters, there may be an intermediate degree of
rivalry for which the optimal development is quickest possible, is certainly not
obvious a-priori and highlights the non-intuitive effects of rivalry on decisions to
4.4. Further reliability ramifications. From a reliability point of view, Kamien
and Schwartz's assumption (4.5) says
F ~ {IFR} c3 ~
and hence so does H; where ~( is the set of life distributions with a log-concave
hazard function. The IFR hypothesis is easy to interpret. It says; the composite
rivals' residual time to development is stochastically decreasing so that if they
have not succeeded so far, then completion of their development within any
additional deadline becomes more and more likely with elapsed time. This reflects
the accumulation of efforts positively reinforcing the chances of success in future.
The other condition that F, and thus H, also has a log-concave hazard function
is less apparent to such interpretation; it essentially restricts the way in which the
time-dependent component of the entrepreneur's innovational hazard from competing rivals grows with time t.
The proportional hazard model (4.3) can accomodate different configurations
of market structure as special cases, an argument clearly in its favor. By (4.3), as
M. C. Bhattacharjee
h --, O, P(T > t) ~ 1 for all t > 0 and in the limiting case T is an improper r.v. witb
all its mass at infinity. Thus h = 0 corresponds to absence of rivalry. Similarly as
h ~ 0% P ( T > t)---,O for all t > 0; in the limit the composite rivals' appearance is
immediate and this prevents the possibility of entreprenunial precedence. If our
entrepreneur had a head start with no rivals until a later time when rivals appear
with a very large h, then even if our entrepreneur innovates first; his supernormal
profits from innovation will very quickly be eliminated by rival imitation with high
probability within a very short time as a consequence of high rivalry intensity h,
which shrinks to instantaneous imitation as h approaches infinity. In this sense
the case h = oo reflects the traditional economists' dream of 'perfect competition'.
Among the remaining possibilities 0 < h < oo that reflect more of a realism, Barzel
(1968) distinguishes between moderate and intense rivalry, the latter corresponding
to the situation when the intensity of rivalry exceeds the market growth rate
( h > 7). If rivalry is sufficiently intense, no development becomes best
(h >>~, ~ z*(h) = ~ ) . In other cases, the intense rivalry and non-rivalous solutions provide vividly contrasting benchmarks to understand the innovation process under varying degrees of moderate to intense rivalry.
Our modeling to illustrate the use of reliability ideas has been limited to a
relatively simplified situation. It is possible to introduce other variations and
features of realism such as modification of rivals' effort as a result of entrepreneur's early success, budget constraints, non-contractual development which
allows the option of stopping development under rival precedence, and game
theoretic formulations which incorporate technical uncertainty. There is now substantial literature on these various aspects of innovation as an economic process
(DasGupta and Stiglitz, 1980, 1980a; Kamien and Schwarz, 1968, 1971, 1972,
1974, 1975, 1982; Lee and Wilde, 1980; Lowry, 1979). It appears to us that there
are many questions, interesting from a reliability application viewpoint which can
be profitably asked and would lead to a deeper understanding of the economics
of innovation. Even in the context of the present model which captures the
essence of the innovating proces under risk of rivalry, there are many such
questions. For example, what kind of framework for R & D rivalry and market
mechanisms lead to the rival entry model (4.3)? Stochastic modeling of such
mechanisms would be of obvious interest. Note the exponential: H ( t ) = e -m,
2(0 = 1; Weibull: H(t) = e -h'~, 2(0 = ~t ~- 1 and the extreme-value distributions:
H(t) = e x p { - h ( e ~ ' - 1)}, 2(t)= 0~e~t all satisfy (4.3) and (4.7), the latter for
A related open question is the following. Suppose the rival introduction time
satisfies (4.3) but its distribution F under unit rivalry (h = 1) is unknown. Under
what conditions, interesting from a reliability point of view with an appropriate
interpretation in the context of rivalry, does there exist a finite maximin introduction time ~*(h) and what, if any, is a least favorable distribution F* of time
to rival entry? Such a pair (z*(h), F*), for which
max rain V(~, h; F) = min max V(z, h; F ) = V(z*(h), h; F * ) ,
Reliability applications in economics
would indicate the entrepreneur's best economic introduction time within any
specified regime of rivalry when he has only an incomplete knowledge of the
benchmark distribution F. Here V(v, h; F) is the total expected reward (4.2) and
(4.1) under (4.3).
The proportional hazards model (4.3) aggregates all sources of rivalry, from
existing firms or potential new entrants. This is actually less of a criticism than
it appears because in the entrepreneur's preception, only the distribution of composite rival entry time matters. It is possible to introduce technical uncertainty in
the model by recognizing that the effort, usually parametrized through cost,
required to successfully complete development is also subject to uncertainties
(Kamien and Schwartz, 1971). Suppose there are n competetors including our
entrepreneur, the rivals are independent and let G(z) be the probability that any
rival completes development with an effort no more than z. If z(t) is the cumulative rival effort up to time t, then the probability that none of the rivals will
succeed by time t is
P(t) = 1 - {1
G(z(t))} n-1
This leads to (4.3) with F--- G(z), H = P and intensity h = (n - 1) the number of
rivals. We note this provides one possible answer to the question of modeling
rivalry described by (4.3). What other alternative mechanisms can also lead to
(4.3)? If the effort distribution G has a 'failure rate' (intensity of effort) r(z), then
the innovational hazard function and rates are
An(t )
( n - 1)
r(u) du,
2H(t) = (n - 1)z'(t)r(z(t)),
which show how technical uncertainty can generate market uncertainty. If our
entrepreneur's effort distribution is also G(z) and independent of the rivals; then
note the role of each player in the innovation game is symmetric and each faces
the hazard rate (4.8) since from the perspective of each competitor, the other
(n - 1) rivals are i.i.d, and in series. It would clearly be desirable to remove the
i.i.d, assumption to reflect more of a realism in so far as a rival's effort and
spending decisions are often dictated by those of others.
Some of the effects of an innovation may be irreversible. Computers and
information processing technology which have now begun to affect every facet of
human life is clearly a case in point. Are these impacts or their possible irreversibility best for the whole society? None of the above formulations can address this
issue, a question not in the perview of economists and quantitative modeling
alone; nor do they dispute their relevance. What they can and do provide is an
understanding of the structure and evolution of the innovating process as a risky
enterprise and it is here that reliability ideas may be able to play a more significant
role than hitherto in explaining rivalry and their impacts on the economics of
M. C. Bhattacharjee
i n n o v a t i o n . In t u r n the m e a s u r a b l e p a r a m e t e r s o f s u c h m o d e l s a n d their c o n s e q u e n c e s c a n t h e n serve as s i g n p o s t s for an i n f o r m e d d e b a t e o n the w i d e r
q u e s t i o n s o f social r e l e v a n c e o f an i n n o v a t i o n .
Arrow, K. J. (1951). Social Choice and Individual Values. Wiley, New York.
Arrow, K. J. (1962). Economic welfare and the allocation of resources for invention. In: R. R. Nelson,
ed., The Rate and Direction of Inventive Activity. Princeton University Press, Princeton, NJ.
Barlow, R. E. and Campo, R. (1975). Total time on test processes and applications to failure data
analysis. In: R. E. Barlow, J. Fussell and N. D. Singpurwalla, eds., Reliability and Fault Tree
Analysis, SIAM, Philadelphia, PA, 451-481.
Barlow, R. E. and Saboia, J. L. M. (1973). Bounds and inequalities in the rate of population growth.
In: F. Proschan and R. J. Serfling, eds., Reliability and Biometry, Statistical Analysis of Lifelengths,
SIAM, Philadelphia, PA, 129-162.
Barlow, R. E. and Proschan, F. (1965). Mathematical Theory of Reliability. Wiley, New York.
Barlow, R. E. and Proschan, F. (1975). Statistical Theory of Reliability and Life Testing: Probability
Models. Holt, Rinehart and Winston, New York.
Barzel, Y. (1968). Optimal timing of innovation. Review of Economics and Statistics 50, 348-355.
Bergmann, R. and Stoyan, D. (1976). On exponential bound for the waiting time distribution in
GI/G/1. J. AppL Prob. 13(2), 411-417.
Bhattacharjee, M. C. and Krishnaji, N. (1985). DFR and other heavy tail properties in modeling the
distribution of land and some alternative measures of inequality. In: J. K. Ghosse, ed., Statistics:
Applications and New Directions, Indian Statistical Institute, Eka Press, Calcutta; 100-115.
Bhattacharjee, M. C. (1986). Tail behaviour of age-smooth failure distribution and applications. In:
A. P. Basu, ed., Reliability and Statistical Quality Control, North-Holland, Amsterdam, 69-86.
Bhattacharjee, M. C. (1986a). On using Reliability Concepts to Model Aggregate Inequality of Distributions. Technical Report, Dept. of Mathematics, University of Arizona, Tucson.
Brains, S. J., Lucas, W. F. and Straffin, P. D., Jr. (eds.) (1978). Political and Related Models. Modules
in Applied Mathematics: Vol. 2, Springer, New York.
Chandra, M. and Singpurwalla, N. D. (1981). Relationships between some notions which are
common to reliability and economics. Mathematics of Operations Research 6, 113-121.
Daley, D. (ed.) (1983). Stochastic Comparison Methods for Queues and Other Processes. Wiley, New
Deegan, J., Jr. and Packel, E. W. (1978). To the (Minimal Winning) Victors go the (Equally Divided)
Spoils: A New Power Idex for Simple n-Person Games. In: S. J. Brahms, W. F. Lucas and P.
D. Straffin, Jr. (eds.): Political and Related Models. Springer-Verlag, New York, 239-255.
DasGupta, P. and Stiglitz, J. (1980). Industrial structure and the nature of innovative activity.
Economic Journal 90, 266-293.
DasGupta, P. and Stiglitz, J. (1980a). Uncertainty, industrial structure and the speed of R& D. Bell
Journal of Economics 11, 1-28.
Feller, W. (1966). Introduction to Probability Theory and Applications. 2nd ed. Wiley, New York.
Gail, M. H. and Gatswirth, J. L. (1978). A scale-free goodness-of-fit test for the exponential distribution based on the Lorenz curve. J. Amer. Statist. Assoc. 73, 787-793.
Galbraith, J. K. (1952). American Capitalism. Houghton and Mifflin, Boston.
Goldie, C. M. (1977). Convergence theorems for empirical Lorenz curves and their inverses. Advances
in Appl. Prob. 9, 765-791.
Guess, F., Hollander, M. and Proschan, F. (1983). Testing whether Mean Residual Life Changes Trend.
FSU Technical Report #M665, Dept. of Statistics, Florida State University, Tallahassee.
Hicks, J. R. (1932). The Theory of Wages. Macmillan, London.
Hollander, M. and Proschan, F. (1975). Tests for the mean residual life. Biometrika 62, 585-593.
Kamien, M. and Schwartz, N. (1968). Optimal induced technical change. Econometrika 36, 1-17.
Reliability applications in economics
Kamien, M. and Schwartz, N. (1971). Expenditure patterns for risky R & D projects. J. Appl. Prob.
8, 60-73.
Kamien, M. and Schwartz, N. (1972). Timing of innovations under rivalry. Econometrika 40, 43-60.
Kamien, M. and Schwartz, N. (1974). Risky R & D with rivalry. Annals of Economic and Social
Measurement 3, 276-277.
Kamien, M. and Schwartz, N. (1975). Market structure and innovative activity: A survey. J.
Economic Literature 13, 1-37.
Kamien, M. and Schwartz, N. (1982). Market Structure and Innovation. Cambridge University Press,
Kelfsj/J, B. (1982). The HNBUE and HNWUE class of life distributions. Naval Res. Logist. Qrtly. 29,
Kelfsj/5, B. (1983). Testing exponentiality against HNBUE. Scandinavian J. Statist. 10, 65-75.
Kelfsj~, B. (1984). Reliability interpretations of some concepts from economics. Naval Res. Logist.
Qrtly. 31,301-308.
Kleinrock, L. (1975). Queueing Systems, Vol. 1. Theory. Wiley, New York.
KSllerstrSm, J. (1976). Stochastic bounds for the single server queue. Math. Proc. Cambridge Phil.
Soc. 80, 521-525.
Lucas, W. F. (1978). Measuring power in weighted voting systems. In: S. J. Brahms, W. F. Lucas
and P. D. Straffin, Jr., eds., Political Science and Related Models. Springer, New York, 183-238.
Lee, T. and Wilde, L. (1980). Market structure and innovation: A reformulation. Qrtly. J. of
Economics 194, 429-436.
Loury, G. C. (1979). Market structure and innovation. Qrtly. J. of Economics XCIII, 395-410.
Macdonald, J. B. and Ransom, M. R. (1979). Functional forms, estimation techniques and the
distribution of income. Ecometrika 47, 1513-1525.
Mukherjee, V. (1967). Type III distribution and its stochastic evolution in the context of distribution
of income, landholdings and other economic variables. Sankhy-d A 29, 405-416.
Owen, G. (1982). Game Theory. 2nd edition. Academic Press, New York.
Pechlivanides, P. M. (1975). Social Choice and Coherent Structures. Unpublished Tech. Report # ORC
75-14, Operations Research Center, University of California, Berkeley,
Rae, D. (1979). Decision rules and individual values in constitutional choice. American Political
Science Review 63.
Ramamurthy, K. G. and Parthasarathy, T. (1983). A note on factorization of simple games. Opsearch
20(3), 170-174.
Ramamurthy, K. G. and Parthasarathy, T. (1984). Probabilistic implications of the assumption of
homogeneity in voting games. Opsearch 21(2), 81-91.
Salem, A. B. Z. and Mount, T. D. (1974). A convenient descriptive model of income distribution.
Econometrika 42, 1115-1127.
Schumpeter, J. A. (1961). Theory of Economic Development. Oxford University Press, New York.
Schumpeter, J. A. (1964). Business Cycles. McGraw-Hill, New York.
Schumpeter, J. A. (1975). Capitalism, Socialism and Democracy. Harper and Row, New York.
Seneta, E. (1976). Regularly Varying Functions. Lecture Notes in Math. 508, Springer, New York.
Straffin, P. D., Jr. (1978). Power indices in politics. In: S. J. Brams, W. F. Lucas and P. D. Straffin,
Jr., eds., Political Science and Related Models. Springer, New York, 256-321.
Straffin, P. D., Jr. (1978a). Probability models for power indices. In: P. C. Ordershook, ed., Game
Theory and Political Science, University Press, New York.
TaiUie, C. (1981). Lorenz ordering within the generalized gamma family of income distributions. In:
C. Taillie, P. P. Ganapati and B. A. Baldessari, eds., Statistical Distributions in Scientific Work. Vol.
6. Reidel, Dordrecht/Boston, 181-192.
Taussig, F. W. (1915). Innovation and Money Makers. McMillan, New York.
P. R. Krishnaiah and C. R. Rao, eds., Handbook of Statistics, Vol. 7
© Elsevier Science Publishers B.V. (1988) 215-224
.]k g ~
Mean Residual Life: Theory and Applications*
Frank Guess and Frank Proschan
1. Introduction and summary
The mean residual life (MRL) has been used as far back as the third century
A.D. (cf. Deevey (1947) and Chiang (1968)). In the last two decades, however,
reliabilists, statisticians, and others have shown intensified interest in the MRL
and derived many useful results concerning it. Given that a unit is of age t, the
remaining life after time t is random. The expected value of this random residual
life is called the mean residual life at time t. Since the MRL is defined for each
time t, we also speak of the M R L function. (See Section 2 for a more formal
The M R L function is like the density function, the moment generating function,
or the characteristic function: for a distribution with a finite mean, the MRL
completely determines the distribution via an inversion formula (e.g., see Cox
(1962), Kotz and Shanbhag (1980), and Hall and Wellner (1981)). Hall and
Wellner (1981) and Bhattacharjee (1982) derive necessary and sufficient conditions for an arbitrary function to be a M R L function. These authors recommend
the use of the M R L as a helpful tool in model building.
Not only is the M R L used for parametric modeling but also for nonparametric
modeling. Hall and Wellner (1981) discuss parametric uses of the MRL. Large
nonparametric classes of life distributions such as decreasing mean residual life
(DMRL) and new better than used in expectation (NBUE) have been defined
using MRL. Barlow, Marshall and Proschan (1963) note that the D M R L class
is a natural one in reliability. Brown (1983) studies the problem of approximating
increasing mean residual life (IMRL) distributions by exponential distributions.
He mentions that certain IMRL distributions, '... arise naturally in a class of first
passage time distributions for Markov processes, as first illuminated by Keilson'.
See Barlow and Proschan (1965) and Hollander and Proschan (1984) for further
comments on the nonparametric use of MRL.
A fascinating aspect about M R L is its tremendous range of applications. For
example, Watson and Wells (1961) use MRL in studying burn-in. Kuo (1984)
* Research sponsored by the Air Force Office of Scientific Research, AFSC, USAF, under Grant
AFOSR 85-C-0007.
F. Guess and F. Proschan
presents further references on M R L and burn-in in his Appendix 1, as well as a
brief history on research in burn-in.
Actuaries apply MRL to setting rates and benefits for life insurance. In the
biomedical setting researchers analyze survivorship studies by MRL. See ElandtJohnson and Johnson (1980) and Gross and Clark (1975).
Morrison (1978) mentions IMRL distributions have been found useful as
models in the social sciences for the lifelengths of wars and strikes. Bhattacharjee
(1982) observes M R L functions occur naturally in other areas such as optimal
disposal of an asset, renewal theory, dynamic programming, and branching processes.
In Section 2 we define more formally the M R L function and survey some of
the key theory. In Section 3 we discuss further its wide range of applications.
2. T h e o r y o f m e a n r e s i d u a l life
Let F be a life distribution (i.e., F(t) = 0 for t < 0) with a finite first moment.
Let i ( t ) = 1 - F(t). X is the random life with distribution F. The mean residual
life function is defined as
m(t)= E [ X -
t I X > t]
= 0
for if(t)> 0,
for if(t) = 0 ,
for t >/0. Note that we can express
i(x + t)
- r(t)
f o~ i(u)
dx =
ff-~ du
when i ( t ) > O. If F also has a density f we can write
re(t) :
uf(u) du/~'(t) - t .
Like the failure rate function (recall that it is defined as r(t)= f(t)/F(t) when
F(t) > 0), the MRL function is a conditional concept. Both functions are conditioned on survival to time t.
While the failure rate function at t provides information about a small interval
after time t ('just after t', see p. 10 Barlow and Proschan (1965)), the M R L
function at t considers information about the whole interval after t ('all after t').
This intuition explains the difference between the two.
Note that it is possible for the M R L function to exist but for the failure rate
function not to exist (e.g., consider the standard Cantor ternary function, see
Chung (1974), p. 12). On the other hand, it is possible for the failure rate function
Mean residual life: theory and applications
to exist but the M R L function not to exist (e.g., consider modifying the Cauchy
density to yield f ( t ) = 2/n(1 + t 2) for t >f 0). Both the M R L and the failure rate
functions are needed in theory and in practice.
When m and r both exist the following relationship holds between the two:
m'(t) = m ( t ) r ( t ) -
See Watson and Wells (1961) for further comments on (2.2) and its uses•
If the failure rate is a constant ( > 0 ) the distribution is an exponential. If the
MRL is a constant ( > 0 ) the distribution is also an exponential.
L e t / t = E(X). If F(0) = 0 then m(0) = #. If F(0) > 0 then m(0) = #/F(0) ~ #.
For simplicity in discussions and definitions in this section, we assume F(0) = 0.
Let F be right continuous (not necessarily continuous). Knowledge of the MRL
function completely determines the reliability function as follows:
if(t) = m(O) e- $Om~
, , d~ for 0 ~ t < F - l ( 1 ) ,
for t~> F - I ( 1 ) ,
where F - l ( 1 ) ~ f s u p { t [ F ( t ) < 1}.
Cox (1962) assigns as an exercise the demonstration that M R L determines the
reliability. Meilijson (1972) gives an elegant, simple proof of (2.3). Kotz and
Shanbhag (1980) derive a generalized inversion formula for distributions that are
not necessarily life distributions. Hall and Wellner (1981) have an excellent discussion of (2.3) along with further references.
A natural question to ask is: what functions are M R L functions? A characterization is possible which answers this. By a function f being increasing (decreasing) we mean that x ~<y implies f(x)<~ (>>,)fly).
THEOREM 2.1. Consider the following conditions:
(i) m:[0, 09)--+ [0, 09).
(ii) m(0) > 0.
(iii) m is right continuous (not necessarily continuous).
(iv) d(t) ,lof
= re(t) + t is increasing on [0, 09).
(v) When there exists to such that m ( t o ) = llInt~t¢ re(t) = 0, then m(t) = 0
holds for t ~ [to, 09). Otherwise, when there does not exist such a to with
m ( t o ) = O, then S o 1~re(u)du = 09 holds.
A function m satisfies (i)-(v) /f and only if m is the M R L function of a nondegenerate at 0 life distribution.
See Hall and Wellner (1981) for a proof. See Bhattacharjee (1982) for another
characterization. Note that condition (ii) rules out the degenerate at 0 distribution•
F. Guess and F. Proschan
For (iv) note that d(t) is simply the expected time of death (failure) given that a
unit has survived to time t. Theorem 2.1 delineates which functions can serve as
MRL functions, and hence, provides models for lifelengths.
We restate several bounds involving MRL from Hall and Wellner (1981). Recall
a + = a if a >i 0, otherwise a + = 0.
THEOREM 2.2. Let F be nondegenerate. L e t ~tr = E X r ~ oo for r > 1.
(i) m ( t ) < ~ ( F - l ( 1 ) - t )
+ for all t. Equality holds if and only if F ( t ) =
F ( ( F - 1(1))-) or 1.
(ii) m(t) <~ (#~if(t)) - t f o r all t. Equality holds if and only if F(t) = O.
(iii) m(t) < (#r/F(t)) l / r - t for all t.
(iv) m(t) >~ (kt - t)+ /F(t) for t < F - 1(1). Equality holds if and only if r ( t ) = O.
(v) m(t) > [# - F(t)(l~r/F(t))l/~]iF(t ) - t f o r t < F - 1(1).
(vi) m(t)>~ ( # - t) + for all t. Equality holds if and only if F(t) = 0 or 1.
Various nonparametric classes of life distributions have been defined using
MRL. (Recall, for simplicity we assume F(0) = 0 and the mean is finite for these
DEFINITION 2.3. DMRL. A life distribution F has decreasing mean residual life
if its MRL m is a decreasing function.
DEFINITION 2.4. NBUE. A life distribution F is new better than used in expectation if m(0) >1 m(t) for all t >t 0.
DEFINITION 2.5. IDMRL. A life distribution F has increasing then decreasing
mean residual life if there exist z>~ 0 such that m is increasing on [0, z) and
decreasing on [z, ~ ) .
Each of these classes above has an obvious dual class associated with it, i.e.,
increasing mean residual life, new worse than used in expectation (NWUE), and
decreasing then increasing mean residual life (DIMRL), respectively.
The D M R L class models aging that is adverse (e.g., wearing occurs). Barlow,
Marshall and Proschan (1963) note that the D M R L class is a natural one in
reliability. See also Barlow and Proschan (1965). The older a D M R L unit is, the
shorter is the remaining life on the average. Chen, Hollander and Langberg (1983)
contains an excellent discussion of the uses of the D M R L class.
Burn-in procedures are needed for units with IMRL. E.g., integrated circuits
have been observed empirically to have decreasing failure rates; and thus they
satisfy the less restrictive condition of IMRL. Investigating job mobility, social
scientists refer to IMRL as inertia. See Morrison (1978) for example. Brown
(1983) studies approximating IMRL distributions by exponentials. He comments
that certain IMRL distributions, '... arise naturally in a class of first passage time
distributions for Markov processes, as first illuminated by Keilson'.
Note that D M R L implies NBUE. The N B U E class is a broader and less
Mean residual life: theory and applications
restrictive class. Hall and Wellner (1981) show for NBUE distributions that the
coefficient of variation a/it ~< 1, where a z = Var(X). They also comment on the
use of NBUE in renewal theory. Bhattacharjee (1984b) discusses a new notion,
age-smoothness, and its relation to NBUE for choosing life distribution models
for equipment subject to eventual wear. Note that burn-in is appropriate for
NWUE units.
For relationships of DMRL, IMRL, NBUE, and N W U E with other classes
used in reliability see the survey paper Hollander and Proschan (1984).
The IDMRL class models aging that is initially beneficial, then adverse. Situations where it is reasonable to postulate an IDMRL model include:
(i) Length of time employees stay with certain companies: An employee with a
company for four years has more time and career invested in the company than
an employee of only two months. The M R L of the four-year employee is likely
to be longer than the M R L of the two-month employee. After this initial IMRL
(this is called 'inertia' by social scientists), the processes of aging and retirement
yield a D M R L period.
(ii) Life lengths of human." High infant mortality explains the initial IMRL.
Deterioration and aging explain the later D M R L stage.
See Guess (1984) and Guess, Hollander, and Proschan (1983) for further
examples and discussion. Bhattacharjee (1983) comments that Gertsbakh and
Kordonskiy (1969) graph the MRL function of a lognormal distribution that has
a 'bath-tub' shaped M R L (i.e., DIMRL).
Hall and Wellner (1981) characterize distributions with MRL's that have linear
segments. They use this characterization as a tool for choosing parametric
models. Morrison (1978) investigates linearly IMRL. He states and proves that
if F is a mixture of exponential then F has linearly IMRL if and only if the mixing
distribution, say G, is a gamma. Howell (1984) studies and lists other references
on linearly DMRL.
In renewal theory M R L arises naturally also. For a renewal process with
underlying distribution F, let G(t) = ( ~ if(u)du)/#. G is the limiting distribution
of both the forward and the backward recurrence times. See Cox (1962) for more
details. Also if the renewal process is in equilibrium then G is the exact distribution of the recurrence times. G(t) = (m(t)ff(t))/#. The failure rate of G, r 6, is
inversely related to the MRL of F, m F. I.e., re(t ) = 1/mF(t ). Note, however, that
rF(t) ~ 1/mF(t ) is USually the case. See Hall and Wellner (1981), Rolski (1975),
Meilijson (1972), and Watson and Wells (1961) for related discussions.
Kotz and Shanbhag (1980) establish a stability result concerning convergence
of an arbitrary sequence of M R L functions to a limiting MRL function. (See also
Bhattacharjee (1982).) They show an analogous stability result for hazard
measures. (When the failure rate for F exists and vF is F's hazard measure, then
VF(B) = ~B rF(t) dt for B a Borel set.) Their results imply that MRL functions can
provide more stable and reliable information than hazard measures when
assessing noncontinuous distributions from data.
In a multivariate setting, Lee (1985) shows the effect of dependence by total
positivity on M R L functions.
F. Guess and F. Proschan
3. Applications of mean residual life
A mean is easy to calculate and explain to a person not necessarily skilled in
statistics. To calculate the empirical M R L function, one does not need calculus.
Details of computing the empirical M R L follow.
Let X 1, X 2 . . . . , X~ be a r a n d o m sample from F. For simpler initial notation,
we assume first no ties. Later we allow for ties. Order the observations as
x,. <x2. < "" <Xn..
Let Xo, = 0. The empirical M R L function is defined as
mn(t ) =
2 ni = k + l
(Sin -- t)
for te [Xk,, X(k + l),) ,
and k = 0, 1, ..., n - 1. rn~(t) = 0 for t>~X,n.
Note that (3.2) is simply
m,(t) =
Total time on test observed after t
N u m b e r of units observed after t
The empirical M R L function at 0, mn(0) = X , = ~,. ~= 1 Xi)/n, is just the usual
sample mean when no unit fails at time 0. If a unit fails at 0 then m n ( 0 ) > X,.
If ties exist let
0 = Xol<Xll<X2l
< ...
be the distinct ordered times of failure,
n; = number of observed failures at time ~';z,
~ nj
for i = 0, 1, ..., I < n. Note that n i ~ 0, i = 1, . . . , / ,
= ~i=k+
= 0
while n o = 0 is allowed.
ni(Xil- t) for t~ [~'kZ, X(k+ ,),),
for t >~/~'u,
for k = 0, 1. . . . , l - 1. Note that (3.6) is simply notation for (3.3).
We illustrate in the following example.
EXAMPLE 3.1. Bjerkedal (1960) studies the lifelengths of guinea pigs injected
with different amounts of tubercle bacilli. Guinea pigs are known to have a high
Mean residual life." theory and applications
susceptibility to human tuberculosis, which is one reason for choosing this
species. We describe the only study (M) in which animals in a single cage are
under the same regimen. The regimen number is the common log of the number
of bacillary units in 0.5 ml of the challenge solution, e.g., regimen 4.3 corresponds
Table 3.1
Empirical m e a n residual life in days at the unique times of death for the 72 guinea pigs under
regimen 5.5. We include the empirical M R L at time 0 also.
Number of
Time of
N u m b e r of
Time of
F. Guess and F. Proschan
to 2.2 × 104 bacillary units per 0.5 ml (loglo(2.2 × 104)=4.342). Table 3.1
presents the data from regimen 5.5 and the empirical MRL.
Graphs of MRL provide useful information not only for data analysis but also
for presentations. Commenting on fatigue longevity and on preventive maintenance, Gertsbakh and Kordonskiy (1969) recommend the MRL function as
another helpful tool in such analyses. They graph the MRL for different distributions (e.g., Weibull, lognormal, and gamma). Hall and Wellner (1979) graph the
empirical MRL for Bjerkedal's (1960) regimen 4.3 and regimen 6.6 data. Bryson
and Siddiqui (1969) illustrate the graphical use of the empirical MRL on survival
data from chronic granulocytic leukemia patients. Using the standard KaplanMeier estimator (e.g., see Lawless (1982), Nelson (1982), or Miller (1980)), Chen,
Hollander, and Langberg (1983) graph the empirical MRL analogue for censored
lifetime data.
Gertsbakh and Kordonskiy (1969) note that estimation of MRL is more stable
than estimation of the failure rate. Statistical properties of estimated means are
better than those of estimated derivatives (which enter into failure rates).
Yang (1978) shows that the empirical MRL is uniformly strongly consistent.
She establishes that mn, suitably standardized, converges weakly to a Gaussian
process. Hall and Wellner (1979) require less restrictive conditions to apply these
results. They derive and illustrate the use of simultaneous confidence bands for
m. Yang (1978) comments that for t > 0, ran(t) is a slightly biased estimator.
Specifically, E(mn(t))= m(t)(1 -Fn(t)). Note, however, that l i m ~ E(m~(t))=
re(t). Thus, for larger samples rn,(t) is practically unbiased. See also Gertsbakh
and Kordonskiy (1969).
Yang (1977) studies estimation of the MRL function when the data are randomly censored. For parametric modeling Hall and Wellner (1981) use the empirical MRL plot. They observe that the empirical MRL function is a helpful addition
to other life data techniques, such as total time on test plots, empirical (cumulative) failure rate functions, etc. The MRL plot detects certain aspects of the
distribution more readily than other techniques. See Hall and WeUner (1981), Hall
and WeUner (1979), and Gertsbakh and Kordonskiy (1969) for further comments.
When a parametric approach seems inadvisable, the MRL function can still be
used as a nonparametric tool. Broad classes defined in terms of MRL allow a
more flexible approach while still incorporating preliminary information. For example, to describe a wear process, a DMRL is appropriate. When newly
developed components are initially produced, many may fall early (such early
failure is called infant mortality and this early stage is called the debugging stage).
Another subgroup tends to last longer. Depending on information about this latter
subgroup, we suggest IMRL (e.g., lifelengths of integrated circuits) or IDMRL
(e.g., more complicated systems where there are infant mortality, useful life, and
wear out stages).
Objective tests exist for these and other classes defined in terms of MRL. E.g.,
see Hollander and Proschan (1984) and Guess, Hollander and Proschan (1983).
To describe 'burn-in' the MRL is a natural function to use. Kuo's (1984)
Appendix 1 presents an excellent brief introduction to burn-in problems and
applications of MRL.
Mean residual life: theory and applications
Actuaries apply M R L to setting rates and benefits for life insurance. In the
biomedical setting researchers analyze survivorship studies by M R L . For example,
see E l a n d t - J o h n s o n and J o h n s o n (1980) and Gross and Clark (1975).
Social scientists use I M R L for studies on job mobility, length o f wars, duration
of strikes, etc. See Morrison (1978).
In economics M R L arises also. Bhattacharjee and Krishnaji (1981) present
applications of M R L for investigating landholding. Bhattacharjee (1984a) uses
N B U E for developing optimal inventory policies for perishable items with r a n d o m
shelf life and variable supply.
Bhattacharjee (1982) observes M R L functions occur naturally in other areas
such as optimal disposal of an asset, renewal theory, dynamic programming, and
branching processes.
We thank Dr. J. Travis, Department of Biological Sciences, and Dr. D. Meeter,
Department of Statistics, Florida State University, for the Deevey (1947)
reference. We are also grateful to Dr. M. Bhattacharjee, Indian Institute o f
Management, Calcutta, and to Dr. M. Hollander, Department of Statistics,
Florida State University for discussions on M R L .
Barlow, R. E., Marshall, A. W. and Proschan, F. (1963). Properties of probability distributions with
monotone hazard rate. Ann. Math. Statist. 34, 375-389.
Barlow, R. E. and Proschan, F. (1965). Mathematical Theory of Reliability. Wiley, New York.
Barlow, R. E. and Proschan, F. (1981). Statistical Theory of Reliability and Life Testing. To Begin With,
Silver Springs, MD.
Bhattacharjee, M. C. (1984a). Ordering policies for perishable items with unknown shelf life/variable
supply distribution. Indian Institute of Management, Calcutta, Technical Report.
Bhattacharjee, M. C. (1984b). Tail behavior of age-smooth failure distributions and applications.
Indian Institute of Management, Calcutta, Technical Report.
Bhattacharjee, M. C. (1983). Personal communication.
Bhattacharjee, M. C. (1982). The class of mean residual lives and some consequences. S l A M J.
Algebraic Discrete Methods 3, 56-65.
Bhattacharjee, M. C. and Krishnaji, N. (1981). DFR and other heavy tail properties in modelling the
distribution of land and some alternative measures of inequality. Proceedings of the Indian Statistical Institute Golden Jubilee International Conference.
Bjerkedal, T. (1960). Acquisition of resistance in guinea pigs infected with different doses of virulent
tubercle bacilli. Amer. J. Hygiene 72, 130-148.
Brown, M. (1983). Approximating IMRL distributions by exponential distributions, with applications
to first passage times. Ann. Probab. 11, 419-427.
Bryson, M. C. and Siddiqui, M. M. (1969). Some criteria for aging. J. Amer. Statist. Assoc. 64,
Chen, Y. Y., Hollander, M. and Langberg, N. A. (1983). Tests for monotone mean residual life,
using randomly censored data. Biometrics 39, 119-127.
Chiang, C. L. (1968). Introduction to Stochastic Processes in Biostatistics. Wiley, New York.
F. Guess and F. Proschan
Chung, K. L. (1974). A Course in Probability Theory, 2nd ed. Academic Press, New York.
Cox, D. R. (1962). Renewal Theory. Methuen, London.
Deevey, E. S. (1947). Life tables for natural populations of animals. Quarterly Review of Biology 22,
Elandt-Johnson, R. C. and Johnson, N. L. (1980). Survival Models and Data Analysis. Wiley, New
Gertsbakh, I. B. and Kordonskiy, K. B. (1969). Models of Failure. Springer, New York.
Gross, A. J. and Clark, V. A. (1975). Survival Distributions: Reliability Applications in the Biomedical
Sciences. Wiley, New York.
Guess, F. (1984). Testing whether mean residual life changes trend. Ph.D. dissertation, Department
of Statistics, Florida State University.
Guess, F., Hollander, M. and Proschan, F. (1983). Testing whether mean residual life changes trend.
Florida State University Department of Statistics Report M665. (Air Force Office of Scientific
Research Report 83-160).
Hall, W. J. and Wellner, J. A. (1979). Estimation of mean residual life. University of Rochester
Department of Statistics Technical Report.
Hall, W. J. and Wellner, J. A. (1981). Mean residual life. In: M. CsSrgS, D. A. Dawson, J. N. K.
Rao and A. K. Md. E. Saleh, eds., Statistics and Related Topics, North-Holland, Amsterdam,
Hollander, M. and Proschan, F. (1984). Nonparametric concepts and methods in reliability. In: P.
R. Krishnaiah and P. K. Sen, eds., Handbook of Statistics, Vol. 4, Nonparametric Methods, NorthHolland, Amsterdam.
Howell, I. P. S. (1984). Small sample studies for linear decreasing mean residual life. In: M. S.
Abdel-Hameed, J. Quinn and E. ~inlar, eds., Reliability Theory and Models, Academic Press, New
Keilson, J. (1979). Markov Chain Models--Rarity and Exponentiality. Springer, New York.
Kotz, S. and Shanbhag, D. N. (1980). Some new approaches to probability distributions. Adv. in
Appl. Probab. 12, 903-921.
Kuo, W. (1984). Reliability enhancement through optimal burn-in. IEEE Trans. Reliability 33,
Lawless, J. F. (1982). Statistical Models and Methods for Lifetime Data. Wiley, New York.
Lee, M. T. (1985). Dependence by total positivity. Ann. Probab. 13, 572-582.
Meilijson, I. (1972). Limiting properties for the mean residual lifetime function. Ann. Statist. 1,
Miller, R. G. (1981). Survival Analysis. Wiley, New York.
Morrison, D. G. (1978). On linearly increasing mean residual lifetimes. J. Appl. Probab. 15, 617-620.
Nelson, W. (1982). Applied Life Data Analysis. Wiley, New York.
Rolski, T. (1975). Mean residual life. Bulletin of the International Statistical Institute, Book 4 (Proceedings of the 40th Session), 266-270.
Swartz, G. B. (1973). The mean residual lifetime function. IEEE Trans. Reliability 22, 108-109.
Watson, G. S. and Wells, W. T. (1961). On the possibility of improving the mean useful life of items
by eliminating those with short lives. Technometrics 3, 281-298.
Yang, G. L. (1978). Estimation of a biometric function. Ann. Statist. 6, 112-116.
Yang, G. (1977). Life expectancy under random censorship. Stochastic Process. AppL 6, 33-39.
P. R. Krishnaiah and C. R. Rao, eds., Handbook of Statistics, Vol. 7
© Elsevier Science Publishers B.V. (1988) 225-249
Life Distribution Models and Incomplete Data*
Richard E. Barlow and Frank Proschan
O. Introduction
In this paper our objective is to introduce life distribution models and to
discuss methods useful for analyzing failure data, especially incomplete data. We
show how to express the likelihood functions for general distributions and incomplete data. The likelihood function tends to be fairly fiat for incomplete data,
For this reason the maximum likelihood estimator may be of limited value. It is
therefore especially important in this situation to assess a prior distribution for
parameters and plot the posterior distribution or its contours.
Inference based on the exponential model is discussed for general sampling
plans. Parameter estimators and credibility intervals are derived for special cases.
The Weibull distribution is a very useful model for life distribution studies and
also for the analysis of strength data. For these reasons, we describe failure
m e c h a n i s m s leading to a Weibull life distribution model. Contour plotting methods
for analyzing life data based on a Weibull distribution are also given.
1. Likelihood
In this section we present a unified way of analyzing incomplete data for a large
number of failure distribution models. We often assume that the failure distribution F is absolutely continuous with density f and failure rate
= F(x)
where if(x) = 1 - F(x). We call
* This research was supported by the Air Force Office of Scientific Research (AFSC), USAF, under
Grant AFOSR-77-3179 with the University ef California. Reproduction in whole or in part is
permitted for any purpose of the United States Government.
R. E. Barlowand F. Proschan
R(x) = fo r(u) du
hazard function
so that
associated with F. For general F, define
= - lnF(x)
e x p [ - R(x)]. N o t e that when F has a density f,
[ _ lnff(x)] _ _f(x)
_ r(x)
so that (1.2) and (1.3) agree in this case.
F r o m (1.1) and (1.3) we see that
f(x) = r(x) e -R(x) .
For a discussion of these fundamental concepts, their inter-relationships and
illustrations in the case of well k n o w n distributions, see Barlow and Proschan
(1975), Chapter 3.
Suppose now we observe n independent lifetimes xl, x 2 . . . . . x, corresponding
to a given failure rate function, r. The joint density is
i~__l f(xi) = [ i=~-Ilr(xi) l exp[ - i~=l R(Xi) ] .
The likelihood as a function
D = (xl, x 2 . . . . . x,) is then
L(r(u), u >~OlD) = [ i=(Ilr(xi)] expf - i~=l R(Xi) ] .
EXAMPLE 1.1. The time-transformed
function is of the form
ff(xl2) = e - ~R°(x)
exponential model
Suppose the survival
where it is assumed that R o is k n o w n and differentiable but 2 is unknown. By
(1.2) we may writte
2R°(x) = fo 2r°(u) du.
It follows that the hazard function and the failure rate function are essumed
known up to the parameter 2. Another way to view the model is to consider time
Life distribution models and incomplete data
x to be transformed by the function Ro('). For this reason (1.7) is called the
time-transformed exponential model
Let x~, x 2, ..., x, be n independent observations given 2 from this model. The
likelihood is
L(21D) = 2n Ii__I~1 ro(Xi)] exp I -/~
Ro(xi) ] •
We conclude that Y,"i=l Ro(x~) and n are jointly sufficient for 2. If we use the
gamma prior for 2,
~(,~) --
b a,~a - 1 e - b2
we obtain as the posterior density for 2:
1t(2ID) = b +
2a+m_ 1 exp{ - 2[b + 5~,.= ~Ro(x;)]}
r ( a + n)
Inference preceeds exactly as for the exponential model, except that observation
x i of the exponential model is replaced by its time-transformed value Ro(x~). This
is valid assuming only that Ro(" ) is continuous.
1.1. The general sampling plan
In many practical life testing situations, the lifetime data collected are incomplete. This may be due to the sampling plan itself or due to the unplanned
withdrawal of test units during the test. (For example, in a medical experiment,
one or more of the subjects may leave town, or suffer an accident, etc.)
We now describe one type of sampling plan. Suppose unit i having lifetime
distribution F is observed over an interval of time starting at age 0 and ending
at a random or nonrandom age. Termination of observation occurs in either one
of the following two ways:
(1) The ith unit is withdrawn or lost from observation at age l; ~> 0; li may be
random or nonrandom.
(2) The ith unit fails at age Xi, where X; is a random variable.
In addition, we require a technical assumption regarding the 'stopping rule'; i.e.,
a prescription for determining when to stop observation:
(3) Suppose unit lifetime, X, depends on an unknown parameter (or parameters) 0. Observation on a unit may stop before unit lifetime is observed. Let STOP
be a rule or set of instructions which determines when observation of a unit stops.
R. E. Barlow and F. Proschan
STOP is noninformative relative to 0, that is, STOP provides no additional information about 0 other than that contained in the data.
It is important to remark that the 'stopping rule' is not necessarily the same as
the 'stopping time'.
To understand assumption (3), consider the sampling plan: put n items on life
test and stop testing at the kth observed failure. In this case, the stopping rule
depends only on k and is clearly independent of life distribution parameters since
k is fixed in advance of testing.
Suppose we stop testing at time to. Since to is fixed in advance of testing, the
stopping rule is again independent of life distribution parameters.
For these sampling plans, the likelihood, up to a constant of proportionality,
depends only on the life distribution model and the observed data. This proportionality constant depends on the stopping rule, but not on the unknown parameter.
1.2. Examples of informative stopping rules
Records are routinely kept on failures (partial or otherwise) and maintenance
actions on critical units such as airplane engines. Should a relatively new type of
unit start exhibiting problems earlier than anticipated, this may trigger early withdrawal of units. If this happens, the stopping rule, which is contingent on performance, may also be informative relative to life distribution parameters. This fact
needs to be considered when calculating the likelihood and analyzing the data.
The second example illustrates another case where assumption (3) is violated.
Suppose lifetime X is exponential with failure rate 2 and the random withdrawal
time, W, is also exponential with parameter ~p. We observe the minimum of X and
W. Furthermore, suppose that X given 2 and W given q~ are judged independent.
Then the likelihood given an observed failure at x is
L(2, ~blx) = 2 e-~X e-*X.
If ~. and ~ are judged a priori independent then the posterior density of ~. is
7t(21 x) oc ;t e - zx n(2)
where n is the prior density for ~. However, if ). and ~ are judged dependent
with joint prior rt(2, ~p), then the posterior density is
The factor ~o e-¢Xrc( 2, q~) dq~, contributed by the stopping rule, depends on ~..
There is an important case not covered by the General Sampling Plan--namely
when it is known that a unit has failed within some time interval but the exact
time of failure is unknown.
The following simple example illustrates the way in which incomplete data can
Life distribution models and incomplete data
EXAMPLE 1.2. Operating data are collected on an airplane part for a fleet of
airplanes. A typical age history for several engines is shown in Figure 1.1. The
crosses indicate the observed ages at failure. Ordered withdrawal times (nonfailure
times) are indicated by short vertical lines. In our example, units 2 and 4 fail at
respective times xco and x~2~ while observation on units 1 and 3 is terminated
without failure at times l~2~ and l~1~ respectively.
Age u
X(1) l(t)
x(2) l(2)
Fig. 1.1. Age of airplane part at failure or withdrawal.
It is important to note that all data are plotted against the age axis. Figure 1.2
illustrates how events may have occurred in calendar time. For example, units 1
and 3 had not failed at the end of the calendar record.
1.3. Total time on test
The total time on test is an important statistic for the exponential model.
Start of
calendar record
End of
calendar record
Fig. 1.2. Calendar records for airplane parts.
R. E. Barlow and F. Proschan
DEFINITION 1.3. The total time on test T is the total of the periods of observation of all the units undergoing test. Excluded from this statistic are any periods
following death or withdrawal or preceding observation. Specifically, the periods
being totalled include only those in which a death or a withdrawal of a unit under
observation can be observed.
Age u
x(2 )
Fig. 1.3. Number of units in operation n(u) as a function of age.
Let n(u) be the number of units observed to be operating at age u. The observed
function n(u) u >~ O, for Example 1.2 is displayed in Figure 1.3. From Figure 1.3
we may readily calculate the total time on test T(t) corresponding to any
t, 0 <~ t ~ l(2):
r(t) = I f n(u) du.
For example, for t such that x(2) < t </(2), we obtain from Figure 1.3:
[ ' n(u) du = 4x(1 ) + 3(l(1) - x(1)) + 2(x(2) - l(1)) + (t - x(z)).
After simplifying algebraically, we obtain
T(t) = x(a ) + l(1) + x(2) + t.
Note that the resulting expression, given in (1.11), can be obtained directly, since
X(l ) and x(2) represent the observed lifetimes of the 2 units that are observed to
fail, l(1) represents the observed age of withdrawal of the unit first withdrawn from
observation, and finally t represents the age of the second unit at the instant t
Although in this small example, the directly calculated expression (1.11) for
total time on test is simpler, Equation (1.10) is an important identity, since it
Life distribution models and incomplete data
yields the total time on test accumulated by age t in terms of the (varying) number
of units on test at each instant during the interval [0, t] for any data set in which
the ages at death or withdrawal are observed. Thus it is a general formula applicable
in a great variety of problems in which data may be incomplete.
Although n(u) is a step function, the integral representation in (1.10) is advantageous, since it is compact, mathematically tractable, and applicable in a great
variety of incomplete data situations. Of course, So n ( u ) d u < ~ in practical
problems since observation ultimately ceases in order to analyze the data in hand.
1.4. The likelihood function for incomplete data
All recorded data are necessarily discrete. Likewise real world life distribution
models should also be discrete. Continuous life distribution models are convenient
approximations to real world life distributions. However, it is most convenient to
define initially the likelihood concept in the context of discrete models.
For our purposes, we find it preferable to define the likelihood concept for the
General Sampling Plan in the context of a discrete model. Computation of the
likelihood function is an intermediate step between specification of the prior
distribution on the space O and computation of the posterior distribution on O
given observed data D.
Suppose temporarily that the life distribution is discrete, i.e., failures can occur
only at times 1, 2, ... ; similarly, withdrawals can occur only at these time points.
Suppose that the probability of failure of a given unit at x is p(xl 0). Suppose k
failures are observed at times xs, s = 1, ..., k, and m withdrawals are observed
at times lt, t = 1, ..., m. Failure and withdrawal times need not be distinct. All
observations are assumed statistically independent, given parameters. Withdrawal
times are produced by a stopping rule which is noninformative concerning 0.
For example, the stopping rule might specify that we observe a unit until failure
or until withdrawal, whichever comes first, where withdrawal time is specified in
advance. For this model, the probability of the observed outcome is
p(DIO) = ~ p(x,[O) f i P(I,[0),
where P(ujl O) def
= Zi= 1 P(Uj+irO) represents the probability that a specified unit
fails at age uj+ ~ or later, given the parameter is 0. Note that the first product
corresponds to the k failures at respective ages x~ . . . . . x k, while the second
product corresponds to the m withdrawals at respective ages l I . . . . . Ira.
Another way to model withdrawal is to suppose there exists a random withdrawal age W such that P [ W = t] --- q(t), t = 1, 2 . . . . . with W independent of unit
lifetimes and of 0. Under this model, we suppose that we observe
minimum (X, W) = ~ X if X ~< W,
( W if X > W.
R. E. Barlow and F. Proschan
Now for observed data D = {x l, ..., x k, l 1, ... lm}, the probability of the observed
outcome given parameter 0 is
p(D[O) = f i q(It) 1-I Q(xs) I-I p(xs[O) f i e ( l t l 0 ) ,
where Q(uj) ~ f ~]i=1
~ q(uj+e) represents the probability that W > uj. Note that
(1.12) and (1.13) differ only by a factor that does not depend on 0. Thus, relative
to calculating the M L E of O, the two models for withdrawals (withdrawal deterministic
or withdrawal random) do not differ essentially.
There are many practical testing situations in which withdrawals occur as a
result of chance mechanisms unrelated to the parameter 0 of the lifetime distribution. For example, concluding the collection of data at a specified chronological
time has the effect of withdrawing from observation those units still alive at that
point in time. In Figure 1.2, this phenomenon is illustrated by units 1 and 3. Other
chance mechanisms causing withdrawal at a random age result from human errors
and accidents. The net effect of the various stopping rules that are unrelated to
the value of the parameter 0 is summarized in the factor g(x, l) in the expression
for the probability of the observed outcome:
p(D[ O) = g(x, 1)
-if(It[ 0).
DEFINITION 1.4. The likelihood, L(OiD), is the probability o f the observed
outcome, p(D]O), considered as a function of the parameter 0 given the data, D.
In the case of a continuous model, the corresponding likelihood will have this
interpretation relative to a discrete probability approximation.
It follows from (1.14) that
L(OID)o¢ I-I p(x~]O) f i P(ltlO ).
From Bayes' Theorem, it is clear that we need not know g(x, !) in order to
compute the posterior density of 0.
In this subsection, we have thus far confined our discussion to the case of
discrete time life distributions since the basic concepts are easier to grasp in this
case. However, in the case of continuous time life distributions, the likelihood
concept is equally relevant, and in fact the expression for the likelihood L(OID)
assumes a rather elegant form if we use n(u), the number on test function. In the
continuous case, p(x[O) is replaced by the probability density element f(xlO).
Given the failure rate, independent observations are made under the
General Sampling Plan. Let Xl, x 2, ..., x k denote the k observed failure ages. Let
n(u) denote the number of units under observation at age, u, u >i O, and r(u) denote
Life distribution models and incomplete data
the failure rate function of the unit at age u. Then the likelihood of the failure rate
function r(u), having observed data D described above, is given by
L(r(u), u >101D)
I~=[-I r(xs)]exp[- ~o°°n(u)r(u)dul,
PROOF. To justify (1.16), we first note that the underlying random events are
the ages at failure or withdrawal. Thus the likelihood of the observed outcome is
specified by the likelihood of the failure ages and survivals until withdrawal. By
Assumption (3) of the General Sampling Model, we need not include any factor
contributed by the stopping rule, since the stopping rule does not depend on the
failure rate function r(-).
To calculate the likelihood, we use the fact that given r(.),
(See (1.4).) Specifically, if a unit is observed from age 0 until it is withdrawn at
age l, without having failed during the interval [0, lt], a factor e x p [ - S~ r(u)du]
is contributed to the likelihood. Thus, if no units fail during the test (i.e., k = 0),
the likelihood of the observed outcome is proportional to the expression given in
(1.16) for k = 0.
On the other hand, if a unit is observed from age 0 until it fails at age x~, a
r(x~)expl- fo~r(u) du]
is contributed to the likelihood. The exponential factor corresponds to the survival
of the unit during [0, xs], while r(xs) represents the rate of failure at age xs. (Note
that if we had retained the differential element 'dx', the corresponding expression
r(Xs) dx would approximate an actual probability: the conditional probability of
a failure during the interval (xs, x s + dx) given survival to age x~.)
The likelihood expression in (1.16) corresponding to the outcome k >i 1 now is
clear. The exponential factor corresponds to the survival intervals of both units
that failed under observation and units that were withdrawn before failing:
n(u)r(u) du = ~.
r(u) du + ~
r(u) du,
R. E. Barlowand F. Proschan
where the first sum is taken over units that failed while the second sum is taken
over units that were withdrawn. The upper limit ' ~ ' is for simplicity and introduces no technical difficulty, since n(u)=-0 after observation ends. []
The likelihood (1.16) applies for any absolutely continuous life distribution. In
the important special case of an exponential life distribution model,
f(xl2) = 2 e-~x, the likelihood of the observed outcome takes the simpler form
2 kexp
L(AID) oc
exp - 2
n(u) du , k>~ 1,
n(u) du ,
The following theorem is obvious from (1.17).
THEOREM 1.6. Assume that the test plan satisfies Assumptions (1), (2) and (3) of
the General Sampling Plan. Assume that k failures and the number of units operating
at age u, n(u), u >~O, are observed and that the model is the exponential density
f(x]2) = 2 e- ~x. Then
(a) k and T = So n(u) du together constitute a sufficient statistic for 2;
(b) kiT is the MLE for 2.
Note that the MLE, k/T, for 2 represents the number of observed failures divided
by the total time of test.
The maximum likelihood estimator is the mode of the posterior density corresponding to a uniform prior (over an interval containing the MLE). A uniform
prior is often a convenient reference prior. Under suitable circumstances, the
analyst's actual posterior distribution will be approximately what it would have
been had the analyst's prior been uniform. To ignore the departure from uniformity, it is sufficient that the analyst's actual prior density changes gently in the
region favored by the data and also that the prior density not too strongly favors
some other region. This result is rigorously expressed in the Principle of Stable
Estimation [see Edwards, Lindman and Savage (1963)]. DeGroot (1970), pages
198-201, refers to this result under the name of precise measurement.
EXAMPLE 1.7. The exact likelihood can be calculated explicitly for specified
stopping rules. Suppose that withdrawal times are determined in advance. Then
the likelihood is
L(r(u), u >~OID) = I ~=l n(xT )r(xs)] e x p l - f o~ n(u)r(u) dul (1.18)
where n(Xs ) is the number surviving just prior to the observed failure at age x s.
To see this consider the airplane engine data in Example 1.2. Using Figure 1.3 as
a guide, the likelihood will have the following factors:
Life distributionmodelsand incompletedata
1. For the interval [0, x(1)] we have the contribution
corresponding to the probability that all 4 units survive to x(S) and the first failure
occurs at x(1).
2. For the interval (x(l), l(1)] we have the contribution
corresponding to the probability that the remaining 3 units survive this interval.
3. For the interval (l(1), x(2)] we have the contribution
2r(x(2)) exp[ - f t~i;~2r(u) du]
corresponding to the probability that the remaining 2 units survive to x(~) and the
failure occurs at x(z).
4. For the interval (x(2), l(2)] we have the contribution
corresponding to the conditional probability that the remaining unit survives to
age l(2). Multiplying together these conditional probabilities, we obtain a likelihood
having the form shown in (1.18).
2. Parameter estimators and credible intervals
In the previous section we saw how to calculate the likelihood function for
general life distributions. This is required in order to calculate the posterior
distribution. Calculation and possibly graphical display of the posterior density
would conceivably complete our data analysis.
If we assume a life density p(xlO) and n(O) is the prior, then
p(x, O) = p(x] 0)~(0) is the joint density and p(x) = ~op(x[O)~(0) dO is the marginal or predictive density. Given data D and the posterior density r~(0[D), the
predictive density is
p(xlD) = foP(XlO)zr(OID)dO.
R. E. Barlow and F. Proschan
If asked to give the probability of survival until time t, we would calculate
P(X > t l D) =
p(xlD) d x .
EXAMPLE 2.1. For the exponential density 2 e-xx, k ovserved failures, T total
time on test, and the General Sampling Plan, the likelihood is proportional to
2/` e - a t . For the natural conjugate prior,
b a 2a
e - oa
the posterior density is
~(2lk, T) = (b + T ) a + k 2 a + k - I e-(b+ r)x/F(a + k).
In this case the probability of survival until time t is
P ( X > thk, T) =
e-'t/Tz(2]k, T ) d 2
+t+ T/
2.1. Bayes estimators
We will need the following notation:
El0]=fo 0~z(0)d0
Of course, E[ t?] is the mean of the prior distribution while E[ OlD] is the
mean of the posterior distribution.
We wish to select a single value as representing our 'best' estimator of the
unknown parameter 0. To define the best estimator we must specify a criterion
of goodness (or equivalently, of poorness). Statisticians measure the poorness of
an estimator 0 by the expected 'loss' resulting from their estimator 0. One
very popular loss function is squared error loss: specifically, having observed data
D and determined the posterior density ~z(0[D), the expected squared error loss
is given by
E l ( 0 - 0)2ID] ;
the expectation is calculated with respect to the posterior density n(OID). We
choose a point estimator 0 so as to minimize the expected squared error loss
Life distribution models and incomplete data
in (2.2); i.e., we choose O to satisfy
minimum E[( 0 - a)2rD] = E[( 0 - 0)21D].
To find the minimizing value t), we add and subtract E ( O I D ) in the loss
function to obtain
E l ( 0 - a)ZlO] = E[( O - E ( OID))21D] + [E( OID) - a] 2 .
Since we wish to minimize the right hand side, we set a = E ( 0 J D ) , which
then represents the solution to (2.3). The resulting estimator, E(0ID), the
mean of the posterior, is called the Bayes estimator with respect to squared error
THEOREM 2.2. The Bayes estimator of a parameter 0 with respect to squared loss
is the mean E ( 0 1D) of the posterior density.
Another loss function in popular use is the absolute value loss function:
Eli 0-
01 ID].
To find the minimizing estimator using this criterion, we choose 0 to satisfy:
minimumE[ p0 - al ID] = E[I 0 - Of ID].
It is easy to show:
THEOREM 2.3. The Bayes estimator of a parameter 0 with respect to the absolute
value loss function is the median of the posterior density. Specifically, the estimator
0 satisfies
~c(OID) dO =
n(OID) dO = ½ .
Of course, the prior density and the loss function enter crucially in determining
a 'best' estimator. However, no matter what criterion is used, all the information
concerning the unknown parameter 0 is contained in the posterior density. Thus,
a graph of rc(0[D) is more informative than any single parameter of the posterior
density, whether it be the mean, the median, the mode, a quartile, etc.
EXAMPLE 2.4. Assume that lifetime is governed by the exponential model,
O - l e -x/°. Suppose we conjecture that E[ 0 Ik, T], for sampling plan with k, T
sufficient, is linear in T for fixed k. It turns out that such a linear relationship
holds if and only if we use as our prior the natural conjugate prior:
R. E. Barlow and F. Proschan
bao-(a+ 1) e-b/O
~(o) =
(See Diaconis and Ylvisaker (1979) for a proof of this result and for more general
results of this kind.) The corresponding Bayes estimator with respect to squared
error loss is
E[ OIk, T] _
(b + T)
( a + k - 1)
However, the natural conjugate prior would not be appropriate if we believed,
for example, that 0 could assume values only in two disjoint intervals. Under this
belief, a bimodal prior density would be more natural, and the corresponding
estimator E[ 0lD] would very likely be difficult to obtain in closed form
such as in (2.7). However E[ 0 ID] could be computed by numerical integration.
There are many other functions of unknown parameters for which we may want
the Bayes estimator with respect to squared error loss. For example, we may wish
to estimate the probability of survival until age t for the exponential model; i.e.,
g(O) = e x p [ - ~ ] .
It is easy to show in this case that
is the Bayes estimator. If n(O) is the natural conjugate prior, then it is easy to
verify that
b+t+ Tl
i.e., this is the Bayes estimator of the probability of survival to age t given total
time on test T and k observed failures. Note that this ~ is precisely the marginal
probability of survival until time t.
2.2. Credible intervals
As we have seen, Bayes estimators correspond to certain functions of the
posterior distribution such as the mean, the mode, etc. A credible set or interval
is another way of presenting a partial description of the posterior distribution.
Life distn'bution models and incomplete data
Specifically, we choose a set C on the positive axis (since we are dealing with
lifetime) such that
f rr(OID)dO= 1 -
Such a set C is called a Bayesian (1 - a) 100 percent credible set (or credible
interval if C is an interval) for 0.
Obviously, the set C is not uniquely determined. It would seem desirable to
choose the set C to be as small (e.g., least length, area, volume) as possible. To
achieve this, we seek a constant c 1 _ ~ and a corresponding set C such that
C = {0]
f re(OlD)dO= 1 -
A set C satisfying (2.11) and (2.12) is called a highestposterior density credible set
(Box and Tiao, 1973). In general, C would have to be determined numerically with
the aid of a computer.
For the exponential model 2 e-ax, the natural conjugate prior is the gamma
density. Since the gamma density is a generalization of the chi-square density, we
recall the definition of the latter so that we can make use of it to determine
credible intervals for the failure rate of the exponential.
A random variable,
gZ(n), having
X n/2- 1 exp I - 2
fx2~°)(x) =
for x/> 0, n = 1, 2 . . . . ,
is called a chi-square random variable with n degrees of freedom (d.f.).
A table of percentage points of the chi-square distribution may be found in
Pearson and Hartley (1958). In addition, chi-square programs are available for
more extensive calculations using electronic computers and programmable calculators.
It is easy to verify that the Z2 random variable with 2n d.f. is distributed as
2(Y1 + Y2 + " ' " + Yn), where Y1, Y2 . . . . .
Y~ are independent, exponentially distributed random variables with mean one. Thus, we obtain the following result
useful in computing credibility intervals for the failure rate of the exponential
model with corresponding natural conjugate prior.
R. E. Barlow and F. Proschan
THEOREM 2.6. Let k failures and total time on test T be observed under sampling
assumptions (1), (2) and (3) (Section 1)for the exponential model 2e -zx. Let )~
have the posterior density corresponding to the natural conjugate prior
b a )a - 1 e- b2
with a an integer. Then
p[Z2/2[2(a + k)] <<,~ <Z2-~/2[2(a + k ) ] i D ] =
2(b + T)
2(b + T)
l _ ~,
where z~(n) is the lOOfl percentage point of a chi-square distribution with n d.f.; i.e.,
f ~ ( " ) fz2(m (x) dx = ft.
REMARK, Because of the lack of symmetry of the Z2 density, the interval in
(2.14) is not the highest posterior density credible interval.
It is easy to verify that (b + T)). given the data has a gamma density,
1~a+k-1 e 2
F(a + k)
corresponding to the density of Y1 + "'" + Ya + k, where the Y's are independent
unit exponential random variables. Hence
2).(b + T) ~t 2(Y 1 + " " + Ya+k),
where st denotes stochastic equality; i.e., 2),(b + T) has a chi-square density
with 2(a + k) d.f. []
COROLLARY 2.7. For 2(a + k) large (say 2(a + k ) > 30), the normal approximation provides the approximate credibility statement
p [ ( a + k) + (a + k)l/Zz~/2 ~
(a + k) + (a + k)l/2z~_ ~/2 D1
J -1-~,
where z~ satisfies ~ o~ ep(u) du = c~and q~(u) = ( 1 / x / ~ ) e u2/2 is the normal density
with mean 0 and variance 1.
Life distributionmodels and incomplete data
Since the Z2(2n) random variable can be written as
Z2(2n) = 2(Yz + Y2 + "'" + Y.)
where YI, Y2. . . . , Yn are independent unit exponentials, the Central Limit
Theorem (e.g., Hoel, Port and Stone, 1971) applies. Note that EX2(2n) = 2n and
Var[z2(2n)] = 4n. Thus,
Z2(2n) - 2n
is approximately normal with mean 0 and variance 1 by the Central Limit
Theorem. []
COROLLARY 2.8. Let k failures and T total time on test be observed under the
General Sampling Plan assumptions (1), (2) and (3) (Section 1), for the exponential
model O-l e-~/o. Let 0 have the natural conjugate prior with integer a, then
2(b + T)
Z 2- ,/2 [2(a + k)]
2(b + T ) I D ] =
Z2/2 [2(a + k)]
PROOF. Since 0 has the natural conjugate prior distribution for the model
0 - 1 e -x/°, then ,~ = 1/0 has the natural conjugate prior for the model 2 e -~x.
(2.16) follows from (2.14). []
3. The
Whenever possible, the choice of a life distribution model should be based on
the underlying failure mechanisms. Simple structures composed of statistically
independent components have been used to derive life distribution models valid
when the number of structural components is very large.
Suppose a structure of n components fails as soon as k components fail. If also
component lifetimes are judged identically distributed and independent, then there
are only two possible limiting structure life distributions in the sense that there
exist sequences of normalizing constants {a,)~=l, {2n)n~_-i such that for all
real x,
lim P { 2 n ( ~ . , - a,) ~ x}
exists. The limit is either
(' [,~(x - a)] ~
e - " u k - a du,
~ , 2 > 0, x > a ~ > 0,
R . E . B a r l o w a n d F. P r o s c h a n
f f e x p [)t(x - a)]
e - uuk - 1 du
1)! ~o
(Smirnov, 1952). In both cases a is a location parameter and 2 is a scale
parameter while ~ and k are shape parameters.
If k = 1, then (3.1) becomes
W(xla, 4, e)= 1 - e x p { - [ 2 ( x - a ) ] ~ } ,
the Weibull distribution, and (3.2) becomes
A(xla, 2)= 1 - e x p { - e ~ ( X - a ) } ,
Thus, if X is the structure lifetime, then either X or exp (X) has a Weibull distribution. The failure rate for the Weibull distribution of (3.1') is
rw(x )= ~ 2 ~ ( x _ a ) = - i
and 0 elsewhere. In the second case it is
rA(X) = 2 e x p [ 2 ( x - a)].
For all parameter values, (3.4) is increasing in x. Hence, if we wish to allow the
possibility that the failure rate may be decreasing we must choose the Weibull
model, (3.1'), with e < 1.
The Weibull model appears to furnish an adequate fit for some strand lifetime
data with estimated values of e less than 4. On the other hand, it has been
empirically observed that for strength data, estimates for e using the Weibull
model are often large ( > 27 in some cases). This suggests that (3.2') may provide
a better model for strand strength data.
3.I. Inference for the Weibull distribution
The Weibull life distribution model has three parameters: a, 2, and e. The
parameter a > 0 is a threshold value for lifetime; before time a we expect to see
no failures. If there is no physical reason to justify a positive threshold value, the
analyst should use the two parameter Weibull model. The most simple model
compatible with prior knowledge concerning physical processes will often provide
the most insight. The Weibull density is
f(xla, ~, 2) = ~2~(x - a) ~- a e-[~(x-a)l~
for x >~ a and 0 elsewhere.
Life distribution models and incomplete data
Usually we wish to quantify our uncertainty about a particular aspect of the life
distribution, such as the probability of surviving x hours. For the three parameter
Weibull model, this is given by
f f ( x l a , 2, ~) = e x p { - [ 2 ( x
- a)]~}.
It is clearly sufficient to assess our uncertainty concerning a, 2, and ~.
Suppose data are obtained under the General Sampling Plan (Section 1). Let
xl, x 2 . . . . . x k denote the unordered observed failure ages and n(u) the number
surviving until age u. Then by Theorem 1.6 in Section 1, the likelihood is given
L ( a , ~, ).ID)
oc c~2 k~
(x i - a)
l 'I Ira
- 2~
an(u) (u - a) ~ - 1 du
for a ~< xi and ~, 2 > 0. Suppose there are m withdrawals and we pool observed
failure and loss times and relabel them as
0 =-- t(o ) ~ t(1 ) ~
t(2 ) ~ ' ' ' ~
t(k+m ) ~ t.
Then, for a ~< x i, i = 1, 2 . . . . . k, we have
k +m
n(u) (u - a) ~ - ' du = Z
F t(O
i= 1
+ (n - k -
,I t(i_ 1)
(u - a) ~ - I d u .
Observation is confined to the age interval [0, t].
Two important deductions can be made from (3.7):
1. The only sufficient statistic for all three parameters (or for a and 2 alone
when a = 0) is the entire data set.
2. No natural conjugate family of priors is available for all three parameters (or
for ~ and 2 alone when a = 0). Consequently, the posterior distribution must be
computed using numerical integration [see Diaconis and Ylvisaker (1979)].
For most statistical investigations, a and perhaps also a would be considered
nuisance parameters. By matching our joint prior density in a, 2 and a with the
likelihood (3.7), we can calculate the posterior density, re(a, 2, aiD). For example,
of a is considered a nuisance parameter, then we would calculate the marginal
density on 2 and ~ as
~z(a, ).ID) =
n(a, ct, 2[D) d a .
R. E. Barlow and F. Proschan
3.2. Credibility regions for two parameter models
Let rc(~, kID) be the posterior density for a two parameter model such as the
Weibull model above with scale parameter 2 and shape parameter ~. To find the
so-called 'highest posterior density' credibility region for ~ and 2 simultaneously
(Section 2), we find a constant c(fi) by sequential search such that:
R = [(c¢, 2) 1 (Tr(~, 410)>~ c(fl)]
f f ~(a,21D)d~d2=fl.
The region R defined above is a fl(100) percent credibility region for a and 4. For
unimodal densities such regions are bounded by a single curve C which does not
intersect itself (i.e., a 'simply connected region').
To illustrate the use of Weibull credibility regions we have computed credibility
regions corresponding to the data in Tables 3.1 and 3.2. Twenty-one pressure
vessels were put on life test at 68~o of their ultimate mean burst stress. A pressure
vessel is filled with a gas or liquid and provides a source of mechanical energy.
They are used on space satellites and other space vehicles, After 13488 hours of
testing, 5 failures were recorded, After an additional 7080 hours of testing, an
additional 4 failures were recorded.
Table 3.1
Ordered failure ages of pressure vessels life
tested at 68~o of mean rupture strength (n = 21,
observation to 13488 hours)
Number of failure
Age at failure (hours)
Table 3.2
Ordered failure ages of pressure vessels life
tested at 68~o of mean rupture strength (failures
between 13488 hours and 20568 hours)
Number of failure
Age at failure (hours)
Life distribution models and incomplete data
Figure 3.1 displays credibility contours for ct and 2 after 13488 hours of testing
and again after 20 568 hours of testing. The posterior densities were computed
relative to uniform priors. The posterior density computed after 20568 hours
could also be haterpreted as the result of using the posterior (calculated on the
basis of Table 3.1 and a fiat prior) as the new prior for the data in Table 3.2. A
qualitative measure of the information gained by an additional year of testing can
be deduced by comparing the initial (dark) contours and the tighter (light)
contours in Figure 3.1.
2 , O0
13 4 8 8 h o u r s
20 568 hours
1 . 5 0 --
1. O0 --
O. 5 0
O. O0 -- ~ i
J I I r i i ~ ~i
,: i i I I r ]1
1.60 2.00
J I [ J J J I f t r I I I I Ii
I i i Ir
~t.40 4 . 8 0
Fig. 3.1. Highest probability density contours for ~ and 2 for Kevlar/epoxy pressure vessel life test
data, T h e pressure vessels w e r e tested at 68~o stress level.
R. E. Barlow and 1:. Proschan
To predict pressure vessel life at the 68~o stress level, we can numerically
where rt(~, 2[D) must be numerically computed using the given data, D.
If the mean life
or the standard deviation of life
computed by making a change
parameter. For example, if a = 0
the mean life, 0, we can use the
are of interest, their posterior densities can be
of variable and integrating out the nuisance
in the Weibull model and we are interested in
Weibull density in terms of c~ and 0.
f ( x l a , O) = a
F 1+x ~- 1 exp
to compute the joint posterior density rc(~t, 0[ D). The prior for a and 2 must be
replaced by the induced prior for a and 0. This may be accomplished by a change
of variable and by computing the appropriate Jacobian. The marginal posterior
density of 0 is then
n(OID ) =
7r(a, OlD) d~.
This can then be used to obtain credibility intervals on 0.
4. Notes and references
4. I. Section 1
In the General Sampling Plan we needed to assume that any stopping rules
used were noninformative concerning the failure distribution. The need for this
assumption was pointed out by Raiffa and Schlaiffer (1961). Examples of informative stopping rules were given by Roberts (1967) in the context of two stage
sampling of biological populations to estimate population size (so-called capturerecapture sampling).
Life distribution models and incomplete data
4.2. Section 2: Unbiasedness
The posterior mean is a Bayes estimator of a parameter, say 0, with respect to
squared error loss. It is also a function of the data. An estimator, O(D), is
called unbiased in the sample theory sense if
E~[ b(D)l 0]
= 0
for each 0e O. No Bayes estimator (based on a corresponding proper prior) can
be unbiased in the sample theory sense (Bickel and Blackwell, 1967).
Most unbiased estimators are in fact inadmissible in the sample theory sense
with respect to squared error loss. For example, 0(D) = T/k is a
sample theory unbiased estimator for the mean of the density 0-~ e -x/°. However
it is inadmissible in the sense that there exists another cO(D) with e :~ 1
such that, for all 0
Er[[cO(D ) - 0121 0] < E,~[ [ O(D) - O]z ] 0].
To find this c, consider Y = O(D)/O and note E Y = 1. Then we need only find
c such that
ElJ(Cr- 1)210]
is minimum. This occurs for co = E Y / E Y 2 which is clearly not 1. Hence 0(D)
is sample theory inadmissible. Sample theory unbiasedness is not a viable
For ~_arge k, 0 ( D ) = T/k will be approximately the same as our Bayes
estimator. However, T/k is not recommended for small k.
Since tables of the chi-square distribution have in the past been more accessible
than tables of the gamma distribution, we have given the chi-square special
treatment. However with modern computing facilities, we really only need to use
the more general gamma distribution.
4.3. Confidence intervals
A (1 - c~)100~o confidence interval in the sample theory sense in one such that
if the experiment is repeated infinitely often (and the interval recomputed each
time) then (1 - ~)100~o of the time the interval will cover the fixed unknown true
parameter 0. Since confidence intervals do not produce a probability distribution
on the parameter space for 0, they cannot provide the basis for action in the
decision theory sense; i.e., a decision maker cannot use a sample theory confidence interval to compute an expected utility function which can then be
maximized over his set of possible decisions.
If for 2 e -~x we choose the improper prior, n ( 2 ) = 1/2, then the chi-square
( 1 - ~)100~o credible intervals and the sample theory ( 1 - a)100~o confidence
intervals agree. Unfortunately, such improper credible intervals can be shown to
R. E, Barlow and F. Proschan
violate certain rules of logical behavior. Lindley (personal communication)
provides the following simple illustration of this fact for the exponential model
2 e-~x. Suppose n units are put on test and we stop at the first failure, so that
T = nXo). Now T given 2 also has density 2e -~x so that (ln2)/T is a 50~o
improper upper credible limit on 2; i.e.,
P [ "~<(ln2)
Suppose now that T is observed and we accept the probability statement (4.1).
Consider the following hypothetical bet.
(i) If ~. < (ln2)/T we lose the amount e- r;
(ii) If 2 >/(In 2)/T we win e- r.
We can pretend that the true 2 is somehow revealed and bets are paid off. If
we believe statement (4.1), then given T such a bet is certainly fair.
Now let us compute our expected gain before T is observed (preposterior
analysis). This is easily seen to be (conditional on 2)
- f on 2~/~2 e - ~ t e - t d t +
,d 0
f ~
2- ~/~- 1]
2 e - ~ t e - ~ d t = - 2[
,)(ln 2)/Z
1+ 2
which is negative for all 2 > 0. Note that this is what we subjectively expect, since
as (improper) Bayesians, every probability (and presumably even an improper
prior) is subjective.
The contradiction lies in the observation that
1. conditional on 2 and prior to observing T, our expected winnings are negative for all 2;
2. conditional on T, our expected loss is zero (using the improper prior
~ ( A ) = 1/2).
The source of the contradiction is that we have not measured our uncertainty
for all events by probability. For example, we have assigned the value ~ to the
event 2 < 2 o for all 2 0 > 0 ; i.e., ~ r c ( 2 ) d 2 = S ~ ( 1 / A ) d 2 = ~ .
We can
prove that for any set of uncertainty statements that are not probabilistically
based (relative to proper distributions), a system of bets can be constructed which
will result in the certain loss of money. A bet consists of paying pz < z dollars
to participate with the understanding that if an event E occurs you win z dollars
and otherwise you win nothing.
4.4. Section 3
The Weibull distribution is one of several extreme value distributions. See
Barlow and Proschan (1975), Chapter 8, for a more advanced discussion of
extreme value distributions.
Life distribution models and incomplete data
W e w o u l d like to a c k n o w l e d g e D e n n i s L i n d l e y for his perceptive c o m m e n t s a n d
criticisms o f a n earlier draft. T h a n k s are also d u e to C o l l e e n P o s t m u s a n d M a r i k o
K u b i k for t y p i n g m a n y v e r s i o n s p r e v i o u s to this one.
Barlow, R. E. and Proschan, F. (1975). Statistical Theory of Realibility and Life Testing. Holt, Rinehart
and Winston, New York.
Bickel, P. J. and Blackwell, D. (1967). A note on Bayes estimates, Ann. Math. Statist. 38, 1907-1911.
Box, G. E. P. and Tiao, T. C. (1973). Bayesian Inference in Statistical Analysis. Addison-Wesley,
Reading, MA.
De Groot, M. H. (1970). Optimal Statistical Decisions. McGraw-Hill, New York.
Diaconis, R. and Ylvisaker, D. (1979). Conjugate priors for exponential families. Ann. Statist. 7,
Edwards, W., Lindman, H., and Savage, L. J. (1963). Bayesian statistical inference for psychological
research. Psychological Rev. 70, 193-242.
Hoel, P. G., Port, S. C., and Stone, C. J. (1971). Introduction to Probability Theory. Houghton Mifflin,
Boston, MA.
Lindley, D. V. (1978). The Bayesian approach. Scandinavian J. Statist. 5, 1-26.
Pearson, E. S. and Hartley, H. O. (1958). Biometnka Tables for Statisticians. Vol. 1. The University
Press, Cambridge, England.
Raiffa, H. and Schlaiffer, R. (1961). Applied Statistical Decosion Theory. Harvard Business School,
Boston, MA.
Roberts, H. V. (1967). Informative stopping rules and inferences about population size. J. Amer.
Statist. Assoc. 62, 763-775.
Smirnov, N. V. (1952). Limit distributions for the terms of a variational series. Trans. Math. Soc.
Ser. 1, 1-64.
P. R. Krishnaiah and C. R. Rao, eds., Handbook of Statistics, Vol. 7
© Elsevier Science Publishers B.V. (1988) 251-280
Piecewise Geometric Estimation of a Survival
Gillian M. Mimmack and Frank Proschan
1. Introduction and summary
The problem of estimating survival probabilities from incomplete data is well
known in the fields of reliability, medicine, biometry and actuarial science. The
general situation is described as follows. The variable of interest is the lifespan
of some unit: the investigator wishes to estimate the probability of survival beyond
any given time. To this end, n identical units are placed 'on test'. Each item is
either observed until failure, resulting in an uncensored observation, or is removed
from the test before failure, resulting in a censored observation. Thus the data
available consist of a number of lifelengths and a number of truncated lifelengths:
the statistical problem is to estimate the probability distribution of the lifelengths.
The various statistical approaches to the problem can generally be classified
according to the restrictiveness of the model assumed and the type of information
utilized. At one extreme are purely parametric procedures, which involve assuming
that the underlying life distribution belongs to a specific parametric family.
These procedures utilize interval information. The Bayesian estimator described
by Susarla and Van Ryzin (1976) makes allowance for both parametric and
nonparametric models: the type of information utilized depends on the assumptions about the prior distribution. As our approach to the problem is neither
parametric nor Bayesian, we do not consider these procedures further but concentrate on nonparametric procedures.
Nonparametric procedures range in sophistication from the well-known actuarial estimator, which is a step function constructed from ordinal information
alone, to the piecewise polynomial estimators of Whittemore and Keller (1983)
that utilize interval information. The most widely used nonparametric estimators
are those of Kaplan and Meier (1958) and Nelson (1969). These estimators are
also step functions constructed from ordinal information. Their properties are
described by Eft-on (1967), Breslow and Crowley (1974), Petersen (1977), Aalen
* Research supported by the Air Force Office of Scientific Research, AFSC, USAR, under Grant
AFOSR 82-K-0007.
G. M. Mimmack and F. Proschan
(1976, 1978), Kitchin, Langberg and Proschan (1983), Nelson (1972), Fleming
and Harrington (1979), and Chen, Hollander and Langberg (1982).
One of the by-products of the estimation process is an estimate of the failure
rate function: here, another issue is raised. It is evident that survival function
estimators that are step functions do not provide useful failure rate function
estimators: Miller (1981) mentions smoothing the Kaplan-Meier estimator for
this reason and summarizes the development of other survival function estimators
that may be obtained by considering a special case of the regression model of Cox
(1972). These estimators generally correspond to failure rate function estimators
that are step functions and utilize at most part (but not all) of the interval
information contained in the data. Whittemore and Keller (1983) give several
more refined failure rate function estimators that are step functions and utilize full
interval information. They also describe even more complex estimators that utilize
full interval information: however, these are not computationally convenient compared with their simpler estimators. It seems, from their work, that a successful
rival of the Kaplan-Meier estimator should be only marginally more complex than
it (so as to be computationally convenient and yet yield a useful failure rate
function estimator) and also should utilize more than ordinal information.
In Section 2, we propose an estimator that not only provides a reasonable
failure rate function estimator but also utilizes interval information. Moreover, it
is computationally simple. Our estimator is a discrete counterpart of two versions
of a continuous estimator proposed independently by Kitchin, Langberg and
Proschan (1983) and Whittemore and Keller (1983). The motivation for the
construction of our estimator is the same as that of the former authors, and our
model is the discrete version of theirs: in contrast, the latter authors assume the
more restrictive model of random censorship and obtain their estimator by the
method of maximum likelihood. This provides an alternative method of deriving
our estimator.
The remaining sections are concerned with properties of our estimator. As this
presentation is expository, proofs are omitted: Mimmack (1985) provides proofs.
In Section 3, we explore the asymptotic properties of our estimator under
increasingly restrictive models. Our estimator is strongly consistent and asymptotically normal under conditions more general than those typically assumed.
Section 4 deals with the relationships among our estimator, the Kaplan-Meier
estimator, and the above-mentioned estimator of Kitchin et al. and Whittemore
and Keller. The section ends with an example using real data.
In Section 5, we continue the comparison of the new estimator and the
Kaplan-Meier estimator: since the properties of the new estimator are expected
to resemble those of its continuous counterparts, we discuss the implications of
simulation studies designed to investigate the small sample behaviour of these
estimators. We also present the results of a Monte Carlo pilot study designed to
investigate the small sample properties of our estimator.
Piecewise geometric estimation of a survival function
2. Preliminaries
In this section we formulate the problem in statistical terms and define our
Let X denote the lifelength of a randomly chosen unit, where X has distribution
function G. Suppose that n identical items are placed on test. The resultant
sample consists of the pairs (Z1, bl) . . . . . (Z~, b~), where Z; represents the time
for which unit i is observed and b; indicates whether unit i fails while under
observation or is removed from the test before failure. Symbolically, for
i = 1, ..., n, we have
X; --- lifelength of unit i, where X; has distribution G,
Y~ = time to censorship of unit i,
Z; = min(X~, Ye),
ai = I(X;<... Y~).
(Xt, Yt), - . . , (Xn, Yn) are assumed to be independent random pairs. Elements
of a pair X; and Y~, where i = 1, . . . , n, are not assumed to be independent.
We assume that the lifelength and censoring random variables are discrete. Let
5f = {x~, x 2. . . . } denote the set of possible values of X and Y¢ = {Yl, Y2. . . . }
denote the union of the sets of possible values of Y~, Yz . . . . , where ~¢ ___ &r. The
survival probabilities of interest are denoted P ( X > xk), k = 1, 2 . . . . . where
P ( X > xk) = G(x~) = 1 - G(xk), k = 1, 2 , . . . .
It is evident that this formulation differs from that of the model of random
censorship which is generally assumed in the literature, and in particular, by
Whittemore and Keller (1983). These authors assume that the lifelength and
censoring random variables are continuous, that the corresponding pairs X; and
Y~, where i = 1, 2, ..., are independent, and that the censoring random variables
are identically distributed. Although Kaplan and Meier (1958) assume only independence between corresponding lifelength and censoring random variables,
Breslow and Crowley (1974), Petersen (1977), Aalen (1976, 1978), and others--all
of whom describe the properties of the K a p l a n - M e i e r e s t m a t o r - - a s s u m e also
that the censoring random variables are identically distributed. Our formulation
is the discrete counterpart of that of Kitchin, Langberg and Proschan (1983):
likewise, our estimator is the discrete counterpart of theirs.
Before describing our estimator, we give the notation required.
Let nl be the random number of distinct uncensored observations in the sample
and let t I < t2 < • • • < tn, denote these distinct observed failure times, with to = 0.
Let n 2 be the random number of distinct censored observations in the sample and
let s~ < s 2 < • • • < sn: denote these times, with s o --- 0.
Let D; be the number of failures observed at time t;:
~. I ( Z j =
ti, ~.= l )
G. M. Mirnmack and F. Proschan
Let C,. be the number of censored observations equal to se:
C i = ~ I ( Z j = s i, b j = 0 )
for i = 1. . . . .
n 2.
Let fin(t) --- 1 - Fn(t ) denote the proportion of observations that exceed t:
ff~(t) = 1 ~ I(Z_i> t) for t~ [0,
H j=l
Let F~ (t) denote the proportion of failures observed at or before t:
F~(t)= -1 ~ I(Zj<~t, bj= 1) for t~[O,
Let T~. be a measure of the total time on test in the interval (t;_
T i = # {m: ti_ 1 <
Xm ~
k: l i
ti)(nFn(ti) -~- Oi)
for i = 1, . . . , n , ,
l < Sk ~ tl
where # A denotes the cardinality of the set A.
(If failure and censoring r a n d o m variables are lattice r a n d o m variables, then T;
is the total time on test in (t;_ 1, ti]. In general, however, 7",.increases by one unit
whenever an item on test survives an interval of the form ( x j _ l , xj], where
t i - 1 < X j _ 1 < X j ~ ti, irrespective of the distance between xj_ 1 and x±)
We now construct our estimator. Expressing the survival function G in terms
of the failure rates P ( X = XklX>~ Xk), k = 1, 2, . . . , we have
1-I [ 1 - P ( X = x + I X > ~ x ; ) ]
for k = 1 , 2 . . . . .
It is evident from (2.1) that we may estimate our survival function at x k from
estimates of the failure rates at xl, x2 . . . . , x k. In the experimental situation,
failures are not observed at all the times x 1, x 2 . . . . so specific information about
the failure rates at m a n y of the possible failure times is not available. Having
observed failures at q, t 2, . . . , t,~, we find it simple to estimate the failure rates
P ( X = te]X >~ ti), i = 1. . . . . n 1. However, the question of how to estimate the
failure rates at the intervening possible failure times requires special consideration.
One a p p r o a c h - - t h a t of Kaplan and Meier (1958), Nelson (1969) and o t h e r s - - i s
to estimate the failure rates at these intervening times as zero since no failures are
observed then. However, not observing failures at some possible failure times may
be a result of being in an experimental situation rather than evidence of very small
Piecewise geometric estimation of a survival function
failure rates at these times, so we discard this approach and consider nonzero
It is reasonable to assume that the underlying process possesses an element of
continuity in that adjacent failure rates do not differ radically from one another.
Thus we consider using the estimate of the failure rate at t,. to estimate the failure
rate at each of the possible failure times between t;_ 1 and ti, where i = 1, ..., n~.
We are therefore assuming that our approximating distribution has a constant
failure rate between the times at which failures are o b s e r v e d - - t h a t is,
f i ( X = x k l X >~ Xk) = O,
for ti_ l < x k <~ t i, i = 1 , . . . , n l ,
for i = 1. . . . . n 1.
Substituting (2.2) into (2.1), we obtain
Xk) = (1
qi) #{m:t . . . . . .
~Xk} I-I
Oj)#{m:tj . . . . .
for t;_ ~ < x k <~ t i, i = 1 , . . . ,
nI .
We note that the property of having constant failure rate on gr characterizes
a family of geometric distributions defined on Y'. In particular, the failure rates
q~ . . . . . qn, identify n~ geometric distributions G~ . . . . , G~ defined on ~c. The
survival functions, G 1. . . . . Gn,, have the geometric form
k for k = 1 , 2 . . . .
and i = 1. . . . . n l .
Inspection of (2.3) and (2.4) reveals that our estimating function is constructed
from the geometric survival functions G1, . . . , Gn,, where G; is used in the interval (t i_ 1, ti], i = 1 . . . . . n l . Consequently, the estimator (2.3) is called the Piecewise G e o m e t r i c E s t i m a t o r (PEGE).
It remains to define estimators of the failure rates ql . . . . , qm" This was originally done by separately obtaining the maximum likelihood estimators of the
parameters of n 1 truncated geometric distributions: the procedure is outlined at
a later stage because it utilizes the geometric structure of (2.3) and therefore
provides further motivation for the name ' P E G E ' . A more straightforward but less
appealing approach is to obtain the maximum likelihood estimates of q~, . . . , q,,
directly: denoting by L the likelihood of the sample, we have
Substituting (2.3) into this expression and differentiating yields the unique maximum likelihood estimates
qi =
i = 1. . . .
, n1.
G. M. Mimmack and F. Proschan
Substituting 01, . . . , an, into (2.3), we finally obtain our estimator, formally defined
as follows.
DEFINITION 2.1. The Piecewise Geometric Estimator ( P E G E ) of the survival
function of the lifelength r a n d o m variable X is defined as follows:
or n l = O ,
(1 - D f f r , ) # { ~ : , . . . . . .
: ( x > xk) =
~Xk} IX (1 - D j / T j ) # { m : t , - ' < x " ~ ' J }
for ti_ 1 < Xk <~ ti, i = 1, . . . , hi, n I > 0 ,
Dj/Tj) #(m:t'
for Xk > tnl , n 1 > 0 .
The alternative derivation of the P E G E emphasizes its geometric structure: it
turns out that 01 . . . . , 0 , , defined above are m a x i m u m likelihood estimators of the
parameters of the truncated geometric distributions G* . . . . , G*, defined below.
For i = 1, ..., n~ we formulate the following definitions:
Let Ne = # {m: t,._ 1 < Xm ~ t~} be the number of possible times of failure in the
interval (t,._ 1, t~] and let X* be the number of possible times of failure that a unit
of age t~_ 1 s u r v i v e s - - t h a t is,
X* = number of trials to failure of a unit of age t,._ 1 ,
where the possible values of X* are assumed to be 1, 2 . . . . , N~, N,.+ . The distribution G* of X* is then given by
for k = 1 , 2 . . . . ,IV,.,
G : ( N ; - ) = O.
The information available for estimating qi consists of nff,(t~_ l ) observations
on X,.*: of these, D; are equal to N~, nF,(te) are equal to N~+ , and for all sj in the
interval (t~_ 1, ti], Cj are equal to the number # {m: t~_ 1 < Xm <<-Sj}. The resultant
m a x i m u m likelihood estimator of q; is precisely 0~ defined above.
It is evident that the estimators 01, - . . , 0,, have the form of the usual m a x i m u m
likelihood estimator of a geometric p a r a m e t e r - - t h a t is,
Estimated failure rate =
number of failures observed
total time on test
Moreover, we note that this is the form of the failure rate estimators in the
intervals (t o, q ] . . . . . (t,l, oo) defined for the Piecewise Exponential Estimator
Piecewise geometric estimation of a survival function
(PEXE) of Kitchin, Langberg and Proschan (1983). In terms of our notation
(modified for continuity), the PEXE is defined as follows:
1 for t < 0
or nl = 0 ,
I-[ e x p [ - ( t j . j=1
P * ( X > t) =
for ti_ l < t <~ t/, i = l . . . .
tj_ 1),~j]
, n l , nl > O ,
I-'[ e x p [ - ( t j
- tj_ 1),~j]
for t > t n , ,
2 i = 1/7i for i = 1 , . . . , n 1,
f t ti
nFn(u) du
for i-- 1. . . . . n l .
For i = 1, ..., n 1, 2/is the failure rate in the interval (ti_ 1, t/] and 7,- is the total
time on test in this interval.
The PEXE is a piecewise exponential function because its construction is based
on the assumption of constant failure rate between observed failures: just as a
constant discrete failure rate characterizes a geometric distribution so a constant
continuous failure rate characterizes an exponential distribution. Thus the P E G E
is the discrete counterpart of the PEXE.
Returning to our introductory discussion about the desirable features of survival
function estimators, we now compare the P E G E with other estimators in terms
of these and other features.
First, the P E G E is intuitively pleasing because it reflects the continuity inherent
in any life process. The Kaplan-Meier and other estimators that are step
functions do not have this property.
Second, we note that the P E G E utilizes interval information from both censored and uncensored observations. It is therefore more sophisticated than the
Kaplan-Meier and Nelson estimators. Moreover, none of the estimators of
Whittemore and Keller utilizes more information than does the PEGE.
Third, the P E G E provides a simple, useful estimator of the failure rate function.
While this estimator is naive compared with the nonlinear estimators of Whittemore and Keller, the P E G E has the advantage of being simple enough to calculate
by hand--moreover it requires only marginally more computational effort than
does the Kaplan-Meier estimator.
Regarding the applicability of the PEGE, we note that use of the P E G E is not
restricted to discrete distributions because it can be easily modified by linear
interpolation or by being defined as continuous wherever necessary. This is
theoretically justified by the fact that the integer part of an exponential random
variable has a geometric distribution: by defining the P E G E to be continuous, we
G. M. Mimmack and F. Proschan
are merely defining a variant of the PEXE. The properties of this estimator follow
immediately from those of the PEXE.
Finally, apart from being intuitively pleasing, the form of the P E G E allows
reasonable estimates of both the survival function and its percentiles. The
Kaplan-Meier estimator is known to overestimate because of its step function
form. We show in a later section that the P E G E tends to be less than the
Kaplan-Meier estimator, and therefore the P E G E may be more accurate than the
Kaplan-Meier estimator. Whittemore and Keller give some favourable indications
in this respect. They define three survival function estimators that have constant
failure rate between observed failure times. One of these is the PEXE, modified
for ties in the data: the form of the failure rate estimator is the same as the form
of the P E G E failure rate estimator--specifically, for i = 1. . . . , nl,
Estimated failure rate in (t;_ 1, t,.] =
number of failures observed at t~
total time on test during (t~_ 1, ti]
The second of these estimators is defined instead on intervals of the form
[t~_ 1, ti): for i = 1, ..., nl, the failure rate estimator has the form
Estimated failure rate in [tt_ 1, ti) =
number of failures observed at t;_ 1
total time on test during [t~_ 1, t;)
The third of these estimators is obtained from the average of the two failure rate
estimators described by (2.6) and (2.7).
In a simulation study to investigate the small sample properties of these three
estimators, Whittemore and Keller find that the first estimator tends to underestimate the survival function while the second tends to overestimate the survival
function. From these results, we expect the P E G E to underestimate the survival
function and its percentiles. Whittemore and Keller do not record further results
for the first two estimators: however, they do indicate that, in terms of bias at
extreme percentiles, variance and mean square error, the third estimator tends to
be better than the Kaplan-Meier estimator.
The implications for the discrete version of the third estimator are that, in terms
of bias, variance and mean square error, it will compare favourably with the
Kaplan-Meier estimator. An unanswered question is whether the performance of
this estimator is so superior to the performance of the P E G E as to warrant the
additional computational effort required for the former.
Piecewise geometric estimation of a survival function
3. Asymptotic properties of the PEGE
This section treats the asymptotic properties of the P E G E and of the corresponding failure rate function estimator. The properties of primary interest are
those of consistency and asymptotic normality: secondary issues are asymptotic
bias and asymptotic correlation.
Initially considering a very general model, we obtain the limiting function of the
PEGE and show that the s e q u e n c e s
and {Pn(X=x/,p
oo= ~ converge in distribution to Gaussian sequences. We then explore the
X >/ X k)}~
effects of making various assumptions about the lifelength and censoring random
variables. Under the most general model, the PEGE is not consistent and the
failure rate estimators are not asymptotically uncorrelated: a sufficient condition
for consistency is independence between corresponding lifelength and censoring
random variables, and a sufficient condition for asymptotically independent failure
rate estimators is that the censoring random variables be identically distributed.
However, it is not necessary to impose both of these conditions in order to ensure
both consistency and asymptotic independence of the failure rate estimators:
relaxing the condition of independent lifelength and censoring random variables,
we give conditions under which both desirable properties are obtained.
Before investigating the asymptotic properties of the PEGE, we describe the
theoretical framework of the problem, give some notation, and present a preliminary result that facilitates the exploration of the asymptotic properties of the
The probability space (f2, ~, P) on which all of the lifelength and censoring
random variables are defined is envisaged as the infinite product probability space
that may be constructed in the usual way from the sequence of probability spaces
corresponding to the sequence of independent random pairs (X1, Yl),
0(2, II2). . . . . Thus 1"2 consists of all possible sequences of pairs of outcomes
corresponding to pairs of realizations in 5f x Y¢: the first member of each pair
corresponds to failure at a particular time and the second member of each pair
corresponds to censorship at a particular time--that is, for each co in f2,
k = 1 , 2 , . . . and j = 1,2 . . . . .
(Xi, Yt)(co) = (Xi(co), Y~.(co)) = (Xk, yj) if the ith element of the infinite
sequence co is the pair of outcomes corresponding to failure at xg and
censorship at yj.
The argument co is omitted wherever possible.
Two conditions are imposed on the random pairs (XI, YI), (22, Y2) . . . . :
(A1) There is a distribution function F such that
~ P(Zi<~xk)=F(xk)
n~°° H i=1
for k = 1 , 2 , . . . .
G. M. Mirnmack and F, Proschan
(A2) There is a subdistribution function F 1 such that
lim -
~ P ( Z ~ < x k , 6;= 1 ) = F l ( x k )
n ~ o o iv/ i= 1
for k = 1,2 . . . . .
It is evident that a sufficient condition for (A1) and (A2) is that the censoring
random variables be identically distributed.
Definitions of symbols used in this section are given below. Assumptions (A1)
and (A2) ensure the existence of the limits defined. Let
P k i = P ( Z i = X k , ~i= 1) f o r k = 1 , 2 , . . . andi--- 1. . . . . n,
R k t = P ( Z i = x k, bi = 0 )
fork= 1,2,... andi=
1 n
Pk = lim -~ Z P k ; = F I ( x k ) - F ' ( X k - , )
n~°° l'l
R k = lim -
f o r k = 1,2 . . . . .
Z Rk,
f o r k = 1,2 . . . . .
n~oo n i= 1
The proposition below is fundamental: it asserts that, with probability one, as
the sample size increases to infinity, at least one failure is observed at every
possible value of the lifelength random variable. First, we need a definition.
DEFINITION 3.1. Let t2* c f2 be the set of infinite sequences which contain, for
each possible failure time, at least one element corresponding to the outcome of
observing failure at that time--that is,
t2* =
(Vk)(3n)X,,(e9) = x k, Y,,(o9) >1 Xk}.
P(~2*) = 1.
The proposition is proven by showing that the set of infinite sequences that do
not contain at least one element corresponding to the outcome of observing failure
at each possible failure time x, has probability z e r o - - t h a t is,
P ( nlim°°~ ~=1
~ {Xi= xk'
k = 1 , 2 .....
As the pairs (X1, Y1), (X2, Y2), ... are independent, this is equivalent to proving
the following equality:
lira l~ ( 1 - P ( X i = x k, Y,.>tXk))=O
for k = 1,2 . . . . .
Piecewise geometric estimation of a survival function
Since [I i=
°° 1 (1 - p~) = 0 if and only if ~ i=
o~ 1 Pi =
of probabilities, and since (A2) implies that
• P(X i=xk,
Y,.>/Xk)= o0
where {Pi}~=
is any sequence
for k = 1,2 . . . . ,
we have (3.1).
The importance of the preceding proposition lies in the simplifications it allows.
It turns out that, on 12" and for n large enough, the P E G E may be expressed in
simple terms of functions that have well-known convergence properties. Since
P(12*) = 1, we need consider the asymptotic properties of the P E G E on O*
alone: these properties are easily obtained from those of the well-known functions.
In order to express the P E G E in this convenient way, we view the estimation
procedure in an asymptotic context.
Suppose co is chosen arbitrarily from f2*. Then, for each k, there is an N
(depending on k and co) such that X;(co) = xj and }',.(co)>~ xj for j = 1. . . . . k and
some i ~< N. Consequently, for n >~ N, the smallest k distinct observed failure times
tl, . . . , tk are merelY x l , . . . , x k, and, since the set of possible censoring times is
contained in f , the smallest k distinct observed times are also x l , . . . , x k. T h e
first k intervals between observed failure times are simply (0, x~],
(Xl, x2] . . . . . (Xk- 1, Xk], and the function T~,~ defined on the ith interval is given
by the number of units on test just before the end of the ith interval--that is,
Ti, n = n F n ( x f - ) = n F n ( x i -
i ~-- 1 . . . .
k and n/> N .
Likewise, we express the function D~, n defined on the ith interval in terms of the
empirical subdistribution function F2 as follows:
for i = 1, . . . , k and n ~> N .
D~.,, = n [ F 2 ( x i ) - F 2 ( x ~ _ , ) ]
As the P E G E is a function of D;. n and T;, n, it can be expressed in terms of
the empirical functions Fn and F2. Specifically, on O*, for any choice of k, there
is an N such that
F l ( x ~ ) - F 2 ( x , _ 1!)
>I N .
Consequently, taking the limit of each side and using Proposition 3.2, we have
lim/~,(X> Xk)=
i= 1
F~ (xi) - F~ (x i_ l )
f o r k = 1,2 . . . . ] =
G. M. M i m m a c k and F. Proschan
In exploring the asymptotic behaviour of the P E G E , therefore, we consider the
behaviour of the limiting sequence of the sequence
{i~l (1
The proofs of the results that follow are omitted in the interest of brevity. The
most general model we consider is that in which only conditions (A1) and (A2)
are imposed. The following theorem identifies the limits of the sequences
{P.(X = x~lX>~ x~)}~=, and {/3~(X> Xk)}~= ~ for k = l, 2 . . . . and establishes
that the sequences {/S.(X= XkIX>~Xk)}~=~ and {/S.(X> Xk)}ff=l converge to
Gaussian sequences.
(i) With probability 1,
lim P . ( X = x k l X >>,x k ) =
- Fl(xk
- 1)
(ii) With probability 1,
lira /~.(X > xk) =
i= 1
(iii) Let
kl, . . . , kM
kl < k2 < " " < kM. Then
(P~(X = XkllX>~ xk, ) . . . . .
for k = 1, 2 . . . . .
F ( x i - l)
ffn(X = XkM]X>~ XkM)) is AN
/~*, - Z*
~,* = (P~,/~(xk, _ , ) . . . . .
~q~ = PkqPkr 2
P,~,JF(xkM- ,)) ,
2 (~kinkj ~- ~kM+ki, kj ~[- aki, kM+kj -~ ~kM+ki, kM+k j)
/(~(x,~q_ ~ ) ~ ( ~ , _
"]- Pkr 2 (ffkM + kq,ki q- ~kM + kq, kM + ki)/((F(xkr-1 ))2F(xkq 1))
+ Pk~ ~ (ok,~ + a~.+,~.,D/(~(xk ,)(?(xk ,))2)
+ akM + k,. kr/(ff(Xkq - 1))F(Xkr-,)
for q < r.
Piecewise geometric estimation of a survivalfunction
lim 1 ~ P ~ , , ( 1 n~:x~ n
-lim 1 ~
for q = r, q = 1. . . . , M ,
1. . . . .
i= 1
r= 1,...,M,
- l i m 1 ~ Pkq_M. iRkr. i
forq =M+
1, . . . , 2M,
i= 1
r = 1, . . . , M ,
lim 1 ~ R~q_M,i(l_ Rk~_M,;) f o r q = r , q = M +
i= 1
.... 2M,
- lim 1 ~
n i= 1
2M, r = M +
(iv) Let
k I . . . . , kM
M. Then
(/~.(X> xk, ) . . . . , P . ( X > XkM)) is A N
p**, - Z**
(1 - P,./F(x,_ ,)1 . . . . .
Cry** ---- 1--[ (1 -- P / i f ( x , _ 1)) ~ I
-- e j / ~ ' ( x j _
' Z
l-I (1 - P / i f ( x , _ 1)) ,
].tJqr f q = l
1. . . . .
p** =
1. . . . .
-- e l / f f ( X l _
1)(1 - Pro~if(x,,,_
for q <~ r.
It is evident from the theorem above that the P E G E is a strongly consistent
estimator of the underlying survival function if and only if
F l ( x k ) - F l ( x k - 1) _ P ( X : xk)
for k = 1, 2 . . . . .
The theorems below give conditions under which this equality holds. As for
correlation, it is evident from the structure of the P E G E that any two elements
of the sequence {Pn(X> xg))ff= 1 are correlated. Consequently the matrix 2~**
G. M . M i m m a c k and F. Proschan
cannot be reduced to a diagonal matrix under even the most stringent conditions.
However it turns out that, under certain conditions, the asymptotic correlation
between pairs of the sequence {/Sn(X = xklX>~ x~)}ff= 1 is z e r o - - t h a t is, 1;* is a
diagonal matrix.
The following theorem shows that independence between lifelength and censoring random variables results in strongly consistent (and therefore asymptotically unbiased)
{/~n(X= xklY>~xk)}2= 1 is asymptotically correlated in this case. Since the
matrices ~2" and Z** have the same form as in the theorem above, they are not
explicitly defined below.
Suppose (i) the random variables X i and Y,. are independent for
i = 1, 2 , . . . , and
(ii) there is a distribution function H such that
lim -- Z P(Y¢<<-x~)=H(x*)
1, 2 , . . . .
n i=1
(iii) F l ( x k ) = ~ P ( X = x i ) H ( x i_ 1) and ff(x~) = P ( X > x k ) H ( x ~ ) for k = 1, 2 . . . .
(iv) with probability 1,
k= 1,2 .....
(v) (/Sn(X = Xk, IX>~xk,), . . . , /S,~(X = xk,~lX>~xk,,,)) is AN
~*, - 22* ,
where k~
k2 <
k M are arbitrarily chosen integers and
i,* = ( P ( x = xk~lX>_, x k , ) . . . . .
P ( X = xk,~bX>-- x,,~,)).
(vi) (/Sn(X> xk,), . . . , f i n ( X > XkM)) is AN
~**, - 22** ,
where k~ < k 2 < " ' < k M are arbitrarily chosen integers and
** = ( e ( x > x < ) , . . . , P ( X > XkM)) .
A sufficient condition for (A1), (A2) and assumption (ii) of the preceding
theorem is that the censoring random variables be identically distributed. In this
case the failure rate estimators are asymptotically independent and the matrix Z**
Piecewise geometric estimation of a survival function
is somewhat simplified: The conditions of the following corollary define the model
of random censorship widely assumed in the literature.
COROLLARY 3.5. Suppose (i) the random variables X i and Y,. are independent for
i = 1, 2 , . . . , and
(ii) the random variables Y1, Y2 . . . . are identically distributed.
(iii) with probability 1,
lifnoo13n(X> Xk) = -G(Xk)
(iv) (/~,(X= Xk,[X>~Xk~ ) . . . . .
for k = 1, 2 . . . . .
f t , ( X = XkM[X>~XkM)) is AN
~*, - X* ,
P ( X = XkM]X>~ Xk~)),
l~* = (P(X = Xk~lX>~ xk, ) . . . . .
{ O ~ q r } q = 1. . . . .
for q = r,
for q # r.
(v) (P.(X> xk,), ...,
P.(X> XkM))is
= (P(X>
x~,) . . . . .
P(X > x~..)).
r q=l,...,M;r=l,...,M'
aS** = P ( X > Xk,)P(X > Xkr) 2 P ( X = x i l X >~ x,)/[ff(x i_ ,)
"P ( X > x i l X >~ xi) ]
for q <~ r.
Having dealt with the most restrictive case in which the lifelength and censoring
random variables are assumed to be independent, we now consider relaxing this
condition. It turns out that independence between corresponding lifelength and
censoring random variables is not necessary for asymptotic independence between
pairs of the sequence of failure rate estimators: if the censoring random variables
are assumed to be identically distributed but not necessarily independent of the
corresponding lifelength random variables, then the failure rate estimators are
asymptotically independent. However both the survival function and failure rate
estimators are asymptotically biased. The following corollary expresses these facts
G. M. Mimmack and F. Proschan
(ii) P k = P ( Z = x
Suppose (i) the random variables Y1, Y2 . . . . are identically dis-
for k = 1, 2 . . . . .
k, b = 1) and F ( x k ) = P ( Z > x k )
(iii) (/Sn(X = xk~lX>~ xk,) . . . . . P~(X = x~MIX>~ x~M)) is AN
I2" ,
#* = (Pk~/?(Xk,
= {l~i~ ) i _
1 ) , ' ' . , Pk,/F(xk~
1))/ff(Xk, 1))2 for i = j ,
for i ¢ j .
(iv) (/~n(X> xk, ) . . . . , /~n(X> XkM)) is A N
#**, - L-** ,
#** =
(1 - P,/ff(x i_1)), . . ' , l~ (1 - PJF(x i_1))
aS;* = I~ (1 - Pi/?(x~_ ,)) 1~ (1 - Pm/ff(Xm_ 1))
• ~ Pr/[(F(xr- 1))2( 1 - Pr/ff(Xr-1))]
forj<~ l.
The corollaries above give sufficient (rather than necessary) conditions for the
two desirable properties of (i) consistency and (ii) asymptotic independence
between pairs of the sequence of failure rate estimators {fi,(X = x k l X >1 Xk)}k~_ 1.
The final corollaries show that both of the conditions of Corollary 3.5 are not
necessary for these two desirable properties: the conditions specified in these
corollaries are not so stringent as to require that corresponding censoring and
lifelength random variables be independent (as in Corollary 3.5), but rather that
they be related in a certain way.
COROLLARY 3.7. I f the random variables Y1, Y 2 , . . . are identically distributed,
then with probability 1,
nlim ff n(X > xk) = G(x~)
for k = 1, 2 . . . .
if and only if
P(Y,.>/ x k l X = xk) = P(Y,.>~ xglX>~ xk)
for k = 1, 2, ...
and i= 1 , 2 , . . . .
Piecewise geometric estimation of a survival function
COROLLARY 3.8. Suppose (i) the random variables Y1, Y2, " " are identically distributed, and
(ii) P(Y,>~ XklX = Xk) = P(Y,>~ XkIX>>- Xk) for k = 1, 2 . . . . and i = 1, 2, . . . .
(IS,(X = Xk~rX>~ Xk,), . . . , P , ( X = XkMlX>~ Xk,~)) is AN
p*, - Z* ,
t~* = (P(X = x~, IX >t x~,) . . . . , P(X = x~, Ix >>-XkM)),
Z* = {G;j
,7,= { o ( X = xk,[X >~ x~,)P(X > xk~IX >~ x~,)/F(xk~ ~) for i = j ,
jbr i ~ j .
(ft,(X> xk~), . . . , /~,,(X> XkM)) is AN
,u**, - Z**
p** : ( P ( X > xk,) . . . . . P ( X > x k , ) ) , '
~** = P(X > X k ) P ( X > Xk,) ~ P(X = xilX >~ xi)/[ff(x,_ ,)
• P(X>x~lX>/x~)]
The last two corollaries are of special interest because they deal with consistency and asymptotic independence in the case of dependent lifelength and
censoring random variables--a situation that is not generally considered despite
its obvious practical significance. Desu and Narula (1977), Langberg, Proschan
and Quinzi (1981) and Kitchin, Langberg and Proschan (1983) consider the
continuous version of the model specified in the last two corollaries.
The condition specifying the relationship between lifelength and censoring
random variables is in fact a mild one: re-expressing it, we have the following
P(X = xklX>~ x k, Yi>~ xg) _ P(X = Xk)
P ( X ~ x k I X ~ xk, Y t ~ xg)
f o r k = 1,2 . . . .
P(X>. x~,)
1,2 . . . . .
This condition specifies that the failure rate among those under observation at any
particular age is the same as the failure rate of the whole population of that age.
G. M. Mimmack and F. Proschan
It is evident both intuitively and mathematically that this is a fundamental assumption inherent in the process of estimating a life distribution from incomplete data:
if this assumption could not be made, the data available would be deemed
inadequate for estimating the life distribution. Formally, it is the fact that the
condition is both necessary and sufficient for consistency that indicates that it is
minimal for the estimation process. It is clear, therefore, that the last two corollaries play an important role in estimation in the context of a practical model more
general than the statistically convenient, but unnecessarily restrictive, model of
random censorship.
4. The PEGE compared with rivals
In Section 1 we motivate the construction of the PEGE by describing some
desirable properties of nonparametric survival function estimators and then
mentioning that the commonly used estimator of Kaplan and Meier (1958) does
not fare well in terms of these properties. We now compare the PEGE with the
Kaplan-Meier estimator.
We begin with the most obvious desirable features of survival function estimators and then consider statistical and mathematical properties. In comparing
the two estimators, we find that the issue of continuity arises and that the PEXE
enters the comparison. The section ends with an example using real data. The
subsequent section continues the comparison: we discuss the results of simulation
The K a p l a n - M e i e r estimator (KME) of the survival function of the lifelength
random variable X is defined as follows:
forn 1=0or
t < t l , n11>l,
P(X > t) =
I-[ (1 - D J n f f n ( t f )) for ti_ 1 ~< t < ti, i = 2, . . . , n 1,
I-I (1 - D J n f f , ( t f - )) for t t> tnl, nl ~> 1.
To the prospective user of a survival function estimator, two fundamental
questions are, firstly, does the estimating function have the appearance of a
survival function, and secondly, is it easy to compute?
Considering the second question first, we observe that calculating the PEGE
involves only marginally more effort than calculating the KME. Therefore, both
estimators are accessible to users equipped with only hand calculators.
The first question is a deeper one. If the sample is small or if there are many
ties among the uncensored observations in a large sample, the K M E has only a
few steps and consequently appears unrealistic. The PEGE, in contrast, reflects
the continuity inherent in any life process by decreasing at every possible failure
time, not only at the observed failure times. As the number of distinct uncensored
Piecewise geometric estimation of a survival function
observations increases, both the P E G E and the K M E become smoother: the
many steps of the K M E do allow it the appearance of a survival function, except
possibly at the right extreme--there is no way of extrapolating very far beyond
the range of observation if the K M E is used. (There are several ways of extrapolating from the PEGE.) At face value, therefore, the P E G E is at least as
attractive as the KME.
A related consideration is whether the estimator provides a realistic estimate of
the failure rate function. The KME, being a step function, does not. The seriousness of this omission becomes more apparent when the K M E failure rate function
is examined from a user's point of view: if an item of age t has a (perhaps large)
chance of failing at its age, then claiming that a slightly older (or slightly younger)
item cannot fail at its age seems unreasonable, particularly when it becomes
evident that the claim is made on the grounds that none of the items on test
happened to fail just after (or just before) time t. Intuitively--or from a frequentist's point of view--the very fact that one of the items on test failed at time
t makes it less likely that another item in the sample will fail soon after t because
the observed failure times should be scattered along the appropriate range according to the distribution function. Clearly, then, the gaps between observed failure
times are a result of the fact that the sample is finite and are not indicative of
zero (or very small) failure rates.
The PEGE, on the other hand, is constructed so that a failure at time t, say,
affects the failure rate in the gap before t. Thus the P E G E compensates for the
lack of observations at the possible (but unobserved) failure times. The resultant
failure rate function, being a step function, is still na'fve, but it does at least take
into account the continuity of life processes and it does provide reasonable
estimates of the failure rates at all possible failure times.
A more aesthetic--but none the less important--issue is that of information
loss. Here the P E G E is again at an advantage. Although interval information
about the uncensored observations is used in spacing out the successive values
of the KME, the failure rate estimators utilize only ordinal information. Moreover,
the only information utilized from the censored observations is their positioning
relative to the uncensored observations. Thus the information lost by the K M E
is of both the ordinal and interval types. In contrast, the P E G E failure rate
estimators use interval information from all the observations: in particular, the
positions of censored observations are taken into account precisely. In terms of
information usage, then, the P E G E is far more desirable than the KME.
An apparently attractive feature of the K M E is that its values are invariant
under monotone transformation of the scale of measurement. The P E G E is not
invariant under even linear transformation. However, in the light of the discussion
about information loss, it is evident that the KME's invariance, and the PEGE's
lack thereof, are results of their levels of sophistication rather than properties that
can be used for comparison.
Having noted that the step function form of the K M E is not pleasing, we now
point out that it is also responsible for a statistical defect, namely, that the K M E
tends to overestimate the underlying survival function and its percentiles. The fact
G. M. Mimmack and F. Proschan
that the KME consistently overestimates suggests that its form is inappropriate.
Some indications about the bias of the PEGE are given by considering the
relationship between the PEGE and the KME.
Under certain conditions (for example, if there are no ties among the uncensored observations), the PEGE and the K M E interlace: within each failure interval, the PEGE crosses the K M E once from above. This is not true in general,
however. It turns out that the K M E may have large steps in the presence of ties.
In the case of the PEGE, however, the effect of the ties is damped and the PEGE
decreases slowly relative to the KME. In general, therefore, it is possible to relate
the PEGE and the KME only in a one-sided fashion: specifically, the PEGE at
any observed failure time is larger than the K M E at that time. Examples have
been constructed to show that, in general, the PEGE cannot be bounded from
above by the KME. The following theorem relates /s (the PEGE) and P (the
THEOREM 4.1. (i) P ( X > ti) >~ P ( X > ti) for i = 1 . . . . .
n 1.
(ii) I f n f f , ( t j _ ~ ) / ( n F , ( t j _ l ) + Wj_I)<<.DflDj_ 1 for j = 2, . . . , i, where Wj denotes the number of censored observations at tj f o r j = 1. . . . , n 1, then
f f ( X > ti) <~ P ( X > ti- ) for i = 1 . . . . . n I .
It is evident that the condition in (ii) is met if there are no ties among the
uncensored observations: this is likely if the sample is small. From the relationships in the theorem, we infer that the bias of the PEGE is likely to be of the
same order of magnitude as that of the KME. Further indications about bias are
given later.
Having considered some of the practical and physical features of the PEGE
and the KME, we turn briefly to asymptotic properties--briefly because the
PEGE and the K M E are asymptotically equivalent--that is,
P[(V k) nlim P n ( X > x~) = ,lirn P n ( X > xk)] = 1.
The practical implication of this is that there is little reason for strong preference
of either the PEGE or the K M E if the sample is very large.
We now compare the models assumed in using the K M E and the PEGE. In
the many studies of the KME, the most general model includes the assumption
of independence between corresponding life and censoring random variables. Our
most general model does not include this assumption. However this difference is
not important because the assumption of independence is used only to facilitate
the derivation of certain asymptotic properties of the KME: in fact, the definition
of the K M E does not depend on this assumption, and the K M E and the PEGE
are asymptotically equivalent under the conditions of the most general model of
the PEGE. Therefore this assumption is not necessary for using the KME.
The other difference between the models assumed is that the PEGE is designed
specifically for discrete life and censoring distributions while the Kaplan-Meier
model makes no stipulations about the supports of these distributions. However,
Piecewise geometric estimation o f a survival function
distinguishing between continuous and discrete random variables in this context
is merely a statistical convention--in fact, time to occurrence of some event is
always measured along a continuous scale, and the set of observable values is
always countable because it is defined by the precision of measurement. Since the
process of estimating a life distribution requires measurements, it always entails
the assumption of a discrete distribution: whether the support of the estimator is
continuous or discrete depends on the way the user perceives the scale of
measurement. In practice, therefore, there are no differences between the models
underlying the P E G E and the KME: the P E G E is appropriate whenever the
K M E is, and vice versa.
Having pointed out that the P E G E may be used for estimating continuous
survival functions, and having introduced the PEXE as the continuous counterpart of the PEGE, we compare the two. First we note that the PEXE is the
continuous version of the P E G E because the construction of each is based on the
assumption of constant failure rate between distinct observed failure times. The
forms of the estimators differ because of the difference in the ways of expressing
discrete and continuous survival functions in terms of failure rates. The P E G E
and the PEXE are equally widely applicable since a minor modification of the
PEXE can be made to allow for ties. (This estimator is defined in Whittemore
and Keller (1983).)
The relationship between the P E G E and the modified PEXE, and their positioning relative to the KME, is summarized by the following theorem and the
succeeding relationship.
Let P * * ( X > t) denote the modified PEXE of the survival probability P(X > t) for t > O.
(i) P ( x > O < e * * ( x > t )
(ii) l f nF,(tj_l)/(nT"n(tj_a) + Wj_I)<~Dj/Dj_ 1 for j = 2, ..., i, where Wj denotes the number of censored observations at tj for j = 1, ..., n,, then
e * * ( x > t,) ~ P ( x > t,_ ,) for i= 1. . . . , n 1.
From Theorems 4.1(i) and 4.2(i), we have
P ( X > t~) <~P ( X > t;) < P * * ( X > t;) for i = 1. . . . , n I .
Consequently, if the condition in (ii) above is met (as it is when there are no ties
among the uncensored observations), both the P E G E and the PEXE interlace
with the KME: in each interval of the form (tt_ ,, t~], the P E G E and the PEXE
cross the K M E once from above. Practical experience suggests that the condition
in (ii) above is not a stringent one: even though this condition is violated in many
of the data sets considered to date, the P E G E and the PEXE still interlace with
the K M E in the manner described. Another indication from practical experience
is that the difference between the PEXE and the P E G E is negligible, even in small
Finally, we present an example using the data of Freireich et al. (1963). The
G. M. Mimmack and F. Proschan
t ×
I /
x ,+
I, /
x -.II,/
I /
Piecewise geometric estimation of a survival function
data are the remission times of 21 leukemia patients who have received 6 MP (a
mercaptopurine used in the treatment of leukemia). The ordered remission times
in weeks are: 6, 6, 6, 6 + , 7, 9 + , 10, 10+, 11+, 13, 16, 17+, 19+, 2 0 + , 22,
23, 2 5 + , 32+, 3 2 + , 3 4 + , 3 5 + . The P E G E and the K M E are presented in
Figure 1. (Since the P E G E and the PEXE differ by at most 0.09, only the
PEGE appears.) The graphs illustrate the smoothness of the P E G E in contrast
with the jagged outline of the KME. The K M E and the PEGE interlace even
though the condition in Theorems 4. l(ii) and 4.2(ii) is violated. Since the PEGE
is only slightly above the K M E at the observed failure times and the PEGE
crosses the K M E early in each failure interval, the K M E is considerably larger
than the P E G E by the end of each interval. This behaviour is typical. We infer
that the PEGE certainly does not overestimate: it may even tend to underestimate.
We conclude that the PEGE (and the modified PEXE) have significant
advantages over the KME, particularly in the cases of large samples containing
many ties and small samples. It is only in the case of a large sample spread over
a large range that the slight increase in computational effort required for the
PEGE might merit using the K M E because the P E G E and the K M E are likely
to be very similar.
5. Small sample properties of the PEGE
In this section we give some indications of the small sample properties of the
PEGE by considering three simulation studies. In the first study, Kitchin (1980)
compares the small sample properties of the PEXE with those of the KME. In
the second study, Whittemore and Keller (1983) consider the small sample
behaviour of a number of estimators: we extract the results for the K M E and a
particular version of the PEXE. In the third study, we make a preliminary
comparison of the K M E and the PEGE. We expect the behaviour of the piecewise exponential estimators to resemble that of the PEGE because piecewise
exponential estimators are continuous versions of the PEGE and, moreover,
piecewise exponential estimators and the PEGE are similar when the underlying
life distribution is continuous.
The pi_ecewise exponential estimator considered by Whittemore and Keller is
denoted FQ4" It is constructed by averaging the PEXE failure rate function estimator with a variant of the PEXE failure rate function estimator--that is, ~Q4 is the
same as the PEXE except that the PEXE failure rate estimators 2/- . . . . , 2 ~ are
replaced by the failure rate estimators 2", ..., 2*, defined as follows:
2* = 5(2;
~ - + 2t+- l )
f o r / = 1, .. ., n l ,
2;- = D;/total time on test in (t;_ 2 ,
for i = 1, . . . ,
n 1 ,
2e+ = D~./total time on test in [ti, ti+ ~) for i = 0, . . . , n~ -
G. M. Mimmack and F. Proschan
2+,1 = {O~,/total time on test in
[t,,, ~)
~,~,max Z;. > t t l l
A_lthough Whittemore and Keller include in their study the two estimators
FQ, and FQ2 constructed from 2 f . . . . . 2~-, and 2~- . . . . . 2,~] respectively, they
present the results for the hybrid estimator FQ, alone because they find that FQI
tends to be negatively biased and ffQ: tends to be positively biased.
The same model is assumed in all three studies. The model is that of random
censorship: corresponding life and censoring random variables are independent
and the censoring random variables are identically distributed. Whittemore and
Keller generate 200 samples in each of the 6 x 3 x 4 = 72 situations that result
from considering six life distributions (representing failure rate functions that are
constant, linearly increasing, exponentially increasing, decreasing, U-shaped, and
discontinuous), three levels of censoring (P(Y<X)~ O, 0.55, 0.76), and four
sample sizes (n = 10, 25, 50, 100). Kitchin obtains 1000 samples in each of a
variety of situations: he considers four life distributions (Exponential, Weibull with
parameter 2, Weibull with parameter ½ and Uniform), three levels of censoring
(P(Y<X) = 0, 0.5, 0.67), and four sample sizes (n = 10, 20, 50). Kitchin's study
is broader than that of Whittemore and Keller in that Kitchin considers Exponential, Weibull and Uniform censoring distributions while Whittemore and Keller
consider only Exponential censoring distributions. Kitchin apparently produces
the greater variety of sampling conditions because his results vary slightly according to the model, while Whittemore and Keller find so much similarity in the
results from the various distributions that they record only the results from the
Weibull distribution.
The conclusions we draw from the two studies are similar. Regarding mean
squared error (MSE), both Kitchin and Whittemore and Keller find that, in
(i) The MSE of the exponential estimator is smaller than that of the KME.
(ii) As the level of censoring increases, the increase in the MSE is smaller for
the exponential estimator than for the KME.
Kitchin reports than (i) and (ii) are not always true of the PEXE and the KME:
the exceptional cases occur in the tails of the distributions.
The conclusions about bias are not so straightforward. Whittemore and Keller
find that the PEXE tends to be negatively biased while Kitchin reports that the
bias of the PEXE is a monotone increasing function of time: examining his
figures, we find that the bias tends to be near zero at some point between the 40th
and 60th percentiles except when the life and censoring distributions are Uniform.
(In this case, the bias is positive only after the 90th percentile.) We conclude that
Whittemore and Keller merely avoid detailed discussion of bias. Regarding the
hybrid estimator, we find in the figures recorded some suggestions of the tendencies observed in the PEXE--specifically, monotone increasing bias and a tendency
for underestimation when the sample size is small and censoring is heavy.
Whether this behaviour is typical of the PEGE also remains to be seen.
Piecewise geometric estimation of a survival function
In considering the magnitude of the bias of the estimators, we find the following.
(i) Both Kitchin and Whittemore and Keller report that the bias of the KME
is negligible except in the right tail of the distribution and in the case of a very
small sample (n = 10) and heavy censoring.
(ii) The PEXE i_s considerably more biased than the KME.
(iii) The bias of FQ4 is negligible except in the case of a very small sample and
heavy censoring.
(iv) The bias of each estimator increases as the censoring becomes heavier and
it decreases as the sample size increases.
In view of these two studies, we conclude, firstly, that the PEGE is likely to
compare favourably with the K M E in terms of MSE, and secondly, that the
PEGE is likely to be considerably more biased than the KME. We expect that
the discrete counterpart of FQ4 performs well in terms of both MSE and bias.
Since the bias of this estimator is likely to be small, adjustment for its presumed
tendency to increase monotonically is deemed an unnecessary complication.
In the pilot study we generate three collections of data, each consisting of 100
samples of size 10, from independent Geometric life and censoring distributions.
In each case the life distribution has parameter p = exp(-0.1). The censoring
distributions are chosen so as to produce three levels of censoring: setting
p = e x p ( - 2 ) , where 2=0.00001, 0.1, 0.3, yields the censoring probabilities
P(Y<X) = 0, 0.475, 0.711 respectively.
The conventions followed for extrapolation in the range beyond the largest
observed failure time are as follows:
ff(X>k)={Po(X>t., ) fort.,<~k<s~:,
for k~> s~2 ~> tm ,
fi(X>k)=fi(X>tnl)(1-O~,) k-t"'
for k ~> t~,.
This definition of the K M E rests on the assumption that the largest observation
is uncensored, while the definition of the PEGE results from assuming that the
failure rate after the largest observed failure time is the same as the failure rate
in the interval (tn,_ l, t,l ]Our conventions for extrapolation differ from those of Kitchin and of Whittemore and Keller. Consequently our results involving fight-hand tail probabilities
differ from theirs: a preliminary indication is that our extrapolation procedures
result in estimators that are more realistic than theirs.
Although the size of the study precludes reaching more than tentative conclusions, we observe several tendencies.
Tables l(a), 2(a) and 3(a) contain the estimated bias and mean squared error
(MSE) for the K M E and the P E G E of P(X > k) for k = ~p, where ~p is the pth
percentile of the underlying life distribution and p = 1, 5, i0, 20, 30, 40, 50, 60,
70, 80, 90, 95, 99. From these tables we make the following observations.
(i) The MSE of the P E G E is generally smaller than that of the KME. The
G. M. Mimmack and F. Proschan
Table 1
Results of pilot study using 100 samples of size 10, Geometric (p = e x p ( - 0 . 1 ) ) life distribution,
Geometric (p = e x p ( - 0.00001)) censoring distribution and P ( Y < X ) ~ - 0
Estimated bias
(a) Survivalfunction estimato~
- 0.0184
- 0.0184
- 0.0137
- 0.0172
- 0.0253
- 0.0293
- 0.0351
- 0.0318
- 0.0283
- 0.0199
- 0.0096
(b) Percentile estimators
Estimated M S E
(ii) T h e
- 0.0018
- 0.0196
- 0.0159
- 0.0185
- 0.0187
- 0.0167
- 0.0028
- 0.0011
of each
(iv) T h e
is, t h e M S E
in the
of the PEGE
except in the right-hand
tude of the bias of each estimator
as censoring
of the two estimators
of the two
median of the distribution.
(v) Both the KME and the PEGE
tail of the distribution
little a s t h e c e n s o r i n g
(iii) T h e d i s p a r i t y i n t h e M S E
the censoring
- 0.37
- 0.32
- 0.10
- 0.79
- 0.08
- 1.31
- 2.28
- 3.34
- 4.87
- 1.53
- 18.53
in the right-hand
and heavy
more marked
by relatively
is s m a l l e s t
generally exhibit negative bias: the magni-
is g r e a t e s t
the median
of the distribu-
Piecewise geometric estimation of a survival function
Table 2
Results of pilot study using 100 samples of size 10, Geometric (p = exp(-0.1)) life distribution,
Geometric (p = e x p ( - 0.1)) censoring distribution and P ( Y < X) ~ 0.475
Estimated bias
(a) Su~ivalfunction
- 0.0223
- 0.0223
- 0.0207
- 0.0215
- 0.0282
- 0.0432
- 0.0509
- 0.0564
- 0.0553
- 0.0368
- 0.0060
- 0.0018
- 0.0018
- 0.0037
- 0.0230
- 0.0800
- 0.0707
- 0.0590
- 0.0401
- 0.0091
(b) Percentile estimators
- 0.34
- 0.09
o. 10
- 0.20
- 0.67
- 0.88
- 1.23
- 0.60
- 2.30
- 0.20
- 1.44
- 2.73
- 8.92
- 14.92
- 31.92
Estimated MSE
(vi) T h e m a g n i t u d e o f t h e b i a s o f t h e K M E is c o n s i s t e n t l y s m a l l e r t h a n t h a t
o f t h e P E G E o n l y w h e n t h e r e is n o c e n s o r i n g . U n d e r c o n d i t i o n s o f m o d e r a t e a n d
h e a v y c e n s o r i n g , t h e K M E is less b i a s e d t h a n t h e P E G E o n l y a t p e r c e n t i l e s t o
t h e left o f t h e m e d i a n : t o t h e r i g h t o f t h e m e d i a n , t h e P E G E is c o n s i d e r a b l y less
biased than the KME.
(vii) A s c e n s o r i n g i n c r e a s e s , t h e m a g n i t u d e o f t h e b i a s o f t h e K M E i n c r e a s e s
faster than does that of the PEGE.
Tables l(b), 2(b) and 3(b) contain the estimated bias and MSE for the
Kaplan-Meier (KM) and piecewise geometric (PG) estimators of the percentiles
~p, p = 1, 5, 10, 20, 30, 40, 50, 60, 70, 80, 90, 95, 99. F r o m t h e s e t a b l e s w e m a k e
the following observations.
G. M. Mimmack and F. Proschan
Table 3
Results of pilot study using 100 s a m p l e s of size 10, G e o m e t r i c ( p = e x p ( - 0 . 1 ) )
G e o m e t r i c ( p = exp ( - 0.3)) censoring distribution a n d P ( Y < X ) --- 0.711
E s t i m a t e d bias
(a) Surv#alfunction estimato~
- 0.0230
- 0.0230
- 0.0347
life distribution,
Estimated MSE
- 0.0018
- 0.0018
- 0.1011
- 0.1443
- 0.2421
- 0.2286
(b) Percentile estimators
- 0.41
- 0.08
- 0.13
- 0.20
- 0.10
- 0.47
- 0.78
- 1.11
- 2.38
- 4.91
- 8.54
- 1.68
- 1.25
- 3.34
- 15.53
- 21.53
- 38.53
(i) With a few exceptions, the PG percentile estimator is less biased than the
KM percentile estimator.
(ii) Both estimators tend to be negatively biased.
(iii) At each level of censoring, the bias of the PG percentile estimator is
negligible for percentiles smaller than the 70th, and it is acceptably small for larger
percentiles, except perhaps the 99th percentile. In contrast, the KM percentile
estimators are almost unbiased only for percentiles smaller than the 60th: to the
right of the 60th percentile the bias tends to be very much larger than that of the
PG estimators. This tendency is particularly noticeable in the case of heavy
(iv) The MSE of the PG percentile estimator is smaller than that of the KM
percentile estimator only in certain ranges, viz.: p ~< 70 for heavy censoring,
Piecewise geometric estimation of a survival function
p ~< 40 for moderate censoring, and 5 ~<p ~< 95 for no censoring. Since the PG
percentile estimator is almost unbiased outside these ranges, the large MSE must
be the result of having large variance.
On the basis of the observations involving the survival function estimators, we
conclude that the small sample behaviour of the P E G E resembles that of the
PEXE: specifically, when there is little or no censoring, the PEGE compares
favourably with the K M E in terms of MSE but not in terms of bias. We expect
that this is true irrespective of the level of censoring when the sample size is
larger. It remains to be seen whether inversion of this general behaviour is typical
when the sample size is very small and censoring is heavy. It is evident that
increased censoring affects the bias and the MSE of the PEGE less than it affects
the bias and the MSE of the KME.
Our conclusions about the percentile estimators are even more tentative
because of the lack of results involving the behaviour of percentile estimators. The
fact that the PG percentile estimator is almost unbiased even in the presence of
heavy censoring, and even as far to the right as the 95th percentile, is of considerable interest because the KM extrapolation procedures are clearly inadequate
for estimating extreme right percentiles.
Regarding the MSE, we note that, under conditions of moderate or heavy
larcensoring, any estimator of the larger percentiles is expected to vary considerably
because there are likely to be very few observations in this range. The ad hoc
extrapolation procedure for the KM is expected to cause the estimators of the
extreme right percentiles to exhibit large negative bias and little variation. In view
of these considerations and the accuracy of the PG percentile estimators, we
conclude that the fact the MSE of the PG percentile estimator of the larger
percentiles is greater than that of the KM percentile estimator is not evidence of
a breakdown in the reliability and efficiency of the PG percentile estimator.
The general indications of our pilot study are that the PEGE and the discrete
version of
are attractive alternatives to the KME. In view of the resemblan__ce
between the properties of the P E G E and those of the PEXE, the results for PQ4
portend well for the new discrete estimator: we expect it to be almost unbiased
and to be not only more efficient than the K M E but also more stable under
increased censoring. Moreover, we expect the corresponding percentile estimator
to have these desirable properties also because it is likely to behave at least as
well as the PG percentile estimator.
The properties involving relative efficiency are of considerable importance
because relative efficiency is a measure of the relative quantities of information
utilized by the estimators being compared. This interpretation of relative efficiency, and the fa__ct that heavy censoring is often encountered in engineering
problems, makes FQ4 and its discrete counterpart even more attractive.
Aalen, O. (1976). Nonparametric inference in connection with multiple decrement models. Scandinavian J. Statist. 3, 15-27.
G. M. Mimmack and F. Proschan
Aalen, O. (1978). Nonparametric estimation of partial transition probabilities in multiple decrement
models. Ann. Statist. 6, 534-545.
Breslow, N. and Crowley, J. (1974). A large sample study of the life table and product limit estimators
under random censorship. Ann. Statist. 2, 437-453.
Chen, Y. Y., Hollander, M. and Langberg, N. (1982). Small-sample results for the Kaplan-Meier
estimator. J. Amer. Statist. Assoc. 77, 141-144.
Cox, D. R. (1972). Regression models and life tables. J. Roy. Statist. Soc. Ser. B 34, 187-202.
Desu, M. M. and Narula, S. C. (1977). Reliability estimation under competing causes of failure. In:
I. Shimi and C. P. Tsokos, eds., The Theory and Applications of Reliability I. Academic Press, New
Efron, B. (1967). The two sample problem with censored data. In: Proceedings of the Fifth Berkeley
Symposium on Mathematical Statistics and Probability Vol. IV. University of California Press,
Berkeley, CA, 831-853.
Fleming, T. R. and Harrington, D. P. (1979). Nonparametric estimation of the survival distribution
in censored data. Technical Report No. 8, Section of Medical Research Statistics, Mayo Clinic,
Rochester, MN.
Freireich, E. J. et al. (1963). The effect of 6-Mercaptopurine on the duration of steroid-induced
remission in acute leukemia. Blood 21, 699-716.
Kaplan, E. L. and Meier, P. (1958). Nonparametric estimation from incomplete observations. J.
Amer. Statist. Assoc. 53, 457-481.
Kitchin, J. (1980). A new method for estimating life distributions from incomplete data. Unpublished
doctoral dissertation, Florida State University.
Kitchin, J., Langberg, N. and Proschan, F. (1983). A new method for estimating life distributions
from incomplete data. Statist. and Decisions 1, 241-255.
Langberg, N., Proschan, F. and Quinzi, A. J. (1981). Estimating dependent life lengths, with applications to the theory of competing risks. Ann. Statist. 9, 157-167.
Miller, R. G. (1981). Survival Analysis. Wiley, New York.
Mimmack, G. M. (1985). Piecewise geometric estimation of a survival function. Unpublished doctoral
dissertation, Florida State University.
Nelson, W. (1969). Hazard plotting for incomplete failure data. J. Quality Technology 1, 27-52.
Nelson, W. (1972). Theory and applications of hazard plotting for censored failure data. Technometrics 14, 945-966.
Peterson, A. V. (1977). Expressing the Kaplan-Meier estimator as a function of empirical subsurvival functions. J. Amer. Statist. Assoc. 72, 854-858.
Susarla, V. and Van Ryzin, J. (1976). Nonparametric Bayesian estimation of survival curves from
incomplete observations. J. Amer. Statist. Assoc. 71, 897-902.
Umholtz, R. L. (1984). Estimation of the exponential parameter for discrete data. Report, Aberdeen
Proving Ground.
Whittemore, A. S. and Keller, J. B. (1983). Survival estimation with censored data. Stanford
University Technical Report No. 69.
P. R. Krishnaiah and C. R. Rao, eds., Handbook of Statistics, Vol. 7
© Elsevier Science Publishers B.V. (1988)281-311
Applications of Pattern Recognition in Failure
Diagnosis and Quality Control
L. F. Pau
Through its compliance with, and implications on design, manufacturing,
quality control, testing, operations and maintenance (Figures 1 and 2), the field
of technical diagnostics has wide ranging consequences in all technical fields;
some of the measures hereof are:
system availability;
system survivability;
- safety;
FAILED STATE Ei i:q,..,(N-1)
Fig. 1. Relation between failure diagnosis, reliability or degradation processes, safety and maintenance. If the repair is instantaneous (/% = + oo), if there is no detection delay (tin + td = 0), and if the
diagnostic system itself never fails, the asymptotic system availability in the stationary case is:
A = Prob(UUT not failed)t_ ÷ oo = lqi= 1 ei/(2i + ee). More general formulae may be derwed, especially for finite repair times, and more general degradation processes.
L. F. Pau
Applications of pattern recognition in failure diagnosis and quality control
production yield;
failure tolerance;
system activation delays;
- costs (lifetime, operation);
- maintenance;
- warranties.
We define here technical diagnostics as the field dealing with all methods,
processes, devices and systems whereby one can detect, localize, analyse and
monitor failure modes of a system, i.e., defects and degradations (see Section 2).
It is at this stage essential to stress that, whereas system reliability and safety
theories are concerned with a priori assessments of the probability that the system
will perform a required task under specified conditions, without failure, for a
specified period of time, the field of failure diagnosis is essentially focusing on
a posteriori and on-line processing and acquisition of all monitoring information
for later decision making.
Failure diagnosis has itself evolved from the utilization of stand-alone tools (e.g.
calibers), to heuristic procedures, later codified into maintenance manuals. At a
later stage, automatic test systems and non-destructive testing instruments, based
on specific test sequences and sensors, have assisted the diagnosis; examples are:
rotating machine vibration monitoring, signature analysis, optical flaw detection,
ultrasonics, ferrography, wear sensors, process parameters, thermography, etc.
More recently, however, there has been implementations and research on evolved
diagnostic processes, with heavier emphasis on sensor integration, signal/image
processing, software and communications. Finally, research is carried out on
automated failure diagnosis, and on expert systems to accumulate and structure
failure symptoms and diagnostic strategies (e.g. avionics, aircraft engines, software).
Although the number of application areas and the complexicity of diagnostic
systems have increased, there is still a heavy reliance on 'ad-hoc' or heuristic
approaches to basing decisions on diagnostic information. But a number of fundamental diagnostic strategies have emerged from these approaches, which can be
found to be common to these very diversified applications.
After having introduced in Section 2 a number of basic concepts in technical
diagnostics, we will review in Section 3 some of the measurement problems. The
basic diagnostic strategies will be summarized in Section 4. Areas for future
research and progress will be proposed in Section 5.
2. B a s i c c o n c e p t s in t e c h n i c a l d i a g n o s t i c s
Although they may have parts in common, we will essentially make the
difference between the system for which a diagnosis is sought (system/unit/process
under test: UUT), and the diagnostic system. The basic events in technical
diagnostics are well defined in terminology standards; they are: failure, defect,
degradation, condition.
L. F. Pau
A C T U A L F A I L U R E M O D E E.
Fig. 3.
A failure mode is then the particular manner in which an omission of expected
occurrence or performance of task or mission happens; it is thus a combination
of failures, defects, and degradations. For a given task or mission, the N possible
failure modes will be noted Eo, E 1. . . . . E~v_ l, where E o is the no-failure operating
mode fulfilling all technical specifications./~ is the failure mode identified by the
diagnostic system.
2.1. The basic troubleshooting process (Figure 3)
2.1.1. Failure detection: This is the act of identifying the presence or absence of
a non-specified failure mode in a specified system carrying out a given task or
mission, or manufactured to a given standard.
2.1.2. Failure localization: If the outcome of failure detection is positive, then
failure localization designates the material, structures, components, processes,
systems or programs which have had a failure.
2.1.3. Failure diagnosis: The act or process of identifying a failure mode E upon
an evaluation of its signs and symptoms, including monitoring information. The
diagnostic process carries therefore out a breakdown of failure detection into
individual failure modes.
Applications of pattern recognition in failure diagnosis and quality control
2.1.4. Failure analysis: The process of retrieving via adequate sensors all possible
information, measurements, and non-destructive observations, alltogether called
diagnostic information, about the life of the system prior and up to the failure;
it is also a method whereby to correlate these informations.
2.1.5. Failure monitoring: This is the act of observing indicative change of equipment condition or functional measurements, as warnings for possible needed
2.2. Performance of a diagnostic process
As a decision operator, any diagnostic system can make errors; each of the
following errors or performances can be specified either for a specific failure
mode, or in the expected sense over the set of all possible failure modes
Eo . . . . , EN-1. The probabilities 2.2.1-2.2.4 can be derived from the confusion
matrix (Figure 3). The overall effect of these performances is to affect system
availability, with or without test system linked to UUT (Pan, 1987b).
2.2.1. Probability of incorrect diagnosis: This is the probability of diagnosing a
failure mode different from the actual one, with everything else equal.
2.2.2. Probability of reject (or miss, or non-detection): This is the probability of
taking no decision (diagnosis or detection) when a failure mode is actually present.
2.2.3. Probability of false alarm: The probability of diagnosing that a failure mode
is present, when in fact none is present (except the normal condition Eo).
2.2.4. Probability of correct detection: The probability of detecting correctly a
failure mode to be present, when it actually is (E o excepted); when there is only
one possible failure mode El, it is the complement to one of the probability of
false alarm.
2.2.5. Failure coverage: This is the conditional probability that, given there exists
a failure mode of the UUT, this system is able to recover automatically and
continue operations. The process of automatic reconfiguration, and redundancy
management has the purpose of improving the coverage and making the system
2.2.6. Measurement time tin: This is the total time required for acquiring all
diagnostic iinformation (except a priori information) required for the failure
detection, localization and diagnosis, This time may be fractioned into subsequences, and estimated in expected value.
2.2.7. Detection (of diagnosis) delay td: This is the total time required to process
and analyze the diagnostic information, and also to display or transmit the failure
L. F. Pau
mode as determined. This time may be fractioned into subsequences, and
estimated in expected value.
2.2.8. Forecasting capability tf: This is the lead time with which the occurrence of
a specific failure mode (E o excepted) can be forecasted, given a confidence
interval or margin.
2.2.9. Risks and costs: Costs/risks are attached to each diagnosed failure mode/~,
obtained as a result of the decision process; they include: testing costs,
maintenance/repair costs, safety risks, lost usage costs, warranties, yields.
3.I. Degradation processes
The thorough knowledge about the failure mode occurrence process, and not
only about the normal operating mode Eo, is an absolute must. It requires the
understanding of all physical effects, as well as of software errors, besides design,
operations, human factors and procedures (Figure 4). Failure modes may also
occur because of interactions with other UUT's (machines, or communication
nodes working together).
The results of this knowledge is the derivation of inference of:
• categorized lists of failure modes, and their duration or extent;
• lists of features or characteristic symptoms for detection and diagnosis, with
measurement ranges;
• priorities among categorized failure modes vs.:
- probabilities (availability),
- safety (critical events),
- timing (triggering, windowing, etc.),
- fault-effect models (e.g. error, propagation, stress-fracture relations).
This information is also used for the selection of sensors for technical
3.2. Sensors for technical diagnostics
It is important to distinguish between two classes of diagnostic sensors:
- passive sensors, with no interaction of probing energy with the UTT;
- active sensors, with interaction of probing energy with the U U T perturbating
the operations; this is carried out by personnel, automatic test systems, programmable systems, or other probing means.
In turn, the measurement process is either destructive, or non-destructive for
the UUT.
Needless to say, there is a very wide range of sensors, described and reported
in the technical diagnostics, measurement, and non-destructive testing litteratures.
These sensors are generally used in sequence. We will give below one application
Applicatmns of pattern recognition in failure diagnos~ and q u a ~ control
Fig. 4. Main causes for a degradation.
area into which much sensor development research is going, and refer the reader
to the References for other fields.
EXAMPLE. Integrated circuits diagnosis. See Figure 5.
3.3, Data fusion and feature extraction
3.3.1. Data fusion: In evolved diagnostic systems, it is realized that efficient
diagnosis cannot, in many cases, be based on the acquisition of one single
measurement only, possibly with one single sensor only (Pau, 1987a).
Another fundamental approach, is to strive towards the acquisition of the
measurement(s) by monitoring throughout the entire system life, including manufacturing, testing, operations, maintenance, modifications.
In order to cover those two requirements, evolved diagnostic systems are based
on sensor diversity, which besides increases the global sensor reliability and
reduces the vulnerability (Figure 6).
3.3.2. Feature extraction: The features are then those combined symptoms derived
jointly from d~erent sensors, these measurements being combined together by an
L. F. Pau
Active sensors
- Electrical signature
- Logic testing
Micromanipulator probes
(after removing die coat)
- Nematic LCD to
highlight operating circuit
- LCD displays for
comparative circuit nodal
Soft failure testing
Electron beam
- X-ray analysis
Passive sensors
- Visual inspection
- Electron microscopy
- Electrical pin-to-pin
- Leak testing
- Auger analysis
- Infrared thermography
- Freon boiling of hot spot
- LCD to detect changes in
electrical field
Capacitive discharges
Dynamic and monitored
accelerated testing/burn-in
-Humidity, vibration,
EMC testing
Mechanical abrasion with
ultrasonic probe
Radiation testing
Laser melt
Photoresist etching
- Passive accelerated testing/burn-in
Storage reliability testing
Fig. 5. Sensors and measurement processes for the diagnosis of integrated circuits.
o p e r a t i o n called feature extraction, to i n c r e a s e their u s e f u l n e s s for diagnosis. D a t a
f u s i o n from diverse s e n s o r s usually leads to m u c h i m p r o v e d features, a n d to
m o n i t o r i n g capabilities over the entire system life.
3.3.3. Sensor diversity: T h e diversity is in t e r m s of:
- m e a s u r e m e n t processes,
- design,
- location,
- acquisition rate, b a n d w i d t h , gain, wavelength, etc.,
- e n v i r o n m e n t a l exposure,
with possible s e n s o r r e d u n d a n c i e s (active, p a s s i v e ) a n d d i s t r i b u t e d s e n s o r control.
3.4. Measurement problems in technical diagnostics
I n a d d i t i o n to the classical issues o f calibration, m e a s u r e m e n t stability, process
consistency/stability, e n v i r o n m e n t , noise, the specific c o n c e r n s are:
3.4.1. Observability: T h i s is a n e v e n t u a l p r o p e r t y o f d y n a m i c systems which
expresses the ability to infer or e s t i m a t e the system c o n d i t i o n at a given p a s t
Applications of pattern recognition in failure diagnosis and quality control
instant in time, from quantified records of all measurements made on it at later
points in time. This property does not hold for most UTT's, first because of
missing measurements/data, and next because of time dependent changes of the
system condition which, in general, cannot be modelled.
3.4.2. Accessability to measurement points: One of the main limitations to observability is bad accessability of the main test or measurement points because of
inadequate design, and the insufficient number of such measurements. Another
source of limitation is inadequate selection of the measurement sampling frequency
(spatial or temporal or optical), so that fine features revealing incipient failures get
unnoticed. Measurement delays tm are also a problem.
3.4.3. Effect of control elements and protective elements: The observability is further
reduced for some parts of the UUT because of:
physical protection: hybrid/composite structures, coatings, multilayers, casings;
- p r o t e c t i o n and failure recovery systems: protection networks, fault-tolerant
parts, active spares, sating modes;
control elements: feedback controllers, limiters, and measurement effects due to
the detection delay td.
3.4.4. Sensor-UUT interactions: In case of electrical and mechanical measurements, impedance and bandwidth mismatch are introduced at the interface level,
resulting in signal distortion features which do not originate in system failure
modes. In the case of human observations, sources to obervation errors are many,
as expected. In the case of active sensors, it is essential to understand and model
as well as possible:
- the propagation of the probing energy into the U U T and the interaction with
the defects or failures;
- the inverse problem, of how defect and failure features propagate to the sensor.
EXAMPLE. Effects of intrinsic fracture energy on brittle fractures vs. ductile
fracture under plasticity, external and internal chemistry, and structural loadings.
This leads to complex crack kinetics, and ductile vs. brittle process models.
3.4.5. Support structure: The support structure, casing or board may, by its
properties or behavior, interfere with both the sensor and UUT, e.g. because of
mechanical impedance, electromagnetic interference (EMI), etc.
3.4.6. Distorsion: Is a classical problem in measurement, but added difficulties
result from the fact that the sensors themselves cannot be properly modelled
outside their normal operating bandwidth, whereas likely true measurements on
systems which fail will be characterized by extremely large bandwidths. Such large
bandwidths also contradict with low noise, and unfrequent calibration requirements.
L. F. Pau
- Sensor/measurement type
- Location
Diversity by: - Design
Data acquisition (bandwidth, gain, wavelength, data rate)
- Software
with possible redudancies (active, passive, software), and distributed control
Sensor measurement type 1:
Signals (analog; digital; radiation)
of diagnostic
information g
Sensor measurement type 2:
Images, electromagnetic waves
Sensor measurement type 3:
Human text; procedures; software, behavior
Fig. 6. Feature extraction and data fusion with sensor diversity.
3.4.7. Sensor reliability: Failure analysis and diagnosis are only possible if the
sensors of all kinds survive to system failures; this m a y require sensor redundancy
(physical or analytical), separate power supplies, and different technologies and
computing resources. Besides sensor and processor reliability, short reaction time
t m and good feature extraction are conflicting hardware requirements, all of which
contribute to increased costs which in turn limit the extent of possible implementations. Any diagnostic subsystem, and any U U T subsystem which can be
activated separately, should be equipped with a time meter, unit or cycle counter.
3.4.8. Data transmission errors: Whether the U U T is autonomous or not, analog
or digital multiplexing will often be used, followed by data transmission, e.g. on
a c o m m o n bus or local network. These transmission links may themselves
generate errors and fail. However, if the data acquisition rate is slow under good
operating conditions, data transmission becomes sometimes irrelevant: on-site
temporary data storage is then a convenient solution.
3.5. Research on sensors f o r diagnosis
The main trends are:
- development of cheap and reliable distributed sensor arrays (acoustic imaging,
fiber optic sensors, distributed position sensors, accelerometers .... );
sensor integration and measurement fusion, to enhance the detection and
diagnosis capabilities (vibrations/pressure, temperature/pressure, optical/temperature, pressure/acceleration/flow);
in-built analog-to-digital, or optoelectronic conversion;
Applications of pattern recognition in failure diagnosis and quality control
L. F. Pau
- in-built digital data error-detecting-correcting circuits;
- software controlled calibration;
better impedance matching of active sensors;
noise suppression.
Moreover, there is increased attention given to the processing of unstructured
verbal/written reports and actions by human operators: even if expressed in plain
language, they will often reveal essential diagnostic features.
As already mentioned in Section 1, there appears to exist essentially a few
fundamental diagnostic processes. The discovery of those admidst the technicalities of specific implementations, have actually led to substantial achievements across different application areas (e.g. from mechanical to control systems,
from software to mechanical processes). We will therefore review the:
- diagnostic strategies;
diagnostic system architectures controlled by these strategies (active and passive
test generation.
4.1. Diagnostic strategies S (Figure 7)
4.1.1. Diagnostic strategies S are always sequential, in at least one of the
following aspects: UUT configuration D: Diagnosis is sequentially applied to:
systems obtained by stepwise integration of these units/components;
- automata, software modules, operating systems obtained by stepwise integration
of the U U T with other interfacing systems (sensors, displays, controls, etc.), the
selection being guided by the diagnostic strategy.
- Diagnostic information Y: The diagnosis is using increasing numbers of
diagnostic measurements coming from a diversity of sensors, the selection being
guided by S; when active sensors are considered, the diagnostic measurements are
the results of the probing, as applied to successive UUT decompositions D. A priori/learning information I: The diagnosis is using increasing numbers of a priori/learning information, the retrieval being guided by S; this information set I includes data on the degradation process (see Section 3.1).
As a result, a diagnostic strategy S is a sequential search process in the product
set (D x Y x I): it is clear that U U T parts registration, data labelling are both
needed, besides timing information.
Applications of pattern recognition in failure diagnosis and quality control
4.1.2. There are essentially three basic diagnostic strategies S: Failure mode removal by analysis and inspection: The detection, diagnosis,
localization and removal o f the failure mode which has occurred, are carried out
in sequence; the removal affects, a m o n g others: requirements, design, control,
usage, parts, repair, programs, etc. Validation: Diagnosis cannot be considered complete until the U U T has
been demonstrated to solve the requirements that were set out in the U U T
specifications; validation consists in verifying that these are met. Exploring the operational envelope: The external specifications define the
operational envelope within which the U U T must perform correctly in mode E o.
These performance limits, while representative o f the realworld process, are not
necessarily accurate, and quite different system states m a y occur. These strategies
S therefore explore the behavior under circumstances not given as performance
requirements, including 'severe' operating environments.
4.1.3. Diagnostic strategy assessment: The assessment is done in terms of the
expected risk attached to a r a n d o m failure m o d e E, as estimated in terms of the
various performance criteria listed in Section 2.2.
4.1.4. Example: classification of software testing strategies S: The k n o w n software
testing techniques can be classified into the 3 classes o f Section 412; see Figure 8.
1. Failure removal:
Sensitized path testing
Fault seeding
Hardware/software test points and monitoring software
- Code analyzers
Dynamic test probes, injection of test patterns of bits
2. Validation:
- Proof-of-correctness
Program verification by predicate testing
- Proof-of-loops
Validation using a representation in a specification language
Validation by simulation
3. Exploring the
- Endurance tests
- Derivation of tests outside the specifications, by a specification
Automatic test case generation
Behavior of specific routines in extreme cases
Stress tests (inputs, time), saturation tests
Fig. 8. Classification of software testing strategies S.
L. F. Pau
~-m <
Applications of pattern recognition in failure diagnosis and quality control
4.2. Diagnostic system architectures
The diagnostic strategies S to be implemented control the utilization and access
to: UUT configuration D, diagnostic information Y, failure models and a priori
information/, all of which are part of the diagnostic system. The failure mode/~
is determined by the final diagnostic decision unit. Especially important in the
diagnostic system architecture, are the sequential set-up vs. D, Y, 1 with backtrackings, and the:
4.2.1. Measurement/diagnostic information unit: This senses diagnostic information
by active and passive sensors, and performs a parametric UUT identification by
adjusting a parametric model of the UUT; the estimated parameters are fed into
the diagnostic decision unit.
If these parameters are all measurable, the diagnosis is called external; if they
are only observable (and estimated by e.g. modal analysis, Kalman filter, or
error-detection-correction), the diagnosis is called internal.
4.2.2. Failure model unit: For a given UUT configuration D, operational environment, and set of other learning information/, this unit identifies and prioritizes
the possible failure modes Eo, E 1. . . . , E N - 1 (e.g. critical parts, active routines,
fracture locations). A failure mode effect model (FMEA analysis) is then adjusted
to a usage model of the UUT (incorporating e.g. fatigue, ductility, heating, cumulative failures, cumulative contents of registers) to derive predicted parameter values
for all possible failure modes Eo, E l ,
E N_ 1, and the potential effects on the
UUT performances.
Note that under a sequential diagnostic strategy S, a whole hierarchy of models,
with corresponding adjustment factors (environment, specification of parts, usage)
are needed; these models usually take the simple form of multi-entry tables stored
in read-only memories (e.g. fault dictionaries).
EXAMPLE. S n e a k circuit analysis (failure mode identification). This is, for electronic circuits, a systematic review of electric current and logic paths down to the
Failure modes E l ,
..., E N_
Fatigue of rolling elements/tracks
- Wear
Examples of feature parameters
Vibration parameters
Fiber optic inspection
Shock pulses
Radial position changes in shaft position/deflection
Cage failures
Frictional losses
Temperature changes
- Lubrication starvation, contamination
Temperature changes
Fig. 10. Failure modes of bearings (FMEA analysis).
L. F. Pau
components and logic statements, to detect latent paths, timing errors, software
errors, hardware failures. It uses essentially the specifications and nodal/topological network analysis, in addition to state diagrams for the logic.
EXAMPLE. Failures of bearings (FMEA analysis). See Figure 10.
4.2.3. Diagnostic decision unit (Figure 11). This decision logic determines the likely
failure mode /~ among Eo, El, ..., EN_I, from the estimated and predicted
parameters, with account for the cost/risk/time factors. This process, which may
also derive classification features from these data, is essentially a pattern
recognition process (signals, images, coded data, text, symbols, logic invariants);
the simplest case is straightforward comparison (template matching) between
estimated and predicted parameters (including event counts).
When the diagnostic decosion is used for the prediction of the remaining U U T
life, and passive sensors only are used, one would use the term non-destructive
evaluation (NDE) instead of technical diagnostics.
Extensions to be above are required within the context of knowledge based
systems or expert systems for diagnostics (Pan, 1986).
4.3. Test generation
This is the process whereby the active sensors, controlled by the diagnostic
strategy S, select and apply specific types of probing energy to the UUT. These
processes can be classified according to two criteria:
(i) functional testing (by cause-effect tables) vs. structural testing (by sensitizing
probing energy);
(ii) deterministic vs. random (by noise, Monte Carlo simulation, random
The possible failure modes, and the corresponding probing signals generated by
the active sensors, will usually be determined by the failure model unit
(Section 422).
However, the difficult design/selection issue to be resolved is whether these test
signals can also detect other failure modes than those which they should characterize. Test generation design will have both to minimize these overlaps, and to
find minimum test sequences to energize all hypothesized failure modes.
4.4. Design considerations for diagnostic system architectures
These architectures must meet conflicting criteria, which are essentially:
maximum diagnostic system reliability, because it must in general be larger than
the UUT reliability;
- relative diagnostic system cost vs. UUT cost;
ease of use for human operators; the diagnostic system must be either faster
or more intelligent;
updating capabilities and traceability;
- simultaneous design of the U U T and diagnostic system.
Applications of pattern recognition in failure diagnosis and quality control
L. F. Pau
4.5. Statistical pattern recognition methods used
The diagnostic decision (Section 4.2.3 and Figure 2) is explicitily a pattern
classification problem, as already stated (Pau, 1981). In the case the measurements Y are restricted to numerical values (signals, data), the statistical pattern
recognition (Fukunaga, 1972; S ebestyen, 1962) methods apply (Saeks and Liberty,
1977; Pau, 1981a, b; Rasmussen and Rouse, 1981). In view of the requirements
of the previous sections (especially 4.4), the standard methods used at each stage
for the diagnostic decision are (Section 2.1):
Features are selected and priority ranked among the following:
1. User traffic (demand)
2. Off-lineteletraffic measurements and statistics on:
each route or link (flows and intensities)
around each traffic node (input-output measurements)
3. On-line teletraffic measurements for:
- flow control
- congestion control/windowing
protocol use and interrupts
4. Hardware, software node condition monitoring
5. Error correction, propagation anomalies compensation, and disruption of links
6. Test and monitoring unit condition
7. Protection of transmission links carrying diagnostic information
Fig. 12. Features for data communications network tests and monitoring.
Failure detection
- Sequential hypothesis testing (Wald, 1947).
- Non-parametric sequential testing (Pau, 1978; Fu, 1968; Wald, 1947).
- Hypothesis testing (shift of the mean, variance) (Clark et al., 1975; Sebestyen,
- Bayes classification (Fukunaga, 1972).
Discriminant analysis (Fukunaga, 1972; Sebestyen, 1962).
- Nearest neighbor classification rule (Fukunaga, 1972; Devijver, 1979).
Sensor/observation error compensation (Pau and Kittler, 1980).
Failure localization
- Graph search algorithms (Saeks and Liberty, 1977; Rasmussen and Rouse,
1981; Slagle and Lee, 1971).
- Branch-and-bound algorithms (Navendra and Fukunaga, 1977).
Dynamic programming (Pau, 1981a; Bellman, 1966).
- Logical inference (Pau, 1984).
Failure diagnosis
Correspondence analysis (Pau, 1981a; Hill, 1974; Section 5).
- Discriminant analysis (Van de Geer, 1971; Benzecri, 1977).
Applications of pattern recognition in failure diagnosis and quality control
Canonical analysis (Hastman, 1960; Benzecri, 1977).
Nearest neighbor classification rule (Fukunaga, 1972; Devijver, 1979).
- Knowledge based or expert systems for diagnostics (Pan, 1986).
Failure analysis
Variance analysis, correlation analysis (Van de Geer, 1971).
Principal components analysis (Pau, 1981a; Van de Geer, 1971; Chien and Fu,
Scatter analysis (Van de Geer, 1971; Everitt, 1974).
Clustering procedures, e.g. dynamic clusters algorithm (Pau, 1981a; Everitt,
1974; Hartigan, 1975).
Multivariate probability density estimation (Parzen, kernel functions, k-nearest
neighbour estimators) (Fukunaga, 1972; Devijver, 1979; Parzen, 1962).
- Multivariate sampling plans (Pan et al., 1983).
Failure monitoring
Statistics of level crossings, especially two-level crossings (Saeks and Liberty,
1977; Pau, 1981a).
- Spectral analysis and FFT (Chen, 1982).
Kalman estimation (Pau, 1981a, 1977).
Recursive least-squares estimators.
Linear prediction ARMA, ARIMA estimators (Chen, 1982).
Knowledge based or expert systems for failure monitoring (Pau, 1986).
The problem is to diagnose defective machines among 33 machines, described
each by 4 measurements, while deriving a sequential diagnostic strategy S and
satisfying in that order three detection criteria:
(c0 maximum vibration level,
(/~) minimum flow,
(7) minimum electricity consumption.
5.1. Method
5.1.1. Introduction and problem analys&
(a) The case is set up as a clustering problem, where each of the 33 machines
considered is described by measurement attributes (vibration level, operating time,
electricity consumption, flow). The raw data are given in Figure 13. Some essential
characteristics of this problem are the following:
(i) the answer requested is to reduce the number of alternatives for the
diagnosis and failure location;
(ii) it is obvious, for technical reasons, that the four attributes are correlated;
(iii) the number of attributes measured on each machine is fairly small, and all
observations are real valued and non-negative.
L.F. Pau
Fig. 13. Raw data of machine diagnosis case (Section 5).
However, the parameters of these relations are u n k n o w n and they can only be
inferred from the sample of 33 machines.
(b) These characteristics build justifications for the use of multivariate statistical analysis, a n d of correspondence analysis in particular because of its joint use
of information about the machines and about the diagnostic measurements. The
main steps of correspondence analysis are the following (Pan, 1981a; Chen,
Step 1. First, infer from the data estimated correlations between machines and
between diagnostic measurements, a reduced set of i n d e p e n d e n t feature measurements, according to which the 33 alternative machines may be ranked. As far as
this step is concerned, and this step only, correspondence analysis is comparable
Applications of pattern recognition in failure diagnosis and quality control
to factor analysis (Van de Geer, 1971; Hartman, 1960), although the two differ
in the remaining steps.
Step 2. Next, interpret the nature of these statistically independent feature
measurements, by indicating the contribution to each of these by the original
attribute measurements, and determine the diagnosis in terms of these features.
Step 3. Thereafter, rank the feature measurements by decreasing contributions
to the reconstruction of the original 33 x 4 evaluation measurements; the best
feature measurement (e.g. the first) is, in correspondence analysis, the one
maximizing the variance in that direction; in other words, this is the feature
measurement which produces the ranking with the highest possible discrimination
among the 33 machines, thus reducing the doubt of the repairman.
Step 4. Finally, recommend to the failure location those machines which get
the most favorable ranking (in terms of the interpretation) on the first feature axis,
eventually also on the second axis.
(c) One essential advantage of this approach is that the decision maker, will
be provided with a two-dimensional chart, which he may easily interpret, and on
which he may spot with the eye in a straightforward manner, the final reduced
set of candidate machines. Also, apart from the number of feature measurements
used in step 4, no additional assumption is needed, because unsupervised multivariate statistical analysis is used. The effect of linear transformation and rescaling
of the initial data is indicated in Section
5.1.2. Theory and use of correspondence analysis (Chen, 1982; Hill, 1974; Pau,
1981a). Notation. Let k(I, J) be the incidence table of non-negative numbers,
representing the attribute measurements j t 3", j = 1, 2, 3, 4, on the machines i t / ,
i = 1, ..., 33. The marginals are defined as follows:
k(i, ") ~=~ k(, j),
K(. , j) ~=~ k(i, j) .
It is convenient to operate on the contingency table p(I, J), rather than on the
incidence table k(1, J):
p(i, j) =A k(i, j) / ~] k(m, n),
and corresponding
p(i, "), p(',j)
r will be the number of feature measurements extracted; here r ~< 4. Concepts and principles of interpretation. Generalizing the classical partition
of a contingency table by a Z2 test (Pearson), correspondence analysis yields
natural clusters made of rows i t I and columns j t J which go together to form
natural groups in the feature measurement space. Their construction is essentially
based upon geometrical proximities between rows i t I and/or columns j t J; these "
proximities may be identified by visual inspection, if only two feature measurements are considered, by building coordinate axes for all machines i t I and
L. F. Pau
attribute measurements j E J. Such representations, called maps, are precious tools
for visual clustering, and thus to diagnose causality relations between measurements and machines.
By construction, all the effects of statistically dependent rows and columns such
k(i, j) = k(i, ") k ( ' , j)
will be removed. Equivalent machines will thus appear immediately as having very
close representations on the maps. The machine space I is provided with a
distance measure, called Z2 metric, defined by
d2(il, i2) = ~ p ( ' , j) [x(i,, j) - x(i2, j)12,
x(i, j) a= _
p(;, .)p(., j)
Moreover, each machine i~ I and each measurement j e J are assigned the
weights p(i, .), and p ( . , j), respectively, for all variance computations using the
Z2 metric. Theory of correspondence analysis: summary (Pau, 1981a; Chen, 1982; Hill,
(a) Correspondence analysis, or as it is also called, Fisher's canonical analysis
of contingency tables, amounts to looking for vectors
F = t(F(1), . . . , F(Card(J)))
G = t(G(1), ..., G(Card(I))),
where Card(. ) is the number of elements in the set, such that when the functions
f, g of the random variables (Y, X) = ( j , / ) are defined by the relations
f(Y) = F(j),
g ( X ) = G(i),
then the correlation between the random variables f ( Y ) , g(X) is maximum.
spondence analysis can be applied to non-negative incidence tables k(L
well as to contingency tables p(I, J); the former will be considered
(b) Let k(L ' ) and k ( ' , J ) be the diagonal matrices of row and column
assuming none to be zero. The sequence of operations
F (1) =
( k ( ' , J ) ) - ~ tk(I, J ) G °~ ,
G (2) = (k(/, "))- l k ( / , J ) F (1) ,
F Ce) = (k(., J ) ) - ~ tk(I, J ) G (2~ ,
CorreJ), as
in the
Applications of pattern recognition in failure diagnosis and quality control
in which new vectors F ('m, G (m) are successively derived from an initial vector
G (1), is referred to here as the Co(k((L J)) algorithm corresponding to the tableau
I,(i, J).
(c) Its eigenvectors, as defined below, are the solutions of the correspondence
analysis problem, and the coordinates of the individuals and measurements in the
feature space are simply:
F(j, n) = F * ( j ) ,
G(i, n) = G*(i),
where n = 1, ..., M i n ( C a r d ( / ) , Card(J)), and F*, G* are the eigenvectors of rank
n of the algorithm Co(k(I, J)), when ranked by decreasing eigenvalues 2,.
(d) Each triple (p, F*, G*) is an eigensolution if:
pGg¢ ~. (k(L .))-1 k(l~ J)F*,
pF* = (k(., J ) ) - ' tk(I, J ) G * ,
p= Computational formulas.
(1) Define the dimension 1 ~< r ~< Min (Card)(/), C a r d ( J ) ) of the feature space
after data compression.
(2) (a) G* and 2, = pn2 are respectively the (n + 1)st column eigenvector and
associated eigenvalue of the symmetrical semi-definite matrix S = [sit]:
sit = ~
p(i, j)p(i, l)
i~1 p(i, ' ) x / p ( . , j ) p ( . , i )
' j' l ~ J ,
which has 2 0 = 1 as largest eigenvalue;
(b) These eigenvectors G* = [G*(i),
creasing eigenvalues 1 >/21 > / . . . > 2r >
(3) The factor axes F* of the cluster
values 2,, and
F* = ( 1 / x / ~ ) ( p ( . , J ) ) - '
all the coordinates of G* are equal.
i = 1. . . . , C a r d ( / ) ] are ranked by de0. They are the factor axes of the cluster
N ( J ) are associated to the same eigen-
tp(I, J ) G * ,
( p ( . j ) ) - i tp(i, j ) = [p(j, i)/(p(., j ) ] , i = row ; j = c o l u m n .
(4) (a) The coordinate G(i, n), n = 1. . . . , r, of the individual i e I on the factor
axis G* is G*(O.
(b) The coordinate F(j, n), n -- 1. . . . , r, of the measurement j e J on the factor
axis F~*, is F~*(j).
(c) Both individuals i e I and measurements j e J m a y then be displayed in the
same r-dimensional feature space, with basis vectors G*, n = 1, . . . , r.
G(i, n) -
~ p(i, j ) F ( j , n) . .i e .I , . n .= 1,
p(i, • ) jT"J
L. F. Pau
(5) Data reconstruction formula:
p(i, j) = p(i, .)p(', j) [1 +
x/~. F(j, n)G(i, n)l .
t/=l~...,r Contributions, and interpretations of the factor axes representing the feature
measurements. On a map, the squared Euclidean distance D between rows and/or
columns, has the same value as the Z2 distance between the corresponding
profiles, and
2. = ~ p(' ,j)2
n)) 2 = ~.
p(i, ") (G(i,
n)) 2 ,
n= 1,...,r,
:~n = ~n" Trace(S).
This justifies the following definitions:
(i) p(i, .)(G(i, n))2 Sign(G(/, n)) is the contribution of the row/machine i to the
factor axis n of inertia ).n ;
(ii) p(., j) (F(j, n))2 Sign(F(j, n) is the contribution of the column/measurement
j to the factor axis n of the inertia 2,,.
The rule is then to interpret the feature axis n, with reference only to those
machines and measurements which have the largest (or smallest) contributions to
that axis.
5.I.2.6. Lffect of rescaling the data k(L J). If the attribute measurement k(i, j) is
rescaled by a factor aj > 0, and if the modified x coordinates are noted xa, then
xa(i, j) ~=(x(i, j) + 1)
1 + (aj - 1)p(i, j)ip(i, • )
If we assume aj small, Card(J) large, we may replace p(i, j) by its expected
value and get the approximation
xa(i, j) "~ 1
1 ] (x(i, j)+ 1)- i.
As a consequence, the modified ~2 distance becomes
da2(il, i2) = aj 1
aj - 1_] 2 d2(i,, i2).
In other words, if one attribute measurement j ~ J is rescaled, essentially only
the point representing this measurement will be moved, whereas all distances in
the machine space I will be multiplied by the same factor.
Rescaling does consequently not affect the relative positions of the machines,
and the machine diagnosis procedure does still apply.
Applications of pattern recognition in failure diagnosis and quality control
1. C o o r d i n a t e s F
o f the m e a s u r e m e n t s
2. C o o r d i n a t e s G
o f the m a c h i n e s
3. E i g e n v a l u e s
and inertia
- 0.00886
- 0.08758
G (L15)
G (L 5)
G(L 4)
G(L 8)
G (L22)
G (L31)
G (L20)
G(L 1)
G(L 7)
G (L16)
G (L12)
G (L 9)
G(L 6)
G(L 3)
G(L 2)
G (L30)
G (L32)
- 0.11505
- 0.10726
- 0.08407
- 0.07633
- 0.06924
- 0.06350
- 0.05833
- 0.05656
- 0.05310
- 0.04395
- 0.03896
- 0.00388
- 0.00200
- 0.03511
- 0.00993
- 0.03973
- 0.03516
- 0.01782
- 0.01753
- 0.05232
- 0.04054
- 0.05806
- 0.04308
- 0.04922
- 0.00506
- 0.00150
- 0.00159
- 0.00472
- 0.00186
- 0.00033
- 0.01013
- 0.00104
- 0.00243
- 0.00833
- 0.00978
- 0.00539
0.9931E -03
0.1817 E - 0 4
100 00~o
4. E i g e n v e c t o r s
- 0.47204
- 0.23851
0.4629 E - 0 2
- 0.49587
- 0.99787
Fig. 14. C o o r d i n a t e s o f all m e a s u r e m e n t s a n d m a c h i n e s ( S e c t i o n 5).
L. F. Pau
5.3. Case results
Following the procedure presented in Section 5.1, the theory of which was
summarized in Section 5.1.2, we will in the following interpret the numerical
results obtained, eventually displayed in the compagnion Figures 14, 15, 16.
5.2.1. Step 1: Computation of the feature measurements. First r = 3 feature
measurements are extracted; they are the eigenvectors G~', G~', G*.
5.2.2. Step 2: Interpretation of the feature measurements.
(a) They are obvious from the reading of the computed contributions of the
machines and measurements to G*, G*, and G* (see Figure 14).
(i) G~': The first feature measurement opposes the operating time (contribution = 0.304 E - 02) to the vibration level (contribution = - 0.103 E - 02),
while the flow has weaker but here similar contribution to the operating time; this
first feature measurement is thus the vibration level per unit of operating time.
(ii) G*: The second feature measurement opposes the flow (contribution = 0.691 RE - 03) to operating time (contribution = - 0.244 E - 03); the
second feature measurement is thus the flow required for running the machine.
(iii) G*: The third feature measurement isolates the electricity consumption
alone (contribution = 0.181 E - 0 4 ) ; this means that it has only a minor impact
on the machine diagnosis problem.
(b) The goals are to fulfill, in the given order, the following diagnostic criteria:
(a) maximize the vibration level per unit operating time, thus select machines
with large positive contributions and coordinates on G~';
(fl) minimize the flow, thus select machines with large positive contributions
and coordinates on G~';
(~) minimize the electricity consumption, thus select machines with large positive contributions and coordinates on G*.
5.2.3. Step 3: Ranking the feature measurements. The numlerical results from
Figure 14 yields:
).1 eigenvalue of G* = 0.4629 E - 02 or z I -- 82.07~o ,
22 eigenvalue of G* = 0.993 E - 0 3 or z2 --- 1 7 . 6 1 ~ ,
23 eigenvalue of G* = 0.181 E - 0 4 or z3 = 0.32~o .
Here, it is obvious that the machine diagnosis would essentially rely on the first
feature measurement (vibration level per unit of operating time) and eventually
somehow on the second (flow). Our three-criteria problem has been reduced to
a two-criteria problem with G* as a leading diagnostic criteria to be maximized.
5.2.4. Step 4: Machine diagnosis.
(a) Looking at the machines in the first quadrant of Figure 16, one sees that
the non-dominated points according to the two criteria (~) and (13) are 32, 23, 27,
3, 30,2.
Applications of pattern recognition in failure diagnosis and quality control
l l l l l l l i l l I [ 1 1 1 [ l l i ; l l l l l l l l i l l l i
I l l i l l l i l l l f l l
l l i l l i l l l l l l l l l l l l l l i l l i l i t i l l i l l
I I I I I I I t l l l i l l l l l l ~ l i f l l l l l l l l ~ l l
l i l l
l i l i l l l i l l l l l l l
L. F. Pau
. . . . . . . . . . . . . . . . . . . . . .
WATR . . . . . . .
I,LI3 L18
Fig. 16. Map of all 4 measurements and 33 machines.
Applications of pattern recognition in failure diagnosis and quality control
(b) Because we want the criterion (a) to dominate, we will have to make an
ordering within these non-dominated solutions. Figure 15, which contains the
contributions of the machines to G*, Figure 14 which contains their coordinates,
and last but not least the map of Figure 16, give us, according to the rule (~), the
Diagnose as defective machine # 3 2 ; if not: # 3 0 ; if not: # 2 3 ; if not: # 2 ; if
not: # 2 7 ; if not: # 3 ; etc.
However, the first machine in this sequence also to have a large positive
contribution to G* (flow) according to criterion (fl), is Machine 27, and the next
Machine 3, or Machine 6. Machines 30 and 2 have negative contributions to G*,
and should be eliminated.
(c) By visual clustering, one could select right away the machines by the original
criteria of minimizing the vibration level, the operating time, the electricity consumption, or the flow p e r s e , by looking at the factor map Figure 16, for which
machines are close to the points representing these criteria/measurements:
(i) Max vibration level: Machines 14, 19, 31, 24, 8, 20, 13, 18, close to PRIC.
(ii) Min operating time: Machines 2, 21, 1, 10, close to TIME.
(iii) Min electricity consumption: Machines 17, 16, close to CONS.
(iv) Min flow: Machines 6, 3, 27, 25, 11, close to WATR.
Notice the large differences between the previous selections (a), (b) according
to criteria (~) and (fl), and the latter ones (c).
5.2.5. Conclusion. Because of the significant contributions of G~* and G*, and
because of the removal of correlated effects, we recommend the following reduced
diagnosis of defect machines:
Machines 32, 23, 27, 3 (in that order, the first being the most likely to have
The bibliography on statistical and pattern recognition approaches to failure
diagnosis is enormous, and scattered across many sections of the technical litterature, often within the context of specific applications. Therefore, in addition to a
few numbered recent references of a general nature, are listed a number of major
public conferences dealing to a substantial extent with technical diagnostics.
Neither lists are by any means complete, but are indicated to seve as starting
Beliman, R. (1966). Dynamic programming, pattern recognition and location of faults in complex
systems. J. AppL Probab. 3, 268-280.
Benzecri, J. P. (1977). L'Analyse des Donn~es, Vol. 1 & 2. Dunod, Paris.
Chen, C. H. (1982). Digital Waveform Processing and Recognition. CRC Press, Boca Raton, FL.
Chien, Y, T. and Fu, K. S. (1967). On the generalized Karhunen-Lorve expansion. IEEE Trans.
Inform. Theory 13, 518-520.
Clark, R. N. et al. (1975). Detecting instrument malfunctions in control systems. IEEE Trans.
Aerospace Electron. Systems 11 (4).
L. F. Pau
Collacott, R. A. (1976). Mechanical Fault Diagnosis and Condition Monitoring. Chapman & Hall,
Devijver, P. A. (1979). New error bounds with the nearest neighbor rule. IEEE Trans. Inform. Theory
25, 749-753.
Everitt, B. (1974). Cluster Analysis. Wiley, New York.
Fu, K. S. (1968). Sequential Methods in Pattern Recognition and Machine Learning. Academic Press,
New York.
Fukunaga, K. (1972). Introduction to Statistical Pattern Recognition. Academic Press, New York.
Hartigan, J. A. (1975). Clustering Algorithms. Wiley, New York.
Hartman, H. (1960). Modern Factor Analysis. University of Chicago Press, Chicago, IL.
Hill, M. O. (1974). Correspondence analysis: a neglected multivariate method. AppL Statist. Ser. C
23 (3), 340-354.
IEEU Spectrum (1981). Special issue on reliability, October 1981.
IMEKO (1980). TC-10: Glossary of terms and definitions recommended for use in technical
diagnostics and condition-based maintenance. IMEKO Secretariat, Budapest.
Narendra, P. M. and Fukunaga, K. (1977). A branch and bound algorithm for feature subset
selection. IEEE Trans. Comput. 26, 917-922.
Parzen, E. (1962). On estimation of a probability density function and mode. Ann. Math. Statist. 33,
Pan, L. F. (1977). An adaptive signal classification procedure: application to aircraft engine
monitoring. Pattern Recognition 9 (3), 121-130.
Pau, L. F. (1978). Classification du signal par tests s6quentiels non-param&riques. In: Proc. Conf.
Reconnaissance des formes et traitement des images. INRIA, Rocquencourt, pp. 159-168.
Pau, L. F. (1981a). Failure Diagnosis and Performance Monitoring. Marcel Dekker, New York.
Pau, L. F. (1981b). Applications of pattern recognition to failure analysis and diagnosis. In: K. S. Fu,
ed, Applications of Pattern Recognition. CRC Press, Boca Raton, FL, Chapter 5.
Pau, L. F. (1984). Failure detection processes by an expert system and hybrid pattern recognition.
Pattern Recognition Lett. 2, 419-425.
Pau, L. F. (1986). A survey of expert systems for failure diagnosis, test generation and maintenance.
Expert Systems J. 3 (2), 100-111.
Pau, L. F. (1987a). Knowledge representation approaches in sensor fusion. In: Proc. IFAC World
Congress. Pergamon Press, Oxford.
Pau, L. F. (1987b). System availability in presence of an imperfect test and monitoring system. IEEE
Trans. Aerospace Electron. Systems, 23(5), 625-633.
Pau, L. F. and Kittler, J. (1980). Automatic inspection by lots in the presence of classification errors.
Pattern Recognition 12 (4), 237-241
Pau, L. F., Toghrai, C. and Chen, C. H. (1983). Multivariate sampling plans in quality control: a
numerical example. IEEE Trans. Reliability 32 (4), 359-365.
Rasmussen, J. and Rouse, W. B. (Editors) (1981). Human Detection and Diagnosis of System Failures.
NATO Conference series, Vol. 15, Series 3. Plenum Press, New York.
Saeks, R. and Liberty, S. (1977). Rational Fault Analysis. Marcel Dekker, New York.
Sebestyen, G. (1962). Decision Making Processes in Pattern Recognition. MacMillan, New York.
Slagle, J. R. and Lee, R. C. T. (1971). Application of game tree searching techniques to sequential
pattern recognition. Comm. ACM 14 (2), 103-110.
Van de Geer, J. P. (1971). Introduction to Multivariate Analysis for the Social Sciences. Freeman, San
Francisco, CA.
Wald, A. (1947). Sequential Analysis. Wiley, New York.
IEEE Automatic Testing Conference (AUTOTESTCON).
IEEE International Test Conference (Cherry Hill).
IEEE/AIAA Annual Reliability and Maintainability Conferences.
Applications of pattern recognition in failure diagnosis and quality control
IEEE/IFIP International Conferences on Fault-Tolerant Computing.
IEEE Reliability Physics.
IEEE/ASME/AIAA American Automatic Control Conference.
ASME (American Society of mecanical engineers) International Conference on Non-destructive
ASNT (American Society for Non-destructive Testing) Annual QUALTEST Conference.
ASNT (American Society for Non-destructive Testing) Topical Conferences.
IFAC (International Federation on Automatic Control), SAFECOMP (Safe Computing) Conference.
IMEKO (International Measurement Confederation) International Conference on technical diagnostics.
IBE Conf. Ltd, International Conference on Terotechnology, England.
BINDT (British Institute of NDT), Annual Conference on Non-destructive Testing, England.
EFMS (European Federation of Maintenance Societies), European Maintenance Congress.
Mechanical Failure Prevention Group (MFPG), National Bureau of Standards, Conferencz on
Detection, Diagnosis and Prognosis.
ISTFA (International Society for Testing and Failure Analysis), Annual Testing and Failure Analysis
IFS Publ., International Conference on Automated Inspection and Product Control, England.
NETWORK Ltd, Annual Conference on Automatic Testing, England.
Institute of Environmental Sciences, Annual Conference, USA.
ASM (American Society of Metals), International Conference on Non-destructive Evaluation in the
Nuclear Industry.
ESPRIT-supported Conferences on Expert Systems for Failure Diagnosis.
P. R. Krishnaiah and C. R. Rao, eds., Handbook of Statistics, Vol. 7
© Elsevier Science Publishers B.V. (1988) 313-331
1 I~
Nonparametric Estimation of Density and Hazard
Rate Functions when Samples are Censored*
W. J. Padgett
I. Introduction
A common and very old problem in statistics is the estimation of an unknown
probability density function. In particular, the problem of nonparametric probability density estimation has been studied for many years. Summaries of results
on nonparametric density estimation based on complete (uncensored) random
samples have been listed recently by several authors, including Fryer [18], Tapia
and Thompson [52], Wertz and Schneider [60], and Bean and Tsokos [2]. Also,
a review of results for censored samples has been given by Padgett and
McNichols [39]. In addition to its importance in theoretical statistics, nonparametric density estimation has been utilized in hazard analysis, life testing, and
reliability, as well as in the areas of nonparametric discrimination and high energy
physics [20].
The purpose of this article is to present the different types of nonparametric
density estimates that have been proposed for the situation that the sample data
are censored or incomplete. This type of data arises in many life testing situations
and is common in survival analysis problems (see Lagakos [25] and Kalbfleisch
and Prentice [21], for example). In many of these situations, some observations
may be censored or truncated from the right, referred to as right-censorship. This
occurs often in medical trials when the patients may enter treatment at different
times and then either die from the disease under investigation or leave the study
before its conclusion. A similar situation may occur in industrial life testing when
items are removed from the test at random times for various reasons. It is of
interest to be able to estimate nonparametrically the unknown density of the
lifetime random variable from this type of data without ignoring or discarding the
right-censored information. The development of such nonparametric density estimators has only occurred in the past six or seven years and the avenues of
* This work was supported by the U.S. Air Force Office of Scientific Research and Army Research
w. J. Padgett
investigation have been similar to those for the complete sample case, except that
the problems are generally more difficult mathematically.
The various types of estimators from right-censored samples that have been
proposed in the literature will be indicated and briefly discussed here. They
include histogram-type estimators, kemel-type estimators, maximum likelihood
estimators, Fourier series estimators, and Bayesian estimators. In addition, since
the hazard rate function estimation problem is closely related to the density
estimation problem, various types of nonparametric hazard rate estimators from
right-censored data will be briefly mentioned. Due to their computational simplicity and other properties, the kernel-type density estimators will be emphasized,
and some examples will be given in Section 7.
Before beginning the discussion of the various estimators, in the next section
the required definitions and notation will be presented.
2. Notation and preliminaries
Let X °, X 2,
° ..-, X~o denote the true survival times of n items or individuals
which are censored on the right by a sequence Ua, U2. . . . . Un which in general
may be either constants or random variables. It is assumed that the )(O's are
nonnegative independent identically distributed random variables with common
unknown distribution function F °. For the problem of density estimation, it is
assumed that F ° is absolutely continuous with density fo. The corresponding
hazard rate function is defined by r ° = f°/(1 - F ° ) .
The observed right-censored data are denoted by the pairs (X~, A~), i = 1, . . . , n,
X; = min{X °, Us},
if Xg° ~< U;'
if X ° > U~.
Thus, it is known which observations are times of failure or death and which ones
are censored or loss times. The nature of the censoring mechanism depends on
the Us's:
(i) If U1, . . . , Un are fixed constants, the observations are time-truncated. If
all U,.'s are equal to the same constant, then the case of Type I censoring results.
(ii) If all f i -- X~°~, the rth order statistic of X ° . . . . , X °, then the situation is
that of Type II censoring.
(iii) If UI . . . . . Un constitute a random sample from a distribution H (which is
usually unknown) and are independent of X °, . . . , X°, then (Xi, A;),
i = 1, 2, . . . , n, is called a randomly right-censored sample.
The random censorship model (iii) is attractive because of its mathematical
convenience. Many of the estimators discussed later are based on this model.
Assuming (iii), A1. . . . . A,, are independent Bernoulli random variables and the
distribution function F of each X;, i = 1. . . . . n, is given by 1 - F =
Nonparametric estimation of density and hazard rate functions
(1 - F °) (1 - H). Under the Koziol and Green [24] model of random censorship,
which is the proportional hazards assumption of Cox [7], it is assumed that there
is a positive constant fl such that 1 - H = (1 - F°) ~. Then by a result of Chen,
Hollander and Langberg [6], the pairs (X°, Ue), i-- 1, . . . , n, follow the proportional hazards model if and only if (X1 . . . . . An) and (A1. . . . . An) are independent. This Koziol-Green model of random censorship arises in several
situations (Efron [11], Cs0rg6 and Horvfith [8], Chen, Hollander and Landberg [6]). Note that fl is a censoring coefficient since a = P ( X ° ~< Ue) = (1 + fl)- 1,
which is the probability of an uncensored observation.
Based on the censored sample (X;, Ae), i = 1, . . . , n, a popular estimator of the
survival probability S ° ( t ) = 1 - F ° ( t ) at t >/0 is the product-limit estimator, proposed by Kaplan and Meier [22] as the 'nonparametric maximum likelihood
estimator' of S °. This estimator was shown to be 'self-consistent' by Efron [11].
Let (Z e, A; ), i = 1. . . . , n, denote the ordered X;'s along with their corresponding
Ae's. A value of the censored sample will be denoted by the corresponding lower
case letters (x e, be) or (ze, hi) for the unordered or ordered sample, respectively.
The product-limit estimator of S O is defined by [11]
/sn(t) =
i+ 1
t E ( Z k _ l, Z ~ ] , k
2 .....
Denote the product-limit estimator of F ° ( t ) by fin(t) : 1 - fin(t), and let sj denote
the jump o f / s n (or Pn) at Zj, that is,
1 -/Sn(Z2),
sj: / P,(zj)-Pn(Zj+l),
[ fin(Zn),
j = 1,
n -
j = n.
Note that sj = 0 if and only if Aj = O, j < n, that is, if Zj is a censored observation.
The product-limit estimator has played a central role in the analysis of censored
survival data (Miller [36]), and its properties have been studied extensively by
many authors, for example, Breslow and Crowley [4], Frldes, Retj6 and
Winter [15], and Wellner [59]. Many of the nonparametric density estimators
from right-censored data are naturally based on the product-limit estimator,
beginning with the histogram-type and kernel-type estimators.
3. Histogram and kernel estimators
One of the simplest nonparametfic estimators of the density function for
randomly right-censored samples is the histogram estimator. Although they are
W. J. Padgett
simple to compute, histogram estimators are not smooth and are generally not
suited to sophisticated inference procedures.
Estimation of the density function and hazard rate of survival time based on
randomly right-censored data was apparently first studied by Gehan [29]. The life
table estimate of the survival function was used to estimate the density f o as
follows: The observations (x,, b,), i = 1. . . . . n, were grouped into k fixed intervals
[ q , /2), [t2, t3), . . . , [tg, ~ ) , with the finite widths denoted by h,= t,+ 1 -te,
i = 1, . . . , k - 1. Letting n[ denote the number of individuals alive at time t, L;
be the number of individuals censored (lost or withdrawn from the study) in the
interval [t,, t,+ 1), and d, be the number of individuals dying or failing in the ith
interval (where time to death or failure is recorded from time of entry into the
study), define ~, = de/n , and/~, = 1 -Oe, where n, = n~r - 5 1L r Therefore, qe is an
estimate of the probability of^dying or failing in the ^ith interval, given exposure
risk in the ith interval. Let 1I, = p,_ 1H,_ 1, where //1 - 1. Gehan's estimate of
f o at the midpoint tm~ of the ith interval is then
f (tin,) -
1-1, -
1-1,+ 1 _ 1-1,0`
i = 1,
k - 1
expression for estimating the large sample approximation to the variance of
f(tmi ) was also given in [19].
Using the product-limit estimator P. of F °, F01des, Rejt0, and Winter [16]
defined a histogram estimator of f o on a specified interval [0, T], T > 0. For
integer n > 0, let 0 = t~o") < tc1") < • • • < t~,~} = T be a partition of [0, T] into n
subintervals I~"), where
t vn--1 , T ] ,
i = v,
Then their histogram estimator is
L,(t7 °) -
x6I} '°.
t7 '> A
If x ¢ [0, T], f ( x ) is either undefined or defined arbitrarily. Notice that if none
of the observations are censored,/~, reduces to the empirical distribution function,
and (3.1) becomes the usual histogram estimator with respect to the given partition. The strong uniform consistency of f on [0, T] was proven by F01des,
Rejt0, and Winter [16] under some conditions on the partition, provided that f o
was continuous on [0, T] and H ( T - ) <
1, where H ( T - ) denotes the limit from
the left of H at T. This last condition is common in obtaining consistency
properties under random right-censorship and insures that uncensored observations can be obtained from the entire interval of interest.
Nonparametric estimation of density and hazard rate functions
Burke and Horv/lth [5] defined general density estimators which included
histogram-type and kernel-type estimators with appropriate choices of the defining
functions. They also obtained asymptotic distribution results for these estimators.
In fact, their results were obtained for the more general situation of the k independent competing risks model. When k = 2, this reduces to the random rightcensorship model.
The histogram estimator can be obtained as a special case of the kernel density
estimators. The kernel-type estimators have been perhaps the most popular estimators in practice due to their relative computational simplicity, smoothness, and
other properties. Kernel-type estimators from randomly right-censored data have
been studied only since around 1978, beginning with the work of Blum and
Susarla [3]. The investigation of kernel estimators for right-censored samples has
been attempted along the same lines as for the complete sample case. However,
due to mathematical difficulties introduced by the censoring, some of the
analogous theory to the complete sample case has not yet been obtained.
Blum and Susarla [3] generalized the complete sample results of Rosenblatt [45] concerning maximum deviation of density estimates by the kernel
method. To define the Blum-Susarla density estimator, let {hn} be a positive
sequence, called the bandwidth sequence, such that limn~o~ h, = 0, and let
N + (x) denote the number of observed X;'s that are greater than x. Define
where [A] denotes the indicator function of the event A. By a modification of the
product-limit estimator, it can be shown that H* is a good estimate of
H* = 1 - H. For a kernel function K satisfying certain conditions, the
Blum-Susarla density estimator is given by
f*(x) = [nhnH*(x)]-I
~ ~ n ] [Aj=
For example, K can be a bounded density function with support in the interval
[ -A, A ] for some A > 0 and absolutely continuous on [ - A , A ] with derivative
K' which is square integrable on [ -A, A]. By following standard arguments,
(f°H*),,(x) =- (nh,,) -1 ~ K((x - Xj)/h,,) [ 4 = 11
and H*(x) can be shown to be good estimators off°(x)H*(x) and H*(x), respectively. This motivates the use of (3.1) as an estimator of f°(x). Blum and Susarla
also obtain limit theorems for the maximum over a finite interval of a normalized
deviation of the density estimator (3.2). These results are useful for goodness-of-fit
tests and tests of hypotheses about the unknown lifetime density fo.
w. J. Padgett
It was conjectured by Blum and Susarla [3] that the kernel-type estimator
behaved in the same way as f * , where F* was an estimator of F ° such as the
product-limit estimator. In fact, FOldes, Rejt6, and Winter [16] proved uniform
almost sure convergence of j~ to f o when F* was ^taken to be P,. Specifically, one of their results was that sup . . . .
b lfn(x)-f°(x)l~O
surely as n ~ ~ provided f o was bounded and had a bounded derivative on
(a, b), - ~ ~< a < b ~< c~, K was right-continuous and of bounded variation,
hn(n/logn)l/8~ oo, and H(T;o)< 1, where TFO sup{x: F°(x)< 1}. Again, the
last condition insured that observed lifetimes in the entire support of F ° would
be available. It should be noted that if no censoring is present, then
)~(X) = h2 1 f to K((x - t)/hn) clPn(t)
reduces to the Parzen [43] estimator.
McNichols and Padgett [32] wrote (3.3) in the form
J~(x) = h2' ~ s~K[(x - Zj)/h,],
where si is given by (2.1). They considered the mean, variance, and mean squared
error of (3.4) under the K o z i o l - G r e e n model of random Acensorship described in
Section 2. This model allowed the expected value of fn(X) to be evaluated by
using the independence of (X 1. . . . . An) and (A 1. . . . , A,). In particular, if K is a
Borel function such that sup IK(t) l < ~ , ~ _~ooqK(t) l dt < o0, lim,~oo [tK(t) l = 0,
and ~ ~_to K(t) dt = 1, then
E[J~(x)] = a h ; '
g~(t)f(t)K((x - t)/h,) dt
+ (1 - a)p,,(a)h~ 1 E[K((x - Z,,/hn)],
a= (l+fl)
g,(t) =
b= 1 - a,
Pn( a) : 1~ [ ( n - i + b ) / ( n - i +
[ 1 - F(t)]" - j [F(t)l j
( n +1b )) ' =' ( n' +( bn) ( +
n +bb -- k + l ) / k ! ' k
Nonparametric estimation of density and hazard rate functions
F = 1 - (1 - H) (1 - F°), and f is a d e n s i t y for F. Furthermore, it was shown
that if h , ~ 0 ,
then l i m n ~ E [ f , ( x ) ] = f ° ( x ) ,
Thus, under the
Koziol-Green model, j~(x) is asymptotically unbiased for f ° ( x ) similar to the
complete sample case (the conditions on K and hn are those imposed by
Parzen [43]). Second moment convergence was also obtained under the conditions that nhn ~ Go and b = P(a censored observation)< 1 in addition to the
conditions required for asymptotic unbiasedness above [32].
For the kernel estimator (3.4), it is desirable to allow the data to play a role
in how much smoothing is done. Since, for a fixed n, h, is the 'smoothing
constant', it would be reasonable to allow h n to be a function of the right-censored
sample. McNichols and Padgett [35] consider this type of modification, which
extends the work of Wagner [54] to censored data. This modified kernel estimator is
fn(x) =
Fn-1 ~
where Fn = F n ( X 1 , . . . , X , ) is some function of the censored data. For this
estimator it was shown that if H(TFo)< 1, K has bounded variation,
limlx r~ 00 x K ( x ) = O, Fn ---, 0 in probability (almost surely), and n 1/2 (log log n)- 1/2
× Fn ~ ~ in probability (almost surely), then f ~ ( x ) ~ f ° ( x ) in probability (almost
surely) at each x for which f o is continuous. One choice of/'n satisfying the above
conditions is as follows: If 7n = [n'], ½< a < 1, where [. ] denotes the greatest
integer function, let Din be the distance from Zj to its 7,-nearest neighbor among
Z~ . . . . , Zj_l, Zj+I . . . . , Z~, l~<j~<n, and select Fn to be Dj~ with probability sj.
The practical choice of the bandwidth h~ for a given censored sample is a
problem which must be addressed in order to calculate the kernel estimator. For
complete samples, several 'data-based' procedures for selecting a 'good' value of
h, for a given set of data have been proposed (see Scott and Factor [46], for
example). Among these procedures when samples are right-censored, the maximum likelihood approach seems to be feasible. This will be discussed further in
Section 6.
With the exception of the expressions for the mean, E [ f , ( x ) ] , in (3.5) and
for E[j~2(x)] under the Koziol-Green model
[32], very little has been done
concerning the small-sample properties of f , or any of the other kernel-type
density estimators in the censored data case. Padgett and McNichols [40] have
performed Monte Carlo simulations for several parametric families of lifetime
distributions, uniform and exponential censoring distributions, several kernel
and several bandwidths to determine the small-sample behavior of
fn with respect to bias and mean squared error.
For estimating the hazard rate function r ° from randomly right-censored data,
FOldes, RejtO, and Winter [16] considered estimators of the form
r.(x) =
1 - F,,(x)
+ 1/n
W. J. Padgett
where f denoted either their histogram estimator (3.1) or their kernel-type
estimator (3.3). The 1In in the denominator simply prevents dividing by zero.
Strong consistency results for rn similar to those for (3.1) and (3.3) were proven.
McNiehols and Padgett [34] considered the kernel-type estimator of r ° given
r,,(x) = h;'
f I,:((x - t)/h,,) [1 - P . ( t ) ] - '
such t h a t F ( x ) < l ,
under the Koziol-Green model of random censorship. Expressions for E[r,,(x)]
and var[r~(x)] were obtained, and it was shown that r,,(x) was asymptotically
unbiased, and converged in mean square and in probability to r°(x), extending
Watson and Leadbetter's [55, 56] results.
Tanner and Wong [50] also studied a kernel-type estimator of r ° based on the
ordered censored sample (Z~, A; ), i = 1. . . . , n, given by
P(x)= ~ ( n - j +
such t h a t F ( x ) < l ,
where K was a symmetric integrable kernel with Kh(y ) --K(y/h). They derived
expressions for E[f(x)] and var[~(x)] and proved under the conditions on K
stated by Watson and Leadbetter [55, 56] that r(x) was asymptotically unbiased
if h, ~ 0 and nh, ~ oo. The conditions assumed here were essentially the same as
those required by McNichols and Padgett [34], except for the proportional
hazards (Koziol-Green) model assumption which gave somewhat different
expressions for the mean and variance. The asymptotic variance was also
obtained, and Hajek's projection method was used to establish asymptotic normality under conditions on K, F °, H, and h,. Tanner and Wong [51] studied a
class of estimators of the same general form as ¢(x) with K h replaced by K s, were
0 was a positive-valued 'smoothing vector' chosen to maximize a likelihood
function. Hence, for this estimator the smoothing parameters were chosen based
on the observed data.
Tanner [49] considered a modified kernel-type estimator of r ° in the form
~,,(x) = (2Rk) - ~ ~
i=~ n - i +
I':((x - Z,)/2R,¢),
where R k was the distance from x to the kth nearest of the uncensored observations among X 1. . . . . Xn. This estimator allowed the data to play a role in determining the degree of smoothing that would occur in the estimate. Assuming that
S o and f o were continuous in a neighborhood about x, k = [n~], ½ < ~ < 1, where
[ • ] was the greatest integer function, that K had bounded variation and compact
Nonparametric estimation of density and hazard rate functions
support on the interval [ - 1, 1], and that r ° was continuous at x, it was shown
that ~n(x) was strongly consistent.
Blum and Susarla [3] considered the estimator (in the notation of
Equation (3.2))
~n(x) -
x >~ O ,
where S*(x) = (number of Zj's > x)/n. This estimator was also of the kernel type,
and limiting results similar to those stated for the density estimator (3.2) were
obtained for ~,.
Ramlau-Hansen [44] used martingale techniques to treat the general multiplicative intensity model. His results are very general and include the kernel estimators
of hazard rate functions of FOldes, Rejt6 and Winter [16] and Yandell [61]. The
martingale techniques yielded local asymptotic properties of many of the hazard
rate estimators in a simpler manner than classical procedures.
Finally, in a recent paper Liu and Van Ryzin [26] obtained a histogram
estimator of the hazard rate function from randomly right-censored data based on
spacings in the order statistics. They showed the estimator to be uniformly
consistent in a bounded interval and asymptotically normal under suitable conditions. An efficiency comparison of their estimator with the kemel estimator of
hazard rate was also given. Also, Liu and Van Ryzin [27] gave the large sample
theory for the normalized maximal deviation of a hazard rate estimator under
random censoring which was based on a histogram estimate of the subsurvival
density of the uncensored observations.
4. Likelihood methods
One approach to estimating a density function nonparametrically is that of
maximum likelihood. Nonparametric maximum likelihood estimates of a probability density function do not exist in general. That is, the likelihood function for
a complete sample is unbounded over the class of all possible densities. However,
by suitably restricting the class of densities, a nonparametric maximum likelihood
estimator (MLE) may be found within the restricted class. For complete samples,
the maximum likelihood estimator of a density g was given by Barlow,
Bartholomew, Bremner and Brunk [1] if g was assumed to be either decreasing
(nonincreasing) or unimodal with known mode. Wegman [57, 58] assumed unimodality with unknown mode and found the M L E of the density and studied its
properties for complete samples.
McNichols and Padgett [33] studied maximum likelihood estimation of decreasing or unimodal densities based on arbitrarily right-censored data. The censoring
variables U1. . . . . Un could be either constants or continuous random variables.
They first assumed that f o was decreasing (nonincreasing) on [0, ~ ) and let F D
W. J. Padgett
be the set of distributions with decreasing left-continuous densities on [0, oo). For
the ordered censored observations (z;, b" ), i = 1. . . . . n, the likelihood function
was written as
L ( f °) = l-[ [f°(zi)]a; [S°(zi)] 1-~; ,
where S o = 1 - F °. It was shown that a maximum likelihood estimator of f o
must be a step function.
The estimator was found by maximizing the likelihood function L ( f °) over F D
subject to the decreasing density constraint. Equivalently, the constrained optimization problem to be solved was
Yl ....
, Yn
subject to (i) Y l > > ' Y 2 > t ' ' ' > I Y n > I O ,
~ y:(Z: -- Z:_1) <<.I ,
where zo = 0. This function to be maximized was shown to be concave and the
problem was shown to have a unique solution, say y* . . . . . y*. Then any density
of the form
f*(x) = y*,
zj_l < x < ~ z j , j = l . . . . .
1 ,
was a maximum likelihood estimator of fo, where y*+ ~, some value less than or
equal to y*, and z,+ 1 ( > z , ) were chosen so that
~ y*(z: - z : - l ) = Yn*+l(Zn+ 1 --Zn)"
Similarly, f o was estimated by maximum likelihood assuming that f o was
increasing (nondecreasing) on [0, M], M > 0 known. Then, if M denoted the
known mode of the unknown unimodal density, the two maximum likelihood
estimators on [0, M ] and on (M, ~ ) found as above could be combined to
estimate the unimodal density. If f o was assumed to be unimodal with unknown
mode M, then McNichols and Padgett [33] applied the above procedure for
known mode, assuming zj_ 1 < M < z: for each j = 1. . . . . n, obtaining n solutions
for f0. These n solutions gave n corresponding values of the likelihood function.
The maximum likelihood estimator of f o was then taken to be the solution with
Nonparametric estimation of density and hazard rate functions
the largest of the n likelihood values, analogous to Wegman's [57, 58] procedure
for complete samples.
Another approach to the problem of nonparametric maximum likelihood estimation of a density from complete samples was proposed by Good and
Gaskins [20]. This method allowed any smooth integrable function on the interval
of interest (a, b) (which may be finite or infinite) as a possible estimator, but
added a 'penalty function' to the likelihood. The penalty function penalized a
density for its lack of smoothness, so that a very 'rough' density would have a
smaller likelihood than a 'smooth' density, and hence, would not be admissible.
De Montricher, Tapia, and Thompson [9] proved the existence and uniqueness
of the maximum penalized likelihood estimator (MPLE) for complete samples.
Lubecke and Padgett [30] assumed that the sample was arbitrarily right-censored,
(Xi, Ai), i = 1, ..., n, and showed the existence and uniqueness of a solution to
the problem:
maximize L(g)
subject to g(t) >/0 for all t e O,
g(t) dt= 1,
and g e H(f2),
L(g) : f i [g(x,)] ~' [1 - G(xi) ]' - ~' exp[ - ¢(g)],
f2 is a finite or infinite interval, H(f2) is a manifold, and G is the distribution
function for density g. In particular, letting u = gl/2 and using Good and
Gaskins' [20] first penalty function, the problem (4.1) becomes:
maximize L(u) = f i [u(x;)] ~' I 1f i=1
x, u2(t) dt ]1/20
where x i>O, i= 1. . . . . n, ~o Ue(t) dt= 1, and u(t)>>.O, t>O.
L e t x i = x iand 6 _ i = b / , i = 1 , . . . , n , a n d d e f i n e f i ( x ) = u ( l x l ) f o r x e R \ { 0 }
and ~(0)= limx~o+ u(x). Then define the following problem:
maximize L(fi)= f i
[~(x,.)] ~i 2 -
× exp [ - 2 a f _ ~ (~'(t))2 dtl ,
W.J. Padgett
where ~ _ ~ f i z ( t ) d t = 2 ,
H i ( - ~ , ~ ) is the Sobolev space of real-valued functions such that the function
and its first derivative are square integrable.
If u* solves (4.3), then it can be shown that u*(t) = u*(t), t>~ O, and u*(t) = O,
t < 0, solves (4.2). Lubecke and Padgett [30] showed that a solution to (4.3) was
a function K* which solves the linear integral equation
~ ( t ) = C(t; x, ~, )~) + (8~2)-'/2
f/E ,1,,
~ {z~-(x~.) I( . . . . . ](Izl)
× sinh [(2/2~) 1/2 (t - z)]fi~(z) dz,
where the forcing function is of the form
- bi(2~)~)-1/2 [exp(-(2/2e) 1/z It
C(t; x, ~, 2) - 1 { i~_
+ exp(-(A/2~)
It + x;I)]
c;(1 - hi) [exp(- (2/2o~)1/2t) + exp((2/2~)l/2t)]~,
I,I = 1
for a 2 > 0. The integral equation (4.4) can be transformed to a second-order
differential equation whose solution fi* can be numerically obtained. Then (ft,)2
is the M P L E of the density f o based on the first penalty function of Good and
The nonparametric maximum likelihood estimation of the hazard rate function
r ° based on the arbitrarily right-censored sample (X;, A;), i = 1, 2, . . . , n, was
considered by Padgett and Wei [41] in the class of increasing failure rate (IFR)
distributions. The techniques of order restricted inference were used to obtain the
estimator following an argument similar to that of Marshall and Proschan [31 ] for
the complete sample case. A closed form solution to the likelihood function of r °
subject to the IFR condition was found to be a nondecreasing step function.
Small sample properties of their estimator were indicated by a Monte Carlo study.
Mykytyn and Santner [37] considered the same problem of maximum likelihood
estimation of r ° under arbitrary right censorship assuming either IFR, decreasing
failure rate (DFR), or U-shaped failure rate. Their estimator was essentially
equivalent to Padgett and Wei's estimator and was shown to be consistent by
using a total time on test transform. This estimator was maximum likelihood in
the Kiefer-Wolfowitz sense.
Friedman [17] also considered maximum likelihood estimation from survival
data. Let n survival times be observed over a time period divided into I(n)
intervals and assume that the hazard rate function of the time to failure of
individual j, rj(t), is constant and equal to r,~ > 0 on the ith interval. The maxi-
Nonparametric estimation of density and hazard rate functions
mum likelihood estimate )~ of the vector 2 = {log r,7: j = 1. . . . . n; i = 1,
. . . . I(n)} gave a simultaneous estimate of the hazard rate function. Friedman gave conditions for the existence of 2 and studied the asymptotic properties of linear functionals of ;~ in the general case when the true hazard rate
is not a step function. This piecewise smooth estimate of the hazard rate can be
regarded as giving piecewise smooth density estimates.
5. Some other methods
Nonparametric density estimators based on Fourier series representations have
been proposed for censored data. Kimura [23] considered the problem of estimating density functions and cumulatives by using estimated Fourier series. A
method for generating a useful class of orthonormal families was first developed
for the complete sample case and the results were then generalized to the case
of censored data. Variance expressions for the quantity - S - ~ tp(x)dfin(x ) were
obtained, where tp was chosen so that the variance existed and Pn was the
product-limit estimator. Finally, Monte Carlo simulation was used to test the
methods developed.
Tarter [53] obtained a new maximum likelihood estimator of the survival
function S O by using Fourier series estimators of the probability densities of the
uncensored observations and censored observations separately. That is, the
density estimates were f and f , obtained from the n 1 observed uncensored
X,.'s and the n2 observed censored Xi's, respectively, where n 1 + n 2 = n. It was
shown that as n--* ~ the new likelihood estimator approached the product-limit
estimator from above. It should be noted that the series-type density estimators
f and j7 used here were obtained by the usual complete-sample formulas.
The final series-type estimator to be mentioned here is the general estimator of
the density in the k competing risks model of Burke and Horvfith [5]. It could
be considered as a Fourier-type estimator by appropriate choices of the form of
the defining functions.
Another method that has been used for estimating hazard rate and density
functions is that of Bayesian nonparametric estimation. Since the work of
Ferguson [12, 13], many authors have been concerned with the Bayesian nonparametric estimation of a distribution function or related functions with respect
to the Dirichlet process or other random probability measures as prior distributions. For censored data Susarla and Van Ryzin [47, 48] considered the estimation of the survival function with respect to Dirichlet process priors, while
Ferguson and Phadia [14] used neutral to the right processes as prior distributions.
Padgett and Wei [42] obtained Bayesian nonparametric estimators of the
survival function, density function, and hazard rate function of the lifetime distribution using pure jump processes as prior distributions on the hazard rate
function, assuming an increasing hazard rate. Both complete and right-censored
samples were considered. The pure jump process prior was appealing because it
w. J. Padgett
had an intuitive physical interpretation as shocks occurring randomly in time that
caused the hazard rate to increase a constant small amount at each shock, which
also closely approximated the (random) increasing failure rate by a (random) step
Dykstra and Laud [ 10] also considered a prior distribution on the hazard rate
function in order to produce smooth nonparametric Bayes estimators. Their prior
was an extended gamma process and the posterior distribution was found for
right-censored data. The Bayes estimators of the survival and hazard rate
functions with respect to a squared error loss were obtained in terms of a
one-dimensional integral.
Lo [28, 29] estimated densities and hazard rates, as well as other general rate
functions, from a Bayesian nonparametric approach by constructing a prior
random density as a convolution of a kernel function with the Dirichlet random
probability. His estimator of the density with respect to squared error loss was
essentially a mixture of an initial or prior guess at the density and a sample
probability density function. His technique can be used for complete or censored
6. Numerical examples of some kernel density estimators
Of the many types of nonparametric density estimators available, probably the
most often used in practice are the kernel-type estimators. They are relatively
simple to calculate and can produce smooth, pleasing results. In this section
numerical examples will be given for the kernel estimator (3.4) and the modified
estimator (3.6) with the nearest neighbor-type procedure for selecting Fn.
One problem in using kernel density estimators is that of how to choose the
'best' value of the bandwidth hn to use with a given set of data. This question has
been addressed in the complete sample case by several authors (see Scott and
Factor [46], for example), and 'data-based' choices of hn have been proposed
using maximum likelihood, mean squared error, or other criteria. For the estimator
(3.4) no expressions for the mean squared error for finite sample sizes exist at
present, except for those very complicated ones given by McNichols and
Padgett [32] under the Koziol-Green model. Hence, selection of hn to minimize
mean squared error does not seem to be feasible. However, Monte Carlo simulation results of Padgett and McNichols [40] indicate that at each x there is a value
of h~ which minimized the estimated mean squared error of f~(x) in (3.4).
Similar results were also obtained in [40] for the Blum-Susarla estimator f*(x)
defined by (3.2). These simulation results indicated a range of values of h, which
ga',~e small estimated mean squared errors of fn(x) and f*(x) at fixed x. The
maximum likelihood criterion for selecting h~ for a given censored sample is
feasible for fn but does not seem to be tractable, even using numerical
methods, for f * due to the complications introduced by the term H*(x) in the
likelihood expression. The maximum likelihood approach will be used in the
following example for f~.
Nonparametric estimation of density and hazard rate functions
Following a similar approach to expressions (2.8) and (2.9) of Scott and
Factor [46], consider choosing h, to be a value of h >~ 0 which maximizes the
L(h) =
[f~ (z,)] ~'
f~ (u) du
Obviously, by definition of ~ , the maximum of (6.1) is + o O
Hence, the following modified likelihood criterion is considered:
maximize L~(h) =
[f,,~(zk)] ~zk=l
fnk(U) du
at h = O .
For the standard normal kernel K ( u ) = (2zc)-U2exp(-u2/2), the logarithm of
(6.2) becomes
b~)log h
62 log
#(2r0- z/2 exp ( - (zk - zj)2/2h 2)
where ~ denotes the standard normal distribution function. An approximate
Table 1
Failure times (in millions of operations) of switches
W. J.
a = 0.75
with ~
~ r I ,'
. . . .
= 0.60
h = 0.18
Fig. 1. D e n s i t y e s t i m a t e s for s w t i c h d a t a .
(local) maximum of (6.3) with respect to h can be easily found by numerical
methods for a given set of censored observations, and this estimated h, denoted
by hn, can be used in (3.4) to calculate f,(x).
For this example of the density estimation procedure given by (6.3) and (3.4),
the life test data for n = 40 mechanical switches reported by Nair [38] are used.
Two failure modes, A and B, were recorded and Nair estimated the survival
function of mode A, assuming the random right-censorship model. Table 1 shows
the 40 observations with corresponding be values, where b; = 1 indicates failure
mode A and bi = 0 denotes a censored value (or failure mode B). Using this data,
the function logL,(h) had a maximum in the interval [0, 1] at h4o~0.18.
Hence, J~o was computed from (3.4) with bandwidth 0.18. This estimate is
shown in Figure 1. This maximum likelihood approach to selecting h, does not
produce the smoothest estimate, but is one criterion that can be used.
Shown also in Figure 1 are the modified kernel estimates calculated from (3.6)
with the '7,-nearest neighbor' calculation of F, for the smoothing parameter values
= 0.60 and 0.75. The estimate was also Acalculated for ~ = 0.55, but was very
close to the fixed bandwidth estimate f 4 4 with h = 0.18 and, hence, is not
shown. The modified estimator (3.6) with ~ = 0.75 is pleasingly smooth, but with
the small sample and only 17 uncensored observations, the value of 0t = 0.60
might be a compromise between the very smooth (~ = 0.75) and somewhat rough
(~ = 0.55) estimates.
Nonparametric estimation of density and hazard rate functions
[1] Barlow, R. E., Bartholomew, D. J., Bremner, J. M., and Brunk, H. D. (1972). Statistical Inference
Under Order Restrictions. Wiley, New York.
[2] Bean, S. J. and Tsokos, C. P. (1980). Developments in nonparametric density estimation. Intern.
Statist. Rev. 48, 215-235.
[3] Blum, J. R. and Susarla, V. (1980). Maximal derivation theory of density and failure rate
function estimates based on censored data. In: P. R. Krishniah, ed., Multivariate Analysis V.
North-Holland, Amsterdam, New York, 213-222.
[4] Breslow, N. and Crowley, J. (1974). A large sample study of the life table and product limit
estimates under random censorship. Ann. Statist. 2, 437-453.
[5] Burke, M. and Horvfith, L. (1982). Density and failure rate estimation in a competing risks
model. Preprint, Dept. of Math. and Statist., University of Calgary, Canada.
[6] Chen, Y. Y., Hollander, M. and Langberg, N. A. (1982). Small sample results for the
Kaplan-Meier estimator. J. Amer. Statist. Assoc. 77, 141-144.
[7] Cox, D. R. (1972). Regression models and life-tables. J. Roy. Statist. Soc. Ser. B 34, 187-220.
[8] Cs6rg6, S. and Horv/tth, L. (1981). On the Koziol-Green model for random censorship. Biometrika
68, 391-401.
[9] De Montricher, G. F., Tapia, R. A. and Thompson, J. R.(1975). Nonparametric maximum
likelihood estimation of probability densities by penalty function methods. Ann. Statist. 3,
[10] Dykstra, R. L. and Laud, P. (1981). A Bayesian nonparametric approach to reliability. Ann.
Statist. 9, 356-367.
[11] Efron, B. (1967). The two sample problem with censored data. In: Proc. Fifth Berkely Symp.
Math. Statist. Prob. Vo14, 831-853.
[12] Ferguson, T. S. (1973). A Bayesian analysis of some nonparametric problems. Ann. Statist. 1,
[13] Ferguson, T. S. (1974). Prior distributions on spaces of probability measures. Ann. Statist. 2,
[14] Ferguson, T. S. and Phadia, E. G. (1979). Bayesian nonparametric estimation based on
censored data. Ann. Statist. 7, 163-186.
[15] F61des, A., Rejt6, L. and Winter, B. B. (1980). Strong consistency properties of nonparametric
estimators for randomly censored data, I: The product-limit estimator. Periodica Mathematica
Hungarica 11, 233-250.
[16] F61des, A., Rejt6, L. and Winter, B. B. (1981). Strong consistency properties of nonparametric
estimators for randomly censored data, Part II: Estimation of density and failure rate. Periodica
Mathematica Hungarica 12, 15-29.
[17] Friedman, M. (1982). Piecewise exponential models for survival data with covariates. Ann.
Statist. 10, 101-113.
[18] Fryer, M. J. (1977). A review of some non-parametric methods of density estimation. J. Inst.
Math. Appl. 20, 335-354.
[19] Gehan, E. (1969). Estimating survival functions from the life table. J. Chron. DIS. 21, 629-644.
[20] Good, U. J. and Gaskins, R. A. (1971). Nonparametric roughness penalties for probability
densities. Biometrika 58, 255-277.
[21] Kalbfleisch, J. D. and Prentice, R. L. (1980). The StatisticalAnalysis of Failure Time Data. Wiley,
New York.
[22] Kaplan, E. L. and Meier, P. (1958). Non parametric estimation from incomplete observations.
J. Amer. Statist. Assoc. 53, 457-481.
[23] Kimura, D. K. (1972). Fourier Series Methods for Censored Data, PhD. Dissertation, University of Washington.
[24] Koziol, J. A. and Green, S. B. (1976). A Cram6r-von Mises statistic for randomly censored
data. Biometrika 63, 465-473.
[25] Lagakos, S. W. (1979). General right censoring and its impact on the analysis of survival data.
Biometrics 35, 139-156.
W. J. Padgett
[26] Liu, R. Y. C. and Van Ryzin, J. (1984). A histogram estimator of the hazard rate with censored
data. Ann. Statistics .
[27] Liu, R. Y. C. and Van Ryzin, J. (1984). The asymptotic distribution of the normalized maximal
deviation of a hazard rate estimator under random censoring. Colloquia Mathematica Societatis
Janos Bolyai, Debrecen (Hungary).
[28] Lo, A. Y. (1978). On a class of Bayesian nonparametric estimates. I: Density estimates. Dept.
of Math. and Statist. Tech. Rep., University of Pittsburgh.
[29] Lo, A. Y. (1978). Bayesian nonparametric method for rate function. Dept. of Math. and Statist.
Tech. Rep., University of Pittsburgh.
[30] Lubecke, A. M. and Padgett, W. J. (1985). Nonparametric maximum penalized likelihood
estimation of a density from arbitrarily right-censored observations. Comm. Statist.-Theory
Meth. .
[31] Marshall, A. W. and Proschan, F. (1965). Maximum likelihood estimation for distributions with
monotone failure rate. Ann. Math. Statist. 36, 69-77.
[32] McNichols, D. T. and Padgett, W. J. (1981). Kernel density estimation under random censorship. Statistics Tech. Rep. No. 74, University of South Carolina.
[33] McNichols, D. T. and Padgett, W. J. (1982). Maximum likelihood estimation of unimodal and
decreasing densities on arbitrarily right-censored data. Comm. Statist.-Theory Meth. 11,
[34] McNichols, D. T. and Padgett, W. J. (1983). Hazard rate estimation under the Koziol-Green
model of random censorship. Statistics Tech. Rep. No. 79, University of South Carolina.
[35] McNichols, D. T. and Padgett, W. J. (1984). A modified kernel density estimator for randomly
right-censored data. South African Statist. J. 18, 13-27.
[36] Miller, R. G. (1981). Survival Analysis. Wiley, New York.
[37] Mykytyn, S. and Santner, T. A. (1981). Maximum likelihood estimation of the survival function
based on censored data under hazard rate assumptions. Comm. Statist.-Theory Meth. A 10,
[38] Nair, V. N. (1984). Confidence bands for survival functions with censored data: A comparative
study. Technometrics 26, 265-275.
[39] Padgett, W. J. and McNichols, D. T. (1984). Nonparametric density estimation from censored
data. Comm. Statist.-Theory Meth. 13, 1581-1611.
[40] Padgett, W. J. and McNichols, D. T. (1984). Small sample properties of kernel density
estimators from right-censored data. Statistics Tech. Rep. No. 102, University of South
[41] Padgett, W. J. and Wei, L. J. (1980). Maximum likelihood estimation of a distribution function
with increasing failure rate based on censored observations. Biometn'ka 67, 470-474.
[42] Padgett, W. J. and Wei, L. J. (1981). A Bayesian nonparametric estimator of survival probability
assuming increasing failure rate. Comm. Statist.-Theory Meth. A 10, 49-63.
[43] Parzen, E. (1962). On estimation of a probability density function and mode. Ann. Math. Statist.
33, 1065-1076.
[44] Ramlau-Hanse, H. (1983). Smoothing counting process intensities by means of kernel functions.
Ann. Statist. 11, 453-466.
[45] Rosenblatt, M. (1976). On the maximal deviation of k-dimensional density estimates. Ann.
Probab. 4, 1009-1015.
[46] Scott, D. W. and Factor, L. E. (1981). Monte Carlo study of three data-based nonparametric
probability density estimators. J. Amer. Statist. Assoc. 76, 9-15.
[47] Susarla, V. and Van Ryzin, J. (1976). Nonparametric Bayesian estimation of survival curves
from incomplete observations. J. Amer. Statist. Assoc. 71, 897-902.
[48] Susarla, V. and Van Ryzin, J. (1978). Large sample theory for a Bayesian nonparametric
survival curve estimator based on censored samples. Ann. Statist. 6, 755-768.
[49] Tanner, M. A. (1983). A note on the variable kernel estimator of the hazard function from
randomly censored data. Ann. Statist. 11, 994-998.
[50] Tanner, M. A. and Wong, W. H. (1983). The estimation of the hazard function from randomly
censored data by the kernel method. Ann. Statist. 11, 989-993.
Nonparametric estimation of density and hazard rate functions
[51] Tanner, M. A. and Wong, W. H. (1983). Data-based nonparametric estimation of the hazard
function with applications to model diagnostics and exploratory analysis. J. Amer. Statistc.
[52] Tapia, R. A. and Thompson, J. R. (1978). Nonparametric Probability Density Estimation. The
Johns Hopkins Univ. Press, Baltimore, MD.
[53] Tarter, M. E. (1979). Trigonometric maximum likelihood estimation and application to the
analysis of incomplete survival information. J. Amer. Statist. Assoc. 74, 132-139.
[54] Wagner, T. (1975). Nonparametric estimates of probability densities. IEEE Trans. Inform.
Theory 21, 438-440.
[55] Watson, G. S. and Leadbetter, M. R. (1964). Hazard Analysis I. Biometrika 51, 175-184.
[56] Watson, G. S. and Leadbetter, M. R. (1964). Hazard analysis II. Sankhy~ Ser. A 26, 110-116.
[57] Wegman, E. J. (1970). Maximum likelihood estimation of a unimodal density function. Ann.
Math. Statist. 41, 457-471.
[58] Wegman, E. J. (1970). Maximum likelihood estimation of a unimodal density, II. Ann. Math.
Statist. 41, 2160-2174.
[59] Wellner, J. (1982). Asymptotic optimality of the product limit estimator. Ann. Statist. 10,
[60] Wertz, W. and Schneider, B. (1979). Statistical density estimation: A bibliography. Internat.
Statist. Rev. 47, 155-175.
[61] Yandell, B. S. (1982). Nonparametric inference for rates and densities with censored serial data.
Biostatistics Program Tech. Rep., University of California, Berkeley.
P. R. Krishnaiah and C. R. Rao, eds., Handbook of Statistics, Vol. 7
© Elsevier Science Publishers B.V. (1988) 333-351
| "7
Multivariate Process Control
Frank B. Alt and Nancy D. Smith
There are many situations in which it is necessary to simultaneously monitor
two or more correlated quality characteristics. Such problems are referred to as
multivariate quality control problems.
To illustrate the need for a multivariate approach, consider a manufacturing
plant where the product is plastic film. The usefulness of the film depends on its
transparency (X~) and its tear resistance (X2). It is assumed that these two quality
characteristics are jointly distributed as a bivariate normal. The standard values
are: #Ol = 90, #02 = 30, 0-ol = 9 and 0-02 = 3. Furthermore, it has been determined
that there is a negative correlation of Po = - 0.3 between these two characteristics.
These values can be displayed in a (2 × 1) vector of means, denoted by/~o, and
a (2 x 2) covariance matrix, denoted by 2;0:
~llO= (]AO1) =
(90) ;
~0 = (O"21
\Po 0-ol%2
p00-010-02~8 1(=
- 8.1
A sample of, say, ten items is drawn from the process at regular intervals and
measurements are obtained on both variables.
For the time being, attention will be focused on monitoring the process means.
One approach would be to ignore the correlation between the characteristics and
monitor each process mean separately. For each sample of size ten, an estimate
of #o~, denoted by Ya, is obtained and plotted against time on an Y-chart with
the following control limits:
UCL~ =/~o~ + 3(ao~/x/n) = 98.54,
EL1 = ~Ol
LCL1 = #ol - 3(%,/w/n) = 81.46.
Since 3-sigma limits were used in determining (1), the type I error for this chart
equals 0.0027. Another Y-chart would be set up to monitor the process mean of
F. B. Alt and N. D. Smith
the tear resistance variable. The limits are:
UCL 2 =
CL 2 = 30,
LCL 2 = 27.15.
If both sample means plot within their respective control limits, the process is
deemed to be in control. The use of separate 2-charts is equivalent to plotting
(Xl, 52) on a single chart formed by superimposing one 2-chart over the other,
as shown in Figure 1. If the pair of sample means plots wihtin the rectangular
control region, the process is considered to be in control.
-- -
i \
Region A
_ i __~ R o g i o o
Fig. 1. The elliptical and rectangular control regions.
The use of separate control charts or the equivalent rectangular region can be
very misleading. It will be shortly demonstrated that the true control region is
elliptical in nature, and the process is judged out of control only if the pair of
means (Y1, 52) plots outside this elliptical region. However, if the rectangular
region is used, it may be erroneously concluded that both process means are in
control (Region A), one is out or control and the other is in (Region B), and both
are out of control (Region C). The degree of correlation between the two variables
affects the size of these regions and their respective errors. Furthermore, the
probability that both sample means will plot within the elliptical region when the
process is in control is exactly l - e , whereas with the rectangular region, this
probability is at least 1-e.
Although the use of separate charts to individually monitor each process mean
suffers from the weakness of ignoring the correlation between the variables, these
Y-charts can sometimes assist in determining which process mean is out of
control. When 5-charts are used in this supplemental fashion, it is recommended
that the type I error rate of each one be set equal to e/p, where p is the number
of variables and e is the overall type-I error. When p = 2 and e = 0.0054, the
Multivariate process control
type-I error of each chart would be set at 0.0027 which implies 3-sigrna limits as
used in equations (1) and (2).
In the sequel, control charts will be presented for both Phase I and Phase II
with the presentation for Phase II being first. In both cases, the charts are
referred to as multivariate Shewhart charts.
Phase II control charts
In some instances, estimates of/~o and '~o may be derived from such a large
amount of past data that these values may be treated as parameters and not their
corresponding estimates. Duncan [10] states that the values for the parameters
could also have been selected by management to attain certain objectives. These
are referred to as standard or target values. Phase II comprises both scenarios.
Control charts for the mean
When there is only one quality characteristic, which is normally distributed with
mean #o and standard deviation a o, the probability is (1 - ~) that a sample mean
will fall between #o + z~/2(tro/x/~) where z~/2 is the standard normal percentile
such that P(Z > z~/2) = ~/2. This is the basis for the control charts presented in
equations (1) and (2). It is customary to use 3.0 for z~/z in which case ~ = 0.0027.
Therefore, if an ~ falls outside the control limits, there is very strong evidence that
assignable causes of variation are present.
Suppose random samples of a given size are taken from a process at regular
intervals and an 2-chart is used to determine whether or not the process mean
is at the standard value #o. This is equivalent to repeated significance tests of the
form Ho: # =/~o vs. H1: /~ ~ kto. Furthermore, instead of using an 2-chart with
upper and lower control limits, one could use a control chart with only an upper
control limit on which values of [ x / n ( ~ - #0)/%] 2 are plotted. In this case,
UCL = Zl,~ where Za2.~ denotes the Z2-percentile. Note that Z I , 0.0027 = 9.0.
Admittedly, the simplicity of construction of the Z2-chart is offset somewhat by
the fact that runs above and below the mean will be harder to detect since they
are intermingled. However, the hypothesis testing viewpoint and Z2-chart concept
provide the foundation for extending the univariate to the multivariate case.
The univariate hypothesis on the mean is rejected if
' (2
A nature generalization is to reject Ho: It = Ito vs. Ha: It # Ito if
Xo2 = n ( i - / t o ) ' ,~o- I(X _/to) > Zp.
where X denotes the (p x 1) vector of sample means and 2;0 ~ is the inverse of
the (p x p) variance-covariance matrix. For the case of two quality characteristics,
F. B. Alt and N. D. Smith
Z2 = n(1 - po2)- ' [(Yl - Ix01) 2 0"012 q- (x2 - ]A02)2 0022
2po aOl'aO21(ffa
(22 - #o2)1,
which is the equation of an ellipse centered at (#m, #o2)- Thus, for two quality
characteristics, a control region could be constructed which is the interior and
boundary of such an ellipse. If a particular vector of sample means plots outside
the region, the process is said to be out of control and visual inspection may
reveal which characteristic is reponsible for this condition. Refer to Figure 1.
When there are two or more quality characteristics, the vector of process means
could be morzitored by using a control chart with UCL = ~2. ~. If Xo2 > UCL, the
process is deemed out of control and an assignable cause would be sought. It may
be possible to determine this by the supplemental use of individual Y-charts where
the type I error of each chart is set equal to ~/p. By Bonferroni's inequality, the
probability that each of the sample means plots within its respective control limits
when each process mean is at the standard value is at least 1-0¢. Refer to
Alt [2, 3].
The Z2-chart has associated with it an operating characteristic (OC) curve or,
equivalently, a power curve. The power shows the probability of detecting a shift
in the process mean on the first sample taken after the occurrence of the shift.
Let re(2) denote the power of the chart. Then
7r(2) = P(G2~ > Zp2, ~,),
where Xp,,
denotes the noncentral chi-square random variable with p degrees of
freedom and noncentrality parameter 2 = n(/, -/*o)' ~g 1(/~ _/1o)" For p = 2,
2 = n(1 - p2)-
' [(/~ 1 -/~01)2crffl = + ( # 2 - #02)20"022
- 2poGffl 1 0.ff22(~1 - ~ 0 1 ) ( ~ 2 - ~02)1 •
The power is strictly increasing in 2 for fixed significance level ~ and fixed sample
size n. Wiener [25] presents tables of the noncentrality parameter 2 for significance levels (e) of 0.100, 0.050, 0.025, 0.010, 0.005 and 0.001, degrees of freedom
equal to 1(1)30(2)50(5)100, and for power values (n) of 0.10(0.02)0.70(0.01)0.99.
For example, suppose ~ = 0.005, n = 10, Po = - 0 . 4 , and it is important to detect
a shift of magnitude 0.5 standard deviations in the mean of each variable. Then
r~(2) = 0.42. If Po = - 0 . 2 , the power decreases to 0.28. When there are two
positively correlated characteristics and one of the process standard deviations
(crl) can be adjusted, Alt et al. [6] found that the power is not a monotonically
decreasing function of crI as it is in the univariate case.
A fundamental assumption in the development of the Z2-chart is that the
underlying distribution of the quality characteristics is multivariate normal. In the
univariate case, the effect of nonnormality on the control limits of Y-charts was
studied by Schiling and Nelson [24].
By minimizing the average run length of an out-of-control process for a large
Multivariate process control
fixed value of the average run length of an in-control process, Alt and Deutsch
the sample
size (n) and control
two quality characteristics.
(Xj;2, , ) w h e n
T h e y f o u n d t h a t (i) f o r a r e l a t i v e l y l a r g e p o s i t i v e c o r r e -
Table 1
Control chart data for the mean-standards given
12 a
13 a
14 a
15 a
16 ~
17 ~
18 ~
19 ~
20 b
21 b
22 b
23 ~
24 b
25 b
26 b
27 b
98.75 c
99.87 ~
101.00 c
102.12 ~
103.25 c
98.75 ~
99.87 ~
101.00 ~
102.12 ~
103.25 ~
32.93 d
33.31 a
11.16 e
14.09 ~
17.36 e
20.97 e
24.93 ~
13.07 ~
18.25 ~
24.32 e
31.29 "
39.14 ~
47.90 ~
UCL 1 = 98.54
UCL 2 = 32.85
CLl = 90.00
CL 2 = 30.00
LCL l = 81.45
LCL 2 = 27.15
UCL = ~2,0.0054
= 10.44
there are
For these samples, #o~ was increased in increments of
0.125aOl from 91.1250 to 99.0000.
For these samples, #ol was increased in increments of
0.125 a~ and #02 was increased in increments of 0.125 a 2 from
30.3750 to 33.0000.
These values of ~t plot outside the control limits stated in
equation (1).
These values of 22 plot outside the control limits stated in
equation (2),
For these samples, Zo2 > UCL = 10.44.
F. B. Alt and N. D. Smith
lation, a larger sample size is needed to detect large positive shifts in the means
than small positive shifts, and (ii) a larger sample size is needed to detect shifts
for p > 0, than when p < 0. Montgomery and Klatt [21, 22] present a cost model
for a multivariate quality control procedure to determine the optimal sample size,
sampling frequency, and control chart constant.
Although Hotelling [15, 16] proposed the use of the Z2 random variable in a
control chart setting for the testing of bombsights, he did not actually use Z2-con trol charts since the variance-covariance matrix (Zo) was unknown. His papers
are primarily devoted to the case for 2;o unknown.
To illustrate the use of the Z2-control chart, consider the data listed in Table 1
for the plastic film extruding plant described in the Introduction. The sample size
is ten. To assess the impact of changes in either one or both process means, note
that #o~ was increased by increments of 0.125trOl for data sets 12 through 19 while
#ol and #02 were each increased by increments of 0.125ao;, i = 1, 2, for data sets
20 through 27. Since type I error was set equal to 0.0054, UCL = ~ 22, 0.0054
Sample No.
Fig. 2.
= 10.44. The X2-control chart is illustrated in Figure 2. When only #Ol was
changed (sample numbers 12 to 19), the value of the test statistic (Zo2) exceeded
the UCL as soon as #Ol was increased by at least 0.5 standard deviations (sample
numbers 15 to 19). Furthermore, when #ol and #o2 were simultaneously altered
(sample numbers 20 to 27), Zoz > UCL as soon as each process mean had been
increased by 0.375 standard deviations (sample numbers 22 to 27). The control
limits for the individual control charts were presented in equations (1) and (2).
For sample numbers 12 to 19, the 2-chart for transparency (X1) performed as well
as the zZ-chart. This result is not surprising since the process mean for this
variable alone increased. However, when both process means were increased
(sample numbers 20 to 27), the individual charts did not perform as well as the
Multivariate process control
i(2-chart. Specifically, the Z-chart for transparency did not detect an increase until
/~ol had increased by at least 0.5 standard deviations and the 2-chart for tear
resistance did not plot out-of-control until #02 had increased by at least 0.875
standard deviations.
Control charts for process dispersion (Phase 11)
In the univariate case, even if the process mean is at the standard value but
the process standard deviation has increased, the end result is a greater fraction
of nonconforming product. This is illustrated in Montgomery [20]. Thus, it is
important to monitor both the mean and the variability of a process. Methods for
tracking process dispersion are presented in this section. The case of one quality
characteristic is reviewed first.
To determine whether the process variance is at the standard value (ao2), several
different control charts can be used. All of the control charts assume that a
random sample of size n is available and that the characteristic is normally
For small sample sizes (n ~< 10), the range chart is the one most frequently used
to monitor process dispersion. It can be shown that E ( R ) = aodz and
Var(R) = d2a~. Since most of the distribution of R is contained in the interval
E(R) + 3 [Var(R)] 1/2, the control limits for the R-chart are as follows:
U C L = %(d 2 + 3d3) = Dzao,
CL = aodz,
LCL = ao(d2 - 3d3) = D 1fro.
Values of d 2, d3, D1, and D 2 are presented in Table M of Duncan [ 10] for n = 2
to n -- 25. Duncan [ 10] also gives details for constructing a percentage point chart
based on the distribution of W = R/a o.
Another chart that makes use only of the first two moments of the sample
statistic is the S-chart, where S denotes the sample standard deviation with a
divisor of (n - 1). It is known that E(S z) = ag and E(S) = aoc4, where
=[ 2
I_n 11_1
r((n- 1)/2)
Thus, Var(S) = E(S a) [ E ( S ) ] 2 = ag(1 - c2). Since most of the probability distribution of S is within 3 standard deviations of E(S), the control limits for the
S-chart are as follows:
U C L : O'o[C4 -F 3 N//T-- C42] = B6o'o,
CL = O'oC4,
L C L = 00[c4 - 3 , / 1
- c4 1 --
F. B. Alt and N. D. Smith
Table 2
C o n t r o l c h a r t d a t a for p r o c e s s d i s p e r s i o n - - s t a n d a r d s
- 6.93
- 13.61
- 3.56
IS[ l/z
- 15.83
- 1.07
- 3.70
U n i v a r i a t e c o n t r o l limits
T r a n s p a r e n c y (x 0
Tear resistance
UCL 1 = 49.22
C L 1 = 27.70
L C L 1 = 6.18
U C L 2 = 16.41
C L z = 9.23
L C L 2 = 2.06
U C L 1 = 15.02
C L 1 = 8.75
L C L l = 2.48
U C L z = 5.01
C L z = 2.92
L C L 2 = 0.83
S Z-chart
U C L 1 = 226.44
U C L z = 25.16
Multivariate control
IS I '/Z-chart
U C L = 51.95
L C L = 6.60
[S r'/2-chart
( 3 - s i g m a limits)
U C L = 47.17
C L = 22.90
L C L = 0.00
U C L = 12.38
(a = 0.01)
Values of ¢4, Bs, and B 6 a r e presented in Table M of Duncan [ 10] for n = 2 to
n = 25. A variation of the S-chart is the sigma chart, on which are plotted values
of the sample standard deviation where the divisor is n. In this case, the upper
and lower control limits are given by
Multivariate process control
go[C,] + 3 ~/(n - 1 - nGZ)/n]
where c,~ = c4 % / ~ - 1)/n.
A control chart can also be based on the unbiased sample variance, S 2. Since
(n - 1)$2/O-o2 is distributed as a chi-square random variable with (n - 1) degrees
of freedom, it follows that
p [ 6 ~ )2Z2n _ l , l _ ( = / 2 ) / ( n _
a62 2Z , - l, =/2/(n -
1)1 = 1
The control limits for the S2-chart are as follows:
U C L = t76,~,
2 n2_ i , = / 2 / ( n -
1) ,
= Oz
' 6 ~2
n - 1,
i_(=/2)/(n- 1) .
However, Guttman, Wilks, and Hunter [ 11 ] point out that is is customary to use
only an upper control limit; specifically, U C L = a~Z~_ 1, J ( n - 1). Note that the
S2-chart is equivalent to repeated tests of significance of the form Ho: a z = a~ vs.
HI: a 2 ¢ a~ where the critical region for this test is equivalent to the regions
above the U C L and below the LCL, as stated in equation (11). The power of the
test is given by:
g ( ~ ) = 1 -- p [ ) ] - 2 ~ 2 _
1, 1-(~/2)~ Zn-I
~ ,'~-2 ~2_ 1,=/2],
where ). = a l / % . Operating characteristic curves for this test are presented in
Bowker and Lieberman [8] for ~ = 0.05 and 0.01. This test is significantly
effected when the assumption of sampling from a normal distribution is violated.
Summary statistics and the control limits for all three univariate charts
(R-chart, S-chart, and S2-chart) are recorded in Table 2. For the S2-chart, the
type I error of each chart was set equal to 0.0027, where ;(9.0.0027
= 25.16. When
the data were generated, there was no intentional increase in %1 or ao2. Thus, it
is not surprising that all of the sample measures of dispersion plot in control.
In the multivariate case, attention thus far has been focused on monitoring the
process mean vector. It is also desired that the covariance matrix of the process
remain at the standard value Xo. To check this, a random sample of size n is
obtained and the value of some sample statistic is determined from the (p x n)
data matrix. Let S denote the (p x p) sample variance-covariance matrix:
[ ]
S12 • . . Slp
S2p • . . s2p
where the diagonal elements are the sample variances and the off-diagonal
elements are the sample covariances. For the ease of two quality characteristics:
S =
F. B. Alt and N. D. Smith
1 ~
(Xlk- ~1) (x2k- ~2)l
~ (xz~- ~2)2
Recall that the sample correlation coefficient for the ith and jth variables, denoted
by r~j, is defined as r,y = so~sisj.
The sample generalized variance, denoted by IS I, is a widely used scalar
s21s~- s~2 = s~s~(1 - r 2 2 ) . A geometrical interpretation of IS I for two variables
will now be presented. Let D denote the (2 x n) data matrix after centering:
xz2 - x 2
" " X2n-
~ 2 -I
Note that S T ( n - 1 ) - I D D ' .
Specifically, s ? = ( n - l ) - l d i d " , i = 1 , 2 ,
s12 = (n - 1)- i dl d2, and r12 = d'l d 2 / ~
~ ,
which is the cosine of the
angle 0 between d I and d 2. Thus, sin20 = 1 - r~2 and ISI = sis222 sin 20. However,
the square of the area of the parallelogram formed by using d I and d 2 as principal
edges is ( n - 1)2slZs~ sin20. It follows that ISl = ( n - 1)-2 (area) 2.
This result generalizes to p variables as follows: ISI = ( n - 1)-P(volume) 2.
Johnson and Wichern [18] point out the following properties of the generalized
sample variance:
(i) the volume will increase as the length of any deviation vector (d;) increases;
(ii) for deviation vectors of fixed length, the volume will increase until the
deviation vectors are at right angles to each other;
(iii) if one of the sample variances is small, the volume will be small;
(iv) if one of the deviation vectors lies nearly in the hyperplane formed by the
others, the volume will be small; and
(v) distinctly different covariance matrices can have the same generalized
In view of the last property, it is recommended that any procedure based on
ISI be accompanied by the appropriate univariate procedures to monitor dispersion•
The first chart to be considered is the ISI1/2-chart, which is the multivariate
analogue of the S-chart. The first approach makes use of the distributional
I S ] 1/2
p = 2,
Hoel [ 14]
2(n - 1)l S I 1~2~]~ol 1/2 is distributed as Z2z,_4. By pivoting on this expression, it
follows that control limits for the t SI1/2-chart are as follows:
UCL = 1~011/2~2n
2 - 4, ~/2/2( n -- 1),
LCL = ]I;ol 1/2 22n--4,
1--C~/2)/2(n -- 1),
Multivariate process control
where I fop 1/2 = 0-01 0"02 N / / ~ - -- ]02). For the plastic film example, J27o11/2 = 25.76.
Thus, for each r a n d o m sample of size n, IS] 1/2 = (s?s~ - s22) 1/2 = s , s 2 x / ~ - r~2
is computed. If IS [ ,/2 > U C L or iS] 1/2 < LCL, the dispersion of the process is
deemed to be out of control and assignable causes are sought. Although the exact
distribution of pS 11/2 for p > 2 is unknown, several approximations are available
and discussed in Alt [ 1 ]. The second approach utilizes only the first two m o m e n t s
of IS] '/2 and the property that most of the probability distribution of IS[ ~/2 is
Isl = (n - 1)-" 12701 I-i"k =, g 2 - ~, where the chi-square r a n d o m variables are independent, it follows that
E ( ] S I r) = (n - 1)-Pr 2 pr 127olr
[I r(r +
(n - k ) / 2 ) / F ( ( n - k)/2).
E ( [ S I '/2) = 1~.o11/2(2/(n - 1)) p/z r ( n / 2 ) / r ( ( n
E ( I S I ) = I~,ol (n - 1) - p 1-I (n - k) = 12ol 61.
V a r ( I S I '/2) = E ( I S I ) - [ E ( I S ] ' / 2 ) ] = = 12ol (bl
Since the upper and lower control limits are given by
E(ISI ,/2) __!_3 x / V a r ( I S
I ,/2),
it follows that the control limits for a
U C L = 12ol '/2 (b 3 + 3 ~
are given by
- b32),
CL : J2od ,/2 b3 '
L C L = 12o1 '/2 (63 - 3 X ~ l - b32) •
When p = 1, b3 = c4 as stated in equation (9), b, = 1, 12o11/2 = ao ' and the control
limits presented in equation (16) reduce to those stated in equation (10). When
b, = (n - 2)/(n - 1)
b3 = (2/(n - 1)) [r(n/2)/r((n - 2 ) / 2 ) ] .
when n = 10, b 1 = b 3 = 0.889 and bl - b 2 = 0.099. Thus
U C L = 1.831 ] 2o] ~/2, C L = 0.889 [ 2o] ,/2, and L C L --- 0 since it is negative.
The final chart to monitor process dispersion in the multivariate case is the
analogue of the S2-chart, which was equivalent to repeated tests of significance.
F. B. Alt and N. D. Smith
Anderson [7] shows that the likelihood ratio test of Ho: $ = ~7o vs. H1: , ~ '~o,
modified to be unbiased (the power of the rest is greater than or equal to the
significance level), is based on the following statistic:
- 1) - (n - 1)ln(ISI) +
+ (n - 1 ) t r ( ~ o l S ) ,
W* = -p(n
(n -
where tr(Zo 1S) is the sum of the diagonal elements of ~o-1S- When p = 2,
tr(~ o 1S) = (1 - p 2 ) - 1 [(s2/a21) + (s2/tr22) _ 2po(s12/trolao2)].
Anderson shows that W* is asymptotically distributed as Xp2(p+1)/2. Although an
improved asymptotic approximation is also presented, the upper 5~o and 1 ~
points of the exact distribution of W* have been tabulated and appear in [7] for
p = 2(1)10 and various values of (n - 1). For p = 2 and (n - 1) = 9, Ho is rejected
at the 5~o level if W* > 8.52 and at the 1% level if W* > 12.38. For successive
random samples of size n, the process dispersion is considered to be out of
control if the values of W* exceed UCL.
When there are multiple characteristics, three procedures have been presented
for monitoring the variability of a process. Although ]SI 1/2 is plotted on each of
the first two charts, the distinction is that the control limits for the first chart are
probability limits (equation (12)) while the control limits for the second chart are
3-sigma limits (equation (16)). The third procedure is based on the modified
likelihood ratio test, and values of W* (equation (17)) are plotted on a control
chart with the upper control limit determined by a specified significance level. For
= 0.01, UCL = 12.38. Summary statistics for all three charts are recorded in
Table 2. The statistics for all three charts plot in control. It is concluded that the
variability of the process is in control. Although the range chart was used to
monitor the variability of each quality characteristic, the multivariate analogue was
not presented since it is relatively intractable.
Phase I control charts
In Phase II, control charts are used to determine whether the process is in
control at the standard values (/~o, ~o). During the initial stages of process surveillance,/~o and ,~o are usually unknown and must be estimated from preliminary
samples taken when the process is believed to be in control. These preliminary
samples are referred to as rational subgroups and m is used to denote the number
of subgroups. When there is one quality characteristic, the procedure ordinarily
used to construct the Phase I control chart limits is to replace the standard values
in the Phase II charts by unbiased estimates obtained from the m rational
subgroups. For example, #ol in equation (1) would be replaced by the average of
the sample means obtained from each rational subgroup, and any one of several
measures of variability would be used in place of trol. However, Hillier [13] and
Multivariate process control
Yang and Hillier [26] have developed a two-stage procedure using probability
limits for determining whether the data for the first m subgroups came from a
process that was in control (Stage I) a n d whether future subgroup data from this
process exhibit statistical control. This was extended to the multivariate case by
Alt et al. [ 5].
Stage I control limits for the m e a n
For each of the m subgroups, a r a n d o m sample of size n is obtained a n d the
( p x 1) vector of sample m e a n s (~;) is calculated as is the ( p x p) sample
variance-covariance matrix (St). If statistical control existed within each subgroup,
then unbiased estimates of the process m e a n vector and the process v a r i a n c e covariance matrix are given by
~ x'i
respectively. F o r the plastic film extrusion process, m was chosen to be 10. The
elements of the sample m e a n vectors a n d covariance matrices are recorded in
Table 3.
W h e n standard values for/~ a n d • are available, the test statistic is stated in
equation (4). If the s t a n d a r d values are replaced by their unbiased estimates, the
Table 3
Statistics for control charts for the mean (Phase I--Stage I)
22, i
S2l , i
T2O, 1 a
- 3.88
- 16.16
- 6.11
- 14.30
- 10.74
- 9.04
- 5.84
- 25.76
- 0.87
Pooled statistics and UCL
1 - 10
1-3, 5-7,
9, 10
c(m, n, p)
- 8.92
29.69 a
91.18 a
-9.65 a
Revised value after subgroups 4 and 8 have been excluded.
F. B . A l t a n d N . D . S m i t h
resulting statistic is:
T 02, 1 = n(~, - ~)' S - 1(2; _ ~)
i = 1, 2, . . . , m. F o r the case o f two quality characteristics,
T,02, , = ( n / d e t ( S ) ) [ ( x l , ; -
x , ) 2~2 + (x2, i - ~2) 2~2
(X2, i -
2 = (l/m)
where d e t ( S ) = ~ 2 ~ 2 _ ~2,2, s- 2 = ( l / m ) 2 ~ 1 s ,,,,
5~,=,s12.~. Alt e t a l . [5] show that T2o. l is
p+ 1, where
ci(m, n, p)Fp . . . . . .
c l ( m , n, p )
= p ( m - 1)(n - 1)/(mn - m - p
+ 1).
s22, i and
-distributed as
To determine whether the process was in control when the first m subgroups
were obtained, the m values o f T.o.
2 1 are plotted on a chart with U C L =
c l ( m , n , p ) F p , mn m p + l , , a n d L C L = 0- If T'2o,1 for one or more o f the m initial
subgroups plots out o f control, the c o r r e s p o n d i n g subgroups are d i s c a r d e d and
the first stage control limits are recalculated on the basis o f the remaining
subgroups. This p r o c e d u r e is illustrated for the plastic film extrusion p r o c e s s ; the
s u m m a r y statistics are r e c o r d e d in Table 3. To simulate an out-of-control process,
each process mean was increased by one s t a n d a r d deviation for subgroups 4 and
8. N o t e that T 02, 1 exceeded the U C L (with ~ = 0.001) for these two subgroups.
As a consequence, these subgroups were discarded, x and S were recomputed,
and new control limits were determined using m = 8. F o r the remaining eight
subgroups, the recalculated values of T o,,
2 are less than the revised U C L . The
process a p p e a r s to be in control with respect to its mean.
F o r the case when p = 1, U C L = ( ( m - 1 ) / m ) F a , m ( , _ l ) "
and T,02, 1 =
n ( x i - ~ ) 2 / s 2 where s 2 was previously defined as the average o f the sample
variances obtained from each subgroup. Since F,, . , ( n - ,), ~ = t,~cn1), ~/2, it follows
1 - ~ = P[(X i
((m - 1)/m)F,,m(,,
,), ~]
= P [ IXi - ~1 ~ x / S 2 ( ( m - l ) / m ) t m ( n - , ) , ~/21
= P[~-A4x/~5<~X,<~+A4x/SS],
where A 4 = x / ( m - 1)/m tr~¢~- 1), ~/2" Thus, the multivariate result reduces to the
univariate result previously o b t a i n e d by Yang and Hillier [26]. Furthermore,
Bonferroni intervals for the individual characteristics are obtained by using
A 4 = x / ( m - 1)/m tm( n _ 1), a/2p" F o r p = 2, the upper and lower control limits for
each variable are given by 2 + A 4 x / / ~ . Setting m = 10, n = 10, and a = 0.001
yields the following control limits:
U C L , = 124.81,
U C L 2 = 39.67,
L C L 1 = 59.57,
L C L z = 20.77.
Multivariate process control
Although each process mean had increased by one standard deviation for
subgroups 4 and 8 and this was detected by T,02, 1 ~ these increases failed to show
up on the univariate charts.
Stage H control limits f o r the mean
After the Stage I upper control limit has been revised and the test statistics for
the remaining subgroups do not exceed this upper control limit, a Stage II control
chart is started for future subgroups. Let i f denote the (p x 1) vector of sample
means for a future subgroup. Substituting ~f for 2; in equation (18) yields the
Stage II test statistic:
T(~ 2 = n(~,f- 5)* S - l ( x f - ~),
where x and S are obtained from Stage I. It is shown in Alt et al. [5] that T02, 2
is distributed as c2(m, n, p)Fp . . . . . .
p + 1 where
c 2 ( m , n, p ) =
p(n - 1) (m + 1)/(mn - m - p + 1).
In order to determine whether the mean remains in control during Stage II, values
of T,02, 2 for each future subgroup are plotted on a control chart with
UCL = c 2 ( m , n, p)Fp . . . . . .
p+ ~, ~ and LCL = 0. If Z2o,z exceeds the UCL, an
assignable cause is sought. Yang and Hillier [26] suggest that 2, S and the UCL
be updated fairly often in the beginning, with less updating after the process has
stabilized. The To2,2-chart can be supplemented by charts for each quality characteristic. For p = 2, the upper and lower control for limits for each variable are
x + A* ~/fi, where A * = x / ( m + 1)/m tmc,- 1), cx/2p.
The summary statistics for the plastic film extrusion process are recorded in
Table 4. Since each mean was increased by one standard deviation for f--- 4 and
Table 4
Statistics for control charts for the mean (Phase I - - S t a g e II)
x_ j, f
_x2, f
T 2o, 2
UCL = c2(8, 10, 2)F2, 7J, o.oos = 13.03
F. B. Alt and N. Do Smith
8, it is not surprising that ToE,2 > U C L (with ~ = 0.005) for these two subgroups.
Since the test statistics for eight more subgroups have plotted in control, ~ and
may be recomputed using the sixteen subgroups where T,02, 1 and T,02, 2 plotted
in control.
Control charts f o r process dispersion (Phase I)
The procedure used to monitor the mean of a multivariate process during
Phase I is based on probability limit charts for Stages I and II. However, this
approach will not be employed to monitor the dispersion since the methodology
has not been completed at this time. Rather, the course used will correspond to
the univariate method of replacing the population parameter (ao) by an unbiased
estimate obtained from m rational subgroups.
When there is only one quality characteristic, the R-chart is frequently used to
analyze the variability of past data. If R i denotes the range of each subgroup and
R- = (l/m) E "i= 1 R;, then an unbiased estimate of a o is -R/d 2. T h e Phase I control
limits for the R-chart are obtained by substituting this unbiased estimate for a o
in equation (8). Usually, the control limits are written as U C L = D 4 R and
LCL = D 3 R . Values of 0 3 and O 4 c a n be found in Table M of Duncan [ 10].
Another possibility for analyzing variability is the S-chart. Let 3 denote the average
of the sample standard deviations from the m subgroups. Then the control limits
for an S-chart are U C L = B43 and L C L = B33 where
B 3 = 1 - (3/c4) ~
- c2
B 4 = 1 + (3/c4) x//1 - c42 .
Values of B 3 and B 4 a r e tabulated in Duncan. The B 3 and B 4 constants used in
an S-chart are obtained by substituting the unbiased estimate, 3/c4, for a 0 in
equation (10) and simplifying. Another alternative to control process variability in
the S2-chart, where 5 2 is the average of the sample variances. Note that S ~ x / ~
since S is the average of the sample standard deviations. When the unbiased
estimate (32) of a 2 is substituted in equation (11), the following Phase I control
limits are obtained:
U C L = SZZ2n-I,~,/2/( n -- 1),
LCL = s-2 Z n2- !, !
-- ( o ~ / 2 ) / ( n
It is customary to use only an upper control limit where the percentage point is
l, at"
By using the summary statistics recorded in Table 3, control limits can be
determined for the S and S2-charts. For the transparency variable (X1), 31 = 9.15;
for the tear resistance variable (Xa), 32 = 2.68. From Table M of Duncan, it is
seen that, for n = 10, B 3 = 0.284, and B 4 = 1.716. For X1, the control limits are
U C L 1 = 15.70 and LCL 1 = 2.60; for X 2, U C L a = 4.60 and LCL a = 0.76. Since
none of the points falls outside the S-control limits for either variable, the
variability of the process is deemed to be under control during this preliminary
period of 10 subgroups. For the sa-charts, U C L 1 = (90.98)(25.16)/9 = 254.34
Multivariate process control
and UCL 2 --- (7.64)(25.16)/9 = 21.36. Again, none of the points exceed the upper
control limits and the same conclusion would be reached.
When there are multiple quality characteristics, two variations of the IS[ 1/2_
chart were presented for monitoring orocess dispersion during Phase II. The first
was a probability limit chart with control limits as stated in equation (12). These
particular limits are applicable only when p = 2. Let IS*[ 1/2 denote the average
of the square roots of the generalized sample variances. That is,
I S , I 1/2 = (l/m) E m
,'= 1 IS;[ 1/2. Since IS * 1!/2/b3 is an unbiased estimate of I1~011/2,
Phase I control limits are as follows:
UCL = IS*11/2Z~n_4,~/2/2b3(n-
LCL = IS*l 1/2 )~2n2
- 4, 1 -
~/2/2b3(n- 1).
The constant b 3 w a s defined in equation (13). The other Phase II chart for ISI 1/2
used the 3-sigma limits stated in equation (16) and was appropriate for any
number of quality characteristics. Thus, the Phase I limits are obtained by substituting IS*l 1/2/b3 for 127011/2 in equation (16). The resulting limits are:
UCL = [S*[ 1/2 [1 + (3/b3) x/rbll - b~],
CL = IS*l 1/2,
LCL = [S*I 1/2 [1 - (3/b3) x / ~ - b ] ] .
When p = 1, the above control limits for a [S*[ 1/2-chart are identical to those for
the S-chart with the B 3 and B 4 factors stated in equation (23). Another procedure
that could be used for investigating process dispersion during Phase I is obtained
from equation (17), which was the likelihood ratio statistic for testing Ho: Z = 27o.
To obtain the corresponding Phase I procedure, unbiased estimates of [I;o[ and
270 1 are needed. Let PSol denote the average oftrl the generalized sample variances
from the m subgroups. That is, ISol = (i/m) ~i= 1 ISil- By using the result stated
in equation (14), it can be shown that 1Sol~b1 is an unbiased estimate of 127olLet SF 1 denote the inverse of the sample variance-covariance matrix for subgroup
i, i = 1. . . . , m. Kshirsagar [ 19] shows that (n - p - 2)S 7 l/(n - 1) is an unbiased
estimate of 270 1• Thus, if S , 1 = ( /1 m ) ~ =m 1 $ 7 1, then (n - p - 2 ) S , l/(n - 1) is
an unbiased estimate of 270 1 obtained from the m rational subgroups. The Phase I
procedure is obtained by substituting ISol~b1 for t 27ot and (n - p - 2 ) S , 1/(n - 1)
for Zo-1 in equation (17). The revised values of W*, i = 1. . . . , m, would still be
plotted on a control chart with UCL = )~p(p
2 + 1)/2T h e control limit factors used during Phase I for both one and more than one
quality characteristic were independent of the number of subgroups. Some authors
argue that these factors should also be a function of m, the number of subgroups.
Such factors are presented in Alt [ 1].
The 3-sigma IS*ll/2-chart will be used to investigate the variability of the
plastic film process. The value of ]Se[ 1/2 for each of the initial ten subgroups can
be obtained from the summary statistics presented in Table 3. For example,
F. B. Alt and N. D. Smith
IS111/2 = x/(75.53) (6.78) - ( - 3.88) 2 = 22.29.
F o r ease of reference, these values are r e c o r d e d in T a b l e 5. It was previously
stated that bl = b3 = 0.889 when n = 10. Thus, U C L = (21.46) (2.06) = 44.21 and
L C L = (21.46)(0.06) = 1.29. Since none o f the values o f [Sil !/2 fails outside the
control limits, it a p p e a r s that the variability o f the process is under control.
Table 5
Statistics for 3-sigma [S*[l/2-chart
iStl J/2
22.29 16.85 30.13
IS*I 1/2 = 21.46
UCL = 44.21
LCL = 1.29
18.50 22.97 30.91 22.21
18.15 24.66
Other approaches
The control charts presented in this p a p e r were Shewhart charts. W h e n there
is one quality characteristic, the cumulative sum ( C U S U M ) control chart has
smaller average run lengths than the Shewhart chart when used to detect small
shifts in the process mean. Recently, three multivariate C U S U M charts have been
proposed. Let t o d e n o t e the s t a n d a r d value of the process m e a n and $ the
variance-covariance matrix. Crosier [9] defines
c. = [(s._ 1 + x.-
to)' Y , - l ( s . - i + x . - oo)l 1/2
and p r o p o s e s the following C U S U M
if C.<<.k,
S,,=-(S,,_~ + X n - / J o ) ( 1 - k / C , )
if C , , > k ,
where So = 0 and k > 0. Crosier's scheme signals when S ,n ~ - 1Sn > h 2.
Healy [ 12] developed a C U S U M p r o c e d u r e b a s e d on the sequential probability
ratio test. Let 6 denote the shift from /~o that is i m p o r t a n t to detect. Define
D = x / ~ , ~ - 1 ~ and a ' = ~t' ~ - l i D . Then H e a l y ' s scheme has
S n -- m a x { S n _ l + a'(x,, - / 1 o ) - 0.5 D, 0} .
Healy's scheme signals when S n > L, where L is an appropriately chosen constant. Healy also presents a C U S U M scheme for detecting a shift in the
covariance matrix. H e shows that this C U S U M is equivalent to a C U S U M
s p o n s o r e d by Pignatiello et al. [23] for detecting a shift in the mean.
Multivariate process control
Jackson [17] presents an overview of principal components and its relation to
quality control as well as several other recent developments, such as Andrews
[1] Alt, F. B. (1973). Aspects of multivariate control charts. M.S. thesis, Georgia Institute of
Technology, Atlanta.
[2] Alt, F. B. (t982). In S. Kotz and N. L. Johnson, eds., Encyclopedia of Statistical Sciences, Vol. 1,
Wiley, New York, 294-300.
[3] Alt, F. B. (1985). In S. Kotz and N. L. Johnson, eds., Encyclopedia of Statistical Sciences, Vol. 1,
Wiley, New York, 110-122.
[4] Alt, F. B. and Deutsch, S. J. (1978). Proc. Seventh Ann. Meeting, Northeast Regional Conf. Amer.
Inst. Decision Sci., 109-112.
[5] Alt, F. B., Goode, J. J., and Wadsworth, H. M. (1976). Ann. Tech. Conf. Trans. ASQC, 170-176.
[6] Aft, F. B., Walker, J. W., and Goode, J. J. (1980). Ann. Tech. Conf. Trans. ASQC, 754-759.
[7] Anderson, T. W. (1984). An Introduction to Multivariate Statistical Analysis, 2nd ed., Wiley, New
[8] Bowker, A. H. and Lieberman, G. J. (1959). Engineering Statistics. Prentice-Hall, Englewood
Cliffs, NJ.
[9] Crosier, R. B. (1986). Technometrics 28, 187-194.
[10] Duncan, A. J. (1974). Quality Control and Industrial Statistics. 4th ed. Richard D. Irwin,
Homewood, IL.
[11] Guttman, I. and Wilks, S. S. (1965). Introductory Engineering Statistics. Wiley, New York.
[12] Healy, J. D. (1987). Technometrics. To appear.
[13] Hillier, F. S. (1969). J. Qual. Tech. 1, 17-26.
[14] Hoel, P. G. (1937). Ann. Math. Stat. 8, 149-158.
[15] Hotelling, H. (1947). In: C. Eisenhart, H. Hastay, and W. A. Wallis, eds., Techniques of
Statistical Analysis, McGraw-Hill, New York, 111-184.
[16] Hotelling, H. (1951). Proceedings of the Second Berkeley Symposium on Mathematical Statistics and
Probability. University of California Press, Berkeley, CA, 23-41.
[17] Jackson, J. E. (1985). Commun. Statist.-Theor. Meth. 14, 2657-2688.
[18] Johnson, R. A. and Wichern, D. W. (1982). Applied Multivariate Statistical Analysis. PrenticeHall, Englewood, NJ.
[19] Kshirsagar, A. M. (1972). Multivariate Analysis, Marcel Dekker, New York.
[20] Montgomery, D. C. (1985). Introduction to Statistical Quality Control, Wiley, New York.
[21] Montgomery, D. C. and Klatt, P. J. (1972). Manag. Sci. 19, 76-89.
[22] Montgomery, D. C. and Klatt, P. J. (1972). AIIE Trans. 4, 103-110.
[23] Pignatiello, J. J., Runger, G. C. and Korpela, K. S. (1986). Truly multivariate CUSUM charts.
Working Paper # 86-024, College of Engineering, University of Arizona, Tucson, AZ.
[24] Schilling, E. G. and Nelson, P. R. (1976). J. Qual. Tech. 8.
[25] Wiener, H. L. (1975). A Fortran program for rapid computations involving the non-central
chi-square distribution. NRL Memorandum Report 3106, Washington, DC.
[26] Yang, C.-H. and Hillier, F. S. (1970). J. Qual. Tech. 2, 9-16.
P. R. Krishnaiah and C. R. Rao, eds., Handbook of Statistics, Vol. 7
© Elsevier Science Publishers B.V. (1988) 353-373
1 u
QMP/USP A Modern Approach to Statistical
Quality Auditing
Bruce Hoadley
I. Introduction and summary
An important activity of Quality Assurance is to conduct quality audits of
manufactured, installed and repaired products. These audits are highly structured
inspections done continually on a sampling basis. During a time interval called
a rating period, samples of product are inspected for conformance to engineering
and manufacturing requirements. Defects found are accumulated over a rating
period and then compared to a quality standard established by quality engineers.
The quality standard is a current target value for defects per unit, which reflects
a trade-off between manufacturing cost, maintenance costs, customer need, and
quality improvement opportunities and resources. The comparison to the standard
is called rating and is done statistically. The output of rating is an exception
report, which guides quality improvement activities. For the purpose of sampling
and rating, product and tests are organized into strati called rating classes.
Specific examples of rating classes are: (i) a functional test audit for digital hybrid
integrated circuits, (ii) a workmanship audit for peripheral switching frames.
The purpose of this chapter is to describe the Universal Sampling Plan (USP)
and the Quality Measurement Plan (QMP), which were implemented throughout
Western Electric in late 1980. USP and QMP are modem methods of audit
sampling and rating respectively. They replaced methods that evolved from the
work of Shewhart, Dodge and others, starting in the 1920's and continuing
through to the middle 1950's [5-8]. More generally, USP and QMP provide a
modern foundation for sampling inspection theory.
This chapter is a summary of the material published in [ 1-4, 10, 12-17]. Those
papers considered primarily attribute data in the form of defects, defectives or
weighted defects (called demerits in [8]). Here, we consider Poisson defects only.
However, the general case can be transformed into the Poisson case via the
concept of equivalent defects [ 1, p. 229].
B. Hoadley
1.1. Summary of USP
The first step in a quality audit is to select samples of all the rating classes to
be inspected. The cost per period of these inspections cannot exceed an inspection
The traditional audit sampling consisted of six sampling curves developed by
Dodge and Torrey [8]. The curves provide sample size as a function of production. Each product is assigned a curve based on criteria such as complexity and
homogeneity. To quote Dodge and Torrey, 'These are empirical curves chosen
after careful consideration of the varied classes of product to which they were to
be applied as well as of the quantities of production to be encountered.' There
is no known theoretical foundation for the curves.
The traditional sampling plan did not account for many factors that relate to
sampling. For example, (i) cost of auditing, (ii) field cost of defects, (iii) quality
history, (iv)statistical operating characteristics of rating, (v) audit budget
constraints. USP provides a theoretical foundation for audit sampling, which
accounts for all these factors.
The fundamental concept of USP is that more extensive audits provide more
effective feedback, which results in better quality. The cost benefit is less field
maintenance cost; but, this must be compared to the larger audit costs.
We assume that the field maintenance cost affected by the audit is
(Production) x F Defects sent to the field] x
Field maintenance cost
per unit produced
_J Lper defect sent to the field_]
The audit affects the second quantity in this expression via a feedback mechanism, Figure 1. Under this feedback model, the production process is a controlled
stochastic process. When the process is at the standard level (one on an index
scale), there is a small probability per period that the process will change to a
substandard level. Given this substandard level, there is a probability per period
• -
Fig. 1. The USP feedback model.
Q M P / U S P - - A modern approach to statistical quality auditing
that the audit will detect this change. This probability depends on the audit
sample size. When detection occurs, management acts and forces the process
back towards the standard. This phenomenon is empirically observed in audit
data. So, sample size affects average long run quality, because it affects detection
In this feedback model, we ignore the possibility of false detection when the
process is at standard. There is an implicit assumption that the cost of false
detection is large and therefore, the producer's risk is small. When detection
occurs, management is supposed to act. Such action frequently involves the
expenditure of substantial resources on a quality improvement program. So the
whole audit strategy is founded on the integrity of exception reporting. Otherwise,
management would pay little attention to the results.
Our model for the total audit cost per period per product is
[Inspection cost per unit] x [Number of units inspected].
The USP sample sizes for all products are determined jointly by minimizing the
audit costs plus the field maintenance costs subject to an inspection budget
constraint. The approximate solution to this problem is the ollowing formula for
e = [expected number of defects in the audit sample per period]:
.[0 (i.e., no audit)
when B x / / B ~ < 1,
= Expected number of defects in this audit sample, given standard quality (called
= nS,
n = Sample size,
s = Standard defects per unit,
N = Production,
= G/Ca,
Cr = Field maintenance cost per defect,
C a = Audit cost per unit of expectancy,
P = Probability per period that the process will change to a substandard level,
= Process control factor,
= Budget control factor (monotonically related to the Lagrange multiplier asso-
ciated with the budget constraint).
Note that we express the solution in terms of the expectancy in the sample--not
the sample size itself. Expectancy is the natural metric for describing the size of
an audit; because, the detection power depends on the sample size through
B. Hoadley
expectancy. Five switching frames with a standard of one defect per unit generate
as much information as 5000 transistors with a standard of 0.001 defects per unit.
For most applications, the expectancy in the sample ranges from 1 to 10, whereas
sample sizes range from 5 to 10000.
EXAMPLE. Test audit for a small business system. This example illustrates USP in
its simplest form. For a test audit, the quality measurement is total test defects.
The standard is s = 0.06 defects per unit. An analysis of past audit data yielded
P = 0.04. Economic analyses yielded Ca = $430 and Cr = $280, so
r = $280/$430 = 0.65. If B = 2 (used by Western Electric), then the expectancy
formula is
e = x/2(0.04) (0.65) (0.06)N = (0.056) x / ~ .
The sample size version of this formula is n ~fJ_~0.06) = (0.93) x / ~ . If the production is N = 2820, then e = (0.056)~/2820 = 3.0. The sample size is
3.0/(0.06) = 50. Under the traditional plan [8], the sample size would have been
x/2 x/N = ~
= 75, a 50 percent increase.
USP--A foundation for sampling inspection
The general concept of USP is to select product inspections which minimize
inspection costs plus field maintenance costs subject to an inspection budget
constraint. The tradeoff between inspection and maintenance costs is a result of
the feedback model of Figure 1. These concepts are fundamental and can be
applied to handle many sampling inspection complexities beyond those presented
in this chapter. The following complexities have been treated in Bell Laboratories
and Bell Communications Research memoranda by the author and others:
1. Demerits. Defects are weighted by demerits according to their seriousness.
Demerits can be transformed into equivalent defects [1, p. 229].
2. Pass through. An in process audit can detect defects that cannot pass
through to the field because of subsequent processing. So the audit does not
prevent field maintenance costs associated with those defects. For details see
[12, 131.
3. Clustering. The standard defects per unit, s, can be very large; e.g., s = 10
for the installation audit of a whole switching system (a cluster). In this case, true
quality may vary from system to system; so, the Poisson assumption does not
hold. In this case, the audit expectancy is increased by a cluster factor to account
for the between cluster heterogeneity.
4. Fractional coverage. Sometimes it makes sense to inspect only a fraction of
a unit of product. For example, a fraction of the connections on a frame of wired
equipment. In this case, the decision variable is two-dimensional: number of
frames and fraction of each frame.
5. Attribute rating. Sometimes attributes of products are rated rather than products themselves. For example, solder connections on all product at a factory is
QMP/USP--A modern approach to statistical quality auditing
an attribute rating class. Another example is product labeling. When a frame is
inspected, data for several attribute rating classes is generated. The decision
variable is now multi-dimensional: number of frames and fractions of the frame
for all the attributes rated.
6. Reliability. For reliability audits the decision variable is two-dimensional:
number of untis and time on test. Also, quality is defined by the failure rate curve
rather than defects per unit. B E L L C O R E - S T D - 2 0 0 [14] is a reliability audit
plan based on these concepts.
7. Lot-by-lot acceptance sampling. Mathematically, there is no difference
between an audit and lot-by-lot acceptance sampling. An audit period is analogous
to a lot and an exception report is analogous to a rejected lot. Often, acceptance
sampling is effective because of feedback rather than the screening of rejected lots.
Application [10] of QMP/USP to acceptance sampling yields a plan that has
many features in common with MIL-STD-105D, but also some important differences.
8. Skip lot acceptance sampling. Here the decision variable is two-dimensional:
fiaction of lots to sample and sample size per lot. This is the right approach when
there is a large inspection setup cost for each lot. B E L L C O R E - S T D - 1 0 0
[12, 13, 16] is a lot-by-lot/skip lot plan based on these concepts.
No doubt the list goes on and on. For example, the application of QMP/USP
to sequential and multi-stage sampling has not been investigated.
1.2. Summary of QMP
After the product samples are chosen, they are inspected for conformance to
engineering and manufacturing requirements. These inspections produce data in
the form of defects. QMP is a method of analyzing a time series of defect data.
The details of QMP are in [1, 12, 15, 17].
As an introduction to QMP, consider Figure 2. This is a comparison of the
QMP reporting format (a) with the old T-rate reporting format (b) which is based
on the Shewhart control chart [6]. Each year is divided into eight periods. In
Figure 2b, the T-rate is plotted each period and measures the different between
the observed and standard defect rates in units of sampling standard deviation
(given standard quality). The idea is that if the T-rate is, e.g., less than minus two
or three, then the hypothesis of standard quality is rejected.
The T-rate is simple, but it has problems. For example, it does not measure
quality. A T-rate of - 6 does not mean that quality is twice as bad as when the
T-rate is - 3. The T-rate is only a measure of statistical evidence with respect to
the hypothesis of standard quality. Also, implicit in the use of the T-rate is the
assumption of Normality. For small sample sizes, the Normal distribution is a
poor model for the distribution of defects. QMP was designed to alleviate the
problems with the T-rate and to use modern statistics.
Under QMP, a box and whisker plot (Figure 2a) is plotted each period. The box
plot is a graphical representation of the posterior distribution of current population
quality on an index scale. The index value one is the standard on the index scale
B. Hoadley
'1 ~ 3 4
5 6 78"1
Fig.2. QMPvs. the T-rate(ShewhartControlChart).
and the value two means twice as many defects as expected under the quality
standard. The posterior probability that the population index is larger than the top
whisker is 0.99. The top of the box, the bottom of the box and the bottom whisker
correspond to probabilities of 0.95, 0.05, 0.01 respectively.
For the Western Electric application of QMP, exceptions are declared when
either the top of the box or the top of the whisker are below standard (i.e., greater
than one on the index scale). This makes the producer's risk small, as explained
in Section 1.1.
The posterior distribution of current population quality is derived under the
assumption that population quality varies at random from period to period. This
random process has unknown process average and process variance. These two
unknown parameters have a joint prior distribution, which describes variation
across product.
The heavy 'dot' is a Bayes estimate of the process average; the ' x ' is the
observed value in the current sample; and the 'dash' is the posterior mean of the
QMP/USP--A modern approach to statistical quality auditing
current population index and is called the Best Measure of current quality. This
is like an empirical Bayes estimate--a shrinkage towards the process average. The
process averages ('dots') are joined to show trends.
Although the T-rate chart and the QMP chart sometimes convey similar
messages, there are differences. The QMP chart provides a measure of quality;
the T-rate chart does not. For example, in period 6, 1978 both charts imply that
the quality is substandard, but the QMP chart also implies that the population
index is somewhere between one and three. Comparing period 6, 1977 with
period 4, 1978 reveals similar T-rates, but QMP box plots with different messages.
The QMP chart is a modern control or feedback chart for defect rates. However,
the important outputs of QMP are the estimated process distribution (sometimes
called the prior distribution) and the posterior distribution of current quality. In
other decision making contexts, such as Bayesian acceptance sampling [ 11 ], these
distributions could be used to optimally inspect quality into the product via the
screening of rejected lots [10]. So QMP provides a practical tool for applying
Bayesian acceptance sampling plans.
1.3. The QMP and USP models in perspective
For the USP model, population quality is either at the standard level (1) or at
the substandard level (b), Figure 1. For the QMP model, population quality varies
at random from period to period, with a unknown process average and process
variance. The two models seem to be inconsistent. But, there is a reason for the
The QMP model is used primarily for statistical inference (the posterior distribution of current population quality). This inference should be robust to the real
behavior of the population quality process. The population quality process could
be very complex and contain elements of (i) random variation, (ii) random walks,
(iii) drifts, (iv) auto-correlation, and (v) feedback from out of control signals. But,
no matter what the process, it has a long run average and a long run variance.
The simple QMP model captures the first-order essence of any process. So, the
QMP inference, has a kind of first-order robustness.
On the other hand, the reason for an audit is to provide a monitoring tool to
guide quality improvement programs. Therefore, the allocation of inspection
resources to the many audits, should be based on a model of these monitoring
and quality improvement activities, e.g., the USP model.
The link between the two models is the USP process control factor, P, which
is defined as the probability per period of a change to the substandard level, b.
QMP is used to estimate this factor by the formula
P = Conditional probability that the population quality in the next
period will be worse than b, given all the data through the current
B. Hoadley
2. U S P details
This section contains the important elements of the derivation in [4].
2.1. General theory
For a given product, define
= Expectancy of the audit,
= Audit cost for audit of size e,
= Savings in field maintenance cost due to an audit of size e; S(0) = 0,
We assume:
(i) S' (e) and A' (e) > 0 exist for e > 0.
(ii) (dS/dA)(e) is monotonically decreasing for e > 0.
We deal with m a n y products simultaneously; so, for product i, we use the
subscript i. The general U S P problem is to select e,., i = 1. . . . . I, to minimize
Y.i [Ai(ei) - Si(ei)] subject to the constraints: (i) e i >>,O, (ii) •iAi(ei) <~M.
From K u h n - T u c k e r theory [9], there exists a Lagrange multiplier, 2, so that the
optimal et's satisfy:
A; (e,.) - S; (ee) + L4; (ei) ~> 0 ,
i = 1, . . . , I ,
ee[A t ( e i ) - S i(ee) + 2A;(e~)]=O,
i= 1. . . . . I,
Z Ai(ei) ~ M ,
i=1 .....
For 2 >1 O, define
if S; (0) < "1 + ).,
ei(,~) = I o
A; (o)
S; (et)
A; (ei)
solution to - - - - 1 + 2
A very simple algorithm for solving the problem is:
1. Choose a value for 2.
2. If Y,i Ai(e;(2)) = M, stop; otherwise increase or decrease 2 according to whether
Y~iA;(ei(2)) is greater or less than M.
QMP/USP--A modern approach to statisticalquality auditing
2.2. USP application
Most of the notation used in this section is defined in Section 1.1.
The audit cost function: A(e). We assume the linear audit cost function
A(e) : Cae.
The audit savings function: S(e). Let F(e) denote the field maintenance cost
associated with an audit of size e. Then
S(e) = F(O) - F(e).
Now, according to Section 1.1,
F(e) = O(e)sCfN,
0(e) = Process average (on an index scale) that results
from an audit of size e.
Recall that 0(e) arises from the quality behavior model described in Figure 1.
When the process is at the standard level, there is a probability, P, per period of
a change to the substandard level (b). So, the expected waiting time until this
change is E[Y] = 1/P. When the process is at the substandard level (b), we
assume that in each period, the audit detects the substandard level if the number
of defects, x, exceeds an acceptance number', c. We assume that x has a Poisson
distribution with mean n . s . b = e" b; so, the expected waiting time until detection
depends on e and we define D(e) by E [ Z ] = liD(e). Note that D(e) can be
interpreted as an average detection power. Hence,
LE(Y) +
E(Z)] (1)
+ LE(Y) +
=1 +[p +D(ei](b-1).
Putting all this together yields
S(e) = [/9(0)- O(e)lsCfN
t When QMP is used for detection, the acceptance number is a random variable.
B. Hoadley
Analysis. From the general theory, e(2) is often the solution to
For the USP application, this equation is
P + D(e) - I (b -1)
PrNs] 1/2
D'(O) (1 + 2)
Furthermore, the condition [S'(O)/A'(O)] <
1 + 2
E[ , ( b - l)
LD (0)31+ 2)lPrNsJ <P/D'(O).
2.3. Approximate detection power function
The simplest approximate average detection power function which satisfies
condition (ii) of the general theory is
D(e) -
For this case,
e(2) = max { O, B,/~~+p~ - ae} ,
a(b - 1)
(1 + ~)
B is called the budget controlfactor and is monotonically related to the Lagrange
multiplier, 2, associated with the budget constraint.
For practical application of USP, we assume that P is small and use the simple
approximate solution
For the Western Electric application, B = 2 is often used.
Q M P / U S P - - A modern approach to statistical quality auditing
3. QMP details
3.1. Data format
For rating period t [t = 1, . . . , / ( c u r r e n t period)], the audit data is of the form
Audit sample size,
x t = Defects observed in the audit sample,
e t = Expected defects in the sample when the
quality standard is met (called expectancy),
= SHt
s = Standard defects per unit.
In practice, defectives or weighted defects are sometimes used as the quality
measure. These cases can be treated via a transformation [ 1, p. 229], to equivalent defects.
We express the defect rate, as a multiple of the standard defect rate; i.e., with
the index
xt/e t .
So /t = 2 means that we observed twice as m a n y defects as expected.
3.2. Statistical foundations of QMP
The formulas used for computing the Q M P box plots shown in Figure 2a were
derived by an approximate Bayesian analysis of a statistical model [1]. The
assumptions of the model are:
(1) x t is the observed value of a r a n d o m variable, Xt, whose sampling distribution is Poisson with mean = nt2~, where ~'t is the true defect rate per unit. For
convenience, we reparameterize 2~ on an index scale as
0t = True quality index
So the standard value of 0t is 1.
(2) 0,, t = 1, . . . , T, is a r a n d o m process (or r a n d o m sample) from a G a m m a
distribution with
0 = process average,
7 2 = process variance,
which are unknown. Assumption 2 makes this a parametric empirical Bayes
B. Hoadley
(3) 0 and 72 have a joint prior distribution. The physical interpretation of this
prior is that each product has its own value of 0 and 72 and these vary at random
across products.
Assumption (3) makes this a Bayes empirical Bayes model. We never specify
the form of this joint prior; because, in our heuristic derivation, only its moments
are used.
This is now a full Bayesian model. It specifies the joint distribution of all
variables. The quality rating in QMP is based on the conditional or posterior
distribution of Or given x = (xl, . . . , xr).
3.3. Posterior distribution of current quality
The exact posterior distribution of 0 r is computationally impractical. So we
approximate the posterior mean and variance of 0 r. The complex approximate
formulas given in the Appendix are those published in [1 ]. They resulted from a
lengthy fine tuning process conducted over 20000 audit data sets during a two
year trial of QMP. Improved QMP formulas are published in [12] and derived
in [15]. In this section, we provide only the structure of the formulas.
The posterior mean is approximately
E [ a r l x ] = Or = 6jrO + (1 + 6Jr)IT,
b = E(O[ x ) ,
6JT = E(coT I x ) ,
cot =
O/er + ?2
The posterior mean, 0T, is a weighted average of the estimated process
average, 0, and the sample index, I T. It is the dynamics of the weight, 6j7-,
that causes the the Bayes estimate to work so well. For any t, the sampling
variance of I t (under the Poisson assumption) is Offe r The forecasted value of this
is E[Ot/et] = O/e,. So the weight, coT, is
[Forecaste sampling variance]
[Forecasted sampling variance] + [Process variance]
If the process is stable, relative to the sampling variance, then the process
variance is relatively small and the weight is mostly on the process average; but
if the process is unstable, then the process variance is relatively large and the
weight is mostly on the current sample index. The reverse is true of the sampling
variance. If it is large (e.g., small expectancy), then the current data is weak and
the weight is mostly on the process average; but, if the sampling variance is small
(e.g., large expectancy), then the weight is mostly on the current sample index. In
other words, cor, is monotonically increasing with the ratio of sampling variance
to process variance.
QMP/USP--A modern approach to statistical quality auditing
The posterior variance of Or is approximately
V[O-r Ix] = Vr = (1 - cot) Or/e r + ~ V ( O I x) + (O + I t ) 2 V(cor Ix).
If the process average and variance were known, then the posterior variance of
Or would be (1 - < n r ) 0 r / e r , which is estimated by the first term in Vr. But
since the process average and variance are unknown, the posterior variance has
two additional terms. One contains the posterior variance of the process average
and the other contains the posterior variance of the weight.
The first term dominates. A large 6)r (relatively stable process), a small Or
(good current quality) and a large e r (large audit) all tend to make the posterior
variance of Or small. If 697- is small, the the second term is negligible. This is
because the past data is not used much, so the uncertainty about the process
average is irrelevant. If the current sample index is far from the process average,
then the third term can be important. This is because outlying observations add
to our uncertainty.
If the process average and variance were known, then the posterior distribution
would be Gamma, so we approximate the form of the posterior distribution by
a Gamma. The parameters of the fitted Gamma distribution are ~ = shape param^2
eter = O r / V r, z = scale parameter = V r / O r. And the approximate posterior
distribution function is
P r [ O r < ~ z l x ] = G~(z/z) =
x ~-1 e - X d x .
3.4. Q M P box and whisker plot
For the box and whisker plots shown in Figure 2a, let I 9 9 ~ , I95~o, I05~o,
and I01 ~o denote the top whisker, top of box, bottom of box, and bottom whisker
respectively. These percentiles are formally defined, for example, by
1 - G~(I95%/z) = 0.95, etc.
So, aposteriori, there is a 95 percent chance that Or is larger than I95~o.
3.5. Exception reporting
For QMP, there are two kinds of exceptions.
(a) Alert:
I 9 5 % > 1 but I99~o~<1;
(b) Below Normal:
Products which meet these conditions are highlighted in an exception report.
B. Hoadley
3.6. USP .process control factor, P
As mentioned in Section 1.3, Q M P is used to estimate the U S P process control
factor, P, with the expression
P = Pr[07-+ 1 >
We approximate the conditional distribution of Or+ 1 (quality in the next period),
given x, by a G a m m a distribution fitted by the method of moments.
t h e general form of the moments are
E[OT+I[X ] = O,
V[OT+llX ] = VT+ 1 =E[y21x] + V[OIx]
(see the Appendix for detailed formulas). If the process average and variance were
known, then the conditional mean and variance of Or+ 1 would be simply 0 and
72. But since they are unknown, we use Bayes estimates of O and ~,2 and add the
term V[01 x] to the conditional variance to account for the uncertainty in our
estimate of O.
The^ parameters of the fitted G a m m a distribution are ~1 = shape parameter = 02/Vr+l, z I = scale parameter = VT+1/O. SO,
P = 1 - G~,(b/zl).
3.7. QMP dynamics
The Best Measure and the box plot percentiles are nonlinear functions of all
the data. So the dynamic behavior of these results is interesting.
2 tO
-I ×lJc
~ " I ×t
I × I IE
Fig. 3. Dynamics of sudden degradation for expectancy = 5.
QMP/USP--A modern approach to statistical quality auditing
Dynamics of sudden degradation
Since QMP uses a long run average, it is natural to ask about responsiveness
of the box plot to sudden change. If there is a sudden degradation of quality,
Quality Assurance would like to detect it.
The history data in Figure 3a is a typical history for a product which is meeting
the quality standard. The expectancy of five is average for Western Electric audits.
The history is plotted on a T-rate chart along with six possible values for the
current T-rate (labeled A through F). So, the current period is anywhere from
standard (T-rate = 0) to well below standard (Index --- 3.24, T-rate = - 5 ) .
Figure 3b shows the six possible current results plotted in QMP box plot form.
The box plot labeled A is the result of combining current result A with the past
five periods. The box plot labeled F is the result of combining current result F
with the same past history.
As you can see, the QMP result becomes Alert at about T-rate = - 3 and
becomes Below Normal at about T-rate = - 4 . For the T-rate method of rating,
you would have a Below Normal at T-rate = - 3 . The good past history has the
effect of tempering the result of a T-rate = - 3 .
It is informative to study the relative behavior of the current sample index,
process average and Best Measure as you go from current value A to F. The
current index changes a lot (from 1.00 to 3.24 and the process average changes
a little (from 1.00 to 1.38), both in a linear way. The Best Measure also changes
substantially but in a nonlinear way. It changes slowly at first and then speeds
up. This is because the weight on the process average is changing from 0.71 to
0.32. The weight changes, because as the current data becomes more inconsistent
with the past, the process is becoming more unstable, while the sampling variance
is changing slowly in proportion to the process average.
Bogie contour plot
For a fixed past history and current expectancy, there is a Below Normal Bogie
for the current sample index. If the sample index is worse than the Bogie, then
the product is Below Normal. Figure 4 is a contour plot of the Bogie for an
expectancy of five. The axes are the mean and variance of the five past values
of the sample index; i.e.,
7=51 Z ( I t - i ) 2 ,
, ~ . 2 ..
~1 E ( I t - ] ) 2
where I, is the sample index in past period t. For given values of i and S 2, we
used a standard pattern of It's to compute the Bogie. The results are insensitive
to the pattern. The dashed curve is an upper bound for S 2.
To see how the contour plot works, consider an example. Suppose 7 = 0.8 and
S 2 = 0.7. The point (0.8, 0.7) falls on the contour labeled 2.6. This means that
if the current sample index exceeds 2.6, then the product will be Below Normal.
The contour labeled 2.6 is the set of all pairs (i, S 2) that yield a Bogie of 2.6.
B. Hoadley
~> 3.0
Fig. 4. Below N o r m a l Bogie c o n t o u r plot for e x p e c t a n c y = 5.
This contour plot summarizes the Below Normal behavior of QMP for an
expectancy of five. As i gets larger than one, the Bogie gets smaller. If i exceeds
1.6, then the Bogie is smaller than 2.34, which corresponds to a T-rate of - 3 .
In this case, QMP Below Normal triggers earlier than a T-rate of - 3 .
For ] less than 1.4, as S 2 gets larger, the Bogie gets smaller. This is because
large S 2 implies large process variance, which makes an observed deviation more
likely to be significant.
For very small S 2, as you move from 7 = 0 to i = 1, the Bogie increases from
2.6 (T-rate = - 3 . 6 ) to 2.9 (T-rate = -4.2). This is an apparent paradox. The
better the process average, the less cushion the producer gets. This is n o t a
paradox, but an important characteristic of QMP. With QMP we are making an
QMP/USP--A modern approach to statistical quality auditing
inference about current quality, not long-run quality. If we have a stable past with
i = 0.2, and we suddenly get a sample index of 2.7, then this is very strong
evidence that the process has changed and is worse than standard. If we have
a stable past with i = 1, and we suddenly get a sample index of 2.7, then the
evidence of change is not as strong as with I = 0.2. The weight we put on the
past data depends on how consistent the past is with the present.
The Bogie contour plots provide the engineer with a manual tool to forecast the
number of defects that will be allowed by the end of a period.
Statistical fitter
Figure 5 illustrates jitter statistical in the T-rate. The expectancies are about 0.1,
so the T-rate jitters every time a defect occurs. The small expectancies are revealed
by the long box plots. Period 8, 1977 was Below Normal for the T-rate, but
normal for QMP.
~ 2 3 4 5 6 7 8 ~ 2 5 4 5 6 7 8
I. I
I, I.
Fig. 5. Statistical jitter in the T-rate.
B. Hoadley
Q M P formulas
The Q M P formulas are derived in [1 ]. In this appendix, we state the formulas
in the notation o f [1, Section 4.5].
F o r rating period 2 t [t = 1, . . . , T (current period)], the r a w audit d a t a is of the
n t = sample s i z e ,
x t = Defects o b s e r v e d in the audit s a m p l e .
The mean and variance o f x t given s t a n d a r d quality (Est and Vst ) are the same,
because x t is Poisson. So,
x t = Equivalent defects = d e f e c t s ,
e t = Equivalent expentancy = e x p e c t a n c y .
Let x denote the set o f data, { x t, t = 1. . . . , T } .
In Q M P , the prior distribution o f the process average manifests itself as 'prior
data', which we denote x o = e o = 1. N o w for t = 0, I, . . . , T, c o m p u t e the following:
Sample index :
xt/e t ,
Weighting factors f o r computing process average and variance:
1 + et/4
2.5 + (1.5)e t + (0.22)et2
Corresponding weights:
P, =
qt = gt
N o w let Y, denote ~ rt=o and c o m p u t e the following:
(2 p,/t).
2 The formulas also apply to lot-by-lot acceptance sampling data.
Q M P / U S P - - - A modern approach to statistical quality auditing
Degrees o f freedom:
2 [ 2 qt(1/e,)] 2
_ 1.
qt2(1/e 3 + 2/e if)
Total observed variance:
( 1 4 . 4 ) a 2 + (df + 1) Y, qt(It - O) 2
Estimated average sampling variance:
a 2 = E q,(It/e,).
Variance ratio:
R = S2/G 2 .
F, G, and H"
a = 4.5 + ~ d f ,
B =
T(O) = 1,
a = -
H= [~][(
1)(1) + 1].
Current sampling variance:
Sampling variance ratio:
= 2 .
Process variance:
~ 2 = F S 2 _ (72 = (FR -
1)a z .
B. Hoadley
= o.2/(o.2 + ~2) = IlFR.
Best measure of current quality:
07- = ~ T ~ + (1 - & r ) I r .
Posterior variance of current quality (07-):
O^ - IT) 2
[(r T - 1 ) ~ +
1] 4
Posterior variance of future quality (Or+ 1):
Vr+~ = [HS 2 - a 2] [1 + ~ p f ]
+ 0 Z(p~/et).
Posterior distribution of Or:
Q(z) = P r [ O r >
= 1
- G~,(z/z),
G=(y) = f Yo r(~)
1 e-X dx = Gamma
X a-
Posterior distribution of 07-+ 1:
~1= 0 IV~+ ~ ,
Vr+ ~l b
P(z) = P r [ 0 r + 1 :> zl x] = 1
- G~,(Z/Zl).
[1] Hoadley, B. (1981). The Quality Measurement Plan. Bell System Technical J. 60 (2), 215-271.
[2] Hoadley, B. (1981). Empirical Bayes analysis and display of failure rates. In: Proceedings of the
1EEE 31st Electronic Components Conference, May 11-13, 1981, Atlanta, GA, pp. 499-505.
[3] Hoadley, B. (1986). Quality Measurement Plan. Encyclopedia of Statistical Sciences, Vol. 7.
Wiley, New York.
[4] Hoadley, B. (1981). The Universal Sampling Plan. In: 35th Annual Quality Congress Transactions
of ASQC, May 27-29, 1981, San Francisco, CA, pp. 80-87.
[5] Shewhart, W. A. (1958). Nature and Origin of standards of quality. Bell System Technical J. 37
(1), 1-22.
QMP/USP--A modern approach to statistical quality auditing
[6] Shewhart, W. A. (1931). Economic Control of Quality of Manufactured Product. Van Nostrand,
New York.
[7] Dodge, H. F. (1928). A method of rating manufactured product. Bell System Technical J. 7,
[8] Dodge, H. F. and Torrey, M. N. (1956). A check inspection and demerit rating plan, Indust.
Qual. Control 13 (1), 1-8.
[9] Hillier, F. S. and Lieberman, G. J. (1974). Operations Research. Holden-Day, San Francisco,
CA, Chapter 18.
[10] Buswell, G. and Hoadley, B. (1985). QMP/USP: A modern alternative to MIL-STD-105D.
Naval Logistics Quart. 32 (1), 95-111.
[11] Guthrie, D. Jr. and Johns, M. V. Jr. (1959). Bayes acceptance sampling procedures for large
lots. Ann. Math. Statist. 30, 896-925.
[12] Bell Communications Research (1985). BELLCORE-STD-100 and STD-200 inspection
resource allocation plans. Technical Reference TR-TSY-000016 Issue 1.
[13] Brush, G. G., Guyton, D. A., Hoadley, B., Huston, W. B. and Senior, R. A. (1984). BELL-STD-100:
An inspection resource allocation plan. IEEE Communications Society Global Telecommunications
Conference, November 26-29, 1984, Atlanta, GA.
[14] Guyton, D. A. and Hoadley, B. (1985). BELLCORE-STD-200 system reliability test sampling
plan. In: Proceedings of the Annual Reliability and Maintainability Symposium, January 22-24,
1985, pp. 426-431.
[15] Hoadley, B. (1984). QMP theory and algorithms. Bell Communications Research Released
Technical Memorandum TM-TSY-000238, October 26, 1984 (available from author).
[16] Hoadley, B. (1986). The theory of BELLCORE-STD-100: An inspection resource allocation
plan. In: Transactions of the International Conference on Reliability and Quality Control. NorthHolland, Amsterdam.
[17] Guyton, D. A. and Tang, J. (1986). Reporting Current Quality and Trends: T Plots. In:
Proceedings of the 40th Annual Quality Congress, Anaheim, CA, May 1986.
P. R. Krishnaiah and C. R. Rao, eds., Handbook of
© Elsevier Science Publishers B.V. (1988) 375-402
Statistics, Vol. 7
| Q
.Ik , J
Review About Estimation of Change Points
P. R. Krishnaiah and B. Q. Miao
1. Introduction
Suppose that we have a sequence of observations x I . . . . . X~v with distribution
functions F 1. . . . . F N respectively. Generally the subscripts of x's may be considered as time, but one should remark the fact that the observations may not
necessarily be taken at equal-spaced times. For example, xl, x2, x3, x4 may be
the unemployment rates of a nation in March, June, September and December
of 1986, while x 5, x6, ... are the rates of successive months of 1987. A moment
is said to be a change point in the sequence if F~+ 1 is vastly different from F~
in some way. The precise nature of change is determined by the problem considered.
Such situations occur in a wide range of practical endeavor. In quality control
one takes successive observations of the process to see if something happened
causing the q u a l i t y o f the items produced to deviate from its pre-set standard
value. In econometrics, the variables reflecting the financial situation may change
drastically after a crash of the stock market. In the study of growth in biology,
it is commonly assumed that there exists a log linear relationship between the size
of two body parts and that this relationship persists throughout stable growth
period. A structural shift in this relationship may indicate that a new phase may
be of considerable interest (see Huxley, 1972; Oshumi, 1960).
Although the case in which at most one change is allowed is by far the most
important, situations arise where several changes occur. So in the most general
setting we have the observations xl, . . . , xN grouped into non-overlaping sets
{X 1 . . . . , X v l } , { 1 , . . . , X . c l + 2 } . . . . .
{X.vq_,+l .....
XN} such that within each
group the distributions of observations remain relatively stable, while abrupt
changes (in some sense) occur at z I . . . . , Zq_ 1, which are the change points.
When two or more change points are allowed, the problem becomes vastly complicated, though in a number of cases the methods developed in the one-change
case can also be applied with some modifications.
The statistical inference problem about a change point model consists: (1) To
determine if any change point should exist in the sequence. (2) Estimate the
number and position(s) of change point(s), and other qualities of interest which
P. R. Krishnaiah, B. Q. Miao
are related to the change. For example, the magnitude of the jump of the mean.
In a way, the classical two-sample and multi-sample problem can be considered
as a special case of the general change point problem described above. The
important difference lies in that in the classical case the possible positions of
change are precisely known in advance, while in the above formulation, the most
important question is to determine these possible positions. Page (1954) first
proposed and studied such a formulation.
One frequently-used formulation of the change point problem is as follows:
where x(t) is the observation taken from time t, e(t) is the random error,
E e(t) = 0, and #(t) is an unknown left-continuous and piecewise smooth function.
A point to E (0, 1] satisfying
U(to) ~ U(to + 0),
said to
a jump
If /~(to) = #(t o + 0) but
(d/dt)/~(to - 0) ¢ (d/dt)#(t o + 0), then to is called a first order continuous change
point, usually abbreviated to 'continuous change point'.
This formulation is nonparametric in nature because the unknown #(t) is not
assumed to have any specific form. In order to develop a more fruitful theory, one
imposes the restriction that # belongs to some parametric class. An important
example is the segmented regression model:
f ~',h,(t),
0 < t ~ to,
to < t ~ < 1,
where ~. ~ ~PJ, j = 1, 2, are unknown vectors of regression coefficients, hj is a
continuous function taking values in Rpj, j = 1, 2. In the sequel we use fl' to
denote the transpose of the vector r, while fl is often considered as a column
vector. If fl'lhl(to)v ~ fl~h2(to), t o is a jump change point, otherwise it is a continuous change point, or not a change point at all. We can also consider the case
thet x(t) is multi-dimensional, then ill,//2 above are matrices rather than vectors.
One should note that in the above formulation, the emphasis is on the possible
change of the mean, which is undoubtedly the most important type of the change
point problem. For such problem the general formulation giving at the beginning
of this section can easily be put in the form of (1.1). We may take x(i/N) as x;,
i = 1, . . . , N. Since the observations may not be taken at equal-spaced moments,
the variable t in the model (1.1) can not in general be understood in the uniform
time scale b a s i s .
As mentioned above, the random error process (e(t), 0 < t ~< 1) is assumed to
be centered at 0. An often-made assumption is that it is an independent process,
though models in which e(t) has some simple dependence structure can also be
Review about estimation of change points
studied. In the sequel we shall always stick to the independence assumption. A
much-studied case is that e(t),.~ N(0, o-2) (o.2 unknown) and hi(t), j = 1, 2, are
polynomials of t, of which the linear case is by far the most important. If hi(t),
j = 1, 2, are linear and fl'lhl(to)v~ fl~h2(to), is called a switch regression model.
This model is investigated by many authors (see Quandt, 1958, 1960; Quandt and
Ramsey, 1978; Robison, 1968, and others). Hudson (1966) and other authors
discussed the estimation and hypothesis testing of continuous change point using
maximum likelihood (ML) and least square (LS) techniques.
While estimator of change positions can be obtained by these methods, the
distributions of the estimators are usually very complex, and a precise determination of it is out of the question even in the simplest case. Some asymptotics
are possible.
When the two sections of x t in (1.2) intersect, denote the MLE of the abscissa
of the intersection ~ by ~. Feder (1975) proved the asymptotic normality of
~. Hinkley (1971) derived asymptotic distribution for ~ which gave a better fit
to that sampling distribution for moderate sample sizes. Inference about y is also
proposed. When these regression sections are parallel, the asymptotic distribution
of the MLE @ of the jump change point is derived by Hinkley (1970) by
random-walk considerations. Unfortunately, z tunas out to be inconsistent.
Besides MLE and LSE, Bayesian methods play an active role. Suppose that
the positions of changes obey an arbitrarily specified a priori probability distribution appropriate to the special case being studied, and assume that the jumps of
the mean are independently and normally distributed random variables with mean
0. Chernoff and Zacks (1964) derived a Bayesian estimator of the current mean
~n for a priori uniform distribution on the whole real line using a quadratic loss
function. This approach is extended to the one-parameter exponential family of
distributions (see Kander and Zacks, 1966). Bhattacharyya and Johnson (1968)
proposed an optimal invariant test for certain location shift alternatives. Numerous authors in this field used Bayesian methods under various assumptions
on the model.
A large portion of results in this field are derived under the assumption that
the number q of change points is known. Situations occur in whcih q is unknown
and is to be estimated. In a small-sample setting, this problem does not lend itself
to a satisfactory treatment. Some asymptotics are proposed. Vostrikova (1981)
investigated this problem in multivariate case. He suggested a binary segmentation
procedure to estimate q, and proved that these estimates are consistent. Pettitt
(1980) suggested another ad hoc sequential procedure by cumulative sum
(CUSUM) method. Krishnaiah, Miao, Subramanyam and Zhao (1986), (1987a,b)
considered large sample properties of change point estimators, obtained MLE of
q and the positions of change points by model selection considerations. These
estimators are proved to be consistent.
Under the assumption of independence, normality and the constraint
/~1 ~> ' " " t> #~v, Krishnaiah, Miao and Zhao also derived M L E of the number and
the positions of change points, and proved their consistency. Later, Yin (1986)
proposed a consistent estimator by a nonparametric approach; Chen (1987) and
P. R. Krishnaiah, B. Q. Miao
Miao (1987) obtained the asymptotic distributions of these estimators for some
simple types of change points.
The MLE ,) of the intersection 7 of two regression curves is discussed in
Section 2, and various non-Bayesian estimators about jump change model are
presented in Section 3. Section 4 is devoted to Bayesian methods, and in the last
section the estimates about the positions and the number of change points in large
sample case are discussed.
Some other methods are proposed to study the estimates of change points.
Among them are dynamic program and smooth approximation, to name a few.
2. The e s t i m a t e o f the intersection o f regression curves
2.1. Weighted least squares estimation
Let xi = x(ti), i = 1. . . . . N, be observations drawn from model (1.1) and (1.2)
with only one change point ~. Here the continuity assumption
fl~hl(7) =fl~h2(~),
plays the role of a constraint under which the parameters are estimated. Let wk,
k = 1. . . . , N, be a given set of positive real numbers, called weights. Set
Q(fl, ~) = ~
- fflhl(tk)) 2 +
k = 1
W k ( X k -- f l ~ h l ( t k ) ) 2 ,
k ~ "c(~) + I
where z(a) is an integer such that t~ ~< c~< t~+ l" Suppose that the unknown c~
belongs to a given set, say A. For convenience, we set
where H z ( m ) is the m x Pl design matrix with rows h'l(tk), 1 ~< k ~< m, and H2(m )
is the (N - m) × P2 design matrix with rows h'2(tk), m x k <~ N,
W 1 = diag(w 1. . . . .
W = diag(w I . . . . .
W2 = diag (win + 1. . . . .
WN) ,
B(~) = (h'~(~), - h~(~))
/~ = (/~'1,/~)'
Rewriting the continuity assumption as B(~)fl = 0, we have the following
problem: Find&, /~ such that
Q(/~, &)=
min+ p 2 Q(fl, c¢),
B(~)B = o
Review about estimation of change points
A solution of this problem is called a weighed least squares estimate (WLSE).
W L S E can be found by two steps. First, fixed a (so the row z = z(~) is also fixed)
and find the W L S E of fl under the constraint (2.1). The weighted residual sun of
squares is given by
/~(a) = ]~(Q - {B(a) ( H ' (z) WH('c))- B'(a)} -1
x {U' (~) WH(a)} - B' (a)B(ct)/~(z)
Q(a) = QI(Q + Qe(ct) + (B(a)fl(z))2(B(a)fl) 2
× {B(a) (U' (z) W U ( z ) ) - S ' (ct)} - 1 ,
where /~(z) denotes the unconstrained W L S E of fl and QI(z), Q2(z) are the sums
of first z and the last N - z squares of weighted residuals. Set
Q(z) = Q,(z) + Q2(z).
Then Q(z) is the total weighted residual sums of squares when z is fixed.
The next step is to find that & c A minimizing Q(a) defined by (2.5):
Q(&) = min Q(~),
2.2. Classification of
- -
To facilitate the calculation of e, Hudson (1966) classified the W L S E of e into
three types as shown in Table 1.
Table 1
Classification of WLSE
~ ti
g(&) ~ 0
g(&) = 0
Type One
Type Three
Type Two
Type Two
Let fl*(i) and fl*(i) be the unconstrained W L S E computed from q . . . . . ti and
ti+ 1. . . . . tN respectively, ~*(i) be the point (or points) of intersection of fl*hl(t )
and fl*h2(t), which lies in (ti, ti+ 1).
P. R. Krishnaiah, B. Q. Miao
THEOREM 2.1 (Hudson, 1966).
(i) I f ~ is of Type One, and ~ ( q ,
ti+~) , then
i= 1,2,
= ~*(i)
fl*(i)h,(o~*(i)) = fl*(i)h2(~*(i)).
(ii) I f ~ is of Type Two, ~ = tj for some i, then
= t~(tt),
Q(fl, ~) = Q(te) >~ Q(i).
(iii) That ~ is of Type Three implies
Q(fl, &)>~Q(i),
if t~< & < t i + , .
2.3. Determination of
If i is unknown, but we know the joint is of Type One, we can find & as
For a value of i, fig//*(t~) as before. Find whether or not the curves join at least
one ~*(tj) in the right place, i.e.
t; < ~*(i) < q+ ~,
If (2.9) holds, put
T(i) = P*(i) + P*(i),
where Pj*(i) j = 1, 2, are the local residual sums of squares. If the curves do not
join, or if (2.9) is not satisfied, put
T(i) = oo.
Carry out the computation for all relevant values of i. Finally, choose the critical
value of i for which T(i) is minimized. This procedure is based on the definition
of Type One and the theorem above.
If we know that the joint is of Type Two, i.e. ~ = t; for some i, we can easily
find the remaining parameters since we now have a model which is linear in these
parameters. We get the estimate by solving the problem of least square with the
linear constraint
fl', h,(t~) = fl~h2(t~).
One way of doing this is to find the unconstrained local least square estimates
Review about estimation of change points
fl*(i), and then make the relevant adjustment (see Gallant and Fuller, 1973;
Hudson, 1966). If i is unknown, we have to carry out the above computation for
all possible values of i, and choose the critical value of i for which the overall
residual sum of squares is minimized.
Finally, suppose the joint is of Type Three. That is
ti < & < ti+ 1, f l ; h l ( & ) = f l 2 h 2 ( & ) •
__d (~,lhl(u))lu
=~ _ __d
du (fl;h2(u))]"= a
In this case, depending on practical needs, the regression curve may be
assumed to consist of one smooth curve, or it may contain two segment of
smooth curves. The two cases are handled separately. Refer to Hudson (1966),
and Gallant and Fuller (1973).
In practice, usually we have no information concerning the type of a joint. Since
Hudson's classification is exhaustive, it suffices to try all three types in order to
arrive at the overall solution. We have to do so since there might be two or more
change points in the given set A, and the joints might belong to different types.
This can be done step by step using Hudson's theorem.
Previous discussions can be extended to the models in which r pieces of
segmented regression curves are present.
2.4. M a x i m u m likelihood estimation
Now we assume that et's in (1.1) are normal. The logarithm of the likelihood
function, logL(fl, 0"2, 0.2, c0' can easily be written down. We want to find the
maximizing point of it subjecting to the restriction (2.1). Since the function
logL(fl, 0-2, 0-22, ~) is not differentiable with respect to ~, we consider first to
maximize logL(fl, 0-2, 0"~, ~) for a given ~. We have
logL - sup {logL(fl, 0-1z, a 2, ~), fie R m +p2, a.z > 0, i = 1, 2, B(a)fl = 0}
-= logL(fl, 0-1,
^ 2 0"2,
- - - - log27r . . . .
log b l 2 ( ~ ) - 2
"2 ) .
log az(~
Take any & ~ A such that
logL(&) = max logL(~)
g(a) = arg rain
~ A ( 2 l°g °'2(°0 + - 2
log ~'22(~) .
Thus, M L E is given by L(/~(~), 0-1(~),
P. R. Krishnaiah, B. Q. Miao
^ 2 ) and ~rz(e
^2 )
Finally, (2.10) is not different from general MLE except az(e
are restricted by (2.1). When A is an infinite set, e.g. an interval in [tl, tu],
complications arise. Robison (1964) decomposed A into a finite number of
disjoint sets
c~,= {o~eA, z(cO = z } ,
and used a reparametrization of the model based on constraint (2.1) and polynomial functions in order to solve (2.10).
If a 2 = o:2,
2 a priori, then & is the same as ordinary least squares estimates.
2.5. Asymptotic distribution o f M L E
o f intersection
Consider the model which is a special case of (1.1)
x, =
0~1 + ~lUt "t- ~t, t = 1, . . . , z ,
o~2 q- ~2ut q- ~t,
t = z + l, . . . , N ,
where the arguments u l , . . . , u N are ordered, i.e. u 1 < U 2 < . . . < UN, the error
terms el . . . . . eU are independent N(0, a z) and the parameters el, e2,/~l,/~2 and
z are unknown. Let 7 be the abscissa of the intersection of two regression lines
of model (2.11), then it is easy to see that
- -
- -
Further, it is assumed here that u s < 7 < u,+ 1. Thet is to say, the overall regression function is smooth, only a change in tangent occurs in the two sides of 7.
Important problems to be considered are estimating 7 and making inferences
about ~. Generally, the MLE ~) of ~ can be obtained more easily. In order to
estimate and make inferences for 7, one way is to calculate the distribution of
~. Unfortunately, the explicit form of the distribution of ~ is hard to obtain.
So we turn to the asymptotic distribution of ~. Feder (1975a) proved the
asymptotic normality of ~) under more general conditions that et's are iid.
r.v.'s with mean 0, variance a 2, finite (2 + b)-moment for some b > 0 and the
number of unknown change points is finite, but may be larger than one.
Although ~) is asymptotically normally distributed, an empirical study of the
distribution of ~ suggests that for moderate sample size, the normal approximation is inadequate. So it is necessary to find some other approximate distribution which fits better in case the sample size is not large. This was considered
by Hinkley (1969). He discussed the relation between the two maximum likelihood
estimates ~) and } of ~, the first with constraint (2.1) and the second
unconstrained. From asymptotic normality of ~, he deduced the asymptotic
normality and approximate expression for the distribution of ~).
If the two regression lines in the model (2.11) are parallel, especially/~1 = r2 = 0
and c~~ c~2, the MLE ~) of change point ~ can be deduced. An asymptotic
Review about estimation of change points
distribution of ~ is derived by Hinkley (1969) using the random walk technique.
Unfortunately, his asymptotic distribution is too complicated to be of practical
It is an interesting and important problem to find an estimate of a change point
7, for which an asymptotic distribution can be determined explicitly, since the
construction of confidence interval of 7 depends upon such an estimate. In some
simple models containing jump or slop change points, Chen (1987) and Krishnaiah
and Miao (1987) obtained such estimated, see section five.
3. Non-Bayesian estimates of change points in jump change model
3. I. Weighted least squares estimation (WLSE)
Consider model (1.1). Let Wt, t = 1. . . . . N, be a given set of weights, and
Q(/3, c~) = ~
wk(xk -/3'lhl(tk)) 2 +
wk(x~ -/3~hl(t~))2 ,
A t
is the weighted sun of squares. (/3 , "~)' is called a W L S E of/3 and z is
O(/~, ~ ) = min {O(/3, z), ~sRPJ, j = 1, 2, ~ J }
where/3-- (/3'1,/3~)'. . t . 1,
, N, then (/~,
~ ) , is the ordinary least squares
If IV, . 1. for
A WLSE can be calculated in two steps. First fix ~ and find /3(z) such that
O(z) = O(/~(~), ~)= min {O(/3, z), ~ . ~ R pj, j = 1, 2}.
In the next step take ~ as the solution which minimises Q(z), i.e.,
Q(~) = rain Q ( r ) = min (Q~(z)+ Q2(z))
where QI(z) and Q2(z) are the first ~ and the last N of squares, respectively. That is,
~ weighted residual sums
= arg min Q(~)
Generally, an explicit expression for ~ is not easy to obtain. The same
procedure can be used when x t is a p × 1 vector observation.
P. R. Krishnaiah, B. Q. Miao
3.2. M a x i m u m likelihood estimation (MLE)
Assume that {e(t)} is independent and normally distributed in model (1.1). The
logarithm of the likelihood is given by
logL(fl, tr?, aft, z) = ½10g2~z - l['clogtr2 + ( N 1
( x t - fl, hl(tt,)) 2 + a f 2
-5[trl 2
z) log tr221
( x t - fl2hz(tk))2] ,
Proceeding along the same lines as (3.3), we can obtain a M L E (/~, blz,
~) of (/~, a 2, a~, z) in two steps. The first step is to fix 'c, find /~(~), a,.z,
i = 1, 2, such that
log L (z) - log (/) (z), G~(*),
= max {logL(/~, tr2, ~2, 'c), / ~ Rpj, ai2 > 0, i = 1, 2}
. N . log. (2 x)
1 ['clog 3"tz(z) + (N - z)log a2(*)
- N]
Then find ~ such that
L ( ~ ) = max L(z)
or, equivalentiy,
= argmin['clogz-lSt(z) + (N-
z ) - 1S2(z))],
where Sl('c) is the sample variance of the first 'c observations, and $2(~) is that
of the last (N - ~) observations. This procedure extend easily to multi-dimensional
observations and to the case of more than one change point.
Suppose the integers kl, . . . , kq satisfy 0 -= k o < k I < " ' " < kq < kq + ~ = N.
Then n = ( k l , . . . , kq) is called a partition of [1, N ] , and
K~N) = {Tr = (k,, . . . , kq), 0 < k, < ' . .
< kq
N} .
Consider the following p-dimensional model:
where {st, t =
such that et ~
Case (i). Aj
vation matrix.
1. . . . .
= A>
N} is a p x 1
At), At > 0. We
0 for j = 1. . . . .
n ~ Kq, say n =
sequence of independent normal variables,
consider two cases.
N. Let X = (x l, . . . , X~v) be a p x N obser(k I . . . . , kq), corresponds to the model
Review about estimation of change points
j = 1. . . . .
To find MLE under M , , we proceed as follows. The supremum of the log
likelihood function is given by
logL(X, ~, A)
_ _ _
1 q+l
Np c - -- log IA] - ~ ~ trace(A- 1Aj(N)),
where IAI denotes the determinant of the matrix A, and
(x,- xkj_,k)(x,- ~: ,k,)
A j ( N ) : N -1
t=kj_l+ l
~fl -
~ t=a+l
A,~ = Z A j ( N ) .
For fixed re, the MLE /i of A is
~1 = A,~(N),
logL(u) = s u p IogL(X, u, A) = _ N log IA(N)I
Next, find $ = (kl . . . . .
iq) such that
logL(~) = max logL(rc).
Case (ii). A j > 0, j = 1, 2 . . . . . N. Proceeding along the
(3.12)-(3.14), we get for the model (re = (k 1. . . . . kq)~ kq)
Mrc: E x i = ~ t j ,
same lines as
J = 1, . . . , q ,
logL(lr) = sup {logL(X, lr, A1 . . . . . Aq+~), A~>O, i = 1. . . . . q + 1}
~ ~xjloglAj(N)l
Np c .
P. R. Krishnaiah, B. Q. Miao
o~ = (kj - k j _ 2 ) / U
ks - k j _ 2 > p .
Finally, find ~t = (kl . . . . , kq) such that
l o g L ( ~ ) = max l o g L 0 z ) .
fi = arg m a x l o g L ( n ) .
3.3. A type o f m a x i m u m
likelihood ratio estimation ( M L R E )
Consider the model
t = 1. . . . .
#2 + e,,
t = z + l, . . . , N ,
where z is unknown, //1 ~//2, and et, t = 1. . . . , N, are iid. with the c o m m o n
distribution N(0, a2). a 2 is unknown.
Consider the null hypothesis Ho: X, = #2, t = 1, . . . , N, against the alternative
t= 1,...,
t= z+ 1,...,N,
where z is unknown. Write
x , = z-2
xt '
(x i _ 2 . ) 2
The standard difference between the observations before and after the change
point is
y~ = ( z ( N - z ) l N ) l / 2 ( 2 ~ - ~ * ) ,
then V, = (z(N
- ~*)lx/~
has a t-distribution with N - 2 degrees
of freedom under Ho. The likelihood ratio test for unknown z is based on the
m a x i m u m t-distribution
V(~) =
I ~'c~N-- 1
Review about estimation of change points
= arg
l ~ z~<N-- I
This method was extended to multi-dimensional case by Srivastava and
Worsley (1986). We need only to note that if we substitute (x - 5~)(x - 2~)' and
( x - 5 * ) ( x - 5 * ) ' for ( x - 5~) 2 and ( x - 5 * ) 2 respectively, then under H o,
y ' ~ W ~ - l y ~ is a Hotelling T2-Statistics. Take ~ such that
H('~) = max
(2~ - 2 * ) ' W~- 1(5~ - 5 " ) = max H ( z ) .
= arg m a x H ( z ) .
C u m u l a t i v e s u m estimation ( C U S U M E )
Consider the model: x i , . . . , x u are independent 0 - 1 r a n d o m variables such
= 0) =
1 - e(Xl
P ( x I = 1 ) = ~" 0 ° '
( 0l,
= 1),
i= 1
i= z+ 1,...,N,
where z is unknown. Let
S t = ~ x i,
t = 1. . . . .
vt = N S , -
tS u ,
If 0o > 01, then it would be reasonable to estimate z by the quantity t maximizing
vt. If there are several such t's, we take
"~1 = inf{to: vto >/vt, t = 1, . . . , N } ,
Similar estimates are then defined when it is known in advance that 0o < 01,
or 0o ¢ 01 as follows:
z 2 = sup {to: Vto<~Vt, t = 1. . . . .
= inf{to:
Iv/ol/> vii,
t = 1, . . . , N } ,
0o < 0 1 ,
for 0o4= 01 .
P.R. Krishnaiah, B. Q. Miao
Under the null hypothesis Ho: (there is no change in probability) versus the
alternative H~: (there is a change in probability at unknown time), the following
statistics Vm
+ , V~ and Vm have the same distributions as the null distributions of
m ( N - m)O+,N_m, m ( N - m)D/n.N_ m and m ( N - m)Om,N_m, respectively, the
multi-dimensional extension of the Kolmogorov-Smirnov two-sample statistics,
V+~ = ( v ~ , I S N : m ) ,
Vm = (V~zlSN = m ) ,
Vm = (v~lS~v = m ) .
Pettitt (1980) indicated that the C U S U M E and MLE of z are asymptotically equivalent. Monto Carlo simulations show that in many cases
Pr {CUSUME = MLE} is approximately one.
Hinkley (1971) also considered change point for independent normal random
variables by means of cumulative sum of sequential residual errors.
4. Bayesian estimate
In this section we discuss the change point problem from the Bayesian
view-point. We only discuss the case of jump model, since the treatment of
continuous model is similar. Suppose we have a jump model with at most one
jump point:
x, =
{0~1 + fllUt-~- l~t,
1, . . . , ~,
t = Z+ 1 , . . . , N ,
u I < u 2 < "" • < UN, ( ~ 1 ' i l l ) # ( g 2 ' /~2) a n d
~ is
unknown. A more general
form is
-}- ~ l l U t l
-[- ' ' '
~ 0C20 + 0~2 lUt 1 -1-
"~ ~lqlUtql -j- ~'t'
t = 1. . . . .
For the sake of simplicity, we only consider the model (4.1). Further, we
assume z obeys some specified prior probability distribution and et's are independent and normally distributed.
(N(0, a~),
t= 1,...,
The Bayesian estimators of z are usually defined by (i) the posterior mode of
z or (ii) the value minimizing the expected posterior loss of the quadratic loss
functions (z -/1) 2 with respect to the set J of admissible values of z.
Review about estimation of change points
Now we give a more detailed description of the results of Schulze (1982), which
generalizes the results of Smith (1975), Frereira (1975), Holbert and Broemeling
(1977) and Chin Choy and Broemeling (1980). We also mention that Chemoff and
Zacks (1964), Kander and Zacks (1966), Bhattacharyya and Johnson (1968),
Gardner (1969), MacNeill (1971), Sen and Srivastava (1973) and Booth and Smith
(1982) also investigated these problems within a Bayesian framework.
First, Schulze (1982) considered improper prior distribution by assuming:
(i) The parameters 0 - ( ~ , fl), a 2 and z are all independently distributed.
(ii) The parameters 01, 02 are uniformly distributed over E2.
(iii) The variances alz, az2 are independently distributed with improper densities
po(a 2) = (a~) v~ and po(a22) = (a2)% where vl and vz are given integers, for
example, vl = v2 = - 1.
(iv) Specified a priori probabilities p0(z), z e a r, are given.
THEOREM 4.1 (Schulze, 1982).
Under (i)-(iv) the prior densities are proportional
po(O, ~2, r) ~
°2 ,
and the corresponding posterior probabilities px(z) for the change point
px(Z) ~ C'(x, Z)po(Z) ,
X =
(X 1 . . . .
, XN) ,
Cl(x, "c)--l(01, 02)' (0,, 02)1 -1/2 F ( z - 2 v l - 4 ) 2
X S 1 ( " c ) ( - m - 2Vl - 4 ) / 2 S l ( r ) (
- N-
F( N-r-2v2.4)-2-
m - 2v2 - 4)/2
Sl(z) and S2(z) denote the residual sums of squares of the least square estimate
Oj, j = 1,2, based on the observations t= 1,..., r and t= z+ 1. . . . . N, respectively.
To define proper prior distribution with respect to the parameters 0j, aj.2, j = 1,
2, Schulze (1982) made another assumption:
(v) Conditional on z = t, the parameters 0j, ~2, j = 1, 2, are independently
distributed as norrnalgamma variables NF(2, Tl(t)) and NF(2, T2(t) ) with parameters
Tj(t) = (rj(t), Gj(t), 0s, Ss.(t)), j = 1, 2.
where rs(t) >>.3, Gs(t) is the positive definite matrices of order 2. (For definition
of NF(p, T) distribution see Humak (1977, A, 2.38, p491)).
P. R. Krishnaiah, B. Q. Miao
THEOREM 4.2 (Schulze, 1982).
Under (iv) and (v), prior densities
po(O, a 2, z) oc po(z)pl o(01, a? 2pz)p2o(02, a f
where pjo(02, %. 2p z) denote the densities of N(2, Tj(z))-distributions, j = 1, 2, and the
corresponding posterior probabilities for z are obtained in the form:
px(Z) oc C2(x, Z)po(Z),
C2(x, "C)
[I F(frjx(Z ) - 2)/2) x Sj(z) (rA~)-2)/2 x Iafiv)l 1/2
j=l ~ - - - 2~2)--)( Sjx(7~)(rjx(T)-2)/2 X IGjx(z)l '/2
r,x(Z) = rl(r) + z,
rzx(Z) = rz(z ) + N -
ajx(T ) = Gj(T) q- (01, 02)' C01, 02),
Sjx('~ ) = Sj('~)q- ~ xi 2 --~ O'lal(,~)O 1 - Ojx(T)' alx('C)Ojx(T,),
Oj~(~) = [Oyx(z)'Gj(z)+ Xj(~)(O,, 02)]Gj;',
X;(27) = (X 1. . . .
, Xz)' ,
j = 1, 2,
X~('£) ----( X z + l , " ' ' ' XN)' "
In order to find the estimate z, one possible choice is that
= arg maxpx(z ) = arg max CS(x, Z)po(r),
j = 1, 2.
Note that we have to calculate all px(Z) and thus all values Cffx, z), j = 1, 2,
according to previous formula (4.8). But for each fixed z c J , the calculation of
CS(x, z), especially for C2(x, z), is quite complicated. Most of the effort is devoted
in searching for an optimal ~.
Another choice of the estimate @ is defined by minimizing the expected
posterior loss function.
/I = arg min R ( # ) ,
R ( / 0 = ~ (z - #)2px(Z) = arg min Epx(Z)(z- #)2.
It is well-known that E(zl x) = Y ~ s zpx(z) = arg min,~sR(#). But E(z[ x) is not
necessary an element of J. So we can take
fi = arg rain (# - E(z[x)) 2 ,
i.e., the estimate is the point in J which is nearest to the posterior expectation of
Review about estimation of change points
z. Notice that
E(z[x) = ~ "cpx(0 =Y~*~J zCJ(x' z)P°(Z)
Y ~+ C;(x, z)Po(*)
It requires to calculate px(Z) and thus CJ(x, z) for all z s J
Therefore, the two methods presented above are comparable.
with po(Z)> 0.
5. Large sample properties of the estimates of change points
In the sequel we consider the multivariate jump change model. Let X(t) be an
independent p-dimensional process on (0, 1] such that
0 < t ~ < 1,
where g(t): p x 1 is a non-random left-continuous step function and V(t): p x 1
is an independent normal process with mean vector 0 and covariance matrix
A / > 0 in the j-th horizontal segment. Denote all the jump points of /~(t) by
tl, . . . , tq, i.e. #(tj):~/~(tj + 0 ) , j = 1, . . . , q, where 0 < tl < "'" < tq < 1. tl, . . . , tq
are called change points of the process X(t). Assume that N samples are drawn
from X(t) in equal-spaced t, say X ( j / N ) , j = 1. . . . . N. We are goint to find a set
of numbers, say n (u) = (k~N), . . . , kCqN)), such that
E X(i/N) = #j,
VarX(i/N) = Aj,
for k}U_)l < i <~k) N), j = 1,2 . . . . . q + 1,
where k(oN) = 0 , &q+IL(N)__--N , and #j ~ ~j+l. The number q of change points may be
assumed known or unknown, but it is known that q is less than a given constant
L. Let
/~(LN) = {7["(N) = ( k ? ) ,
k~N); 0 < k~N) < " "
< k~N) < N, l = 1 , . . . ,
= U K(N)
where K(qN) is defined by (3.6). For simplicity, we omit all the superscript N of
X, k, n etc., for example, k j - k ) u), n - n ( N ) , K q _ K ( N ) q . Further, define
X~ = X ( j / N ) . We must always keep in mind that these qualities are dependent on
By (3.3), for given n = (kl . . . . . kq) e Kq, under model M , , we get
sup logL(X, n, N) = __N loglA=(N)l + b I - G °) + b~.
P. R. Krishnaiah, B. Q. Miao
Similar to (5.4), for given
rc = (kt, ..., kq) e
rr = (k~ . . . . .
Aq+,), A ; > 0 ,
sup {logL(X, lr, A~, . . . ,
/~I,, it follows:
i--- 1 . . . . .
q + 1}
N q~l
2 j = l c~/l°g[Ai(N)l + b 2 ~ G ~ ) + b2"
where b I and b 2 are constants independent of rr and Av, and A,~(N), A/(N) and
aj are defined by (3.11), (3.9) and (3.16), respectively.
5.1. Estimate of change points when q is known
(1) Assume
such that
G ~( 1-) _
max /tr,~
¢r = arg max G ~ ) ,
IrE Kq
(2) A s s u m e ^ we have no prior information about Aj, j = 1, . . . , q + 1. Take
= (kl, . . . , kq) from gq such that
G(~ )= max G (2)
= arg max G ~ ) ,
IrE g q
Then we have
THEOREM 5.1 (Krishnaiah,
strongly consistent estimate of (t 1. . . .
and Zhao,
]¢q/N) is a
5.2. Estimate of change points when q is unknown
Let {C~)}, (D~)}, j = 1, 2, be two sequences such that
N >> D ~ ) >> "-'No'(1)>> l o g N ,
N >> D ~ ) >>~N~(2) >> log2N,
Hereafter a N >> fiN means limN+ oo ~N/flN = 00.
Suppose the number of change points is less than some known constant L.
Consider two cases.
(i) All the A/ are equal. Let
Q~) = --
l o g l A , ( N ) l - # ( ~ ) C ~ ),
Review about estimation of change points
where #{re} denotes the number of cut-off points in n. Take ~ = ( k , , . . . , lcz)
~/~L such that
Q(~)= max o (1)
(ii) There is no prior information about A, . . . . .
Aq+ 1
except q<~L. Let
q + 1
= --- ~
~tjloglA,(N)l- # ( n ) C ~ ),
r t ~ g L.
Take ~ = (~:1. . . . . k~,)e/~ L such that
O(2)= max a(~).
In both cases k 1. . . . . kz can be grouped into sets M1, M2, ..;, by the
follow!ng procedure. Let k, be an element of M,. For k2, if k2 kx < D~,
then k2~M1, otherwise k 2 e M 2, where i = 1, 2 corresponds to cases (i) or
(ii), respectively. Continuing this procedure, we get
if ~1+1 - k t < D ~ ) ,
t Ms+ 1 otherwise.
i= 1, 2,
Thus, k,, . . . , k h are grouped into sets M 1. . . . . M 4. Here we note that M;,
are all dependent on N. Krishnaiah, Miao, Subramanyam and Zhao (1987) proved
this result:
With probability one for large N,
(4, k,,IN, ..., ]%IN)-> (q, t, .....
where 1% is any element of Mj.
5.3. Local likelihood estimation (LLE)
The previous results are difficult to put into practical use if the number of
change points is rather large. In this connection Krishnaiah, Miao and Zhao
(1987) developed a new method, so-called local likelihood method, to estimate the
position and the number of change points. This procedure is feasible computationally. Now we introduce this procedure.
Consider model (5.1). For every k, k = m . . . . . N - m, we can construct that
Ak(N ) = ½(A lk(N) + Auk(N)),
P. R. Krishnaiah, B. Q. Miao
AI~(N) m-1
(xi -
l k)(Xi
-- X k - r n +
l k)'
A2k(N ) = m- '
(Xi -- X ~ + m ) ( X i
-- Xk~ +m)' ,
B k ( N ) = (2m)
(x, - ~ - m +,*+m)(X,
-- ~ * - m
GN(k )
+ ~ ~ +m)'
m log [Ak(N)[ - m log ]Bk(N)] ,
where X u is defined by (3.10).
When all the Aj are equal to A, take m = m N which satisfies (5.8). Define
= {k:
k =
m, m + 1. . . . , U - m, - G u ( j ) > C ~ ) } ,
rain {k: k e D N }
D I N = {k: k ~ ON, k -
k z u = rain {k: k E O N - D I N } ,
D i N = {k: k ~ O N - D1N, k - k 2 u < 3 m } ,
Continuing this procedure, we obtain
D N = DIN + '''
where each
DjN t j =
1. . . . .
+ D4N ,
4, is not empty. Put
tj = 2N {kj,~ + max(kj, k j e D j N } , j = 1. . . . .
THEOREM 5.3 (Krishnaiah, Miao and Zhao, 1987b).
(q, t 1, . . . , to) is a strong-
ly consistent estimate o f (q, q , . . . , tq).
Proceeding along the same lines as above, we can obtain a number of results
concerning this estimate• For example, we have
Suppose that there is no prior information about Aj, j = 1. . . . ,
q + 1. L e t
m log I A 2 k ( N ) - m log rBk(N)[
G u ( k ) = 2m log IA,k(N)] + ~-
where Ar,~(N), 7 = 1, 2, and BI,(N) are defined in (5.15)-(5.17). Suppose m = m u
Review about estimation of change points
(q, t l , " ' ,
(5.9). Define DN, DjN , t~, j = 1. . . . . q, by (5.19)-(5.23),
to) is" a strongly consistent estimator of (q, tl, . . . , tq).
5.4. M L E of change points with restricted condition in mean
Bartholomew (1959) first proposed the following testing problem. Suppose
X l, . . . , X u are independently normally distributed, X~ ~ N(#i, a~2), i = 1, . . . , N,
and a~, i = 1. . . . . N, are known. It is desired to test whether X~ . . . . . X u have
the same mean when the rank order of these means is known. H e introduced a
test statistic, but did not consider the estimation problem. Our local likelihood
method is especially suited in estimation of this type, whatever the variances are
equal or not. The case where only one change point exists is investigated by Sen
and Srivastava (1975) and Holbert and Broemeling (1977), among others.
In the model (5.1), let p = 1, X; = X(i/N), i = 1. . . . . N, be independent normal
variables, #j's, defined by (5.3), satisfy
' ' >#q+l
Var X ( i / U ) = )~j,
kj. L < i < ~ k j ,
1. . . . .
Take a positive integer m = m N < N which will be defined below. For k = m,
m + 1, . . . , N - m, we assume that
EXk_m+ I
. . . . .
EXk+ l
. . . . .
VarXk-m+ 1 .
. . . .
VarXk = 2 (l~ ,
VarXk+ 1 . . . . .
VarX~+,~ = 2 (2) ,
where 2 (i), /~(i), i = 1, 2, may not equal to 2i, #i, i = 1, 2, respectively.
Case (i). All the 2fs are equal to 2. The logarithm of the likelihood ratio
statistic for testing the null hypothesis H~: #(~)=/~(2) against the alternative
Kk: # ~ ) > #(2) is given by
GN(k )
(A~(N)B~ ' ( N ) ) l ( Y k _
m + I k > X k k +m)
where Xij is defined by (3.10) and I(A) denotes the indicator of a set A, and A k ( N )
and Bk(N ) are defined by (5.14) and (5.17).
Take m m N and ~ur(l~ to satisfy (5.8). Define DN, Dis, tj, j = 1, . . . , q by
(5.19)-(5.23). Then Krishnaiah, Miao and Z h a o (1986) proved the following
(ql . . . . .
Under case I, (0, t l . . . . .
t 4) is a strongly consistent estimate
P. R, Krishnaiah, B. Q. Miao
Case (ii). The only thing known about 2's is that 2e > 0, i = 1, ..., q. By the
same methodology, we have
THEOREM 5.6. Let m = m N and a positive number C ~ ~ satisfy (5.9). Define DN,
DjN, 2j, j = i, . . . , 0 by (5.19)-(5.23). Then (0, t l . . . . . 2o) obtained from above
procedure is a strongly consistent estimate of (q, q, . . . , tq).
5.5. Non-parametric estimation
Quite a lot of papers appeared handling the change point problem by nonparametric methodology. Since in this book Cs6rgO and Horvhth have made a
detailed survey on this subject, we shall content ourselves with some supplementary remarks.
Yin (1986) proposed a method to search the change points by comparisons
made locally. Specifically he considered the model (1.1), in which the non-random
function may have discontinuity points t I . . . . . ta of the first type, which he
defined as the change points of the model. The function f is supposed to obey
the Lipshitz condition within each interval [a, b] c (0, 1] not containing
tl, ...,
Suppose that we have observed x(i/N), 1 <<,i <~N. Choose a positive integer
m = m u appropriately, and define
l ~x(k-m"
DNk = -- m ~
\ ~ /
Intuitively it is clear that when k i N ~ t+ for some i, IDNkl tends to be large.
Otherwise it will be smaller. This simple observations suggests the following
procedure: Choose hN > 0, N = 1, 2, ... , and define
11 = the smallest k such that
IO N k I -- ( k / N ) h N
= max (IDNj] - (j/N)hu)
12 = the smallest k such that
IONkl -- (k/U)h N = max { IDNjl - (j/N)hu: IJ - 111 > 4mN},
Is = the smallest k such that
IDN~,b - (k/N)hN = max {IDol - (j/N)hN: IJ - I / > 4mN,
l <~i<~s-- 1}
The following theorem is true.
THEOREM 5.7 (Yin, 1986). Suppose that mN/N-~ O, h u ~ 0, and mN/(NhN)--~ O.
Then with probability one we have
(i) I/j - tjl ~< 2mzv/N, 1 <~j <<.q, for N large,
Review about estimation of change points
(ii) DNj ~ f ( t j + O) - f ( t j - 0),
(iii) DNj = O(hN), j > q.
the j u m p at tj,
Based Aon this theorem, if we choose CN = I h N l o g h N I , pick up those integers
kq, such that [DNk, I < C N , l<~i<<.(t, then with probability one, the
number ~ of such integers tends to q - - t h e number of change points, and
k l / N , . . . . k ~ / N (for such k) tend to the change points t I . . . . . tq.
Chen (1987) considered the case where at most one change is allowed, and f
is a step function:
f( t)
0 < t ~< t o ,
t o < t < ~ 1,
where a, 0 and to are unknown. In this simple case Chen derived the asymptotic
distribution of the test statistic under the null hypothesis that no change point
THEOREM 5.8 (Chert, 1987). Suppose that Xl, . . . , x N are iid with a c o m m o n
normal distribution N(a, a2). L e t m = raN, N = 1, . . . , 2 . . . be positive integers such
lim m / N = 0 ,
N ~
lim ( l o g N ) 2 / m = O.
Yk = (2m) 1/2
t 2m
xi -
~u = max {I Ykl: k = m, . . . , N -