Download Evidence Theory - University of Toledo

Document related concepts

Statistics wikipedia , lookup

History of statistics wikipedia , lookup

Probability wikipedia , lookup

Probability interpretations wikipedia , lookup

Transcript
Comparison of Evidence Theory and Bayesian Theory
for Uncertainty Modeling
Prabhu Soundappan
Efstratios Nikolaidis1
Mechanical, Industrial and Manufacturing Department
The University of Toledo
Toledo, OH-43606
USA
Email (Nikolaidis) [email protected]
R. T. Haftka
Department of Aerospace Engineering, Mechanics and Engineering Science
The University of Florida
Gainesville, FL 32611-6250
USA
Ramana Grandhi
Department of Mechanical and Materials Engineering
Wright State University
Dayton, OH 45435
USA
Robert Canfield
Air Force Institute of Technology
WPAFB, OH 45433
USA
Abstract
This paper compares Evidence Theory (ET) and Bayesian Theory (BT) for uncertainty
modeling and decision under uncertainty, when the evidence about uncertainty is
imprecise. The basic concepts of ET and BT are introduced and the ways these theories
model uncertainties, propagate them through systems and assess the safety of these
systems are presented. ET and BT approaches are demonstrated and compared on
challenge problems involving an algebraic function whose input variables are uncertain.
The evidence about the input variables consists of intervals provided by experts. It is
recommended that a decision-maker compute both the Bayesian probabilities of the
outcomes of alternative actions and their plausibility and belief measures when evidence
about uncertainty is imprecise, because this helps assess the importance of imprecision
and the value of additional information. Finally, the paper presents and demonstrates a
method for testing approaches for decision under uncertainty in terms of their
effectiveness
in
making
decisions.
1
Corresponding author
Introduction
The information in many problems of design under uncertainty, especially those
involving reducible (epistemic) uncertainty, is imprecise.
Reducible uncertainty is
uncertainty due to lack of knowledge, as opposed to random (aleatory) uncertainty, which
is due to inherent variability in a physical phenomenon. It is called reducible, because it
can be reduced or eliminated if one collects information.
The uncertainty in the
probability of getting heads in one flip of a bent coin is reducible, because it is due to
lack of knowledge. It can be reduced if we conduct experiments. On the other hand there
is aleatory uncertainty when flipping a coin even if we know the probability of events;
heads and tails. This type of uncertainty cannot be reduced even if we conduct n
experiments, where n is a very large number. For example, we know the probability of
heads and tails in a fair coin is 0.5. But each time you flip it, you are uncertain about the
output. So this uncertainty is also called irreducible uncertainty. Oberkampf et al. [1]
explained these two types of uncertainty and presented examples in which aleatory and
epistemic uncertainty are encountered in engineering problems.
There is no consensus about what the best theory is for modeling epistemic
uncertainty. Oberkampf et al. [1] studied the differences and similarities of epistemic and
random uncertainty.
In their study, they used a hybrid approach in which random
uncertainty was modeled using probability and epistemic uncertainty was modeled using
intervals bounding variables in which there was epistemic uncertainty. Theories for
modeling epistemic uncertainty, include Coherent Upper and Lower Previsions [2-4],
Possibility Theory [5-8], Evidence Theory [9-10], the Transferable Belief Model [11] and
Bayesian Theory [12-13]. Information gap-Decision Theory is another alternative for
decision making under uncertainty when information about uncertainty is scarce [14].
2
Information about epistemic uncertainty is usually in the form of intervals. For
example, if we show a glass jar containing beans to a person and ask her how many beans
are in the jar, then she is more likely to give a range rather than a precise number.
Similarly, if we ask an expert what he thinks the prime interest rate will be in 2007, he
will probably provide a range rather than a single number. The same is true, if we ask an
expert about the probability of getting “heads” in a flip of a bent coin.
The objective of the study presented in this paper is to compare two approaches,
one using Evidence Theory (ET), the other using Bayesian Theory (BT), for
characterizing uncertainty in situations, such as the ones presented in the previous
paragraph, where the information about uncertainty is imprecise.
Specifically, the following problem is considered:
The performance of a system is characterized by variable Y, which is a function of
uncertain variables X1,...,Xm. We know the relation between the performance variable Y
and variables X1,...,Xm, y = f ( x1 ,..., xm ) 2. We have information about the values of
variables X1,…,Xm, which is in the form of intervals obtained from n experts. These
intervals have the form [ x l i, j , x u i, j ] where subscript i specifies the variable, and j
specifies the expert. Suppose that a system survives if the performance variable Y falls in
a certain interval or collection of intervals denoted by S. There is no uncertainty in the
functional relation of the performance variable and the input variables, and in the
definition of survival. We want to model the uncertainty in the independent variables,
derive a model about the uncertainty in the performance variable Y and assess safety.
2
In this report, capital letters denote variables and lower case letters denote values that these variables
assume.
3
First, the assumptions of the ET based approach are presented and then an
approach for constructing models of uncertainty is developed on the basis of these
assumptions, in section 2. A method for propagating uncertainty through a system to
estimate the uncertainty in the response from the uncertainty in the input is shown.
Finally, equations for computation of the Belief and Plausibility of failure of the system
are presented. A simple example of an algebraic function is used to demonstrate each
step of the approach.
Section 3 presents a Bayesian approach for constructing models of uncertainty.
First, Bayes rule for updating the prior mass function of a discrete variable or the
probability density function of a continuous variable is reviewed. Then an approach for
constructing a model of uncertainty using Bayes rule from expert evidence about the
random variables, which is in the form of intervals, is presented. An example
demonstrating each step of this approach is also included.
In section 4, ET and BT approaches are demonstrated and compared on a series of
challenge problems involving epistemic uncertainty proposed by the Epistemic
Uncertainty Group [15].
As mentioned earlier, decision-makers have an arsenal of different theories and
methods based on these theories for making decisions under uncertainty. There is no
consensus as to what method is most suitable for problems with epistemic uncertainty,
when information is scarce and imprecise. Comparisons of alternative approaches on the
basis of their effectiveness in making decisions under uncertainty could help understand
better these methods and assess their effectiveness in modeling epistemic uncertainty.
Section 5 proposes an approach for comparing methods for the solution of the challenge
4
problems. The approach uses alternative methods for characterizing uncertainty to make
decisions, the outcomes of which are later evaluated through numerical simulations or
physical experiments.
1. Evidence Theory Approach
Assumptions
The following are the key assumptions of the ET approach:
1. If some of the evidence is imprecise we can quantify uncertainty about an event
by the maximum and minimum probabilities of that event. Maximum (minimum)
probability of an event is the maximum (minimum) of all probabilities that are
consistent with the available evidence.
2. The process of asking an expert about an uncertain variable is a random
experiment whose outcome can be precise or imprecise. There is randomness
because every time we ask a different expert about the variable we get a different
answer. The expert can be precise and give a single value or imprecise and
provide an interval. Therefore, if the information about uncertainty consists of
intervals from multiple experts, then we have uncertainty due to both imprecision
and randomness.
If all experts are precise they give us pieces of evidence pointing precisely to
specific values. In this case, we can build a probability distribution of the variable. But if
the experts provide intervals, we cannot build such a probability distribution because we
do not know what specific values of the random variables each piece of evidence
5
supports. In this case, we can use second order probability3, or we can calculate the
maximum and minimum values of the probabilities of events. The latter approach does
not require any additional information beyond what is already available.
To demonstrate this philosophy of calculating maximum and minimum bounds
for the probability of an event when the evidence is imprecise, consider the following
problem. We roll a weighted die n times and videotape the results. The statistical error is
negligible because n is large. Later we discover that we cannot determine precisely the
outcomes in the experiments from the tape. We can only tell that 40% of the experiments
resulted in a number less or equal to three and the other 60% in a number greater than
three.
In this case, we cannot estimate the probability of each of the numbers 1-6,
unless we make arbitrary assumptions about the likelihood of getting numbers between 1
to 3 and 4 to 6. But we could estimate that there is a 0.4 probability of getting a number
between 1 to 3 and 0.6 probability of getting a number between 4 to 6. Instead of making
additional assumptions, such as that all numbers between 1 to 3 are equally likely, we
could conclude that every number from 1 to 3 can have a probability as high as 0.4 and as
low as 0 and every number from 4 to 6 can have a probability as high as 0.6 and as low as
0.
Modeling uncertainty
First consider one variable X1. The information consists of intervals obtained from
n experts [ x l 1, j , x u 1, j ] thought to enclose the precise value x1,j . The intervals can be
nested, in which case we have consonant evidence, they may overlap, or they may be
3
Second order probability treats the variables associated with epistemic uncertainty as random variables
with their own probability distributions and computes a probability distribution of the probability of occurrence
of an event. For example, in a experiment that involves flipping a bent coin the probability of the event
“heads” is treated as a random variable.
6
disjoint, in which case we have conflicting evidence. When an expert provides an
interval instead of a value, then the expert is telling us that the true value of the variable
could be anywhere in this interval. Therefore, the evidence from the expert could or could
not support a particular value in that interval. The maximum probability of the variable
being equal to x is the ratio of the pieces of the imprecise evidence from the experts that
could support x to the total number of intervals. For example, if experts 1 and 2 told us
that the value of the gas ten years from now can be between $1 and $5 and $1 to $10,
respectively, then on the basis of this evidence, the probability of any value between $1
and $5 could be as high as 1 and the probability of any value greater than $5 and less or
equal to $10 could be as high as 0.5.
The maximum probability of X1=x , Pu(X1=x), can be found by solving the
following optimization problem:
Find x1,1 ,..., x1,n
to maximize P( X 1 = x ) =
1 n
∑ Ii
n i =1
(1)
 1 x1,i = x
where Ii is an indicator function: I i = 
0 otherwise
so that x1,i ∈ [ x1l ,i , x1u,i ]
This maximum probability will also be called Plausibility. The above formulation
indicates that the maximum probability of X1=x is the ratio of the number of intervals of
the experts containing x to the total number of intervals.
The minimum probability X1=x, Pl(X1=x), can be found by solving the following
dual optimization problem:
7
Find x1,1 ,..., x1,n
To minimize P ( X 1 = x ) =
1 n '
∑I i
n i =1
(2)
1 x l = x u = x
1,i
1,i
where I i' is an indicator function: I ' i = 
 0 otherwise
So that x1,i ∈ [ x1l ,i , x1u,i ]
This minimum probability will also be called Belief.
From the above formulation, we conclude that the minimum probability of X1=x
is the ratio of the number of intervals that coincide with point {x} to the total number of
intervals. This probability is zero unless there is precise evidence pointing at x.
One can easily extend the formulations of the above two optimization problems to
find the maximum and minimum probabilities of any event associated with variable X1,
such as the event that X1 assumes a value in a given interval or set of intervals. The
Plausibility and Belief can also be found solving equations (8) & (9), once the body of
evidence of the input variables is resolved. The body of evidence is molded from the
intervals given by the experts using the mixing or averaging technique. We found this to
be the most intuitive technique when one does not have any knowledge about the experts.
The evidence can be combined using several other techniques like Dempsters rule of
combination, Discount+Combine method, Yager’s modified Dempster’s rule, Inagaki’s
unified combination rule and several others. These combination rules are studied in detail
in [10].
The following assertion relates maximum and minimum probabilities to
Plausibility and Belief, which are used in evidence theory [6] to characterize one’s belief
8
about the occurrence of events. Evidence theory can be viewed as an extension of
probability theory.
It is suitable for characterizing uncertainty when evidence is
imprecise because it allows one to estimate probabilities of intervals instead of
probabilities of specific values. These intervals are called focal elements and their
probabilities basic probabilities.
Assertion
Consider the experts providing n intervals about a variable. These intervals are
considered as focal elements and have basic probability 1/n. Then the Plausibility and
Belief of any event associated with the variable are equal to the maximum and minimum
probabilities of the event, respectively.
Justification
The maximum probability an event represented by a set C is equal to the ratio of
the number of focal elements (intervals provided by the experts) that intersect with C to
the total number of focal elements. Indeed, all of the evidence provided by the experts,
consisting of intervals intersecting with C, could support C because, according to these
experts, the true value of the variable could be in C. The rest of the evidence cannot
support C because the rest of the intervals and C are disjoint. That means the maximum
probability of C is equal to the sum of the basic probabilities of the focal elements that
intersect C, which is the Plausibility of x. Similarly, the minimum probability of C is the
number of the focal elements contained in C divided by the total number of focal
elements, which is the Belief of C.
The following examples are based the challenge problems [15].
9
Example 1: Two experts said that variable A is in the following intervals: [0.1, 0.4] and
[0.3,0.6]. Two experts said that variable B is in the following intervals: [0.2, 0.5] and
[0.4,0.7]. Figure 1, shows the maximum probabilities (Plausibility) of A and B,
respectively.
Consider m variables. If there is no information about the correlation of the
variables and the experts are equally credible, we can transfer the evidence about each
variable, into the m-dimensional space of all the variables using the following equation:
m X1 X 2 ...X m ([ x1l , x1u
], [ −∞ ,+∞ ],..., [ −∞ ,+∞ ]) =
m X1 ([ x1l , x1u ])
(3)
m
In the above equation m X 1 ([ x1l , x1u ]) is the basic probability of the interval
[ x1l , x1u ] .
Symbol
m X1 X 2 ...X m ([ x1l , x1u ], [ −∞ ,+∞ ],..., [ −∞ ,+∞ ])
is the basic
probability of the same interval in the m-dimensional space of the variables.
This
equation can be justified as follows: if an expert says that a variable is in a given interval
this is true for both the space of that variable and the m-dimensional space of the m
variables. But if we have m bodies of evidence for m variables, then the evidence must be
normalized by m when transferring evidence from the one-dimensional space of a
variable to the m-dimensional space.
If we know that the variables are independent (that is, information about any
group of variables does not change our belief about the others) then we can use the
following approach to combine the evidence about the variables into a single joint body
of evidence. a) Focal elements of the joint body of evidence are the elements of the
Cartesian product of the elements of the evidence about the individual variables. b) The
10
probability of each element in the m-dimensional space is the product of the individual
probabilities.
l , x u ]) = m ([ x l , x u ]) ⋅ ... ⋅ m
l
u
m X1 X 2 ... X m ([ x1l , x1u ],..., [ x m
m
X1 1 1
X m ([ x m , x m ])
(4)
The above equation is a special case of Dempster’s rule of combination when the
bodies of evidence from different experts are independent and equally credible. Since the
rule has been justified in other publications, such as [9], we will not justify it here.
From the joint body of evidence, we can estimate the maximum joint probability
of the variables.
Example 2: This is a continuation of example 1. If A and B and independent, the joint
body of evidence is shown in Fig. 2. Specifically, this figure shows the focal elements of
the joint body of evidence, which are the four rectangles (boxes) in Fig. 2. The four boxes
in Fig. 2 is the result of the Cartesian product of the individual bodies of evidence of
variables A and B. The maximum joint probability (Plausibility) of these variables is
shown in Figure 3.
Propagating uncertainty through a system
Here we compute the maximum probability of variable Y, which is related to the
input variables through function y = f ( x1,..., xm ) , from the joint body of evidence about
X1, …,Xm. First, we transform the joint body of evidence about the input variables into
evidence about variable Y. For this purpose, we map the focal elements in the mdimensional space of the independent variables into the elements in the space of variable
Y. We can do this by solving one maximization and one minimization problems to find
the limits of variable Y, when the input variables vary within each focal element in the
11
joint probability space. Mathematically, we solve a pair of optimization problems to find
the interval in the space of variable Y given the focal element in the space of the variables
X1, …,Xm.
Find X1, …,Xm
To min (max) Y = f ( X 1 ,..., X m )
(5)
So that X i ∈ [ X il , X iu ] , where X il and X iu are the lower and upper bounds of Xi
corresponding to each focal element in the joint space.
Example 3: This is an extension of example 2. After calculating the joint body of
evidence of variables, the limits of Y can be found using Eq. 5. In Fig. 2, we have four
boxes in the joint space of variables as the outcome of the Cartesian product of the
individual variables. The shaded box in Fig. 2 is the product of the focal elements
A[0.1,0.4] and B[0.2,0.5]. The basic probability assigned to this shaded box (Fig.2) is the
product of the probabilities of the individual focal elements. The corresponding limits of
Y can be found solving Eq. 5, which is [0.897,1.039], the shaded ellipse in Fig. 4 and 5.
The basic probability assigned to this focal element is ¼. Using the same procedure, the
rest of the focal elements and the basic probabilities are calculated to construct the body
of evidence of Y.
The above optimization problems can be solved using nonlinear programming or
Monte-Carlo simulation. This yields a set of intervals for Y, Ci, and the basic probabilities
of these intervals, mY(Ci). Then we compute the maximum probability of Y being equal
to y through the following equation:
P u ( Y = y ) = ∑ mY ( Ci )
y∈Ci
12
(6)
In the above equation, Ci are the intervals that contain y.
We can also find the maximum and minimum cumulative probability distribution
functions (UCDF and LCDF, respectively) of Y. These functions provide the maximum
and minimum and values of the probability of Y being less or equal to a value y. The
UCDF is obtained using the following equation:
FYu ( y ) = ∑ mY ( Ci )
(7)
The sum in the above equation includes all the elements Ci that intersect with
interval [-∞, y]. The LCDF is obtained from the same equation but the sum includes all
elements that are contained in the interval [-∞, y].
Example 4: Consider function Y=(A+B)A. The maximum probabilities of variables A and
B were found in Example 1 and the joint maximum probability of A and B in Example 2.
Example 3 explains how we construct the body of evidence of Y from the joint space.
Figures 4 and 5 show the body of evidence of variable Y, and the corresponding
maximum probability and the maximum and minimum cumulative probability
distribution functions of Y, respectively.
It is observed from Figure 4 that the maximum probability of Y=y is an overly
conservative measure of the Likelihood of this event. Indeed, unless the probability
density function (PDF) of Y has a delta function at Y=y, the probability of this event is
zero, while the maximum probability of this event assumes a non zero value in the range
from 0.81 to 1.17.
Figure 5 shows that there is a large gap between the maximum and minimum
bounds of the cumulative probability of Y. This indicates a large uncertainty in the true
value of this probability.
13
There are two reasons for this large uncertainty: a) The
intervals provided by the experts about the values of the independent variables A and B
are wide and they are nested. b) Only the information provided in the problem statement
was used to model the uncertain variables.
Assessing safety
As mentioned in the introduction, when the evidence is imprecise it is useful to
know how low and how high the probability of survival (or failure) of a system can be.
Indeed, when evidence is imprecise, it could be reasonable to design a system, whose
failure could have severe consequences, using the most conservative models that are
consistent with the available evidence. The maximum and minimum probabilities of
survival can be found using the following equations:
P u ( S ) = Pl( S ) =
∑ mY ( Ci
Ci I S ≠ 0
)
(8)
P l ( S ) = Bel( S ) = ∑ mY ( Ci )
(9)
Ci ⊆ S
Uncertainty Measures
ET considers two types of uncertainty. One is due to the imprecision in the
evidence; the other is due to the conflict. Nonspecificity and Strife measure the
uncertainty due to imprecision and conflict, respectively. Both measures are expressed in
bits of information. In the following, we briefly present these measures. A detailed
presentation can be found in [16, 17].
The larger the focal elements of a body of evidence, the more imprecise is the
evidence and, consequently, the higher is Nonspecificity. When the evidence is precise
14
(all of the focal elements consist of a single member), Nonspecificity is zero. In the
challenge problems, the broader the interval of the experts, the higher is Nonspecificity.
Strife measures the degree to which pieces of evidence contradict each other.
Consonant (nested) focal elements imply little or no conflict. Disjoint elements imply
high conflict in the evidence. For example, if the experts’ intervals are disjoint, the
experts contradict each other. Therefore, Strife is large. For finite sets, when evidence is
precise, Strife reduces to Shannon’s entropy, which measures conflict in probability
theory.
Nonspecificity measures the epistemic/reducible uncertainty, the uncertainty
associated with the sizes (cardinalities) of relevant sets of alternatives. Consider a body of
evidence <F,m>, where F represents the set of all focal elements and m their
corresponding basic probability assignments. Here N(m,µ) measures the Nonspecificity
in bits.
N(m, µ) = ∑ m(B) ⋅ µ(A)
A⊆ X
(10)
where µ(A) = log 2 (A) for discrete domains and µ(A) = ln(1 + A ) for continuous
domains. A is the Lebesgue measure of A. The Lebesgue measure of an interval is its
length.
Strife measures conflict among the various sets of alternatives in a body of
evidence. Strife measure in evidence theory is given by,


S (m ) = − ∑ m(A)log 2  ∑ m(B) ⋅ SUB(A, B)
A∈F

 B∈F
15
(11)
where SUB(A, B) =
infinite
sets.
A∩ B
A
1
if A ≡ φ

( A∩ B )
if ( A ) > 0
for finite sets and SUB(A, B) = 
(A)

 0
otherwise
Symbols ( A ) and ( A∩ B)
represent
the
Lebesgue
measures
for
of
A and A∩ B , respectively.
2. Bayesian Approach
This section explains Bayes rule and presents an approach for constructing a
probability mass/density function of discrete and continuous variables using evidence
from experts. Methods for estimating the probability density function of the response of
a system and its probability of failure given the probability density function of the input
variables, such as Monte Carlo simulation and Fast Probability Integration, are well
documented [18]. Therefore, they will not be discussed here.
Bayes rule
Discrete case: Updating a Prior Probability Mass Function using evidence
Suppose we have information about variable X whose Prior Probability Mass
Function (PMF) is given by the set of possible values x1,…,xj and the corresponding
probabilities P(X = xj) , j = 1,…,J. Then we observe a sample value of another variable
Y. The Likelihood probability, P(Y=y/X=xj), is determined from the conditional PMF of
Y given X. Bayes rule can be applied to update the Prior PMF of X, when the sample
value of Y is observed, to estimate the Posterior PMF, P(X= xj | Y=y). Bayes rule for the
discrete case is:
16
P(Y = y | X = x j ) ⋅ P( X = x j )
P( X = x j | Y = y ) =
J
∑ P(Y = y | X = x j ) ⋅ P( X = x j )
(12)
j =1
Continuous case: Updating a Prior PDF using evidence [19]
The uncertainty in a continuous random variable, X, can be represented by its
PDF, f Xo ( x ) , which can be updated on the basis of evidence, E, into a Posterior PDF,
f X ( x / E ) . This function can be calculated using Bayes theorem:
f X (x/E) =
1
⋅ L(E/x) ⋅ f Xo (x)
k
(13)
f X ( x / E ) is the Posterior PDF of variable X and k is a normalization constant:
k = ∫−∞∞ L( E / x ) ⋅ f Xo ( x )dx
(14)
L( E / x ) is the Likelihood function, which is the conditional probability of
observing evidence E, given X=x.
If we do not know the Prior PDF of X, then we can assume a noninformative/maximum entropy Prior. If we only know that a variable is in a certain range
then the uniform probability density is the one with maximum entropy. If the evidence is
imprecise (i.e. it consists of intervals instead of single values) Bayes rule cannot be
directly applied. The analyst needs to make assumptions that are described below to
estimate the Likelihood of the evidence. The Posterior PDF of X can be sensitive to these
assumptions.
Example 5: Consider the problem solved in examples 1-4 using the ET approach. Assume
that an analyst only knows that variable A is between 0.1 and 1, and variable B is between
17
0 and 1. Since the analyst has no other information regarding the prior the analyst
assumes uniform prior probability distributions for A and B shown in Figure 6.
Method for combining evidence from experts to construct models of uncertainty
When the evidence given by experts about a random variable X, consists of
intervals (Figure 7), we need a method to interpret the evidence and bring it into a form
so that it can be used in the Bayesian framework.
To apply Bayes theorem to this problem we need to estimate the Prior PDF of X
and the Likelihood L(E/X=x). The following assumptions are made to combine evidence
from experts:
a) The analyst converts the interval provided from each expert about a variable into
a point estimate. This can be the midpoint of the interval or another point
obtained based on the analyst's judgment.
b) The point estimate of the expert is equal to the true value of the variable plus an
error. The analyst assumes a joint probability distribution of the errors of the
experts.
Suppose that we have evidence in the form of intervals from n experts,
[ ximin , ximax ]
for i = 1,…,n (Figure 7). The analyst can assume that the ith expert gives a
point estimate x̂i , which is the midpoint of the ith interval:
ximin + ximax
x̂i =
2
for i = 1,…,n
(15)
Let x be the true value of variable X. If the error in the ith expert’s estimate is Di,
then:
18
Point Estimates = Random Variables + Errors of Experts, or
ˆ =X+D
X
(16)
where X̂ is vector of the point estimates of the experts (size n), X is a vector
whose elements are all equal to the true value of variable X, x, and D is the vector of the
errors.
We need the PDF of D to estimate the likelihood of the evidence. As an example
we can assume that random vector D is normal with mean b and covariance matrix C.
 σ 2
D1

C=
M
ρ σ σ
 n,1 Dn D1
ρ1,2 σ D1 σ D2
O
L
L 

M 
σ Dn 2 

(17)
where σ Di is the standard deviation in the estimate of the ith expert and ρij
is
the correlation coefficient of the estimates of two experts. The ith element of vector b, bi,
is the bias of the ith expert. The analyst should estimate the above quantities.
As an example, an analyst could assume that:
a) Bias bi is zero,
b) The endpoints of the interval provided by ith expert are equal to the midpoint ±3
standard deviations of the error, respectively.
On the basis of the above assumptions, the analyst can calculate the standard
deviation of the error in each expert estimate:
ximax − ximin
σD =
i
6
19
(18)
Example 6: In example 5 we assumed the prior PDF’s of A and B. The next step is to
determine the experts’ error given the evidence from experts. We also assume that the
bias is zero for all the experts (that is the errors of the experts have zero mean). Figure 8
displays the experts’ errors for variables A and B.
Then the Likelihood of the evidence is:
L( E | X = x) = f D ( Xˆ − X / X = x) =
1
1
2 ⋅π ⋅ C 2
1
− .( Xˆ − X −b)T ⋅C −1⋅( Xˆ − X −b)
.e 2
(19)
The Posterior PDF of variable X can be calculated from Eq. (11).
Example 7: Consider the function Y = (A+B)A. The Prior PDFs and the errors of the
experts for variables A and B were found in examples 6 and 7. The Likelihood is
calculated for this example using Eq. (19) and the Posterior PDF using Eq. (13). The
Likelihood PDF and the Posterior PDF of A and B are presented in Figure 9. Using the
convolution integral method we can readily compute the posterior PDF of dependent
variable Y.
Uncertainty Measures
In standard and Bayesian probability theory, Shannon’s entropy measures the uncertainty
due to conflict. Since evidence is treated as if it were precise, Nonspecificity is zero.
Shannon’s entropy for finite sets is not directly applicable for continuous distributions as
a measure of uncertainty. When a probability density function of a continuous variable is
defined in a real interval, then Shannon’s entropy is defined in relative terms using a
reference probability density function. In this case, entropy can be positive (entropy in
20
the probability density function is greater than that in the reference probability density
function) or it can be negative. It can only be employed in a modified form:
∞
H(X) = − ∫ p(x) ⋅ log
−∞
p(x)
dx
g( x )
(20)
where X is the random variable with PDF p(x) and g(x) is the reference density function
of X. In this paper, reference densities for the input variables are their prior PDFs . For
dependent variable Y, reference density is the PDF of Y corresponding to the prior PDF’s
of the input variables.
3. Demonstration and Comparison of ET and BT Approaches
The Epistemic Uncertainty Group,
[15], proposed solving the following
challenge problem using methods for modeling uncertainty to understand how the
methods work when evidence is imprecise. Consider function Y=(A+B)A, where A and B
are independent variables, which means that knowledge about one variable does not alter
our belief about the other. There is no uncertainty in the functional relation between A, B
and Y. There is only uncertainty in the values of A and B. Experts provide information
about A and B in the form of intervals. The objective is to quantify the uncertainty in Y.
In this paper, the above problem is solved in seven cases using both ET and BT
approaches. The objective is to calculate and compare the models that these approaches
construct to characterize the uncertainty in variables A, B and Y.
Table 1 presents the
evidence from the experts. In case 1, we have only one expert providing evidence for
each variable, so there is only imprecision. In cases 2 and 3, two experts provided
evidence for each of the variables A and B and there is both imprecision and conflict.
21
Conflict is considerably lower in case 3 than in case 2, because the intervals are
overlapping in the former case and disjoint in the latter. The experts in case 4 are precise
(their intervals are very narrow) but they contradict each other.
In case 5, we have
highly imprecise experts (the intervals for A and B are wide) and there is no conflict (the
intervals are nested, which means that all experts could be right). This is the opposite
situation than in case 4 where conflict dominates over imprecision. In cases 6 and 7, we
have evidence from 3 and 4 experts for variables A and B, respectively. In case 6, the
evidence is nested, so there is no conflict. In case 7, the intervals of the experts are
disjoint and narrower than in case 6.
Therefore there is higher conflict and lower
imprecision than in case 6.
The analyst in the Bayesian approach makes the following assumptions for all the
cases:
1. The Priors of A and B are uniform from 0.1 to 1 and from 0 to 1, respectively.
2. The error of each expert is normal with standard deviation equal to 1/6th of the
width of the interval provided by the expert in all cases but 4a and 7a. If the
experts are unbiased the mean of the error is zero. If the experts are independent,
then the correlation coefficients of the errors are zero.
Consequently, the
correlation matrix in Eq. (17) is diagonal.
In the Bayesian approach the analyst assumes the probability distributions of the
errors of the experts. In cases 1-7, the experts are assumed independent and unbiased. In
case 3a, the analyst still assumes zero bias but the errors of the two experts are positively
correlated ( ρ = 0.8 ) for variable A, and negatively correlated ( ρ = −0.8 ) for variable B.
22
In case 3b, the analyst assumes a bias of 0.1 for variable A and a bias of 0.05 for variable
B, but the experts are independent.
Cases 4 and 7 are challenging for the analyst who uses Bayesian approach
because the experts contradict each other. This means that based on the experts intervals
only one expert can be correct. The analyst could still assume that the standard deviation
of the expert's errors are equal to 1/6th of the width their intervals but this is a poor
assumption because if both experts were as accurate as this assumption implies then they
would not contradict each other. To overcome this difficulty, the analyst increases the
standard deviation of the error based on the degree of the conflict in the expert's evidence.
The standard deviation of the error obtained from (Eq 18) is increased by 0.05 and 0.2 for
both variables A and B in cases 4a and 7a, respectively.
Figures 10-15 16 present the results of the two methods in cases 1-7, respectively.
The ET method yields maximum and minimum cumulative probability distributions of
the input and the output variables. These curves envelop the true cumulative probability
distributions of the variables. They are also labeled Plausibility and Belief, respectively.
The Bayesian approach characterizes uncertainty using a single cumulative probability
distribution function. Because of the assumptions about the probability distributions of
the expert's errors, the maximum and minimum cumulative probability distributions do
not always envelop the Bayesian cumulative distribution.
One can assess the magnitude of each type of uncertainty by studying the
maximum and minimum cumulative probability distributions obtained from the ET
approach. A large horizontal distance between the maximum and minimum cumulative
23
distributions of a variable indicates high imprecision. For example, in case 5, there is
high imprecision in all variables. Uncertainty due to conflict in a variable can be assessed
by the width of the interval in which its cumulative probability distribution suggests that
it can vary. The flatter the cumulative distribution, the wider is the interval of variation
and the higher is the uncertainty due to conflict. For example, conflict is larger in cases 4
and 7 than in the other cases, because the cumulative Plausibility and Belief distributions
of the variables are flatter (have lower slope) in these cases than in the other cases. From
the Bayesian approach results in Figure 12, we observe that the uncertainty increases
when there is positive correlation in the expert’s errors (variable A) and decreases when
there is negative correlation in the expert's errors (variable B). A shift in the cumulative
distribution function (Figure 12) accounts for the bias assumed in estimating the experts’
errors for variables A, B and Y in case 3b.
As mentioned earlier, conflict is largest in cases 4 and 7. If the Bayesian analyst
assumes that the standard deviation of the errors of the expert’s estimates are 1/6th of the
widths of their intervals, then the analyst will seriously underestimate the uncertainty in
both the input variables and the response variable. For example, in Figures 13 and 16, the
uncertainty in all variables computed by the BT approach in cases 4 and 7 is small
compared to that predicted by the ET approach. But if the Bayesian analyst increases the
standard deviation of the errors of the experts then the analyst will assess the uncertainty
more accurately (cases 4a and 7a) and his/her conclusions will be consistent with those
from ET.
Both Shannon's entropy, used in BT, and Total Uncertainty, used in ET, indicate
that the uncertainty is largest in cases 4a and 7a (Figure 17). It is also observed from
24
Figure 17 that the conclusions of the BT approach are sensitive to the assumptions about
the experts’ errors. Assuming that the standard deviations of the errors of the expert’s
estimates are 1/6th of the widths of their intervals leads to the conclusion that entropy is
small (cases 4 and 7). But this assumption amounts to saying that both experts are
precise, which is wrong because the experts contradict each other.
Using proper
assumptions about the expert's errors a Bayesian analyst obtains consistent conclusions
with ET.
Table 2 presents the Nonspecificity, Conflict and Shannon’s entropy in cases 1 to
7. Even though both Strife and Shannon’s entropy measure conflict, they should not be
compared because Shannon’s entropy measures the conflict relative to a reference
probability density (a uniform probability density in this example). The negative numbers
for the entropy indicate that the uncertainty in variables A and B was reduced when the
prior distributions of these variables were updated based on the expert’s evidence.
In
case 1, Strife is zero showing that there is no uncertainty due to conflict. In case 2, Strife
is higher than Nonspecificity, which indicates that uncertainty due to conflict in the
evidence of the experts dominates.
In case 3, imprecision and conflict types of
uncertainty are comparable. In case 4, conflict dominates over imprecision because the
evidence is precise but conflicting. In case 5, imprecision dominates over conflict,
because the evidence consists of nested intervals. In Case 6, Nonspecificity and Strife are
comparable. In case 7, where the intervals are disjoint, Strife is larger then
Nonspecificity. These conclusions are consistent with those from Figure 18, which shows
that Nonspecificity dominates in cases 1, 3-3b, 5, 6 whereas Strife dominates in cases 4
25
and 7. In the BT approach one cannot only assess the total uncertainty in the input and
output variables because both types of uncertainty were aggregated into one.
When imprecision is large, a decision-maker cannot estimate the probability of an
event accurately. For example, in case 5, the cumulative probability of Y can assume any
value between 0 and 1, for values of Y between 1 and 1.7. This indicates that one should
consider collecting more data before making a decision. However, if the decision-maker
has to make a decision now, then the results of ET do not tell the decision-maker what to
do. For example if two alternative designs have minimum and maximum probabilities of
failure 0.05 and 0.1, and 0.01 and 0.15, respectively, the ET approach does not tell the
decision-maker which design is safer.
Table 3 summarizes the differences of the two approaches. BT approach does not
distinguish between imprecision and conflict types of uncertainty. It provides single
estimates of the probabilities of events, which help a decision-maker rank alternative
designs in terms of their reliability. On the other hand, a decision-maker cannot tell if it is
worth buying more information by only examining the cumulative probability
distribution of a variable. One can assess the value of additional information by studying
the sensitivity of the results of the BT on the underlying assumptions about the
probability distributions of the random variables.
4. Experimental Comparison of Methods
Different methods have been proposed on how to model and propagate the
uncertainty, and provide information on the uncertainty in the function y in the challenge
problems. We assume that, in many cases, uncertainty is propagated for the purpose of
26
making decisions. Therefore, we propose a simulation method for testing methods in
terms of their effectiveness for making simple decisions.
The proposed simulations imitate the following physical experiment designed to
see which of two people (say John and Linda) can estimate better the relative weight of
two pieces of cake. We give John a cake, asking him to cut it knowing that Linda will
pick the heavier slice, and that his objective is to end up with the heaviest slice himself.
Under these conditions, John will try to cut the cake as evenly as possible. We then repeat
the experiment by giving Linda an identical cake to slice. Finally, we weigh the two
pieces that John ended up with and the two pieces that Linda has. If Linda has
substantially heavier total, it would indicate that she estimates the relative weight of two
pieces of cake more accurately than John. This problem belongs to a wide class of reallife problems in which two players are to divide a certain amount of resources or goods in
two parts. For example, a sales manager wants to divide a town to two salesmen
equitably. One way is for the manager to ask one salesman to divide the town into two
regions by drawing a straight line on the town map and then ask the other salesman to
select a region. Then the salesman who divided the town receives the remaining region.
The analogy is immediate when one wants to estimate the median of the
probability density function of function, y(x) (median: 0.5 probability to fall below it and
0.5 probability to exceed it). We have two methods for constructing models of the
uncertainty in variable X and propagating it to quantify the uncertainty in function y(x).
Each method has an advocate, called John and Linda. We ask John to divide the interval
[yl, yu] into two by picking one point inside the interval, so that Linda can select one
subinterval. We repeat the procedure by asking Linda to slice the interval and allow John
27
to pick a subinterval. Finally, we conduct an evaluation of who selected better. How we
do that is a crucial part of the proposed procedure, but we will discuss it later.
First, let us give a simple example. Assume that X is a scalar and that we know
only that it resides in the interval [0,1]. We will assume that the person who divides the
interval will do so assuming that the other person uses the same model for characterizing
uncertainty. We also assume that each player wants to get the portion with the highest
probability. We will assume that if the function is y = x, so that [yl, yu]=[0,1], both John
and Linda will slice [yl, yu] in the middle (at y=0.5). Now consider the function y = x2,
which also has [yl, yu]=[0,1]. This time, John may say that since he does not know
anything about the distribution of X, he does not know anything about the distribution of
Y so he will still divide [yl, yu] in the middle. Linda, on the other hand, may assume a
uniform distribution for X. Using this assumption she will select the interval [0.0.5] and
leave John the interval [0.5,1]. Then, in her turn, she will divide [yl, yu] at y = 0.25 since
0.25 is the median of the probability distribution of Y. Using the same logic, John will
then pick the interval [0.25,1]. That is, in both cases John will end up with the right
interval, and Linda with the left one.
We now come to the issue of how to decide who made better decisions, John or
Linda. We assume that knowing only the interval where a variable lies is equivalent to
that variable being able to take any probability distribution supported on that interval
with each distribution having the same likelihood. This suggests the following possible
Monte Carlo simulation: Pick a distribution at random, and evaluate the outcomes for
John and Linda based on that distribution. Repeat the process many times and see
28
whether John or Linda emerge as winners when a large number of simulations has been
performed.
To make this process more manageable we assume that probability distributions
in common usage, such as the normal or the Weibull distributions, are popular because
they describe well uncertainties we encounter often in practical applications. This allows
us to limit the simulation to a set of the five or ten most popular distributions. Note that
even if we limited ourselves to a single distribution, we still can vary the parameters of
the distribution such as the mean and standard deviation in case of the normal
distribution.
To cater for the possibility that we left out some odd but important distributions,
we can add in some experimental distributions from various sources. For example, we
may ask a class of students to each pick a number from the range of X or a scaled version
of X
We still need to deal with the question of how to simulate situations where the
information on X is more complex. For example, we may receive information from two
people, one saying that X is in the range [0, 1] and the other saying that it is in the range
[0, 0.5]. The best way of simulating this situation deserves some consideration and
debate. One possibility is to take half of the distributions from one interval and the other
half from the other interval (assuming that both sources of information are equally
credible).
In the following, we present methods for solving the interval splitting problem and
demonstrate them through examples.
Definition of Interval Splitting and Subinterval Selection Problem
29
Consider the following game played by John and Linda. The players consider a
variable X and a known function of this variable, Y(X). They are told that X is in the
interval I=[xl, xu]. Then Y is in the interval IY=[yl, yu], where yl and yu are the minimum
and maximum values of the function Y(X) when X varies in the interval I. Additional
evidence about X could be available. John divides IY into two subintervals by selecting a
point, y0. Then Linda selects one of the subintervals, IYC, and John gets the remaining
subinterval, IYB. Linda wins if he or she selects the interval with higher probability. Find
y0 so that John does not lose. The game is repeated with John and Linda switching places
to create a symmetric game.
Solution
Even though the two players may have different models of uncertainty, we
assume that they do not know what these models are. Therefore, we assume that they
solve the problem under the assumption that the other player uses the same model as they
do.
The interval splitting problem is formulated as follows:
Find y0,
to maximize the objective function = P(IYB)-P(IYC), where P(IYB) and P(IYC) are the true
probabilities of the John's and Linda's subintervals.
Under the assumption that Linda has the same uncertainty model as John, Linda
can always select the interval with highest probability. The best bet for John is to select
the median of Y as the dividing point, in which case he will break even (see the appendix
for a mathematical proof of this assertion).
30
John should select the subinterval with highest probability that is, he will select
the left subinterval if y0 is to the right of the median of Y (i.e. FY (y0 ) ≥ 0.5). He will
select the right subinterval otherwise.
Examples
In the following, we present two solutions to both the interval splitting and
subinterval selection problems for function Y=X2, in two cases, in which John knows that
X is in an interval (Case A) and John has evidence about X in the form of three intervals
(Case B). We further assume that one player uses the principle of maximum entropy to
construct a probability distribution, while the other employes
the minimum and
maximum cumulative probability distribution function of Y obtained using ET for
dividing and selecting.
Case A:
The only evidence available is that X is between 0 and 1. Therefore, Y ranges
between 0 and 1, too.
Interval splitting problem: Find spitting point, y0.
Maximum entropy solution: Using the maximum entropy principle, John assumes a
uniform probability density of X in [0, 1]. Then,
FY ( y 0 ) = 0.5 ⇔ P(Y ≤ y 0 ) = 0.5 ⇔ P( X 2 ≤ y 0 ) = 0.5 ⇔ P( X ≤
y 0 ) = 0.5 ⇔ F X ( y 0 ) = 0.5.
Therefore, the optimum value of Y is the square of the median of X , which is x 0 = 0.5,
y 0 = 0.5 2 = 0.25.
31
ET solution: In this case, John does not assume a probability distribution for X or
Y. Instead, he will find the minimum and maximum cumulative probabilities of X that
are consistent with the available evidence and derive the minimum and maximum
cumulative probabilities of Y. Figure 19 shows the minimum and maximum cumulative
probability distributions of X, and Y, which are identical in this case. According to this
figure, John only knows that the cumulative probability can assume any value between 0
and 1 for any value of Y in the interval [0, 1]. Therefore, it does not matter what value
of Y he selects as long as it is in the interval [0, 1]. In cases where one only knows that
the optimum solution to a problem is in a certain interval it is reasonable to select the
midpoint of that interval for the solution because this will best protect against errors in
calculating the interval. One the basis of this argument, John will select y0 = 0.5.
Subinterval selection problem
Probabilistic solution: If y0 is less than 0.25, John will select the right interval, otherwise
he will select the left.
ET solution: If y0 is less than 0.5, John will select the right interval, otherwise he will
select the left one.
Based on these calculations we will get the following scenario: The maximum
entropy player will divide the interval at 0.25, and the ET player will select the right
subinterval. In the second game, the ET player will divide the interval at 0.5, and the
maximum entropy player will select the left subinterval.
Case B:
Three experts tell John that X ranges in the following intervals: [0, 1], [0.3, 0.7],
[0.4, 0.6].
32
Interval splitting problem
Maximum entropy solution: If we considered only the opinion of one expert we would
assume a uniform probability distribution in the expert's interval. Assuming that each
distribution is correct one third of the time we obtain the probability density function of X
shown in Figure 20. Based of this function, the median of X is x0 = 0.5. Therefore, the
optimum value of Y is again, y0 = 0.25.
ET solution: The body of evidence for variable X introduced by the experts is shown in
Figure 21, where m ([a, b]) is the basic probability assignment of interval [a, b].
Figure 22 presents the basic probability assignment of Y . Using this body of evidence
we can compute the minimum and maximum cumulative probabilities of variable Y. For
example, the minimum cumulative probability distribution at point y is equal to the belief
of interval [-∞, y], which is:
FYmin ( y ) = Bel ([ −∞, y ]) =
∑ m( A)
A ⊆ [-∞, y ]
(21)
The minimum and maximum cumulative distributions of Y in Figure 23 can tell us
that the median of Y cannot assume any values less than 0.09 or greater than 0.49.
Therefore, the optimum value of Y, should be anywhere in the range [0.09, 0.49]. A
reasonable choice for y0 is the midpoint, that is the splitting point is, y0 = 0.29.
Subinterval selection problem
Probabilistic solution: If y0 is less than 0.25, John will select the right interval, otherwise
he will select the left.
33
ET solution: If y0 is less than 0.29, John will select the right interval, otherwise he will
select the left.
The results in this case are similar to Case A. In the first game the maximum
entropy player will select 0.25 as the splitting point, and the ET player will select the
right interval. In the second game, the splitting point will be 0.29, and the maximum
entropy player will select the left interval.
Observations
Consider the problem Y=Xn, where n is large (e.g. 100). The person who used the
maximum entropy principle would select an extremely small y0 (y0 = 0.5100). If the only
information available to John was that X were in the interval [0, 1] and John used an ET
approach, then he would not know what value of Y to select in the interval [0, 1]. He
could select the midpoint y0 = 0.5. In this case he would lose practically all the time by a
wide margin for most probability distributions.
Figure 24 shows
John's objective
(payoff) function, when John splits the interval, in case where although both players
assumed that Y = X
100
, the true value of the exponent of the function was 100 times a
bias factor ranging from 0.5 to 1.5. Two cases in which John uses probability and Linda
uses minimum and maximum probalities and vice versa are considered. Since John is
splitting the interval, he has a disadvantage compered to Linda.
When John uses
probability his payoff function is close to 0, which means he will break even. But when
John uses maximum and minimum probability he will almost always lose and his payoff
function is practically -1, which is the lowest value this function can assume. The reason
34
is that he will split the interval in half and Linda will select the left interval which almost
always has the higher probability.
Now consider the same problem but evidence consists of three intervals [0, 1],
[0.3, 0.7] and [0.4, 0.6]. In this case, if John used the probabilistic method described
above, John would still get the correct value of y0. On the other hand, if John used the
approach based on minimum and maximum probability he would counclude that the
splitting point should be between [0.3100, 0.7100]. This solution is much better than the
optimum solution of the same method when only interval [0, 1] is available. It appears
that the performance of the approach based on minimum and maximum probability can
be very poor when there is a severe information deficit. This is interesting because our
intuition tells us the opposite, the fewer assumptions a method does the less sensitive it is
to lack of information.
5. Conclusions
Two approaches, one based on ET and the other on BT have been presented.
These approaches can be used for modeling uncertainty and assessing the safety of a
system when the available evidence consists of intervals bounding the values of the input
variables. Experts provide these intervals.
The Evidence theory approach does not require the user to assume anything
beyond what that is already available. This approach treats uncertainty due to imprecision
differently than uncertainty due to randomness. The approach yields maximum and
minimum bounds of the probability of survival (and/or the probability of failure) of a
system, which can help assess the relative importance of the two types of uncertainty.
These results could help a decision-maker decide if it is worth collecting additional data
35
to reduce imprecision. On the other hand, if the gap between maximum and minimum
probabilities were large, the decision-maker would have difficulty ranking alternative
options. If the decision-maker has to make a decision now, then the ET approach does
not tell the decision-maker what option is better.
The Bayesian approach requires the analyst to make strong assumptions about the
credibility of the experts to estimate the likelihood of the available evidence. On the other
hand, this approach is more flexible than the Evidence theory approach because it
accounts for the credibility and correlation of the experts. This approach yields a single
estimate of the probability of failure of the system. This makes it easier for a decisionmaker to rank alternative options. On the other hand it does not help the decision-maker
assess the relative importance of imprecision over random uncertainty.
It is recommended that a decision-maker compute both the Bayesian probability
of events and their minimum and maximum probabilities when there is considerable
imprecision. A large gap between the minimum and maximum probability suggests that
more information should be collected before making a decision. If this is not feasible,
then Bayesian probabilities can help make a decision.
A procedure for testing alternative methods for solving the challenge problems,
based on the outcomes of decisions obtained from these methods, was presented. A
simple set of test problems, mimicking real-life decision problems in which a given
amount of resources is to be equally divided, is used for testing methods. We can learn
useful lessons about the efficacy of alternative methods, which are difficult to learn from
examination of the theoretical foundations of methods, using the test problems. For
example, it was found that a decision-maker who uses an ET approach performs worst
36
than an opponent who uses probability, in the long run, even when the information about
uncertainty is scarce.
Acknowledgements
The work presented in this report has been partially supported by the grant
"Analytical Certification and Multidisciplinary Integration" provided by The Dayton
Area Graduate Study Institute (DAGSI) through the Air Force Institute of Technology.
Appendix: Mathematical derivation of the solution
As mentioned in the main body of this report, John will select the median of the
probability distribution of Y as the dividing point to overcome Linda's advantage of
knowing the true probability distribution of Y. Here we will prove this assertion.
John will assume that Linda will select the interval with the higher true
probability, that is the probability of Linda selecting the left subinterval, p(y0), is:
1 if FY ( y 0 ) ≥ 0.5
p( y 0 ) = 
0 if FY ( y 0 ) < 0.5
(22)
where FY(y0) is the value of the true cumulative probability distribution of Y at Y
= y0. The decision tree is shown in Figure 25.
Figure 26 shows the objective function as a function of the value of the
cumulative probability of Y at y0, FY(y0). It is observed that the optimum value of y0 is
the one for which FY (y0) is 0.5, that is y0 is the median of Y. This is the only choice for
which John breaks even − he losses for all other values. Q.E.D.
37
References
[1] Oberkampf, W.L., DeLand, S.M., Rutherford, B.M., Diegert, K.V., Alvin, K.F., 2000,
“Estimation of Total Uncertainty in Modeling and Simulation”, Sandia Report
SAND2000-0824, Albuquerque, NM.
[2] Walley, P., 1991, Statistical Reasoning with Imprecise Probabilities, Chapman and
Hall, New York.
[3] Walley, P., 1998, “Coherent Upper and Lower Previsions.” Available at
http://ippserv.rug.ac.be. The Imprecise Probabilities Project.
[4] Kyburg, H. E., 1998, "Interval-valued Probabilities." Available at
http://ippserv.rug.ac.be, . The Imprecise Probabilities Project.
[5] Dubois, D and Prade, H., 1998, Possibility Theory, Plenum Press, New York.
[6] Joslyn, C., A., 1994, Possibilistic Processes for Complex Systems Modeling, Ph.D.
Dissertation, State University of New York at Binghamton, .
[7] Giles, R., 1982, “Foundations for a Theory of Possibility,” Fuzzy Information and
Decision Processes, North-Holland Publishing Company.
[8] Langley, R. S., 2000, “Unified Approach to Probabilistic and Possibilistic Analysis of
Uncertain Systems,” Journal of Engineering Mechanics, ASCE, Vol. 126, No. 11, pp.
1163-1172.
[9] Shafer, G., 1976, A Mathematical Theory of Evidence, Princeton University Press,
Princeton.
[10] Sentz, K., Ferson, S., “Combination of Evidence in Dempster-Shafer Theory,”
Sandia Report SAND2002-0835, April 2002, Albuquerque, NM.
38
[11] Smets, P., "Belief Functions and the Transferable Belief Model." Available at
http://ippserv.rug.ac.be. The Imprecise Probabilities Project.
[12] Berger, J. O., 1985, Decision Theory and Bayesian Analysis, Springer-Verlag, New
York, pp. 109-113.
[13] Winkler, R. L., 1972, Introduction to Bayesian Inference and Decision, Holt
Rienhart and Winston, Inc.
[14] Ben-Haim,Y., 2001, Information-gap Decision Theory: Decisions Under Severe
Uncertainty, Academic Press.
[15] Oberkampf, W.L., Helton, J. C., Joslyn, C. A., Wojtkiewicz, S. F., and Ferson, S.,
“Challenge Problems: Uncertainty in Series Systems Response Given Uncertain
Parameters,” (this issue).
[16 ] Klir, G. J., Wierman, M. J.,1998, Uncertainty-Based Information, A SpringerVerlag Company.
[17] Rocha, L. M., "Relative Uncertainty and Evidence Sets: A Constructivist
Framework," International Journal of general Systems, Vol. 26, (1-2), pp. 35-61.
[18] Melchers, R. E., 1987, Structural Reliability, Analysis and Prediction, Ellis Horwood
Limited, West Sussex, England.
[19] Wu, J. S., Apostolakis, G. E., and Okrent, D., 1990, Uncertainties in System
Analysis: Probabilistic Versus Nonprobabilistic Theories," Reliability Engineering and
System Safety, 30, pp. 163-181.
39
40
Figure Captions
Figure 1: Plausibility of A and B. The belief is zero.
Figure 2: Variables A and B in joint space.
Figure 3: Joint maximum probability (Plausibility) of A and B.
Figure 4: Body of Evidence and Plausibility of variable Y. Belief is zero.
Figure 5: Body of Evidence and Cumulative Plausibility and Belief of Y.
Figure 6: Prior PDF’s of variables A and B.
Figure 7: Evidence from experts.
Figure 8: PDFs of errors of experts for input variables.
Figure 9: Likelihood and Posterior Probability Density Functions of Variables A and B
Figure 10: Cumulative Plausibility and Belief along with Cumulative Bayesian PDF for
Case 1.
Figure 11: Cumulative Plausibility and Belief along with Cumulative Bayesian PDF for
Case 2.
Figure 12: Cumulative Plausibility and Belief along with Cumulative Bayesian PDF for
Cases 3,3a and 3b.
Figure 13: Cumulative Plausibility and Belief along with Cumulative Bayesian PDF for
Cases 4 and 4a.
Figure 14: Cumulative Plausibility and Belief along with Cumulative Bayesian PDF for
case 5.
Figure 15: Cumulative Plausibility and Belief along with Cumulative Bayesian PDF for
Case 6.
41
Figure 16: Cumulative Plausibility and Belief along with Cumulative Bayesian PDF for
Cases 7 and 7a.
Figure 17: Uncertainty in ET vs. Uncertainty in BT
Figure 18: Nonspecificity (ET) vs. Strife (ET)
Figure 19: Maximum and Minimum Cumulative Probabilities of variables X and Y.
Figure 20: PDF of X.
Figure 21: Body of Evidence of variable X.
Figure 22: Body of Evidence of variable Y.
Figure 23: Maximum and Minimum Cumulative Probability of Y.
Figure 24: Payoff function graph, Probability vs. Min-Max probability.
Figure 25: Decision Tree
Figure 26: Objective function vs. FY (y0).
42
Case
Variable A
Variable B
Sizes and positions of
intervals
Conflict
1
[0.2,0.5]
[0.3,0.6]
Wide
None
2
[0.1, 0.5]
[0.55,0.95]
[0, 0.5]
[0.52,1]
Wide, Disjoint
High
High
3,
3a,
3b
[0.1, 0.5],
[0.2,0.6]
[0, 0.5],
[0.2,0.7]
Wide, Overlapping
Low
High
4,
4a
[0.5,0.52],
[0.6,0.62]
[0.6,0.62]
[0.7,0.72]
Narrow, Conflicting
High
Low
5
[0.1,1.0]
[0.6,0.8],
[0.4,0.85],
[0.2,0.9],
[0.0,1.0]
Low
High
6
[0.5,0.7],
[0.3,0.8],
[0.1,1.0]
[0.59,0.61],
[0.4,0.85],
[0.2,0.9],
[0.0,1.0]
Nested
Low
High
7,
7a
[0.8,1.0],
[0.5,0.7],
[0.1,0.4].
[0.8,1.0],
[0.5,0.7],
[0.1,0.4],
[0.0,0.2]
Disjoint
High
High
Wide, Overlapping
Table 1: Evidence from experts.
43
Imprecision
High
Case
Non-specificity (ET)
Strife (ET)
Shannon’s entropy (BT)
1
A = 0.2624
B = 0.2624
Y = 0.1747
A= 0
B=0
Y=0
A = -1.484
B = -1.586
Y = -1.62
2
A = 0.37
B = 0.40
Y = 0.417
A=1
B=1
Y = 0.854
A = -1.53
B = -1.43
Y = -2.08
3
A = 0.3365
B = 0.4055
Y = 0.2846
A = 0.193
B = 0.322
Y = 0.308
A = -1.53
B = -2.043
Y = -2.46
3a
A = 0.3365
B = 0.4055
Y = 0.2846
A = 0.193
B = 0.322
Y = 0.308
A = -1.237
B = -2.2173
Y = -3.08
3b
A = 0.3365
B = 0.4055
Y = 0.2846
A = 0.193
B = 0.322
Y = 0.308
A = -1.58
B = -2.11
Y = -2.573
4
A = 0.0198
B = 0.0198
Y = 0.0247
A= 1
B=1
Y = 1.924
A = -3.59
B = -3.87
Y = -4.03
4a
A = 0.0198
B = 0.0198
Y = 0.0247
A= 1
B=1
Y = 1.924
A = -1.84
B = -1.93
Y = -2.35
5
A = 0.64
B = 0.444
Y = 0.714
A=0
B = 0.358
Y = 0.123
A = -0.169
B = - 2.18
Y = -1.76
6
A = 0.41
B = 0.404
Y = 0.516
A = 0.36
B = 0.465
Y = 0.375
A = -1.81
B = - 3.56
Y = -2.58
7
A = 0.209
B = 0.203
Y = 0.249
A = 1.585
B = 1.75
Y = 1.805
A = -2.5
B = -2.24
Y = -2.84
7a
A = 0.209
B = 0.203
Y = 0.249
A = 1.585
B = 1.75
Y = 1.805
A = -0.46
B = -0.74
Y = -1.08
Table 2: Uncertainty Measures
44
ET Approach
BT Approach
Analyst does not need to make any
Analyst assumes the Prior and
additional assumptions beyond what is
the error in the estimates of the
already available.
experts. These assumptions can
affect the results significantly.
Treats uncertainty due to imprecision and
Does not distinguish between
conflict separately. Yields maximum and
imprecision and conflict. Gives
minimum bounds of probabilities of events
a single value of the probability
from which the relative magnitude of these
of an event
two types of uncertainty can be assessed.
Reliability and correlation between experts
Expert’s
reliability
and
cannot be taken into account.
correlation can be taken into
account.
When the intervals provided by the experts
are very broad, then the gap between the
Since a single value of the
maximum and minimum probabilities of
probability of failure is provided
failure can be very large.
A decision-
it is easier for the decision-
maker might be unable to rank alternative
maker to rank designs in terms
designs in terms of their reliability because
of their reliability.
of this gap.
Table 3: Comparison of ET and BT approaches
45
Figure 1: Plausibility of A and B. The belief is zero.
46
m=1/2
m=1/2
Variable B
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0.2
m=1/2
0.3
0.4
0.5
m=1/2
0.6
Variable A
Figure 2: Variables A and B in Joint Space
47
y
b
a
Figure
3: Joint maximum probability (Plausibility) of
S
and B
0.25
b :=A 0.5
48
Focal element corresponding
to the shaded box(Fig.2)
0.7
0.8
0.9
4
1.1
1.2
Figure 4: Body of Evidence and Plausibility of Variable Y.
Belief is zero.
49
Focal element corresponding
to the shaded box(Fig.2)
0.7
0.8
0.9
4
1.1
1.2
Figure 5: Body of Evidence and Cumulative Plausibility and
Belief of Y
50
Figure 6: Prior PDF’s of variables A and B
51
…
x1min x̂1 x1max x2min x̂2 x2max xnmin x̂n xnmax
Figure 7: Evidence from experts
52
Variable A
Variable B
Figure 8: PDFs of errors of experts for input
variables.
53
Variable A
Variable B
Figure 9: Likelihood and Posterior Probability
Density Functions of Variables A and B.
54
Figure 10: Cumulative Plausibility and Belief along with
Cumulative Bayesian PDF for Case 1.
55
Figure 11: Cumulative Plausibility and Belief along with
Cumulative Bayesian PDF for Case 2.
56
Figure 12: Cumulative Plausibility and Belief along with
Cumulative Bayesian PDF for Cases 3, 3a and 3b.
57
Figure 13: Cumulative Plausibility and Belief along with
Cumulative Bayesian PDF for Cases 4 and 4a.
58
Figure 14: Cumulative Plausibility and Belief along with
Cumulative Bayesian PDF for case 5.
59
Figure 15: Cumulative Plausibility and Belief along with
Cumulative Bayesian PDF for Case 6.
60
Figure 16: Cumulative Plausibility and Belief along with
Cumulative Bayesian PDF for Cases 7 and 7a.
61
Total Uncertainty
(Nonspecificity+Strife)
2.5
2
Case 4
Case 7
1.5
Case 2
1
Case 6 Case 5
Case 3a
Case 3b
Case 3
Case 1
0.5
0
-4.5
Case 7a
Case 4a
-4
-3.5
-3
-2.5
-2
-1.5
-1
-0.5
Shannon's Entropy
Figure 17: Uncertainty in ET versus Uncertainty in BT
62
0
Nonspecificity
0.8
case 5
0.7
0.6
case 6
0.5
case 2
0.4
0.3
cases 3, 3a, 3b
0.2 case 1
0.1
0
0
0.5
1
case 7, 7a
case 4, 4a
Strife
1.5
2
Figure 18: Nonspecificity (ET) versus Strife (ET)
63
2.5
1
f ( a)
l ( a)
0.5
0
0.2
0
0.2
0.4
0.6
0.8
1
a
M axim u m C u m u la tive P rob ab ilit y
M in im u m C u m u lat ive P rob ab ility
Figure
Probabilities
F ig u re19:
1 9Maximum
: M ax im and
u m Minimum
an d M in Cumulative
imu m cu mu
la tiv e p roof
b ab
variables
anddY.
Y
v aria b les
XX an
64
fX(x)
1
0.66
0.33
0
0.3
0.4 0.6
0.7
Figure 20: PDF of X.
65
1
x
66
m([0, 1])=1/3
m([0.09, 0.49])=1/3
m([0.16, 0.36])=1/3
0
0.09
0.16
y
0.36
0.49
1
Figure 22: Body of Evidence of variable Y.
67
68
John's payoff function
.
1
Bidder splits interval
John splits interval
John uses probability
0
-1
0.5
John uses min-max
probability
1
1.5
Exponent, bias factor
Figure 24: Payoff function graph,
Probability vs. Min-Max Probability
69
Following is the decision tree:
Linda selects left
subinterval p(yo)
John selects, yo
Linda selects
right subinterval
Objective function P(IYB)P(IYC) = 1 - 2FY(yo)
Objective function P(IYB)P(IYC) = 2FY(yo) - 1
. 1-p(yo)
.
.
Figure 25 : Decision Tree
70
Objective function
0
0.5
1
FY(y0)
-1
Figure 26: Objective function vs. FY(y0)
71
72