Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Comparison of Evidence Theory and Bayesian Theory for Uncertainty Modeling Prabhu Soundappan Efstratios Nikolaidis1 Mechanical, Industrial and Manufacturing Department The University of Toledo Toledo, OH-43606 USA Email (Nikolaidis) [email protected] R. T. Haftka Department of Aerospace Engineering, Mechanics and Engineering Science The University of Florida Gainesville, FL 32611-6250 USA Ramana Grandhi Department of Mechanical and Materials Engineering Wright State University Dayton, OH 45435 USA Robert Canfield Air Force Institute of Technology WPAFB, OH 45433 USA Abstract This paper compares Evidence Theory (ET) and Bayesian Theory (BT) for uncertainty modeling and decision under uncertainty, when the evidence about uncertainty is imprecise. The basic concepts of ET and BT are introduced and the ways these theories model uncertainties, propagate them through systems and assess the safety of these systems are presented. ET and BT approaches are demonstrated and compared on challenge problems involving an algebraic function whose input variables are uncertain. The evidence about the input variables consists of intervals provided by experts. It is recommended that a decision-maker compute both the Bayesian probabilities of the outcomes of alternative actions and their plausibility and belief measures when evidence about uncertainty is imprecise, because this helps assess the importance of imprecision and the value of additional information. Finally, the paper presents and demonstrates a method for testing approaches for decision under uncertainty in terms of their effectiveness in making decisions. 1 Corresponding author Introduction The information in many problems of design under uncertainty, especially those involving reducible (epistemic) uncertainty, is imprecise. Reducible uncertainty is uncertainty due to lack of knowledge, as opposed to random (aleatory) uncertainty, which is due to inherent variability in a physical phenomenon. It is called reducible, because it can be reduced or eliminated if one collects information. The uncertainty in the probability of getting heads in one flip of a bent coin is reducible, because it is due to lack of knowledge. It can be reduced if we conduct experiments. On the other hand there is aleatory uncertainty when flipping a coin even if we know the probability of events; heads and tails. This type of uncertainty cannot be reduced even if we conduct n experiments, where n is a very large number. For example, we know the probability of heads and tails in a fair coin is 0.5. But each time you flip it, you are uncertain about the output. So this uncertainty is also called irreducible uncertainty. Oberkampf et al. [1] explained these two types of uncertainty and presented examples in which aleatory and epistemic uncertainty are encountered in engineering problems. There is no consensus about what the best theory is for modeling epistemic uncertainty. Oberkampf et al. [1] studied the differences and similarities of epistemic and random uncertainty. In their study, they used a hybrid approach in which random uncertainty was modeled using probability and epistemic uncertainty was modeled using intervals bounding variables in which there was epistemic uncertainty. Theories for modeling epistemic uncertainty, include Coherent Upper and Lower Previsions [2-4], Possibility Theory [5-8], Evidence Theory [9-10], the Transferable Belief Model [11] and Bayesian Theory [12-13]. Information gap-Decision Theory is another alternative for decision making under uncertainty when information about uncertainty is scarce [14]. 2 Information about epistemic uncertainty is usually in the form of intervals. For example, if we show a glass jar containing beans to a person and ask her how many beans are in the jar, then she is more likely to give a range rather than a precise number. Similarly, if we ask an expert what he thinks the prime interest rate will be in 2007, he will probably provide a range rather than a single number. The same is true, if we ask an expert about the probability of getting “heads” in a flip of a bent coin. The objective of the study presented in this paper is to compare two approaches, one using Evidence Theory (ET), the other using Bayesian Theory (BT), for characterizing uncertainty in situations, such as the ones presented in the previous paragraph, where the information about uncertainty is imprecise. Specifically, the following problem is considered: The performance of a system is characterized by variable Y, which is a function of uncertain variables X1,...,Xm. We know the relation between the performance variable Y and variables X1,...,Xm, y = f ( x1 ,..., xm ) 2. We have information about the values of variables X1,…,Xm, which is in the form of intervals obtained from n experts. These intervals have the form [ x l i, j , x u i, j ] where subscript i specifies the variable, and j specifies the expert. Suppose that a system survives if the performance variable Y falls in a certain interval or collection of intervals denoted by S. There is no uncertainty in the functional relation of the performance variable and the input variables, and in the definition of survival. We want to model the uncertainty in the independent variables, derive a model about the uncertainty in the performance variable Y and assess safety. 2 In this report, capital letters denote variables and lower case letters denote values that these variables assume. 3 First, the assumptions of the ET based approach are presented and then an approach for constructing models of uncertainty is developed on the basis of these assumptions, in section 2. A method for propagating uncertainty through a system to estimate the uncertainty in the response from the uncertainty in the input is shown. Finally, equations for computation of the Belief and Plausibility of failure of the system are presented. A simple example of an algebraic function is used to demonstrate each step of the approach. Section 3 presents a Bayesian approach for constructing models of uncertainty. First, Bayes rule for updating the prior mass function of a discrete variable or the probability density function of a continuous variable is reviewed. Then an approach for constructing a model of uncertainty using Bayes rule from expert evidence about the random variables, which is in the form of intervals, is presented. An example demonstrating each step of this approach is also included. In section 4, ET and BT approaches are demonstrated and compared on a series of challenge problems involving epistemic uncertainty proposed by the Epistemic Uncertainty Group [15]. As mentioned earlier, decision-makers have an arsenal of different theories and methods based on these theories for making decisions under uncertainty. There is no consensus as to what method is most suitable for problems with epistemic uncertainty, when information is scarce and imprecise. Comparisons of alternative approaches on the basis of their effectiveness in making decisions under uncertainty could help understand better these methods and assess their effectiveness in modeling epistemic uncertainty. Section 5 proposes an approach for comparing methods for the solution of the challenge 4 problems. The approach uses alternative methods for characterizing uncertainty to make decisions, the outcomes of which are later evaluated through numerical simulations or physical experiments. 1. Evidence Theory Approach Assumptions The following are the key assumptions of the ET approach: 1. If some of the evidence is imprecise we can quantify uncertainty about an event by the maximum and minimum probabilities of that event. Maximum (minimum) probability of an event is the maximum (minimum) of all probabilities that are consistent with the available evidence. 2. The process of asking an expert about an uncertain variable is a random experiment whose outcome can be precise or imprecise. There is randomness because every time we ask a different expert about the variable we get a different answer. The expert can be precise and give a single value or imprecise and provide an interval. Therefore, if the information about uncertainty consists of intervals from multiple experts, then we have uncertainty due to both imprecision and randomness. If all experts are precise they give us pieces of evidence pointing precisely to specific values. In this case, we can build a probability distribution of the variable. But if the experts provide intervals, we cannot build such a probability distribution because we do not know what specific values of the random variables each piece of evidence 5 supports. In this case, we can use second order probability3, or we can calculate the maximum and minimum values of the probabilities of events. The latter approach does not require any additional information beyond what is already available. To demonstrate this philosophy of calculating maximum and minimum bounds for the probability of an event when the evidence is imprecise, consider the following problem. We roll a weighted die n times and videotape the results. The statistical error is negligible because n is large. Later we discover that we cannot determine precisely the outcomes in the experiments from the tape. We can only tell that 40% of the experiments resulted in a number less or equal to three and the other 60% in a number greater than three. In this case, we cannot estimate the probability of each of the numbers 1-6, unless we make arbitrary assumptions about the likelihood of getting numbers between 1 to 3 and 4 to 6. But we could estimate that there is a 0.4 probability of getting a number between 1 to 3 and 0.6 probability of getting a number between 4 to 6. Instead of making additional assumptions, such as that all numbers between 1 to 3 are equally likely, we could conclude that every number from 1 to 3 can have a probability as high as 0.4 and as low as 0 and every number from 4 to 6 can have a probability as high as 0.6 and as low as 0. Modeling uncertainty First consider one variable X1. The information consists of intervals obtained from n experts [ x l 1, j , x u 1, j ] thought to enclose the precise value x1,j . The intervals can be nested, in which case we have consonant evidence, they may overlap, or they may be 3 Second order probability treats the variables associated with epistemic uncertainty as random variables with their own probability distributions and computes a probability distribution of the probability of occurrence of an event. For example, in a experiment that involves flipping a bent coin the probability of the event “heads” is treated as a random variable. 6 disjoint, in which case we have conflicting evidence. When an expert provides an interval instead of a value, then the expert is telling us that the true value of the variable could be anywhere in this interval. Therefore, the evidence from the expert could or could not support a particular value in that interval. The maximum probability of the variable being equal to x is the ratio of the pieces of the imprecise evidence from the experts that could support x to the total number of intervals. For example, if experts 1 and 2 told us that the value of the gas ten years from now can be between $1 and $5 and $1 to $10, respectively, then on the basis of this evidence, the probability of any value between $1 and $5 could be as high as 1 and the probability of any value greater than $5 and less or equal to $10 could be as high as 0.5. The maximum probability of X1=x , Pu(X1=x), can be found by solving the following optimization problem: Find x1,1 ,..., x1,n to maximize P( X 1 = x ) = 1 n ∑ Ii n i =1 (1) 1 x1,i = x where Ii is an indicator function: I i = 0 otherwise so that x1,i ∈ [ x1l ,i , x1u,i ] This maximum probability will also be called Plausibility. The above formulation indicates that the maximum probability of X1=x is the ratio of the number of intervals of the experts containing x to the total number of intervals. The minimum probability X1=x, Pl(X1=x), can be found by solving the following dual optimization problem: 7 Find x1,1 ,..., x1,n To minimize P ( X 1 = x ) = 1 n ' ∑I i n i =1 (2) 1 x l = x u = x 1,i 1,i where I i' is an indicator function: I ' i = 0 otherwise So that x1,i ∈ [ x1l ,i , x1u,i ] This minimum probability will also be called Belief. From the above formulation, we conclude that the minimum probability of X1=x is the ratio of the number of intervals that coincide with point {x} to the total number of intervals. This probability is zero unless there is precise evidence pointing at x. One can easily extend the formulations of the above two optimization problems to find the maximum and minimum probabilities of any event associated with variable X1, such as the event that X1 assumes a value in a given interval or set of intervals. The Plausibility and Belief can also be found solving equations (8) & (9), once the body of evidence of the input variables is resolved. The body of evidence is molded from the intervals given by the experts using the mixing or averaging technique. We found this to be the most intuitive technique when one does not have any knowledge about the experts. The evidence can be combined using several other techniques like Dempsters rule of combination, Discount+Combine method, Yager’s modified Dempster’s rule, Inagaki’s unified combination rule and several others. These combination rules are studied in detail in [10]. The following assertion relates maximum and minimum probabilities to Plausibility and Belief, which are used in evidence theory [6] to characterize one’s belief 8 about the occurrence of events. Evidence theory can be viewed as an extension of probability theory. It is suitable for characterizing uncertainty when evidence is imprecise because it allows one to estimate probabilities of intervals instead of probabilities of specific values. These intervals are called focal elements and their probabilities basic probabilities. Assertion Consider the experts providing n intervals about a variable. These intervals are considered as focal elements and have basic probability 1/n. Then the Plausibility and Belief of any event associated with the variable are equal to the maximum and minimum probabilities of the event, respectively. Justification The maximum probability an event represented by a set C is equal to the ratio of the number of focal elements (intervals provided by the experts) that intersect with C to the total number of focal elements. Indeed, all of the evidence provided by the experts, consisting of intervals intersecting with C, could support C because, according to these experts, the true value of the variable could be in C. The rest of the evidence cannot support C because the rest of the intervals and C are disjoint. That means the maximum probability of C is equal to the sum of the basic probabilities of the focal elements that intersect C, which is the Plausibility of x. Similarly, the minimum probability of C is the number of the focal elements contained in C divided by the total number of focal elements, which is the Belief of C. The following examples are based the challenge problems [15]. 9 Example 1: Two experts said that variable A is in the following intervals: [0.1, 0.4] and [0.3,0.6]. Two experts said that variable B is in the following intervals: [0.2, 0.5] and [0.4,0.7]. Figure 1, shows the maximum probabilities (Plausibility) of A and B, respectively. Consider m variables. If there is no information about the correlation of the variables and the experts are equally credible, we can transfer the evidence about each variable, into the m-dimensional space of all the variables using the following equation: m X1 X 2 ...X m ([ x1l , x1u ], [ −∞ ,+∞ ],..., [ −∞ ,+∞ ]) = m X1 ([ x1l , x1u ]) (3) m In the above equation m X 1 ([ x1l , x1u ]) is the basic probability of the interval [ x1l , x1u ] . Symbol m X1 X 2 ...X m ([ x1l , x1u ], [ −∞ ,+∞ ],..., [ −∞ ,+∞ ]) is the basic probability of the same interval in the m-dimensional space of the variables. This equation can be justified as follows: if an expert says that a variable is in a given interval this is true for both the space of that variable and the m-dimensional space of the m variables. But if we have m bodies of evidence for m variables, then the evidence must be normalized by m when transferring evidence from the one-dimensional space of a variable to the m-dimensional space. If we know that the variables are independent (that is, information about any group of variables does not change our belief about the others) then we can use the following approach to combine the evidence about the variables into a single joint body of evidence. a) Focal elements of the joint body of evidence are the elements of the Cartesian product of the elements of the evidence about the individual variables. b) The 10 probability of each element in the m-dimensional space is the product of the individual probabilities. l , x u ]) = m ([ x l , x u ]) ⋅ ... ⋅ m l u m X1 X 2 ... X m ([ x1l , x1u ],..., [ x m m X1 1 1 X m ([ x m , x m ]) (4) The above equation is a special case of Dempster’s rule of combination when the bodies of evidence from different experts are independent and equally credible. Since the rule has been justified in other publications, such as [9], we will not justify it here. From the joint body of evidence, we can estimate the maximum joint probability of the variables. Example 2: This is a continuation of example 1. If A and B and independent, the joint body of evidence is shown in Fig. 2. Specifically, this figure shows the focal elements of the joint body of evidence, which are the four rectangles (boxes) in Fig. 2. The four boxes in Fig. 2 is the result of the Cartesian product of the individual bodies of evidence of variables A and B. The maximum joint probability (Plausibility) of these variables is shown in Figure 3. Propagating uncertainty through a system Here we compute the maximum probability of variable Y, which is related to the input variables through function y = f ( x1,..., xm ) , from the joint body of evidence about X1, …,Xm. First, we transform the joint body of evidence about the input variables into evidence about variable Y. For this purpose, we map the focal elements in the mdimensional space of the independent variables into the elements in the space of variable Y. We can do this by solving one maximization and one minimization problems to find the limits of variable Y, when the input variables vary within each focal element in the 11 joint probability space. Mathematically, we solve a pair of optimization problems to find the interval in the space of variable Y given the focal element in the space of the variables X1, …,Xm. Find X1, …,Xm To min (max) Y = f ( X 1 ,..., X m ) (5) So that X i ∈ [ X il , X iu ] , where X il and X iu are the lower and upper bounds of Xi corresponding to each focal element in the joint space. Example 3: This is an extension of example 2. After calculating the joint body of evidence of variables, the limits of Y can be found using Eq. 5. In Fig. 2, we have four boxes in the joint space of variables as the outcome of the Cartesian product of the individual variables. The shaded box in Fig. 2 is the product of the focal elements A[0.1,0.4] and B[0.2,0.5]. The basic probability assigned to this shaded box (Fig.2) is the product of the probabilities of the individual focal elements. The corresponding limits of Y can be found solving Eq. 5, which is [0.897,1.039], the shaded ellipse in Fig. 4 and 5. The basic probability assigned to this focal element is ¼. Using the same procedure, the rest of the focal elements and the basic probabilities are calculated to construct the body of evidence of Y. The above optimization problems can be solved using nonlinear programming or Monte-Carlo simulation. This yields a set of intervals for Y, Ci, and the basic probabilities of these intervals, mY(Ci). Then we compute the maximum probability of Y being equal to y through the following equation: P u ( Y = y ) = ∑ mY ( Ci ) y∈Ci 12 (6) In the above equation, Ci are the intervals that contain y. We can also find the maximum and minimum cumulative probability distribution functions (UCDF and LCDF, respectively) of Y. These functions provide the maximum and minimum and values of the probability of Y being less or equal to a value y. The UCDF is obtained using the following equation: FYu ( y ) = ∑ mY ( Ci ) (7) The sum in the above equation includes all the elements Ci that intersect with interval [-∞, y]. The LCDF is obtained from the same equation but the sum includes all elements that are contained in the interval [-∞, y]. Example 4: Consider function Y=(A+B)A. The maximum probabilities of variables A and B were found in Example 1 and the joint maximum probability of A and B in Example 2. Example 3 explains how we construct the body of evidence of Y from the joint space. Figures 4 and 5 show the body of evidence of variable Y, and the corresponding maximum probability and the maximum and minimum cumulative probability distribution functions of Y, respectively. It is observed from Figure 4 that the maximum probability of Y=y is an overly conservative measure of the Likelihood of this event. Indeed, unless the probability density function (PDF) of Y has a delta function at Y=y, the probability of this event is zero, while the maximum probability of this event assumes a non zero value in the range from 0.81 to 1.17. Figure 5 shows that there is a large gap between the maximum and minimum bounds of the cumulative probability of Y. This indicates a large uncertainty in the true value of this probability. 13 There are two reasons for this large uncertainty: a) The intervals provided by the experts about the values of the independent variables A and B are wide and they are nested. b) Only the information provided in the problem statement was used to model the uncertain variables. Assessing safety As mentioned in the introduction, when the evidence is imprecise it is useful to know how low and how high the probability of survival (or failure) of a system can be. Indeed, when evidence is imprecise, it could be reasonable to design a system, whose failure could have severe consequences, using the most conservative models that are consistent with the available evidence. The maximum and minimum probabilities of survival can be found using the following equations: P u ( S ) = Pl( S ) = ∑ mY ( Ci Ci I S ≠ 0 ) (8) P l ( S ) = Bel( S ) = ∑ mY ( Ci ) (9) Ci ⊆ S Uncertainty Measures ET considers two types of uncertainty. One is due to the imprecision in the evidence; the other is due to the conflict. Nonspecificity and Strife measure the uncertainty due to imprecision and conflict, respectively. Both measures are expressed in bits of information. In the following, we briefly present these measures. A detailed presentation can be found in [16, 17]. The larger the focal elements of a body of evidence, the more imprecise is the evidence and, consequently, the higher is Nonspecificity. When the evidence is precise 14 (all of the focal elements consist of a single member), Nonspecificity is zero. In the challenge problems, the broader the interval of the experts, the higher is Nonspecificity. Strife measures the degree to which pieces of evidence contradict each other. Consonant (nested) focal elements imply little or no conflict. Disjoint elements imply high conflict in the evidence. For example, if the experts’ intervals are disjoint, the experts contradict each other. Therefore, Strife is large. For finite sets, when evidence is precise, Strife reduces to Shannon’s entropy, which measures conflict in probability theory. Nonspecificity measures the epistemic/reducible uncertainty, the uncertainty associated with the sizes (cardinalities) of relevant sets of alternatives. Consider a body of evidence <F,m>, where F represents the set of all focal elements and m their corresponding basic probability assignments. Here N(m,µ) measures the Nonspecificity in bits. N(m, µ) = ∑ m(B) ⋅ µ(A) A⊆ X (10) where µ(A) = log 2 (A) for discrete domains and µ(A) = ln(1 + A ) for continuous domains. A is the Lebesgue measure of A. The Lebesgue measure of an interval is its length. Strife measures conflict among the various sets of alternatives in a body of evidence. Strife measure in evidence theory is given by, S (m ) = − ∑ m(A)log 2 ∑ m(B) ⋅ SUB(A, B) A∈F B∈F 15 (11) where SUB(A, B) = infinite sets. A∩ B A 1 if A ≡ φ ( A∩ B ) if ( A ) > 0 for finite sets and SUB(A, B) = (A) 0 otherwise Symbols ( A ) and ( A∩ B) represent the Lebesgue measures for of A and A∩ B , respectively. 2. Bayesian Approach This section explains Bayes rule and presents an approach for constructing a probability mass/density function of discrete and continuous variables using evidence from experts. Methods for estimating the probability density function of the response of a system and its probability of failure given the probability density function of the input variables, such as Monte Carlo simulation and Fast Probability Integration, are well documented [18]. Therefore, they will not be discussed here. Bayes rule Discrete case: Updating a Prior Probability Mass Function using evidence Suppose we have information about variable X whose Prior Probability Mass Function (PMF) is given by the set of possible values x1,…,xj and the corresponding probabilities P(X = xj) , j = 1,…,J. Then we observe a sample value of another variable Y. The Likelihood probability, P(Y=y/X=xj), is determined from the conditional PMF of Y given X. Bayes rule can be applied to update the Prior PMF of X, when the sample value of Y is observed, to estimate the Posterior PMF, P(X= xj | Y=y). Bayes rule for the discrete case is: 16 P(Y = y | X = x j ) ⋅ P( X = x j ) P( X = x j | Y = y ) = J ∑ P(Y = y | X = x j ) ⋅ P( X = x j ) (12) j =1 Continuous case: Updating a Prior PDF using evidence [19] The uncertainty in a continuous random variable, X, can be represented by its PDF, f Xo ( x ) , which can be updated on the basis of evidence, E, into a Posterior PDF, f X ( x / E ) . This function can be calculated using Bayes theorem: f X (x/E) = 1 ⋅ L(E/x) ⋅ f Xo (x) k (13) f X ( x / E ) is the Posterior PDF of variable X and k is a normalization constant: k = ∫−∞∞ L( E / x ) ⋅ f Xo ( x )dx (14) L( E / x ) is the Likelihood function, which is the conditional probability of observing evidence E, given X=x. If we do not know the Prior PDF of X, then we can assume a noninformative/maximum entropy Prior. If we only know that a variable is in a certain range then the uniform probability density is the one with maximum entropy. If the evidence is imprecise (i.e. it consists of intervals instead of single values) Bayes rule cannot be directly applied. The analyst needs to make assumptions that are described below to estimate the Likelihood of the evidence. The Posterior PDF of X can be sensitive to these assumptions. Example 5: Consider the problem solved in examples 1-4 using the ET approach. Assume that an analyst only knows that variable A is between 0.1 and 1, and variable B is between 17 0 and 1. Since the analyst has no other information regarding the prior the analyst assumes uniform prior probability distributions for A and B shown in Figure 6. Method for combining evidence from experts to construct models of uncertainty When the evidence given by experts about a random variable X, consists of intervals (Figure 7), we need a method to interpret the evidence and bring it into a form so that it can be used in the Bayesian framework. To apply Bayes theorem to this problem we need to estimate the Prior PDF of X and the Likelihood L(E/X=x). The following assumptions are made to combine evidence from experts: a) The analyst converts the interval provided from each expert about a variable into a point estimate. This can be the midpoint of the interval or another point obtained based on the analyst's judgment. b) The point estimate of the expert is equal to the true value of the variable plus an error. The analyst assumes a joint probability distribution of the errors of the experts. Suppose that we have evidence in the form of intervals from n experts, [ ximin , ximax ] for i = 1,…,n (Figure 7). The analyst can assume that the ith expert gives a point estimate x̂i , which is the midpoint of the ith interval: ximin + ximax x̂i = 2 for i = 1,…,n (15) Let x be the true value of variable X. If the error in the ith expert’s estimate is Di, then: 18 Point Estimates = Random Variables + Errors of Experts, or ˆ =X+D X (16) where X̂ is vector of the point estimates of the experts (size n), X is a vector whose elements are all equal to the true value of variable X, x, and D is the vector of the errors. We need the PDF of D to estimate the likelihood of the evidence. As an example we can assume that random vector D is normal with mean b and covariance matrix C. σ 2 D1 C= M ρ σ σ n,1 Dn D1 ρ1,2 σ D1 σ D2 O L L M σ Dn 2 (17) where σ Di is the standard deviation in the estimate of the ith expert and ρij is the correlation coefficient of the estimates of two experts. The ith element of vector b, bi, is the bias of the ith expert. The analyst should estimate the above quantities. As an example, an analyst could assume that: a) Bias bi is zero, b) The endpoints of the interval provided by ith expert are equal to the midpoint ±3 standard deviations of the error, respectively. On the basis of the above assumptions, the analyst can calculate the standard deviation of the error in each expert estimate: ximax − ximin σD = i 6 19 (18) Example 6: In example 5 we assumed the prior PDF’s of A and B. The next step is to determine the experts’ error given the evidence from experts. We also assume that the bias is zero for all the experts (that is the errors of the experts have zero mean). Figure 8 displays the experts’ errors for variables A and B. Then the Likelihood of the evidence is: L( E | X = x) = f D ( Xˆ − X / X = x) = 1 1 2 ⋅π ⋅ C 2 1 − .( Xˆ − X −b)T ⋅C −1⋅( Xˆ − X −b) .e 2 (19) The Posterior PDF of variable X can be calculated from Eq. (11). Example 7: Consider the function Y = (A+B)A. The Prior PDFs and the errors of the experts for variables A and B were found in examples 6 and 7. The Likelihood is calculated for this example using Eq. (19) and the Posterior PDF using Eq. (13). The Likelihood PDF and the Posterior PDF of A and B are presented in Figure 9. Using the convolution integral method we can readily compute the posterior PDF of dependent variable Y. Uncertainty Measures In standard and Bayesian probability theory, Shannon’s entropy measures the uncertainty due to conflict. Since evidence is treated as if it were precise, Nonspecificity is zero. Shannon’s entropy for finite sets is not directly applicable for continuous distributions as a measure of uncertainty. When a probability density function of a continuous variable is defined in a real interval, then Shannon’s entropy is defined in relative terms using a reference probability density function. In this case, entropy can be positive (entropy in 20 the probability density function is greater than that in the reference probability density function) or it can be negative. It can only be employed in a modified form: ∞ H(X) = − ∫ p(x) ⋅ log −∞ p(x) dx g( x ) (20) where X is the random variable with PDF p(x) and g(x) is the reference density function of X. In this paper, reference densities for the input variables are their prior PDFs . For dependent variable Y, reference density is the PDF of Y corresponding to the prior PDF’s of the input variables. 3. Demonstration and Comparison of ET and BT Approaches The Epistemic Uncertainty Group, [15], proposed solving the following challenge problem using methods for modeling uncertainty to understand how the methods work when evidence is imprecise. Consider function Y=(A+B)A, where A and B are independent variables, which means that knowledge about one variable does not alter our belief about the other. There is no uncertainty in the functional relation between A, B and Y. There is only uncertainty in the values of A and B. Experts provide information about A and B in the form of intervals. The objective is to quantify the uncertainty in Y. In this paper, the above problem is solved in seven cases using both ET and BT approaches. The objective is to calculate and compare the models that these approaches construct to characterize the uncertainty in variables A, B and Y. Table 1 presents the evidence from the experts. In case 1, we have only one expert providing evidence for each variable, so there is only imprecision. In cases 2 and 3, two experts provided evidence for each of the variables A and B and there is both imprecision and conflict. 21 Conflict is considerably lower in case 3 than in case 2, because the intervals are overlapping in the former case and disjoint in the latter. The experts in case 4 are precise (their intervals are very narrow) but they contradict each other. In case 5, we have highly imprecise experts (the intervals for A and B are wide) and there is no conflict (the intervals are nested, which means that all experts could be right). This is the opposite situation than in case 4 where conflict dominates over imprecision. In cases 6 and 7, we have evidence from 3 and 4 experts for variables A and B, respectively. In case 6, the evidence is nested, so there is no conflict. In case 7, the intervals of the experts are disjoint and narrower than in case 6. Therefore there is higher conflict and lower imprecision than in case 6. The analyst in the Bayesian approach makes the following assumptions for all the cases: 1. The Priors of A and B are uniform from 0.1 to 1 and from 0 to 1, respectively. 2. The error of each expert is normal with standard deviation equal to 1/6th of the width of the interval provided by the expert in all cases but 4a and 7a. If the experts are unbiased the mean of the error is zero. If the experts are independent, then the correlation coefficients of the errors are zero. Consequently, the correlation matrix in Eq. (17) is diagonal. In the Bayesian approach the analyst assumes the probability distributions of the errors of the experts. In cases 1-7, the experts are assumed independent and unbiased. In case 3a, the analyst still assumes zero bias but the errors of the two experts are positively correlated ( ρ = 0.8 ) for variable A, and negatively correlated ( ρ = −0.8 ) for variable B. 22 In case 3b, the analyst assumes a bias of 0.1 for variable A and a bias of 0.05 for variable B, but the experts are independent. Cases 4 and 7 are challenging for the analyst who uses Bayesian approach because the experts contradict each other. This means that based on the experts intervals only one expert can be correct. The analyst could still assume that the standard deviation of the expert's errors are equal to 1/6th of the width their intervals but this is a poor assumption because if both experts were as accurate as this assumption implies then they would not contradict each other. To overcome this difficulty, the analyst increases the standard deviation of the error based on the degree of the conflict in the expert's evidence. The standard deviation of the error obtained from (Eq 18) is increased by 0.05 and 0.2 for both variables A and B in cases 4a and 7a, respectively. Figures 10-15 16 present the results of the two methods in cases 1-7, respectively. The ET method yields maximum and minimum cumulative probability distributions of the input and the output variables. These curves envelop the true cumulative probability distributions of the variables. They are also labeled Plausibility and Belief, respectively. The Bayesian approach characterizes uncertainty using a single cumulative probability distribution function. Because of the assumptions about the probability distributions of the expert's errors, the maximum and minimum cumulative probability distributions do not always envelop the Bayesian cumulative distribution. One can assess the magnitude of each type of uncertainty by studying the maximum and minimum cumulative probability distributions obtained from the ET approach. A large horizontal distance between the maximum and minimum cumulative 23 distributions of a variable indicates high imprecision. For example, in case 5, there is high imprecision in all variables. Uncertainty due to conflict in a variable can be assessed by the width of the interval in which its cumulative probability distribution suggests that it can vary. The flatter the cumulative distribution, the wider is the interval of variation and the higher is the uncertainty due to conflict. For example, conflict is larger in cases 4 and 7 than in the other cases, because the cumulative Plausibility and Belief distributions of the variables are flatter (have lower slope) in these cases than in the other cases. From the Bayesian approach results in Figure 12, we observe that the uncertainty increases when there is positive correlation in the expert’s errors (variable A) and decreases when there is negative correlation in the expert's errors (variable B). A shift in the cumulative distribution function (Figure 12) accounts for the bias assumed in estimating the experts’ errors for variables A, B and Y in case 3b. As mentioned earlier, conflict is largest in cases 4 and 7. If the Bayesian analyst assumes that the standard deviation of the errors of the expert’s estimates are 1/6th of the widths of their intervals, then the analyst will seriously underestimate the uncertainty in both the input variables and the response variable. For example, in Figures 13 and 16, the uncertainty in all variables computed by the BT approach in cases 4 and 7 is small compared to that predicted by the ET approach. But if the Bayesian analyst increases the standard deviation of the errors of the experts then the analyst will assess the uncertainty more accurately (cases 4a and 7a) and his/her conclusions will be consistent with those from ET. Both Shannon's entropy, used in BT, and Total Uncertainty, used in ET, indicate that the uncertainty is largest in cases 4a and 7a (Figure 17). It is also observed from 24 Figure 17 that the conclusions of the BT approach are sensitive to the assumptions about the experts’ errors. Assuming that the standard deviations of the errors of the expert’s estimates are 1/6th of the widths of their intervals leads to the conclusion that entropy is small (cases 4 and 7). But this assumption amounts to saying that both experts are precise, which is wrong because the experts contradict each other. Using proper assumptions about the expert's errors a Bayesian analyst obtains consistent conclusions with ET. Table 2 presents the Nonspecificity, Conflict and Shannon’s entropy in cases 1 to 7. Even though both Strife and Shannon’s entropy measure conflict, they should not be compared because Shannon’s entropy measures the conflict relative to a reference probability density (a uniform probability density in this example). The negative numbers for the entropy indicate that the uncertainty in variables A and B was reduced when the prior distributions of these variables were updated based on the expert’s evidence. In case 1, Strife is zero showing that there is no uncertainty due to conflict. In case 2, Strife is higher than Nonspecificity, which indicates that uncertainty due to conflict in the evidence of the experts dominates. In case 3, imprecision and conflict types of uncertainty are comparable. In case 4, conflict dominates over imprecision because the evidence is precise but conflicting. In case 5, imprecision dominates over conflict, because the evidence consists of nested intervals. In Case 6, Nonspecificity and Strife are comparable. In case 7, where the intervals are disjoint, Strife is larger then Nonspecificity. These conclusions are consistent with those from Figure 18, which shows that Nonspecificity dominates in cases 1, 3-3b, 5, 6 whereas Strife dominates in cases 4 25 and 7. In the BT approach one cannot only assess the total uncertainty in the input and output variables because both types of uncertainty were aggregated into one. When imprecision is large, a decision-maker cannot estimate the probability of an event accurately. For example, in case 5, the cumulative probability of Y can assume any value between 0 and 1, for values of Y between 1 and 1.7. This indicates that one should consider collecting more data before making a decision. However, if the decision-maker has to make a decision now, then the results of ET do not tell the decision-maker what to do. For example if two alternative designs have minimum and maximum probabilities of failure 0.05 and 0.1, and 0.01 and 0.15, respectively, the ET approach does not tell the decision-maker which design is safer. Table 3 summarizes the differences of the two approaches. BT approach does not distinguish between imprecision and conflict types of uncertainty. It provides single estimates of the probabilities of events, which help a decision-maker rank alternative designs in terms of their reliability. On the other hand, a decision-maker cannot tell if it is worth buying more information by only examining the cumulative probability distribution of a variable. One can assess the value of additional information by studying the sensitivity of the results of the BT on the underlying assumptions about the probability distributions of the random variables. 4. Experimental Comparison of Methods Different methods have been proposed on how to model and propagate the uncertainty, and provide information on the uncertainty in the function y in the challenge problems. We assume that, in many cases, uncertainty is propagated for the purpose of 26 making decisions. Therefore, we propose a simulation method for testing methods in terms of their effectiveness for making simple decisions. The proposed simulations imitate the following physical experiment designed to see which of two people (say John and Linda) can estimate better the relative weight of two pieces of cake. We give John a cake, asking him to cut it knowing that Linda will pick the heavier slice, and that his objective is to end up with the heaviest slice himself. Under these conditions, John will try to cut the cake as evenly as possible. We then repeat the experiment by giving Linda an identical cake to slice. Finally, we weigh the two pieces that John ended up with and the two pieces that Linda has. If Linda has substantially heavier total, it would indicate that she estimates the relative weight of two pieces of cake more accurately than John. This problem belongs to a wide class of reallife problems in which two players are to divide a certain amount of resources or goods in two parts. For example, a sales manager wants to divide a town to two salesmen equitably. One way is for the manager to ask one salesman to divide the town into two regions by drawing a straight line on the town map and then ask the other salesman to select a region. Then the salesman who divided the town receives the remaining region. The analogy is immediate when one wants to estimate the median of the probability density function of function, y(x) (median: 0.5 probability to fall below it and 0.5 probability to exceed it). We have two methods for constructing models of the uncertainty in variable X and propagating it to quantify the uncertainty in function y(x). Each method has an advocate, called John and Linda. We ask John to divide the interval [yl, yu] into two by picking one point inside the interval, so that Linda can select one subinterval. We repeat the procedure by asking Linda to slice the interval and allow John 27 to pick a subinterval. Finally, we conduct an evaluation of who selected better. How we do that is a crucial part of the proposed procedure, but we will discuss it later. First, let us give a simple example. Assume that X is a scalar and that we know only that it resides in the interval [0,1]. We will assume that the person who divides the interval will do so assuming that the other person uses the same model for characterizing uncertainty. We also assume that each player wants to get the portion with the highest probability. We will assume that if the function is y = x, so that [yl, yu]=[0,1], both John and Linda will slice [yl, yu] in the middle (at y=0.5). Now consider the function y = x2, which also has [yl, yu]=[0,1]. This time, John may say that since he does not know anything about the distribution of X, he does not know anything about the distribution of Y so he will still divide [yl, yu] in the middle. Linda, on the other hand, may assume a uniform distribution for X. Using this assumption she will select the interval [0.0.5] and leave John the interval [0.5,1]. Then, in her turn, she will divide [yl, yu] at y = 0.25 since 0.25 is the median of the probability distribution of Y. Using the same logic, John will then pick the interval [0.25,1]. That is, in both cases John will end up with the right interval, and Linda with the left one. We now come to the issue of how to decide who made better decisions, John or Linda. We assume that knowing only the interval where a variable lies is equivalent to that variable being able to take any probability distribution supported on that interval with each distribution having the same likelihood. This suggests the following possible Monte Carlo simulation: Pick a distribution at random, and evaluate the outcomes for John and Linda based on that distribution. Repeat the process many times and see 28 whether John or Linda emerge as winners when a large number of simulations has been performed. To make this process more manageable we assume that probability distributions in common usage, such as the normal or the Weibull distributions, are popular because they describe well uncertainties we encounter often in practical applications. This allows us to limit the simulation to a set of the five or ten most popular distributions. Note that even if we limited ourselves to a single distribution, we still can vary the parameters of the distribution such as the mean and standard deviation in case of the normal distribution. To cater for the possibility that we left out some odd but important distributions, we can add in some experimental distributions from various sources. For example, we may ask a class of students to each pick a number from the range of X or a scaled version of X We still need to deal with the question of how to simulate situations where the information on X is more complex. For example, we may receive information from two people, one saying that X is in the range [0, 1] and the other saying that it is in the range [0, 0.5]. The best way of simulating this situation deserves some consideration and debate. One possibility is to take half of the distributions from one interval and the other half from the other interval (assuming that both sources of information are equally credible). In the following, we present methods for solving the interval splitting problem and demonstrate them through examples. Definition of Interval Splitting and Subinterval Selection Problem 29 Consider the following game played by John and Linda. The players consider a variable X and a known function of this variable, Y(X). They are told that X is in the interval I=[xl, xu]. Then Y is in the interval IY=[yl, yu], where yl and yu are the minimum and maximum values of the function Y(X) when X varies in the interval I. Additional evidence about X could be available. John divides IY into two subintervals by selecting a point, y0. Then Linda selects one of the subintervals, IYC, and John gets the remaining subinterval, IYB. Linda wins if he or she selects the interval with higher probability. Find y0 so that John does not lose. The game is repeated with John and Linda switching places to create a symmetric game. Solution Even though the two players may have different models of uncertainty, we assume that they do not know what these models are. Therefore, we assume that they solve the problem under the assumption that the other player uses the same model as they do. The interval splitting problem is formulated as follows: Find y0, to maximize the objective function = P(IYB)-P(IYC), where P(IYB) and P(IYC) are the true probabilities of the John's and Linda's subintervals. Under the assumption that Linda has the same uncertainty model as John, Linda can always select the interval with highest probability. The best bet for John is to select the median of Y as the dividing point, in which case he will break even (see the appendix for a mathematical proof of this assertion). 30 John should select the subinterval with highest probability that is, he will select the left subinterval if y0 is to the right of the median of Y (i.e. FY (y0 ) ≥ 0.5). He will select the right subinterval otherwise. Examples In the following, we present two solutions to both the interval splitting and subinterval selection problems for function Y=X2, in two cases, in which John knows that X is in an interval (Case A) and John has evidence about X in the form of three intervals (Case B). We further assume that one player uses the principle of maximum entropy to construct a probability distribution, while the other employes the minimum and maximum cumulative probability distribution function of Y obtained using ET for dividing and selecting. Case A: The only evidence available is that X is between 0 and 1. Therefore, Y ranges between 0 and 1, too. Interval splitting problem: Find spitting point, y0. Maximum entropy solution: Using the maximum entropy principle, John assumes a uniform probability density of X in [0, 1]. Then, FY ( y 0 ) = 0.5 ⇔ P(Y ≤ y 0 ) = 0.5 ⇔ P( X 2 ≤ y 0 ) = 0.5 ⇔ P( X ≤ y 0 ) = 0.5 ⇔ F X ( y 0 ) = 0.5. Therefore, the optimum value of Y is the square of the median of X , which is x 0 = 0.5, y 0 = 0.5 2 = 0.25. 31 ET solution: In this case, John does not assume a probability distribution for X or Y. Instead, he will find the minimum and maximum cumulative probabilities of X that are consistent with the available evidence and derive the minimum and maximum cumulative probabilities of Y. Figure 19 shows the minimum and maximum cumulative probability distributions of X, and Y, which are identical in this case. According to this figure, John only knows that the cumulative probability can assume any value between 0 and 1 for any value of Y in the interval [0, 1]. Therefore, it does not matter what value of Y he selects as long as it is in the interval [0, 1]. In cases where one only knows that the optimum solution to a problem is in a certain interval it is reasonable to select the midpoint of that interval for the solution because this will best protect against errors in calculating the interval. One the basis of this argument, John will select y0 = 0.5. Subinterval selection problem Probabilistic solution: If y0 is less than 0.25, John will select the right interval, otherwise he will select the left. ET solution: If y0 is less than 0.5, John will select the right interval, otherwise he will select the left one. Based on these calculations we will get the following scenario: The maximum entropy player will divide the interval at 0.25, and the ET player will select the right subinterval. In the second game, the ET player will divide the interval at 0.5, and the maximum entropy player will select the left subinterval. Case B: Three experts tell John that X ranges in the following intervals: [0, 1], [0.3, 0.7], [0.4, 0.6]. 32 Interval splitting problem Maximum entropy solution: If we considered only the opinion of one expert we would assume a uniform probability distribution in the expert's interval. Assuming that each distribution is correct one third of the time we obtain the probability density function of X shown in Figure 20. Based of this function, the median of X is x0 = 0.5. Therefore, the optimum value of Y is again, y0 = 0.25. ET solution: The body of evidence for variable X introduced by the experts is shown in Figure 21, where m ([a, b]) is the basic probability assignment of interval [a, b]. Figure 22 presents the basic probability assignment of Y . Using this body of evidence we can compute the minimum and maximum cumulative probabilities of variable Y. For example, the minimum cumulative probability distribution at point y is equal to the belief of interval [-∞, y], which is: FYmin ( y ) = Bel ([ −∞, y ]) = ∑ m( A) A ⊆ [-∞, y ] (21) The minimum and maximum cumulative distributions of Y in Figure 23 can tell us that the median of Y cannot assume any values less than 0.09 or greater than 0.49. Therefore, the optimum value of Y, should be anywhere in the range [0.09, 0.49]. A reasonable choice for y0 is the midpoint, that is the splitting point is, y0 = 0.29. Subinterval selection problem Probabilistic solution: If y0 is less than 0.25, John will select the right interval, otherwise he will select the left. 33 ET solution: If y0 is less than 0.29, John will select the right interval, otherwise he will select the left. The results in this case are similar to Case A. In the first game the maximum entropy player will select 0.25 as the splitting point, and the ET player will select the right interval. In the second game, the splitting point will be 0.29, and the maximum entropy player will select the left interval. Observations Consider the problem Y=Xn, where n is large (e.g. 100). The person who used the maximum entropy principle would select an extremely small y0 (y0 = 0.5100). If the only information available to John was that X were in the interval [0, 1] and John used an ET approach, then he would not know what value of Y to select in the interval [0, 1]. He could select the midpoint y0 = 0.5. In this case he would lose practically all the time by a wide margin for most probability distributions. Figure 24 shows John's objective (payoff) function, when John splits the interval, in case where although both players assumed that Y = X 100 , the true value of the exponent of the function was 100 times a bias factor ranging from 0.5 to 1.5. Two cases in which John uses probability and Linda uses minimum and maximum probalities and vice versa are considered. Since John is splitting the interval, he has a disadvantage compered to Linda. When John uses probability his payoff function is close to 0, which means he will break even. But when John uses maximum and minimum probability he will almost always lose and his payoff function is practically -1, which is the lowest value this function can assume. The reason 34 is that he will split the interval in half and Linda will select the left interval which almost always has the higher probability. Now consider the same problem but evidence consists of three intervals [0, 1], [0.3, 0.7] and [0.4, 0.6]. In this case, if John used the probabilistic method described above, John would still get the correct value of y0. On the other hand, if John used the approach based on minimum and maximum probability he would counclude that the splitting point should be between [0.3100, 0.7100]. This solution is much better than the optimum solution of the same method when only interval [0, 1] is available. It appears that the performance of the approach based on minimum and maximum probability can be very poor when there is a severe information deficit. This is interesting because our intuition tells us the opposite, the fewer assumptions a method does the less sensitive it is to lack of information. 5. Conclusions Two approaches, one based on ET and the other on BT have been presented. These approaches can be used for modeling uncertainty and assessing the safety of a system when the available evidence consists of intervals bounding the values of the input variables. Experts provide these intervals. The Evidence theory approach does not require the user to assume anything beyond what that is already available. This approach treats uncertainty due to imprecision differently than uncertainty due to randomness. The approach yields maximum and minimum bounds of the probability of survival (and/or the probability of failure) of a system, which can help assess the relative importance of the two types of uncertainty. These results could help a decision-maker decide if it is worth collecting additional data 35 to reduce imprecision. On the other hand, if the gap between maximum and minimum probabilities were large, the decision-maker would have difficulty ranking alternative options. If the decision-maker has to make a decision now, then the ET approach does not tell the decision-maker what option is better. The Bayesian approach requires the analyst to make strong assumptions about the credibility of the experts to estimate the likelihood of the available evidence. On the other hand, this approach is more flexible than the Evidence theory approach because it accounts for the credibility and correlation of the experts. This approach yields a single estimate of the probability of failure of the system. This makes it easier for a decisionmaker to rank alternative options. On the other hand it does not help the decision-maker assess the relative importance of imprecision over random uncertainty. It is recommended that a decision-maker compute both the Bayesian probability of events and their minimum and maximum probabilities when there is considerable imprecision. A large gap between the minimum and maximum probability suggests that more information should be collected before making a decision. If this is not feasible, then Bayesian probabilities can help make a decision. A procedure for testing alternative methods for solving the challenge problems, based on the outcomes of decisions obtained from these methods, was presented. A simple set of test problems, mimicking real-life decision problems in which a given amount of resources is to be equally divided, is used for testing methods. We can learn useful lessons about the efficacy of alternative methods, which are difficult to learn from examination of the theoretical foundations of methods, using the test problems. For example, it was found that a decision-maker who uses an ET approach performs worst 36 than an opponent who uses probability, in the long run, even when the information about uncertainty is scarce. Acknowledgements The work presented in this report has been partially supported by the grant "Analytical Certification and Multidisciplinary Integration" provided by The Dayton Area Graduate Study Institute (DAGSI) through the Air Force Institute of Technology. Appendix: Mathematical derivation of the solution As mentioned in the main body of this report, John will select the median of the probability distribution of Y as the dividing point to overcome Linda's advantage of knowing the true probability distribution of Y. Here we will prove this assertion. John will assume that Linda will select the interval with the higher true probability, that is the probability of Linda selecting the left subinterval, p(y0), is: 1 if FY ( y 0 ) ≥ 0.5 p( y 0 ) = 0 if FY ( y 0 ) < 0.5 (22) where FY(y0) is the value of the true cumulative probability distribution of Y at Y = y0. The decision tree is shown in Figure 25. Figure 26 shows the objective function as a function of the value of the cumulative probability of Y at y0, FY(y0). It is observed that the optimum value of y0 is the one for which FY (y0) is 0.5, that is y0 is the median of Y. This is the only choice for which John breaks even − he losses for all other values. Q.E.D. 37 References [1] Oberkampf, W.L., DeLand, S.M., Rutherford, B.M., Diegert, K.V., Alvin, K.F., 2000, “Estimation of Total Uncertainty in Modeling and Simulation”, Sandia Report SAND2000-0824, Albuquerque, NM. [2] Walley, P., 1991, Statistical Reasoning with Imprecise Probabilities, Chapman and Hall, New York. [3] Walley, P., 1998, “Coherent Upper and Lower Previsions.” Available at http://ippserv.rug.ac.be. The Imprecise Probabilities Project. [4] Kyburg, H. E., 1998, "Interval-valued Probabilities." Available at http://ippserv.rug.ac.be, . The Imprecise Probabilities Project. [5] Dubois, D and Prade, H., 1998, Possibility Theory, Plenum Press, New York. [6] Joslyn, C., A., 1994, Possibilistic Processes for Complex Systems Modeling, Ph.D. Dissertation, State University of New York at Binghamton, . [7] Giles, R., 1982, “Foundations for a Theory of Possibility,” Fuzzy Information and Decision Processes, North-Holland Publishing Company. [8] Langley, R. S., 2000, “Unified Approach to Probabilistic and Possibilistic Analysis of Uncertain Systems,” Journal of Engineering Mechanics, ASCE, Vol. 126, No. 11, pp. 1163-1172. [9] Shafer, G., 1976, A Mathematical Theory of Evidence, Princeton University Press, Princeton. [10] Sentz, K., Ferson, S., “Combination of Evidence in Dempster-Shafer Theory,” Sandia Report SAND2002-0835, April 2002, Albuquerque, NM. 38 [11] Smets, P., "Belief Functions and the Transferable Belief Model." Available at http://ippserv.rug.ac.be. The Imprecise Probabilities Project. [12] Berger, J. O., 1985, Decision Theory and Bayesian Analysis, Springer-Verlag, New York, pp. 109-113. [13] Winkler, R. L., 1972, Introduction to Bayesian Inference and Decision, Holt Rienhart and Winston, Inc. [14] Ben-Haim,Y., 2001, Information-gap Decision Theory: Decisions Under Severe Uncertainty, Academic Press. [15] Oberkampf, W.L., Helton, J. C., Joslyn, C. A., Wojtkiewicz, S. F., and Ferson, S., “Challenge Problems: Uncertainty in Series Systems Response Given Uncertain Parameters,” (this issue). [16 ] Klir, G. J., Wierman, M. J.,1998, Uncertainty-Based Information, A SpringerVerlag Company. [17] Rocha, L. M., "Relative Uncertainty and Evidence Sets: A Constructivist Framework," International Journal of general Systems, Vol. 26, (1-2), pp. 35-61. [18] Melchers, R. E., 1987, Structural Reliability, Analysis and Prediction, Ellis Horwood Limited, West Sussex, England. [19] Wu, J. S., Apostolakis, G. E., and Okrent, D., 1990, Uncertainties in System Analysis: Probabilistic Versus Nonprobabilistic Theories," Reliability Engineering and System Safety, 30, pp. 163-181. 39 40 Figure Captions Figure 1: Plausibility of A and B. The belief is zero. Figure 2: Variables A and B in joint space. Figure 3: Joint maximum probability (Plausibility) of A and B. Figure 4: Body of Evidence and Plausibility of variable Y. Belief is zero. Figure 5: Body of Evidence and Cumulative Plausibility and Belief of Y. Figure 6: Prior PDF’s of variables A and B. Figure 7: Evidence from experts. Figure 8: PDFs of errors of experts for input variables. Figure 9: Likelihood and Posterior Probability Density Functions of Variables A and B Figure 10: Cumulative Plausibility and Belief along with Cumulative Bayesian PDF for Case 1. Figure 11: Cumulative Plausibility and Belief along with Cumulative Bayesian PDF for Case 2. Figure 12: Cumulative Plausibility and Belief along with Cumulative Bayesian PDF for Cases 3,3a and 3b. Figure 13: Cumulative Plausibility and Belief along with Cumulative Bayesian PDF for Cases 4 and 4a. Figure 14: Cumulative Plausibility and Belief along with Cumulative Bayesian PDF for case 5. Figure 15: Cumulative Plausibility and Belief along with Cumulative Bayesian PDF for Case 6. 41 Figure 16: Cumulative Plausibility and Belief along with Cumulative Bayesian PDF for Cases 7 and 7a. Figure 17: Uncertainty in ET vs. Uncertainty in BT Figure 18: Nonspecificity (ET) vs. Strife (ET) Figure 19: Maximum and Minimum Cumulative Probabilities of variables X and Y. Figure 20: PDF of X. Figure 21: Body of Evidence of variable X. Figure 22: Body of Evidence of variable Y. Figure 23: Maximum and Minimum Cumulative Probability of Y. Figure 24: Payoff function graph, Probability vs. Min-Max probability. Figure 25: Decision Tree Figure 26: Objective function vs. FY (y0). 42 Case Variable A Variable B Sizes and positions of intervals Conflict 1 [0.2,0.5] [0.3,0.6] Wide None 2 [0.1, 0.5] [0.55,0.95] [0, 0.5] [0.52,1] Wide, Disjoint High High 3, 3a, 3b [0.1, 0.5], [0.2,0.6] [0, 0.5], [0.2,0.7] Wide, Overlapping Low High 4, 4a [0.5,0.52], [0.6,0.62] [0.6,0.62] [0.7,0.72] Narrow, Conflicting High Low 5 [0.1,1.0] [0.6,0.8], [0.4,0.85], [0.2,0.9], [0.0,1.0] Low High 6 [0.5,0.7], [0.3,0.8], [0.1,1.0] [0.59,0.61], [0.4,0.85], [0.2,0.9], [0.0,1.0] Nested Low High 7, 7a [0.8,1.0], [0.5,0.7], [0.1,0.4]. [0.8,1.0], [0.5,0.7], [0.1,0.4], [0.0,0.2] Disjoint High High Wide, Overlapping Table 1: Evidence from experts. 43 Imprecision High Case Non-specificity (ET) Strife (ET) Shannon’s entropy (BT) 1 A = 0.2624 B = 0.2624 Y = 0.1747 A= 0 B=0 Y=0 A = -1.484 B = -1.586 Y = -1.62 2 A = 0.37 B = 0.40 Y = 0.417 A=1 B=1 Y = 0.854 A = -1.53 B = -1.43 Y = -2.08 3 A = 0.3365 B = 0.4055 Y = 0.2846 A = 0.193 B = 0.322 Y = 0.308 A = -1.53 B = -2.043 Y = -2.46 3a A = 0.3365 B = 0.4055 Y = 0.2846 A = 0.193 B = 0.322 Y = 0.308 A = -1.237 B = -2.2173 Y = -3.08 3b A = 0.3365 B = 0.4055 Y = 0.2846 A = 0.193 B = 0.322 Y = 0.308 A = -1.58 B = -2.11 Y = -2.573 4 A = 0.0198 B = 0.0198 Y = 0.0247 A= 1 B=1 Y = 1.924 A = -3.59 B = -3.87 Y = -4.03 4a A = 0.0198 B = 0.0198 Y = 0.0247 A= 1 B=1 Y = 1.924 A = -1.84 B = -1.93 Y = -2.35 5 A = 0.64 B = 0.444 Y = 0.714 A=0 B = 0.358 Y = 0.123 A = -0.169 B = - 2.18 Y = -1.76 6 A = 0.41 B = 0.404 Y = 0.516 A = 0.36 B = 0.465 Y = 0.375 A = -1.81 B = - 3.56 Y = -2.58 7 A = 0.209 B = 0.203 Y = 0.249 A = 1.585 B = 1.75 Y = 1.805 A = -2.5 B = -2.24 Y = -2.84 7a A = 0.209 B = 0.203 Y = 0.249 A = 1.585 B = 1.75 Y = 1.805 A = -0.46 B = -0.74 Y = -1.08 Table 2: Uncertainty Measures 44 ET Approach BT Approach Analyst does not need to make any Analyst assumes the Prior and additional assumptions beyond what is the error in the estimates of the already available. experts. These assumptions can affect the results significantly. Treats uncertainty due to imprecision and Does not distinguish between conflict separately. Yields maximum and imprecision and conflict. Gives minimum bounds of probabilities of events a single value of the probability from which the relative magnitude of these of an event two types of uncertainty can be assessed. Reliability and correlation between experts Expert’s reliability and cannot be taken into account. correlation can be taken into account. When the intervals provided by the experts are very broad, then the gap between the Since a single value of the maximum and minimum probabilities of probability of failure is provided failure can be very large. A decision- it is easier for the decision- maker might be unable to rank alternative maker to rank designs in terms designs in terms of their reliability because of their reliability. of this gap. Table 3: Comparison of ET and BT approaches 45 Figure 1: Plausibility of A and B. The belief is zero. 46 m=1/2 m=1/2 Variable B 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.2 m=1/2 0.3 0.4 0.5 m=1/2 0.6 Variable A Figure 2: Variables A and B in Joint Space 47 y b a Figure 3: Joint maximum probability (Plausibility) of S and B 0.25 b :=A 0.5 48 Focal element corresponding to the shaded box(Fig.2) 0.7 0.8 0.9 4 1.1 1.2 Figure 4: Body of Evidence and Plausibility of Variable Y. Belief is zero. 49 Focal element corresponding to the shaded box(Fig.2) 0.7 0.8 0.9 4 1.1 1.2 Figure 5: Body of Evidence and Cumulative Plausibility and Belief of Y 50 Figure 6: Prior PDF’s of variables A and B 51 … x1min x̂1 x1max x2min x̂2 x2max xnmin x̂n xnmax Figure 7: Evidence from experts 52 Variable A Variable B Figure 8: PDFs of errors of experts for input variables. 53 Variable A Variable B Figure 9: Likelihood and Posterior Probability Density Functions of Variables A and B. 54 Figure 10: Cumulative Plausibility and Belief along with Cumulative Bayesian PDF for Case 1. 55 Figure 11: Cumulative Plausibility and Belief along with Cumulative Bayesian PDF for Case 2. 56 Figure 12: Cumulative Plausibility and Belief along with Cumulative Bayesian PDF for Cases 3, 3a and 3b. 57 Figure 13: Cumulative Plausibility and Belief along with Cumulative Bayesian PDF for Cases 4 and 4a. 58 Figure 14: Cumulative Plausibility and Belief along with Cumulative Bayesian PDF for case 5. 59 Figure 15: Cumulative Plausibility and Belief along with Cumulative Bayesian PDF for Case 6. 60 Figure 16: Cumulative Plausibility and Belief along with Cumulative Bayesian PDF for Cases 7 and 7a. 61 Total Uncertainty (Nonspecificity+Strife) 2.5 2 Case 4 Case 7 1.5 Case 2 1 Case 6 Case 5 Case 3a Case 3b Case 3 Case 1 0.5 0 -4.5 Case 7a Case 4a -4 -3.5 -3 -2.5 -2 -1.5 -1 -0.5 Shannon's Entropy Figure 17: Uncertainty in ET versus Uncertainty in BT 62 0 Nonspecificity 0.8 case 5 0.7 0.6 case 6 0.5 case 2 0.4 0.3 cases 3, 3a, 3b 0.2 case 1 0.1 0 0 0.5 1 case 7, 7a case 4, 4a Strife 1.5 2 Figure 18: Nonspecificity (ET) versus Strife (ET) 63 2.5 1 f ( a) l ( a) 0.5 0 0.2 0 0.2 0.4 0.6 0.8 1 a M axim u m C u m u la tive P rob ab ilit y M in im u m C u m u lat ive P rob ab ility Figure Probabilities F ig u re19: 1 9Maximum : M ax im and u m Minimum an d M in Cumulative imu m cu mu la tiv e p roof b ab variables anddY. Y v aria b les XX an 64 fX(x) 1 0.66 0.33 0 0.3 0.4 0.6 0.7 Figure 20: PDF of X. 65 1 x 66 m([0, 1])=1/3 m([0.09, 0.49])=1/3 m([0.16, 0.36])=1/3 0 0.09 0.16 y 0.36 0.49 1 Figure 22: Body of Evidence of variable Y. 67 68 John's payoff function . 1 Bidder splits interval John splits interval John uses probability 0 -1 0.5 John uses min-max probability 1 1.5 Exponent, bias factor Figure 24: Payoff function graph, Probability vs. Min-Max Probability 69 Following is the decision tree: Linda selects left subinterval p(yo) John selects, yo Linda selects right subinterval Objective function P(IYB)P(IYC) = 1 - 2FY(yo) Objective function P(IYB)P(IYC) = 2FY(yo) - 1 . 1-p(yo) . . Figure 25 : Decision Tree 70 Objective function 0 0.5 1 FY(y0) -1 Figure 26: Objective function vs. FY(y0) 71 72