Download Probability—the description of random events

UNCERTAINTY, MEASUREMENTS, AND ERROR ANALYSIS Uncertainty —The Description of Random Events A. Some introductory definitions 1. Event/realization: the rolling of a pair of dice, the taking of a measurement, the performing of an experiment 2. Outcome: the result of rolling the dice, taking the measurement, etc. 3. Deterministic event: an event whose outcome can be predicted realization after realization, e.g., the measured length of a table to the nearest cm. 4. Random event/process/variable: an event/process that is not and cannot be made exact and, consequently, whose outcome cannot be predicted, e.g. the sum of the numbers on two rolled dice. 5. Probability: an estimate of the likelihood that a random event will produce a certain outcome. B. What’s deterministic and what’s random depends on the degrees to which you take into account all the relevant parameters. 1. Mostly deterministic—only a small fraction of outcome cannot be accounted for a) The measurement of the length of a table—temperature/humidity variation, measurement resolution, instrument/observer error; and, at the quantum level, intrinsic uncertainty. b) The measurement of a volume of water—evaporation, meniscus, temperature dependency. 2. Half and half—a significant fraction of an outcome cannot be accounted for a) The measurement of tidal height in an ocean basin—measurement is contaminated by effects from wind and boats. 3. Mostly random—most of an outcome cannot be accounted for a) The sum of the numbers on two rolled dice b) The trajectory of a given molecule in a solution c) The stock market d) Fluctuations in water pressure in a municipal water supply C. The “sample space”—the set of all possible outcomes 1. A rolled die: 1, 2, 3, 4, 5, 6 (discrete outcome) 2. A flipped coin: H, T (discrete outcome) 3. The measured width of a room (continuous outcome) 4. The measured period of a pendulum (continuous outcome) D. Statistical regularity—if a random event is repeated many times, it will produce a distribution of outcomes. This distribution will approach asymptotic form as the number of events increases. 1. If the distribution is represented as the number of occurrences of each outcome, the distribution is called the frequency distribution function. 2. If the distribution is represented as the percentage of occurrences of each outcome, the distribution is called the probability distribution function. 3. We attribute probability to the asymptotic distribution of outcome occurrences. a) A “fair” die will produce each of its six outcomes with equal likelihood; therefore we say that the probability of getting a “1” is 1/6. E. Description of a random variable X 1. Distribution of discrete outcomes a) Pr(𝑋 = 𝑥𝑖 ) = 𝑓(𝑥𝑖 ) → the probability that the variable X takes the value xi is a function of the value xi. b) 𝑓(𝑥𝑖 ) is called the probability distribution function ↙ f(xi) f(x ) x1 xN x Probability distribution function c) Properties of discrete probabilities i. Pr(𝑋 = 𝑥𝑖 ) = 𝑓(𝑥𝑖 ) ≥ 0 for all i ii. ∑𝑘𝑖=1 Pr⁡(𝑋 = 𝑥𝑖 ) = ∑𝑘𝑖=1 𝑓(𝑥𝑖 ) = 1 for k possible discrete outcomes iii. Pr(𝑎 < 𝑋 ≤ 𝑏) = 𝐹(𝑏) − 𝐹(𝑎) = ∑𝑎<𝑥𝑖≤𝑏 𝑓(𝑥𝑖 ) d) Cumulative discrete probability distribution function 𝑗 Pr(𝑋 ≤ 𝑥 ′ ) = 𝐹(𝑥 ′ ) = ∑𝑖=1 𝑓(𝑥𝑖 ) where xj is the largest discrete value of X less than or equal to x’. Pr(𝑋 ≤ 𝑥𝑘 ) = 1 1 F(xi) ↘ F(x) x1 x Cumulative distribution function e) Examples of discrete probability distribution functions i. Distribution functions for throwing a die: f(xi)=1/6 for i=1, 6 ii. Distribution function for the sum of two thrown dice f(xi)= 1/36 for x1=2 f(xi)= 5/36 2/36 for x2=3 4/36 3/36 for x3=4 3/36 4/36 for x4=5 2/36 5/26 for x5=6 1/36 6/36 for x6=7 Page 2 of 9 ©Johns Hopkins University (3/5/15) xk for x7=8 for x8=9 for x9=10 for x10=11 for x11=12 M. Karweit (modified K.Borgsmiller) 2. Distribution of continuous outcomes a) Cumulative distribution function 𝑥 Pr(𝑋 ≤ 𝑥) = 𝐹(𝑥) = ∫ 𝑓(𝑥)𝑑𝑥 −∞ F(x) 1.0 x Cumulative Distribution Function b) Probability density (distribution) function (p.d.f.) 𝑓(𝑥) = 𝑑𝐹(𝑥)/𝑑𝑥 f(x) Area = 1.0 x Probability density function c) Properties of F(x) and f(x) i. F(-∞) = 0, 0 ≤ F(x) ≤1, F(∞)=1 𝑏 ii. Pr(𝑎 < 𝑋 ≤ 𝑏) = 𝐹(𝑏) − 𝐹(𝑎) = ∫𝑎 𝑓(𝑥)𝑑𝑥 d) Examples of contintuous p.d.f.s i. “top hat” or uniform distribution: f(x) = 0 for –a < x < a f(x) = 1/(2a) for |x| ≤ a ii. Gaussian distribution: 𝑓(𝑥) = iii. 1 (𝑥−𝜇)2 𝑒𝑥𝑝 [− ], 2𝜎 2 √2𝜋𝜎 where μ and σ are given constants Poisson, Binomial are other important named distributions 3. Another way of representing a distribution function: moments a) The rth moment about the origin: 𝑣𝑟 = ∑𝑘𝑖=1 𝑥𝑖𝑟 𝑓(𝑥𝑖 ) i. The 1st moment about the origin is the mean 𝜇 = 𝑣1 = ∑𝑘𝑖=1 𝑥𝑖 𝑓(𝑥𝑖 ) b) The rth moment about the mean μ2 is the variance 𝜎 2 = ∑𝑘𝑖=1(𝑥𝑖 − 𝜇)2 𝑓(𝑥𝑖 ) c) For many distribution functions, knowing all of the moments of f(x) is equivalent to knowing f(x) itself. 4. Important moments a) The mean μ, the “center of gravity” Page 3 of 9 ©Johns Hopkins University (3/5/15) M. Karweit (modified K.Borgsmiller) b) The variance σ2: a measure of spread i. The standard deviation: 𝜎 = √𝜎 2 c) The skewness d) The 𝜇3 3 [𝜎 2 ]2 𝜇4 kurtosis [𝜎2 ]2 : : a measure of asymmetry a measure of “peakedness” F. Estimation of random variables (RVs) 1. Assumptions/procedures a) There exists a stable underling p.d.f. for the RV b) Investigating the characteristics of the RV consists of obtaining sample outcomes and making inferences about the underlying distribution 2. Sample statistics on a random variable X (or from a large sample “population”) a) The ith sample outcome of the variable X is denoted Xi 1 b) The sample mean:⁡𝑋̅ = ∑𝑁 𝑖=1 𝑋𝑖 , where N is the sample size 𝑁 1 ̅ 2 c) The sample variance is: 𝑠 2 = 𝑁−1 ∑𝑁 𝑖=1(𝑋𝑖 − 𝑋) d) 𝑋̅ and s2 are only estimates of the mean μ and variance σ2 of the underlying p.d.f. (𝑋̅ and s2 are estimates for the sample. μ and σ2 characteristics of the population from which the sample was taken) e) 𝑋̅ and s2 themselves random variables. As such, they have their own means and variances (which can be calculated) 3. Expected value a) The expected value of a random variable X is written E(X). It is the value obtained if a very large number of samples were averaged together. b) As defined above: i. 𝐸[𝑋̅] = 𝜇, i.e., the expected value of the sample mean is the population mean ii. 𝐸[𝑠 2 ] = 𝜎 2 , i.e., the expected value of the sample variance is the population variance c) Expectation allows us to use sample statistics to infer population statistics d) Properties of expectation: i. 𝐸[𝑎𝑋 + 𝑏𝑌] = 𝑎𝐸[𝑋] + 𝑏𝐸[𝑌] where a, b are constants ii. If Z = g(X), then E[X] = E[g(X)] = ∑𝑎𝑙𝑙⁡𝑣𝑎𝑙𝑢𝑒𝑠⁡𝑥⁡𝑜𝑓⁡𝑋 𝑔(𝑥)Pr⁡(𝑋 = 𝑥) Example: Throw a die. If the die shows a “6” you win $5; else, you lose $1. What’s the expected value Z of this “game”? Pr(X=1) = 1/6 g(1) = -1 Pr(X=2) = 1/6 g(2) = -1 Pr(X=3) = 1/6 g(3) = -1 Pr(X=4) = 1/6 g(4) = -1 Pr(X=5) = 1/6 g(5) = -1 Pr(X=6) = 1/6 g(6) = 5 E[Z] = (-1)*5*1/6+5*1/6 = 0, i.e., you would expect to neither win nor lose iii. E[XY] = E[X]E[Y] provided X and Y are “independent, i.e., samples of X cannot be used to predict anything out of sample Y (and vice versa) Example: You have error-prone measurements for the height X and width Y of a picture. What is the expected value of the area of the picture XY? Answer: E[X]E[Y] Page 4 of 9 ©Johns Hopkins University (3/5/15) M. Karweit (modified K.Borgsmiller) G. Selected engineering uses of statistics 1. Measurement and errors a) “Best” estimate of value based on error-prone measurements, e.g. three different measurements give three different answers b) Compound measurements each of which is error-prone, e.g., the volume of a box whose sides are measured with error c) Standard error (standard deviation) of “best” estimate, i.e., error bars d) Techniques to reduce the standard error i. Repeated measurements ii. Different measurement strategy 2. Characterizing random populations a) Distribution of traffic accidents at an intersection b) Quantity of lumber in a forest c) Distribution of “seconds” on an assembly line d) Voltage fluctuation characteristics on a transmission line Measurements and Significant Figures A. All measurements have errors 1. Instrument Error 2. Measurement Error 3. Variations in item being measured B. Estimating and Accuracy – when measuring tool gradations require estimation 1. Estimate with a single reading (take ½ the smallest division) 2. Independently measure several times and take an average – try to make each trial independent of previous measurement (different ruler, different observer) C. Accuracy vs. Precision D. Significant Figures 1. Actual Measured Values 2. Defined Numbers: a) Unit conversions, e.g. 2.54 cm in one inch b) Pi c) e, base of natural logarithms d) Integers, e.g. counting, what calendar year e) Rational fractions, e.g. 2/5 3. EXACT NUMBERS HAVE INFINITE NUMBER OF SIGNIFICANT FIGURES 4. Rounding: a) If you do not round after a computation, you imply a greater accuracy than you actually measured b) Determine how many digits you will keep c) Look at the first rejected digit i. If digit is less than 5, round down ii. If digit is more than 5, round up iii. If digit is 5, round up or down in order to leave an even number as your last significant figure d) Multiplication or Division - # of sig figs in result is equal to the # of sig figs in least accurate value used in the computation Page 5 of 9 ©Johns Hopkins University (3/5/15) M. Karweit (modified K.Borgsmiller) e) Addition or Subtraction - Place of last sig fig is important Error Analysis A. Measured value of x = xbest  δx xbest = best estimate or measurement of x δx = uncertainty or error in the measurements B. Error – difference between an observed/measured value and a true value. 1. We usually don’t know the true value 2. We usually do have an estimate C. Systematic Errors 1. Faulty calibration, incorrect use of instrument 2. User bias 3. Change in conditions – e.g., temperature rise D. Random Errors 1. Statistical variation 2. Small errors of measurement 3. Mechanical vibrations in apparatus E. Percent Error F. Relative Error G. What is the error if you add or subtract numbers? The absolute error is the sum of the absolute errors w  x   y  z upper bound H. What is error if you multiply or divide? The relative error is the sum of the relative errors w x  y z    w x y z I. upper bound What is the error if you have a power law? The relative error is the exponent times the relative error w x n w x J. What is the error when you use trigonometric functions? w  sin( x ) w  sin( x  x )  sin( x ) Page 6 of 9 ©Johns Hopkins University (3/5/15) M. Karweit (modified K.Borgsmiller) OPTIONAL DISCUSSION: THE CALCULUS OF ERRORS Suppose you want to calculate the volume of a structure that consists of a cone resting upon a rectangular parallelepiped. The total volume of this structure is: 𝑉 ∆𝑉 ∆𝑅 ∆𝐻𝑐 ∆𝐻 ∆𝑊 ∆𝐿 = 𝑉𝑐 (2 + ) + 𝑉𝑝 ( + + ) 𝑉 𝑅 𝐻𝑐 𝐻 𝑊 𝐿 You will not measure V directly, but rather you will calculate V by taking measurements of R, Hc, L, W, and H. But suppose these measurements are not perfectly accurate. So the questions is how much error will you induce in your calculation of V by using these inaccurate values for the measured variables. If each measurement is in error by its own Δ, then the calculated volume would consist of the true volume V plus an error ΔV. The relation between the error-borne measurements and the resulting calculated volume would be: 1 𝑉 + ∆𝑉 = 𝜋(𝑅 + ∆𝑅)2 (𝐻𝑐 + ∆𝐻𝑐 ) + (𝐿 + ∆𝐿)(𝑊 + ∆𝑊)(𝐻 + ∆𝐻) 3 Expanding this equation, then subtracting out the equation for V, one obtains: 1 ∆𝑉 = 𝜋[2𝑅𝐻𝑐 ∆𝑅 + 𝑅 2 ∆𝐻𝑐 + 𝐻𝑐 (∆𝑅)2 + 2𝑅∆𝑅∆𝐻𝑐 + ∆𝐻𝑐 (∆𝑅)2 ] + 3 𝐻𝑊∆𝐿 + 𝐻𝐿∆𝑊 + 𝐿𝑊∆𝐻 + 𝐻∆𝐿∆𝑊 + 𝐿∆𝐻∆𝑊 + 𝑊∆𝐻∆𝐿 + ∆𝐿∆𝑊∆𝐻 If ΔHc, ΔR, ΔL, ΔW, and ΔH are all small, then terms containing two or more of these Δs will be much smaller than terms containing only a single Δ. Consequently, if we ignore these smaller terms, ΔV can be approximated as: 1 ∆𝑉 ≈ 𝜋(2𝑅𝐻𝑐 ∆𝑅 + 𝑅 2 ∆𝐻𝑐 ) + 𝐻𝑊∆𝐿 + 𝐻𝐿∆𝑊 + 𝐿𝑊∆𝐻 3 Page 7 of 9 ©Johns Hopkins University (3/5/15) M. Karweit (modified K.Borgsmiller) Thus, if we have some estimate for the errors in our measurements, we can use the above expression to evaluate the error in our calculated V. Another way of representing this error is by percentages. If the total volume V is separated into its constituent pieces V=Vc+Vp, where the subscripts c and p refer to the cone and parallelepiped, respectively, then the above equation can be rewritten as: 𝑉 ∆𝑉 ∆𝑅 ∆𝐻𝑐 ∆𝐻 ∆𝑊 ∆𝐿 = 𝑉𝑐 (2 + ) + 𝑉𝑝 ( + + ) 𝑉 𝑅 𝐻𝑐 𝐻 𝑊 𝐿 This equation shows that percentage errors in the parallelepiped measurements, e.g.,∆𝐻/𝐻, are linearly additive with weight Vp, whereas a percentage error in the measurement of R is doubly additive with weight Vc. This error calculation can be presented more formally as follows: If F is a differentiable function depending on n variables x1, x2,…, xn, then infinitesimal variations in F are determined by infinitesimal variations in the xis as: 𝜕𝐹 𝜕𝐹 𝜕𝐹 𝑑𝑥1 + 𝑑𝑥2 + ⋯ + 𝑑𝑥 𝜕𝑥1 𝜕𝑥2 𝜕𝑥𝑛 𝑛 dF is called the “total differential”. For additional references, look under “total differential” or calculus of errors” in elementary calculus books. 𝑑𝐹(𝑥1 , 𝑥2 , … , 𝑥𝑛 ) = 𝑑𝐹(𝑥1 , 𝑥2 , … , 𝑥𝑛 ) is the actual error in F associated with errors in a single set of measurements with errors dxi. If all our measurement errors have the same sign, then 𝑑𝐹(𝑥1 , 𝑥2 , … , 𝑥𝑛 ) is the maximum error we can have. But measurements errors are typically not all of the same sign. So we can ask the question, what is the typical error we might expect in F. To answer that questions, we must make three assumptions: if we were to carry out the measurements many times, 1) the average value of each ̅̅̅̅̅ 𝑑𝑥𝑖 = 0 (i.e. the errors are unbiased), 2) the errors dxi do not depend on the values of the measurements xi, and 3) the correlations between the measurement errors is zero. In a very compact notation this last assumption can be expressed as ̅̅̅̅̅̅̅̅̅ 𝑑𝑥𝑖 𝑑𝑥𝑗 =0 for i≠j. Finally, if i=j, then we can write 2 2 th ̅̅̅̅̅ 𝑑𝑥𝑖 = 𝜎 , i.e., the variance of the i measurement error. The “squared error” of 𝑑𝐹(𝑥1 , 𝑥2 , … , 𝑥𝑛 ) is: 𝜕𝐹 2 𝜕𝐹 2 𝜕𝐹 2 𝑑𝐹 2 = ( ) (𝑑𝑥1 )2 + ( ) (𝑑𝑥2 )2 + ⋯ + ( ) (𝑑𝑥𝑛 )2 + 𝜕𝑥1 𝜕𝑥2 𝜕𝑥𝑛 𝜕𝐹 𝜕𝐹 𝜕𝐹 𝜕𝐹 𝜕𝐹 𝜕𝐹 𝜕𝐹 𝜕𝐹 𝑑𝑥1 𝑑𝑥2 + 𝑑𝑥1 𝑑𝑥3 + ⋯ + 𝑑𝑥1 𝑑𝑥4 + ⋯ + 𝑑𝑥 𝑑𝑥 𝜕𝑥1 𝜕𝑥2 𝜕𝑥1 𝜕𝑥3 𝜕𝑥1 𝜕𝑥4 𝜕𝑥𝑛−1 𝜕𝑥𝑛 𝑛−1 𝑛 Now consider the “mean squared error”. This is written as: Page 8 of 9 ©Johns Hopkins University (3/5/15) M. Karweit (modified K.Borgsmiller) ̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅ ̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅ ̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅ 𝜕𝐹 2 𝜕𝐹 2 𝜕𝐹 2 ̅̅̅̅̅ 𝑑𝐹 2 = ( ) (𝑑𝑥1 )2 + ( ) (𝑑𝑥2 )2 + ⋯ + ( ) (𝑑𝑥𝑛 )2 + 𝜕𝑥1 𝜕𝑥2 𝜕𝑥𝑛 ̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅ ̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅ ̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅ ̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅ 𝜕𝐹 𝜕𝐹 𝜕𝐹 𝜕𝐹 𝜕𝐹 𝜕𝐹 𝜕𝐹 𝜕𝐹 𝑑𝑥1 𝑑𝑥2 + 𝑑𝑥1 𝑑𝑥3 + ⋯ + 𝑑𝑥1 𝑑𝑥4 + ⋯ + 𝑑𝑥 𝑑𝑥 𝜕𝑥1 𝜕𝑥2 𝜕𝑥1 𝜕𝑥3 𝜕𝑥1 𝜕𝑥4 𝜕𝑥𝑛−1 𝜕𝑥𝑛 𝑛−1 𝑛 Since we have assumed that the measurement errors do not depend on the values of the measurements themselves, we can separate the averaging of each term into two parts as: 𝜕𝐹 2 ̅̅̅̅̅̅̅̅̅2 𝜕𝐹 2 ̅̅̅̅̅̅̅̅̅2 𝜕𝐹 2 ̅̅̅̅̅̅̅̅̅2 2 ̅̅̅̅̅ 𝑑𝐹 = ( ) (𝑑𝑥1 ) + ( ) (𝑑𝑥2 ) + ⋯ + ( ) (𝑑𝑥𝑛 ) + 𝜕𝑥1 𝜕𝑥2 𝜕𝑥𝑛 𝜕𝐹 𝜕𝐹 𝜕𝐹 𝜕𝐹 𝜕𝐹 𝜕𝐹 𝜕𝐹 𝜕𝐹 ̅̅̅̅̅̅̅̅̅̅ ̅̅̅̅̅̅̅̅̅̅ ̅̅̅̅̅̅̅̅̅̅ ̅̅̅̅̅̅̅̅̅̅̅̅̅ 𝑑𝑥1 𝑑𝑥2 + 𝑑𝑥1 𝑑𝑥3 + ⋯ + 𝑑𝑥 𝑑𝑥 𝑑𝑥 1 𝑑𝑥4 + ⋯ + 𝜕𝑥1 𝜕𝑥2 𝜕𝑥1 𝜕𝑥3 𝜕𝑥1 𝜕𝑥4 𝜕𝑥𝑛−1 𝜕𝑥𝑛 𝑛−1 𝑛 ̅̅̅̅̅̅̅̅̅ Because⁡𝑑𝑥 𝑖 𝑑𝑥𝑗 =0 for i≠j, all the terms on the second line disappear and we end up with: 𝜕𝐹 2 2 𝜕𝐹 2 2 𝜕𝐹 2 2 ̅̅̅̅̅ 𝑑𝐹 2 = ( ) 𝜎1 + ( ) 𝜎2 + ⋯ + ( ) 𝜎𝑛 𝜕𝑥1 𝜕𝑥2 𝜕𝑥𝑛 So, the “expected” or “root-mean-squared” error in F is: 𝜕𝐹 2 2 𝜕𝐹 2 2 𝜕𝐹 2 2 √̅̅̅̅̅ 𝑑𝐹 2 = √( ) 𝜎1 + ( ) 𝜎2 + ⋯ + ( ) 𝜎𝑛 𝜕𝑥1 𝜕𝑥2 𝜕𝑥𝑛 Page 9 of 9 ©Johns Hopkins University (3/5/15) M. Karweit (modified K.Borgsmiller)

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Probability—the description of random events