* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download 2.0 Probability Concepts
Survey
Document related concepts
Transcript
2.0 Probability Concepts • definitions: randomness, parent population, random variable, probability, statistical independence, probability of multiple events, derived random variables • the probability density function, pdf • the cumulative distribution function, cdf • discrete pdf for rolling two dice • continuous pdf for a fluorescence decay 2.0 : 1/11 Randomness • randomness: when repeated measurements vary in an unpredictable way, they are said to be random • parent population: the complete set of possible random outcomes, often written as a set, {...} § finite: enumerable number of outcomes, e.g. tossing a coin {H,T}, rolling a die {1,2,3,4,5,6}, drawing from a shuffled deck of cards § infinite: an infinite number of outcomes > continuous: real-numbered values such as time or voltage > event trigger: tossing a coin until H is observed § pseudo-infinite: enumerable but very large, e.g. mole which is a pseudo-real number • random variable: an ordered listing of all possible outcomes of the parent population • a numeric random variable can be discrete or continuous • discrete example: whole numbers, multiples of π • continuous example: any real number between 0 and 1 2.0 : 2/11 Probability • the probability, p, of observing a given value of the random variable is the fraction of a very large number of measurements yielding that outcome • the sum of the probability of all possible outcomes is 1 Example: A coin is tossed 10,000 times yielding 5,013 heads and 4,987 tails. Thus, p(H) = 5013/10000 = 0.5013 and p(T) = 0.4987. Note that p(H) + p(T) = 1. • theory can often be used - rolling a die has the outcomes {1...6}, each with the same probability p = 1/6; drawing an ace from a deck of playing cards, p = 4/52. • for discrete random variables each outcome has a finite probability • for a continuous random variable only ranges of outcomes have non-zero probability Example: Consider a continuous random variable, x, which has a uniform probability over the range, 0 ≤ x ≤ 1. p(0.45 ≤ x ≤ 0.55) = (0.55-0.45)/(1-0) = 0.1 p(0.495 ≤ x ≤ 0.505) = 0.01 .... p(0.49999999995 ≤ x ≤ 0.50000000005) = 0.0000000001 2.0 : 3/11 Miscellaneous Measurements are statistically independent when the knowledge that one outcome has been observed will not influence the outcome of a second observation. Example: tossing a coin Counter Example: drawing an ace from a deck of cards The probability of observing multiple events is given by the product of the individual probabilities. Example: the probability of tossing a coin to observe H and rolling a die to observe 5, is given by (1/2)×(1/6) = 1/12. Many random variables used to test hypotheses are derived from one or more random measurements. The manner in which the randomness is transferred during the calculation is of extreme importance. Example: rolling two dice and summing their face values Example: computing the area of a circle by measuring its diameter 2.0 : 4/11 Probability Functions The probability density function, f(x), describes how probability is distributed over the random variable. For a discrete random variable, p(x) = f(x), and f(x) has no units. For a continuous random variable, dp(x) = f(x)dx, and f(x) has units of x-1. The cumulative distribution function, F(x), describes how probability accumulates as the range of allowed outcomes is increased. The accumulation starts with the first enumerated outcome of the random variable. F(x) has units of probability for both discrete and continuous random variables. F(x) is described as a sum for discrete random variables and an integral for continuous random variables. F (m) = 2.0 : 5/11 m≤b ∑ x =a m≤b f ( x) F (m) = ∫ x=a f ( x)dx a ≤ x≤b Discrete PDF - Two Dice m Consider an experiment where a red die and a blue die are rolled. The value of the blue die will be subtracted from the value of the red die. The possible outcomes are x = {-5,-4,-3,-2,-1, 0, 1, 2, 3, 4, 5}. Probability can be computed by counting the number of ways the value of each outcome can be obtained. -5 can be obtained only one way, (1 - 6), thus f(-5) = 1×(1/6)×(1/6) = 1/36. 0 can be obtained six ways (1-1)...(6-6), thus f(0) = 6/36. 0.20 The pdf can also be written as a two-part function. −5 ≤ x ≤ 0 f ( x ) = ( 6 − x ) 36 0< x≤5 f(x) f ( x) = ( 6 + x ) 36 0.15 0.10 0.05 0.00 -5 2.0 : 6/11 -4 -3 -2 -1 0 x 1 2 3 4 5 Discrete CDF - Two Dice 0.80 f(x) The cdf can be obtained by summing up the individual probabilities, starting with -5. F(-5) = 1/36 F(-4) = 3/36 F(-3) = 6/36 ... F(0) = 21/36 ... F(5) = 36/36 1.00 0.60 0.40 0.20 0.00 -5 -4 -3 -2 -1 0 1 2 3 4 x The cdf can also be written as a two-part function. m F (m) = ∑ x =−5 (6 + x ) = 36 m + 6 )( m + 7 ) ( 1 m+6 y= ∑ 36 y =1 72 42 m ⎛ 6 − x ⎞ 42 m 6 m x m (11 − m ) + 42 F (m) = + ∑⎜ + ∑ −∑ = ⎟= 72 x=1 ⎝ 36 ⎠ 72 x=1 36 x=1 36 72 2.0 : 7/11 −5 ≤ m ≤ 0 0<m≤5 5 Probability Calculation - Two Dice The cdf is used to compute probability over an interval of the random variable. The probability that x falls in the range, a < x ≤ b, is given by F(b) - F(a). What is the probability that the difference between two dice will have values of -1, 0, or +1? p ( −2 < x ≤ 1) = F (1) − F ( −2 ) 1(11 − 1) + 42 52 F (1) = = 72 72 −2 + 6 )( −2 + 7 ) 20 ( F ( −2 ) = = 72 72 52 − 20 32 16 p ( −2 < x ≤ 1) = = = 72 72 36 2.0 : 8/11 Continuous PDF - Exponential Decay m Consider an exponential fluorescence decay that has a lifetime, τ. The intensity of fluorescence as a function of time is normally written as, I(t) = I0×exp(-t/τ) where I0 is the intensity at t = 0. Now suppose photons are measured and we wish to compute the probability of observing photons at various times, 0 ≤ t ≤ ∞. We first need to write the probability density function, remembering that f(t) has to have units of t-1, in order that f(x)dx be unitless. Note that τ has to have units of t so that the exponent is unitless. 0.25 f ( t ) = exp ( − t τ ) τ 0≤t ≤∞ 0.15 f(t) 1 τ = 5 ns 0.2 The graph at the right shows the pdf for a 5-ns fluorescence decay. 0.1 0.05 0 -5 2.0 : 9/11 0 5 10 t (ns) 15 20 25 Continuous CDF - Exponential Decay The cdf for an exponential decay can be obtained by integration of the pdf. F (m) = 1 m m exp ( − t τ ) dt = ⎡⎣ − exp ( − t τ )⎤⎦ = 1 − exp ( − m τ ) ∫ 0 τ 0 An important check is to make sure the cdf goes to unity over the range of the random variable, in this case over the range of 0 to ∞. F ( ∞ ) = 1 − exp ( −∞ / τ ) = 1 − 0 = 1 1.2 The graph at the right shows the cdf for a 5-ns fluorescence decay. 0.8 F(t) 1 τ = 5 ns 0.6 0.4 0.2 0 -5 2.0 : 10/11 0 5 10 t (ns) 15 20 25 Probability Calculation - Decay When using time-filtered detection of fluorescence, it is important to collect as many photons as possible. The fraction of photons collected directly affects the sensitivity of the method. Two interferences prevent one from collecting all of the photons. At short times Rayleigh and Raman scatter will be erroneously added to the signal. In contrast photomultiplier dark counts will be distributed evenly over time. Thus, the temporal filter might start at 1 ns and end 3τ later. The cdf can be used to compute the fraction of the fluorescence collected. Do the calculation with τ = 5 ns. p (1 < t < 3 × 5 + 1) = F (16 ) − F (1) p (1 < t < 3 × 5 + 1) = 1 − exp ( −16 5 ) − 1 + exp ( −1 5 ) p (1 < t < 3 × 5 + 1) = 0.82 − 0.04 = 0.78 Note that where you start has a greater impact on the sensitivity than where you end. This means that a temporally narrower gate following a laser pulse will dramatically improve the measurement. 2.0 : 11/11