Download Document

1 - Introduction 2 - Exploratory Data Analysis 3 - Probability Theory 4 - Classical Probability Distributions 5 - Sampling Distrbns / Central Limit Theorem 6 - Statistical Inference 7 - Correlation and Regression (8 - Survival Analysis) 1 What is the connection between probability and random variables? Events (and their corresponding probabilities) that involve experimental measurements can be described by random variables. 2 POPULATION random variable X Example: X = Cholesterol level (mg/dL) x1 x2 x3 x6 …etc…. x5 x4 xn SAMPLE of size n Pop values Probabilities xi p(xi ) x1 p(x1) x2 p(x2) x3 p(x3) ⋮ ⋮ Total 1 Data values Relative Frequencies xi p(xi ) = fi /n x1 p(x1) x2 p(x2) x3 p(x3) ⋮ ⋮ xk p(xk) Total 1 3 POPULATION random variable X Example: X = Cholesterol level (mg/dL) “Density” f ( x ) p ( x) (height) (area) Probability Histogram p( x)  f ( x) x Probabilities x p(x) x1 p(x1) x2 p(x2) x3 p(x3) ⋮ ⋮ Total 1 Total Area = 1 p(x) = Probability that the random variable X is equal to a specific value x, i.e., | x x (width) Pop values p(x) = P(X = x) “probability mass function” (pmf) | x X Consider the following discrete random variable… Example: X = “value shown on a single random toss of a fair die (1, 2, 3, 4, 5, 6)” X is said to be uniformly distributed over the values 1, 2, 3, 4, 5, 6. Probability Histogram Probability Table x p(x) 1 1/6 2 1/6 3 1/6 4 1/6 5 1/6 6 1/6 1 Density f(x) P(X = x) Total Area = 1 1 6 1 6 1 6 1 6 1 6 1 6 X “What is the probability of rolling a 4?” p (4)  P( X  4)  5 Consider the following discrete random variable… Example: X = “value shown on a single random toss of a fair die (1, 2, 3, 4, 5, 6)” X is said to be uniformly distributed over the values 1, 2, 3, 4, 5, 6. Probability Histogram Probability Table x p(x) 1 1/6 2 1/6 3 1/6 4 1/6 5 1/6 6 1/6 1 Density f(x) P(X = x) Total Area = 1 1 6 1 6 1 6 1 6 1 6 1 6 X “What is the probability of rolling a 4?” p (4)  P( X  4)  1 6 6 POPULATION random variable X Example: X = Cholesterol level (mg/dL) Probability Histogram Pop values Probabilities x p(x) x1 p(x1) x2 p(x2) x3 p(x3) ⋮ ⋮ Total 1 Total Area = 1 F(x) = Probability that the random variable X is less than or equal to a specific value x, i.e., F(x) = P(X  x) “cumulative distribution function” (cdf) | x X Motivation ~ Consider the following discrete random variable… Example: X = “value shown on a single random toss of a fair die (1, 2, 3, 4, 5, 6)” X is said to be uniformly distributed over the values 1, 2, 3, 4, 5, 6. Cumulative distribution P(X = x) P(X  x) x p(x) F(x) 1 1/6 1/6 2 1/6 2/6 3 1/6 3/6 4 1/6 4/6 5 1/6 5/6 6 1/6 1 1 8 Motivation ~ Consider the following discrete random variable… Example: X = “value shown on a single random toss of a fair die (1, 2, 3, 4, 5, 6)” X is said to be uniformly distributed over the values 1, 2, 3, 4, 5, 6. Cumulative distribution P(X = x) P(X  x) x p(x) F(x) 1 1/6 1/6 2 1/6 2/6 3 1/6 3/6 4 1/6 4/6 5 1/6 5/6 6 1/6 1 1 “staircase graph” from 0 to 1 9 POPULATION Pop vals pmf x p(x) x1 p(x1) F(x1) = p(x1) x2 p(x2) F(x2) = p(x1) + p(x2) x3 p(x3) F(x3) = p(x1) + p(x2) + p(x3) ⋮ ⋮ ⋮ Total 1 increases from 0 to 1 random variable X Example: X = Cholesterol level (mg/dL) cdf Calculating “interval probabilities”… F(b) = P(X  b) F(a–) = P(X  a–) F(b) – F(a–) = P(X  b) – P(X  a–) = P(a  X  b) b   p(x) a | | a–a | b X F(x) = P(X  x) POPULATION Pop vals pmf x p(x) x1 p(x1) F(x1) = p(x1) x2 p(x2) F(x2) = p(x1) + p(x2) x3 p(x3) F(x3) = p(x1) + p(x2) + p(x3) ⋮ ⋮ ⋮ Total 1 increases from 0 to 1 random variable X Example: X = Cholesterol level (mg/dL) Calculating “interval probabilities”…  F(b) = P(X  b) F(a–) = P(X  a–) b a cdf f ( x) dx  F (b)  F (a) b  f ( x )  x  F ( b )  F ( a )  F(b) – F(a–) = a p( x) P(X  b) – P(X  a–) = P(a  X  b) b   p(x) a F(x) = P(X  x) | | a–a | b X FUNDAMENTAL THEOREM OF CALCULUS (discrete form) POPULATION Pop vals pmf x p(x) x1 p(x1) F(x1) = p(x1) x2 p(x2) F(x2) = p(x1) + p(x2) x3 p(x3) F(x3) = p(x1) + p(x2) + p(x3) ⋮ ⋮ ⋮ Total 1 increases from 0 to 1 random variable X Example: X = Cholesterol level (mg/dL) Calculating “interval probabilities”…  F(b) = P(X  b) F(a–) = P(X  a–) b a cdf f ( x) dx  F (b)  F (a) b  f ( x )  x  F ( b )  F ( a )  F(b) – F(a–) = a p( x) P(X  b) – P(X  a–) = P(a  X  b) b   p(x) a F(x) = P(X  x) | | a–a | b X FUNDAMENTAL THEOREM OF CALCULUS (discrete form) POPULATION Pop values Probabilities x pmf p(x) x1 p(x1) x2 p(x2) x3 p(x3) ⋮ ⋮ Total 1 random variable X Example: X = Cholesterol level (mg/dL) Just as the sample mean x and sample variance s2 were used to characterize “measure of center” and “measure of spread” of a dataset, we can now define the “true” population mean  and population variance  2, using probabilities. • Population mean    x p ( x) Also denoted by E[X], the “expected value” of the variable X. • Population variance  2   ( x   ) 2 p ( x) 13 POPULATION Pop values Probabilities x pmf p(x) x1 p(x1) x2 p(x2) x3 p(x3) ⋮ ⋮ Total 1 random variable X Example: X = Cholesterol level (mg/dL) Just as the sample mean x and sample variance s2 were used to characterize “measure of center” and “measure of spread” of a dataset, we can now define the “true” population mean  and population variance  2, using probabilities. • Population mean    x p ( x) Also denoted by E[X], the “expected value” of the variable X. • Population variance  2   ( x   ) 2 p ( x) 14 Example 1: POPULATION random variable X Example: X = Cholesterol level (mg/dL) 1/2 Pop values Probabilities xi p(xi ) 210 1/6 240 1/3 270 1/2 Total 1 1/3 1/6    x p( x)  (210)(1/ 6)  (240)(1/ 3)  (270)(1/ 2)  250 2 2 2  2   ( x   )2 p( x)  (40) (1/ 6)  (10) (1/ 3)  (20) (1/ 2)  500 15 Example 2: POPULATION random variable X Example: X = Cholesterol level (mg/dL) Equally likely outcomes result in a “uniform distribution.” Pop values Probabilities xi p(xi ) 180 1/3 210 1/3 240 1/3 Total 1 1/3 1/3 1/3    x p( x)  (180)(1/ 3)  (210)(1/ 3)  (240)(1/ 3)  210 (clear from symmetry) 2 2 2  2   ( x   )2 p( x)  (30) (1/ 3)  (0) (1/ 3)  (30) (1/ 3)  600 16 To summarize… 17 POPULATION Discrete random variable X Probability Table Pop Probabilities xi pmf p(xi ) x1 p(x1) x2 p(x2) x3 p(x3) ⋮ ⋮ 1 Probability Histogram Total Area = 1 X    x p( x)  2   ( x   ) 2 p ( x) Frequency Table Data xi x1 x2 x3 x6 x4 …etc…. x5 xn SAMPLE of size n Relative Frequencies Density Histogram p(xi ) = fi /n x1 p(x1) x2 p(x2) x3 p(x3) ⋮ ⋮ xk p(xk) 1 Total Area = 1 X x   x p( x) s 2  nn1  ( x  x ) 2 p( x) 18 POPULATION Continuous Discrete random variable X Probability Table Pop Probabilities xi pmf p(xi ) x1 p(x1) x2 p(x2) x3 p(x3) ⋮ ⋮ 1 Probability Histogram Total Area = 1 X    x p( x)  2   ( x   ) 2 p ( x) Frequency Table Data xi x1 x2 x3 x6 x4 …etc…. x5 xn SAMPLE of size n Relative Frequencies Density Histogram p(xi ) = fi /n x1 p(x1) x2 p(x2) x3 p(x3) ⋮ ⋮ xk p(xk) 1 Total Area = 1 X x   x p( x) s 2  nn1  ( x  x ) 2 p( x) 19 One final example… 20 Example 3: TWO INDEPENDENT POPULATIONS X1 = Cholesterol level (mg/dL) X2 = Cholesterol level (mg/dL) x p1(x) 1 = 250 x p2(x) 2 = 210 210 1/6 12 = 500 180 1/3 22 = 600 240 1/3 210 1/3 270 1/2 240 1/3 Total 1 Total 1 D = X1 – X2 ~ ??? d -30 0 Outcomes (210, 240) (210, 210), (240, 240) +30 (210, 180), (240, 210), (270, 240) +60 (240, 180), (270, 210) +90 (270, 180) 21 Example 3: TWO INDEPENDENT POPULATIONS X1 = Cholesterol level (mg/dL) X2 = Cholesterol level (mg/dL) x p1(x) 1 = 250 x p2(x) 2 = 210 210 1/6 12 = 500 180 1/3 22 = 600 240 1/3 210 1/3 270 1/2 240 1/3 Total 1 Total 1 D = X1 – X2 ~ ??? d -30 0 Probabilities Outcomesp(d) 1/9 ? 240) (210, 2/9 ? 210), (240, 240) (210, +30 3/9 ? 180), (240, 210), (270, 240) (210, +60 2/9 ? 180), (270, 210) (240, +90 1/9 ? 180) (270, The outcomes of D are NOT EQUALLY LIKELY!!! 22 Example 3: TWO INDEPENDENT POPULATIONS X1 = Cholesterol level (mg/dL) X2 = Cholesterol level (mg/dL) x p1(x) 1 = 250 x p2(x) 2 = 210 210 1/6 12 = 500 180 1/3 22 = 600 240 1/3 210 1/3 270 1/2 240 1/3 Total 1 Total 1 D = X1 – X2 ~ ??? d -30 0 Probabilities Outcomesp(d) (1/6)(1/3) (210, 240)= 1/18 via independence (210, 210), (240, 240) +30 (210, 180), (240, 210), (270, 240) +60 (240, 180), (270, 210) +90 (270, 180) 23 Example 3: TWO INDEPENDENT POPULATIONS X1 = Cholesterol level (mg/dL) X2 = Cholesterol level (mg/dL) x p1(x) 1 = 250 x p2(x) 2 = 210 210 1/6 12 = 500 180 1/3 22 = 600 240 1/3 210 1/3 270 1/2 240 1/3 Total 1 Total 1 D = X1 – X2 ~ ??? d -30 0 Probabilities p(d) (1/6)(1/3) = 1/18 via independence (210, 210),+ (1/3)(1/3) (1/6)(1/3) (240, 240) = 3/18 +30 (210, 180), (240, 210), (270, 240) +60 (240, 180), (270, 210) +90 (270, 180) 24 Example 3: TWO INDEPENDENT POPULATIONS X1 = Cholesterol level (mg/dL) X2 = Cholesterol level (mg/dL) x p1(x) 1 = 250 x p2(x) 2 = 210 210 1/6 12 = 500 180 1/3 22 = 600 240 1/3 210 1/3 270 1/2 240 1/3 Total 1 Total 1 Probability Histogram 6/18 5/18 3/18 3/18 1/18 D = X1 – X2 ~ ??? d -30 0 Probabilities p(d) (1/6)(1/3) = 1/18 via independence (1/6)(1/3) + (1/3)(1/3) = 3/18 +30 (210, 180),+ (1/3)(1/3) (240, 210), (270, 240) (1/6)(1/3) + (1/2)(1/3) = 6/18 +60 (240, 180),+ (1/2)(1/3) (270, 210) (1/3)(1/3) = 5/18 +90 (270, 180)= 3/18 (1/2)(1/3) 25 Example 3: TWO INDEPENDENT POPULATIONS X1 = Cholesterol level (mg/dL) Probability Histogram X2 = Cholesterol level (mg/dL) x p1(x) 1 = 250 x p2(x) 2 = 210 210 1/6 12 = 500 180 1/3 22 = 600 240 1/3 210 1/3 270 1/2 240 1/3 Total 1 Total 1 D = X1 – X2 ~ ??? d -30 0 6/18 5/18 3/18 1/18 D = (-30)(1/18) + (0)(3/18) + (30)(6/18) + (60)(5/18) + (90)(3/18) = 40 Probabilities f(d) D = 1 – 2 (1/6)(1/3) = 1/18 via independence (1/6)(1/3) + (1/3)(1/3) = 3/18 +30 (210, 180),+ (1/3)(1/3) (240, 210), (270, 240) (1/6)(1/3) + (1/2)(1/3) = 6/18 +60 (240, 180),+ (1/2)(1/3) (270, 210) (1/3)(1/3) = 5/18 +90 (270, 180)= 3/18 (1/2)(1/3) 3/18 D2 = (-70) 2(1/18) + (-40) 2(3/18) +  (-10) 2(6/18) + (20) 2(5/18) + (50) 2(3/18) = 1100 2 = 2 + 2 D 1 2   26 General: TWO INDEPENDENT POPULATIONS X1 = Cholesterol level (mg/dL) IF the two Probability Histogram populations are dependent… X2 = Cholesterol level (mg/dL) x f1(x) 1 = 250 210 1/6 12 = 500 240 1/3 f2(x) 2 = 210 …then this 2 180 1/3still  formula holds, 2 = 600 210 BUT…… 1/3 270 1/2 240 Total 1 x 1/3 -30 0 5/18 3/18 3/18 1/18 Mean (X1 – X Total 2) = 1Mean (X1) – Mean (X2) D = X1 – X2 ~ ??? d 6/18 D = (-30)(1/18) + (0)(3/18) + (30)(6/18) + (60)(5/18) + (90)(3/18) = 40 Probabilities f(d) D = 1 – 2 (1/6)(1/3) = 1/18 via independence (1/6)(1/3) + (1/3)(1/3) = 3/18 = (-70) + Cov (-40) 2(3/18) + ) Var (X1 – X2) = Var (X1) D+2 Var (X22(1/18) ) – 2 (X , X 2 1 2 2 +30 (210, 180),+ (1/3)(1/3) (240, 210), (270, 240) (1/6)(1/3) + (1/2)(1/3) = 6/18 +60 (240, 180),+ (1/2)(1/3) (270, 210) (1/3)(1/3) = 5/18 These two formulas are valid for (270, 180) +90 (1/2)(1/3) = 3/18 continuous as well as discrete distributions.  (-10) (6/18) + (20) (5/18) + (50) 2(3/18) = 1100 2 = 2 + 2 D 1 2   27 NOTICE TO STAT 324 • Slides 29-41 contain more details on properties of Expected Values. They are not required for Stat 324, but if you are experiencing difficulty with the formulas, you may find them of some benefit. • Special note regarding Slide 41: Similar to the “alternate computational formula” for sample variance s2, such a formula also exists for population variance σ 2, derived there. Stat 324 material picks up with the Binomial Distribution. 28 POPULATION random variable X Example: X = Cholesterol level (mg/dL) Pop values Probabilities x pmf p(x) x1 p(x1) x2 p(x2) x3 p(x3) ⋮ ⋮ Total 1 General Properties of “Expectation” of X Mean:  X  E[ X ]   x p( x) Suppose X is transformed to another random variable, say h(X). Then by def, h ( X )  E[h( X )]   h( x) p( x) Variance:  X2   E ( xXXX))22 p(x ) ( x   X ) 2 p( x) 29 POPULATION random variable X Example: X = Cholesterol level (mg/dL) Pop values Probabilities x pmf p(x) bx1 bx2 bx3 p(x1) ⋮ ⋮ Total 1 p(x2) p(x3) General Properties of “Expectation” of X Mean:  X  E[ X ]   x p( x) Suppose X is constant, say b, throughout entire population… Then by def, E[b]   b p ( x)  b  p ( x)  b 1  b Variance:  X2  E ( X   X ) 2    ( x   X ) 2 p( x) 30 POPULATION random variable X Example: X = Cholesterol level (mg/dL) Pop values Probabilities x pmf p(x) bx1 bx2 bx3 p(x1) ⋮ ⋮ Total 1 p(x2) p(x3) General Properties of “Expectation” of X Mean:  X  E[ X ]   x p( x) Suppose X is constant, say b, throughout entire population… Then… E[b]  b Variance:  X2  E ( X   X ) 2    ( x   X ) 2 p( x) 31 POPULATION random variable X Pop values Probabilities x pmf p(x) a x1 a x2 a x3 Example: X = Cholesterol level (mg/dL) p(x1) p(x2) p(x3) ⋮ ⋮ Total 1 General Properties of “Expectation” of X Mean:  X  E[ X ]   x p( x) Multiply X by any constant a… Then by def, E[aX ]   a x p( x)  a  x p ( x)  a E[ X ] Variance:  X2  E ( X   X ) 2    ( x   X ) 2 p( x) 32 POPULATION random variable X Example: X = Cholesterol level (mg/dL) Pop values Probabilities x pmf p(x) a x1 a x2 a x3 p(x1) p(x2) p(x3) ⋮ ⋮ Total 1 General Properties of “Expectation” of X Mean:  X  E[ X ]   x p( x) Multiply X by any constant a… Then… E[aX ]  a E[ X ] i.e.,… a X  a  X Variance:  X2  E ( X   X ) 2    ( x   X ) 2 p( x) 33 POPULATION Pop values Probabilities x pmf p(x) x1  b random variable X Example: X = Cholesterol level (mg/dL) x2  b x3  b p(x1) p(x2) p(x3) ⋮ ⋮ Total 1 General Properties of “Expectation” of X Mean:  X  E[ X ]   x p( x) Multiply X by any constant a… Then… E[aX ]  a E[ X ] i.e.,… a X  a  X Add any constant b to X…  ( x  b) p( x)   x p( x)   b p( x) E[ X  b]   E[ X ]  E[b] Variance:  X2  E ( X   X ) 2    ( x   X ) 2 p( x) 34 POPULATION Pop values Probabilities x pmf p(x) x1  b random variable X Example: X = Cholesterol level (mg/dL) x2  b x3  b p(x1) p(x2) p(x3) ⋮ ⋮ Total 1 General Properties of “Expectation” of X Mean:  X  E[ X ]   x p( x) Multiply X by any constant a… Add any constant b to X… Then… E[aX ]  a E[ X ] E[ X  b]  E[ X ]  b i.e.,… a X  a  X X b  X  b Variance:  X2  E ( X   X ) 2    ( x   X ) 2 p( x) 35 POPULATION Pop values Probabilities x pmf p(x) x1 p(x1) x2 p(x2) x3 p(x3) ⋮ ⋮ Total 1 random variable X Example: X = Cholesterol level (mg/dL) General Properties of “Expectation” of X Mean:  X  E[ X ]   x p( x) E[aX  b]  a E[ X ]  b  a X b  a  X  b Variance:  X2  E ( X   X ) 2    ( x   X ) 2 p( x) 36 POPULATION random variable X Example: X = Cholesterol level (mg/dL) Pop values Probabilities x pmf p(x) x1 p(x1) x2 p(x2) x3 p(x3) ⋮ ⋮ Total 1 General Properties of “Expectation” of X Variance:  X2  E ( X   X ) 2    ( x   X ) 2 p( x) Multiply X by any constant a… then X is also multiplied by a. 2  aX  E (aX  a X ) 2   E  a 2 ( X   X ) 2   a 2 E ( X   X ) 2   a 2  X2 37 POPULATION random variable X Example: X = Cholesterol level (mg/dL) Pop values Probabilities x pmf p(x) x1 p(x1) x2 p(x2) x3 p(x3) ⋮ ⋮ Total 1 General Properties of “Expectation” of X Variance:  X2  E ( X   X ) 2    ( x   X ) 2 p( x) Multiply X by any constant a… then X is also multiplied by a. 2  aX  a 2  X2 2 i.e.,…Var (aX )  a Var ( X )  aX  a  X i.e.,…SD(aX )  a SD( X ) 38 POPULATION random variable X Example: X = Cholesterol level (mg/dL) Pop values Probabilities x pmf p(x) x1 p(x1) x2 p(x2) x3 p(x3) ⋮ ⋮ Total 1 General Properties of “Expectation” of X Variance:  X2  E ( X   X ) 2    ( x   X ) 2 p( x) Add any constant b to X… then b is also added to X . 2  X2 b  E  ( X  b)  ( X  b)      E  ( X   X ) 2    X2 39 POPULATION random variable X Example: X = Cholesterol level (mg/dL) Pop values Probabilities x pmf p(x) x1 p(x1) x2 p(x2) x3 p(x3) ⋮ ⋮ Total 1 General Properties of “Expectation” of X Variance:  X2  E ( X   X ) 2    ( x   X ) 2 p( x) Add any constant b to X… then b is also added to X .  X2 b   X2 i.e.,…Var ( X  b)  Var ( X )  X b   X i.e.,… SD( X  b)  SD( X ) 40 POPULATION Pop values Probabilities x pmf p(x) x1 p(x1) x2 p(x2) x3 p(x3) ⋮ ⋮ Total 1 random variable X Example: X = Cholesterol level (mg/dL) General Properties of “Expectation” of X Variance:  X2  E ( X   X ) 2    ( x   X ) 2 p( x)  E  X 2  2 X  X   X 2   E  X 2   2E 2X E X XX    EX2EX21  E  X 2   2 X 2   X 2  E  X 2    X 2 41 POPULATION random variable X Example: X = Cholesterol level (mg/dL) Pop values Probabilities x pmf p(x) x1 p(x1) x2 p(x2) x3 p(x3) ⋮ ⋮ Total 1 General Properties of “Expectation” of X Variance:  X2  E ( X   X ) 2    ( x   X ) 2 p( x)  X2  E  X 2    X 2   x2 p( x)   X 2   E  X   E[ X ] 2 X 2 2   x p( x)   x p( x)  2 2 This is the analogue of the “alternate computational formula” for the sample variance s2. 42 ~ The Binomial Distribution ~  Used only when dealing with binary outcomes (two categories: “Success” vs. “Failure”), with a fixed probability of Success () in the population.  Calculates the probability of obtaining any given number of Successes in a random sample of n independent “Bernoulli trials.”  Has many applications and generalizations, e.g., multiple categories, variable probability of Success, etc. POPULATION 40% Male, 60% Female For any randomly selected individual, define a binary random variable: 1 if Male, with prob   0.4 Y  0 if Female, with prob 1    0.6 RANDOM SAMPLE n = 100 Discrete random variable X = # Males in sample (0, 1, 2, 3, …, 99, 100) x p(x) F(x) x1 p(x1) F(x1) How can we calculate the probability of x p(x ) F(x ) = P(X = x), for x==2), 0, …, 1, 2, 3, …,100? P(Xp(x) = 0), 1), P(X P(X = 99), P(X = x100)? p(x ) ⋮ ⋮ ⋮ 1 F(x) = P(X ≤ x), for x = 0, 1, 2, 3, …,100? 1 2 2 3 3 2 POPULATION 40% Male, 60% Female RANDOM SAMPLE n = 100 For any randomly selected individual, define a binary random variable: 1 if Male, with prob   0.4 Y  0 if Female, with prob 1    0.6 Discrete random variable X = # Males in sample (0, 1, 2, 3, …, 99, 100) Example: How can we calculate the probability of p(25) p(x) = P(X = x), for=xP(X = 0,=1,25)? 2, 3, …,100? Solution: F(x) = Model P(X the ≤ x),sample for x =as 0, a1,sequence 2, 3, …,100? of independent coin tosses, with 1 = Heads (Male), 0 = Tails (Female), where P(H) = 0.4, P(T) = 0.6 .… etc…. 45 2100 How many possible outcomes of n = 100 tosses exist? How many possible outcomes of n = 100 tosses exist with X = 25 Heads? 1 2 3 4 5 ...... 97 98 99 100 ...... … X = 25 Heads: { H1, H2, H3,…, H25 } permutations of 25 among 100 There are 100 possible open slots for H1 to occupy. For each one of them, there are 99 possible open slots left for H2 to occupy. For each one of them, there are 98 possible open slots left for H3 to occupy. …etc…etc…etc… For each one of them, there are 77 possible open slots left for H24 to occupy. For each one of them, there are 76 possible open slots left for H25 to occupy. Hence, there are ?????????????????????? 100  99  98  …  77  76 possible outcomes. This value is the number of permutations of the coins, denoted 100P25. 2100 How many possible outcomes of n = 100 tosses exist? How many possible outcomes of n = 100 tosses exist with X = 25 Heads? 1 2 3 4 5 ...... 97 98 99 100 ...... X = 25 Heads: { H1, H2, H3,…, H25 } 100  99  98  …  77  76 permutations of 25 among 100 This number unnecessarily includes the distinct permutations of the 25 among themselves, all of which have Heads in the same positions. For example: We would not want to count this as a distinct outcome. 1 2 3 4 5 ...... ...... 97 98 99 100 2100 How many possible outcomes of n = 100 tosses exist? How many possible outcomes of n = 100 tosses exist with X = 25 Heads? 1 2 3 4 5 ...... 97 98 99 100 ...... X = 25 Heads: { H1, H2, H3,…, H25 } 100  99  98  …  77  76 permutations of 25 among 100 This number unnecessarily includes the distinct permutations of the 25 among themselves, all of which have Heads in the same positions. How many is that? By the same logic…... 25  24  23  …  3  2  1 100  99  98  …  77  76 100!_ = 25  24  23  …  3  2  1 25! 75! “25 factorial” - denoted 25! R: choose(100, 25) Calculator: 100 nCr 25  100  “100-choose-25” - denoted  25  or 100C25   This value counts the number of combinations of 25 Heads among 100 coins. 2100 How many possible outcomes of n = 100 tosses exist? How many possible outcomes of n = 100 tosses exist with X = 25 Heads? 1 2 3 4 5 0.4 0.6 0.6 0.4 0.6 ...... 97 . ... . . ... . 98 99 100 0.6 0.4 0.4 0.6  100  Answer:  25    What is the probability of each such outcome? Recall that, per toss, P(Heads) =  = 0.4 P(Tails) = 1 –  = 0.6 Answer: Via independence in binary outcomes between any two coins, 0.4  0.6  0.6  0.4  0.6  …  0.6  0.4  0.4  0.6 = (0.4)25 (0.6)75. 100  25 75 Therefore, the probability P(X = 25) is equal to…….   (0.4) (0.6)  25  R: dbinom(25, 100, .4) 2100 How many possible outcomes of n = 100 tosses exist? How many possible outcomes of n = 100 tosses exist with X = 25 Heads? 1 2 3 4 5 0.4 0.5 0.6 0.5 0.6 0.5 0.4 0.5 0.6 0.5  100  Answer:  25    ...... 97 . ... . . ... . 98 99 100 0.6 0.5 0.4 0.5 0.4 0.5 0.6 0.5 This is the “equally likely” scenario! What is the probability of each such outcome? Recall that, per toss, P(Heads) =  = 0.4 0.5 P(Tails) = 1 –  = 0.5 0.6 Answer: Via independence in binary outcomes between any two coins, 25 100 75 0.4  0.5 0.6  0.5 0.6  0.5 0.4  0.5 0.6  …  0.5 0.6  0.5 0.4  0.5 0.4  0.5 0.6 = (0.4) . (0.5)(0.6) 0.5  100  10025 100 100 75 (0.6) 2(1/ 2) (0.5) Therefore, the probability P(X = 25) is equal to…….  25  (0.4)   Question: What if the coin were “fair” (unbiased), i.e.,  = 1 –  = 0.5 ? POPULATION “Success” 40% Male, vs. “Failure” 60% Female RANDOM SAMPLE nsize = 100 n For any randomly selected individual, define a binary random variable: “Success” with prob    0.4 1 if Male, Y  “Failure” with prob 11–  0.6 0 if Female, Discrete random variable X = # “Successes” Males in sample in sample (0, 1, 2, 3, …, 99, n) 100) Example: What is the probability 100 100  n xx x25 100 x xx 75 (0.4) (0.4)  (1 (1(0.6) (0.6) ))n100      x x P(X = 25)? x  25 n x = 0, 1, 2, 3, …,100 Solution: F(x) =Model P(X ≤the x), sample for x = 0,as 1, 2, a 3, sequence …,100? of n = 100 independent coinwith tosses, with 1 = Heads (Male), 0= Tails Bernoulli trials P(“Success”) = , P(“Failure”) = 1 –(Female). . independent, with constant probability () per trial Then X is said to follow a Binomial distribution, written X ~ Bin(n, ), with “probability mass function” n x n x , x = 0, 1, 2, …, n.  (1 .…  )etc….  x   p(x) =  Example: Blood Type probabilities, revisited Rh Factor Blood Type + – O .384 .077 .461 A .323 .065 .388 B .094 .017 .111 AB .032 .007 .039 .833 .166 .999 Suppose n = 10 individuals are to be selected at random from the population. Probability table for X = #(Type O) Binomial model applies? Check: 1. Independent outcomes? Reasonably assume that outcomes “Type O” vs. “Not Type O” between two individuals are independent of each other.  2. Constant probability  ? From table,  = P(Type O) = .461 throughout population.  Example: Blood Type probabilities, revisited R: dbinom(0:10, 10, .461) Rh Factor x Blood Type + – O .384 .077 .461 1 A .323 .065 .388 2 B .094 .017 .111 AB .032 .007 .039 .833 .166  10  p(x) =  x  (.461)x (.539)10 – x   0 .999 Suppose n = 10 individuals are to be selected at random from the population. Probability table for X = #(Type O) Binomial model applies. X ~ Bin(10, .461) 3 4 5 6 7 8 9 10 p(x)  10  0    10   1    10  2    10  3    10  4    10  5    10  6    10  7    10  8    10  9    10   10    F (x) (.461)0 (.539)10 = 0.00207 0.00207 (.461)1 (.539)9 = 0.01770 0.01977 (.461)2 (.539)8 = 0.06813 0.08790 (.461)3 (.539)7 = 0.15538 0.24328 (.461)4 (.539)6 = 0.23257 0.47585 (.461)5 (.539)5 = 0.23870 0.71455 (.461)6 (.539)4 = 0.17013 0.88468 (.461)7 (.539)3 = 0.08315 0.96783 (.461)8 (.539)2 = 0.02667 0.99450 (.461)9 (.539)1 = 0.00507 0.99957 (.461)10 (.539)0 = 0.00043 1.00000 1 Example: Blood Type probabilities, revisited R: dbinom(0:10, 10, .461) Rh Factor x Blood Type + – O .384 .077 .461 1 A .323 .065 .388 2 B .094 .017 .111 AB .032 .007 .039 .833 .166  10  p(x) =  x  (.461)x (.539)10 – x   0 .999 Suppose n = 10 individuals are to be selected at random from the population. Probability table for X = #(Type O) Binomial model applies. X ~ Bin(10, .461) 3 4 5 6 7 8 9 10 p(x)  10  0    10   1    10  2    10  3    10  4    10  5    10  6    10  7    10  8    10  9    10   10    F (x) (.461)0 (.539)10 = 0.00207 0.00207 (.461)1 (.539)9 = 0.01770 0.01977 (.461)2 (.539)8 = 0.06813 0.08790 (.461)3 (.539)7 = 0.15538 0.24328 (.461)4 (.539)6 = 0.23257 0.47585 (.461)5 (.539)5 = 0.23870 0.71455 (.461)6 (.539)4 = 0.17013 0.88468 (.461)7 (.539)3 = 0.08315 0.96783 (.461)8 (.539)2 = 0.02667 0.99450 (.461)9 (.539)1 = 0.00507 0.99957 (.461)10 (.539)0 = 0.00043 1.00000 1 n = 10 p = .461 pmf = function(x)(dbinom(x, n, p)) N = 100000 x = 0:10 bin.dat = rep(x, N*pmf(x)) hist(bin.dat, freq = F, breaks = c(-.5, x+.5), col = "green") axis(1, at = x) axis(2) Example: Blood Type probabilities, revisited R: dbinom(0:10, 10, .461) Rh Factor x Blood Type + – O .384 .077 .461 1 A .323 .065 .388 2 B .094 .017 .111 AB .032 .007 .039 .833 .166  10  p(x) =  x  (.461)x (.539)10 – x   0 .999 Suppose n = 10 individuals are to be selected at random from the population. Probability table for X = #(Type O) Binomial model applies. X ~ Bin(10, .461) 3 4 5 6 7 8 9 p(x)  10  0    10   1    10  2    10  3    10  4    10  5    10  6    10  7    10  8    10  9    10   10    10 n  Also, can show mean  =  x p(x) = == 4.61 (10)(.461) and variance  2 =  (x – ) 2 p(x) = n (1 – ) = 2.48 F (x) (.461)0 (.539)10 = 0.00207 0.00207 (.461)1 (.539)9 = 0.01770 0.01977 (.461)2 (.539)8 = 0.06813 0.08790 (.461)3 (.539)7 = 0.15538 0.24328 (.461)4 (.539)6 = 0.23257 0.47585 (.461)5 (.539)5 = 0.23870 0.71455 (.461)6 (.539)4 = 0.17013 0.88468 (.461)7 (.539)3 = 0.08315 0.96783 (.461)8 (.539)2 = 0.02667 0.99450 (.461)9 (.539)1 = 0.00507 0.99957 (.461)10 (.539)0 = 0.00043 1.00000 1 Example: Blood Type probabilities, revisited R: dbinom(0:10, 10, .461) Rh Factor x Blood Type + – O .384 .077 .461 1 A .323 .065 .388 2 B .094 .017 .111 AB .032 .007 .039 .833 .166  10  p(x) =  x  (.461)x (.539)10 – x   0 .999 Suppose n = 10 individuals are to be selected at random from the population. Probability table for X = #(Type O) Binomial model applies. X ~ Bin(10, .461) 3 4 5 6 7 8 9 10 p(x)  10  0    10   1    10  2    10  3    10  4    10  5    10  6    10  7    10  8    10  9    10   10    Also, can show mean  =  x p(x) = n = 4.61 and variance  2 =  (x – ) 2 p(x) = n (1 – ) = 2.48 F (x) (.461)0 (.539)10 = 0.00207 0.00207 (.461)1 (.539)9 = 0.01770 0.01977 (.461)2 (.539)8 = 0.06813 0.08790 (.461)3 (.539)7 = 0.15538 0.24328 (.461)4 (.539)6 = 0.23257 0.47585 (.461)5 (.539)5 = 0.23870 0.71455 (.461)6 (.539)4 = 0.17013 0.88468 (.461)7 (.539)3 = 0.08315 0.96783 (.461)8 (.539)2 = 0.02667 0.99450 (.461)9 (.539)1 = 0.00507 0.99957 (.461)10 (.539)0 = 0.00043 1.00000 1 Example: Blood Type probabilities, revisited Rh Factor Blood Type + Therefore,  1500  x 1500  x (.007) (.993) p(x) =    x  – O .384 .077 .461 A .323 .065 .388 B .094 .017 .111 AB .032 .007 .039 .833 .166 .999 1500 individuals Suppose nn==10 individuals areare to to be selected at random from the population. Probability table for X = #(Type AB–) Binomial model applies. X ~ Bin(10, Bin(1500, .461) .007) Also, can show mean  =  x p(x) = n = 10.5 – ) = 10.43 2.48 and variance  2 =  (x – ) 2 p(x) = n (1 x = 0, 1, 2, …, 1500. RARE EVENT! Example: Blood Type probabilities, revisited Therefore,  1500  x 1500  x (.007) (.993) p(x) =    x  x = 0, 1, 2, …, 1500. Is there a better alternative? RARE EVENT! Long positive skew as x  1500 …but contribution  0 Example: Blood Type probabilities, revisited Rh Factor Blood Type + Therefore,  1500  x 1500  x (.007) (.993) p(x) =    x  – x = 0, 1, 2, …, 1500. O .384 .077 .461 A .323 .065 .388 B .094 .017 .111 Poisson distribution AB .032 .007 .039 RARE EVENT! .833 .166 .999 Is there a better alternative? 1500 individuals Suppose nn==10 individuals areare to to be selected at random from the population. Probability table for X = #(Type AB–) p( x ) = e μ μ x x! x = 0, 1, 2, …, where mean and variance are  = n = 10.5 and  2 = n = 10.5 Binomial model applies. X ~ Bin(1500, .007) Also, can show mean  =  x p(x) = n = 10.5 and variance  2 =  (x – ) 2 p(x) = n (1 – ) = 10.43 X ~ Poisson(10.5) Notation: Sometimes the symbol  (“lambda”) is used instead of  (“mu”). Example: Blood Type probabilities, revisited Rh Factor Blood Type + Therefore,  1500  x 1500  x (.007) (.993) p(x) =    x  – x = 0, 1, 2, …, 1500. O .384 .077 .461 A .323 .065 .388 B .094 .017 .111 Poisson distribution AB .032 .007 .039 RARE EVENT! .833 .166 .999 Is there a better alternative? Suppose n = 1500 individuals are to be selected at random from the population. Probability table for X = #(Type AB–) p( x ) = x  ee10.5 (1x 0.5) x !x ! where mean and variance are  = n = 10.5 and  2 = n = 10.5 Ex: Probability of exactly X = 15 Type(AB–) individuals = ?  1500  15 1485 Binomial:  15  (.007) (.993)   x = 0, 1, 2, …, Poisson: X ~ Poisson(10.5) e 10.5 (10.5)15 15! (both ≈ .0437) Example: Deaths in Wisconsin Example: Deaths in Wisconsin Assuming deaths among young adults are relatively rare, we know the following: • Average λ = 584 deaths per year • Mortality rate (α) seems constant. Therefore, the Poisson distribution can be used as a good model to make future predictions about the random variable X = “# deaths” per year, for this population (15-24 yrs)… assuming current values will still apply.  Probability of exactly X = 600 deaths next year e584 (584)600  0.0131 P(X = 600) = 600! R: dpois(600, 584)  Probability of exactly X = 1200 deaths in the next two years Mean of 584 deaths per yr  Mean of 1168 deaths per two yrs, so let λ = 1168: e1168 (1168)1200  0.00746 P(X = 1200) = 1200! 584 deaths / yr  Probability of at least one death per day: λ = 365 days / yr = 1.6 deaths/day P(X ≥ 1) = P(X = 1) + P(X = 2) + P(X = 3) + … True, but not practical. e1.6 (1.6)0 = 1 – e–1.6 = 0.798 P(X ≥ 1) = 1 – P(X = 0) = 1 – 0! Poisson Distribution (discrete) For x = 0, 1, 2, …, this calculates P(x Events) in a random sample of n trials coming from a population with rare P(Event) = . But it may also be used to calculate P(x Events) within a random interval of time units, for a “Poisson process” having a known “Poisson rate” α. 0 X = # “clicks” on a Geiger counter in normal background radiation. T Poisson Distribution (discrete) For x = 0, 1, 2, …, this calculates P(x Events) in a random sample of n trials coming from a population with rare P(Event) = . But it may also be used to calculate P(x Events) within a random interval of time units, for a “Poisson process” having a known “Poisson rate” α. 0 T X = #time “clicks” between on a “clicks” Geiger on counter a in Geiger normalcounter background in normal radiation. background radiation. failures, deaths, births, etc. • “Time-to-Event Analysis” • “Time-to-Failure Analysis” • “Reliability Analysis” • “Survival Analysis” Time between events is often modeled by the Exponential Distribution (continuous). ● Binomial ~ X = # Successes in n trials, P(Success) =  ● Poisson ~ As above, but n large,  small, i.e., Success RARE ● Negative Binomial ~ X = # trials for k Successes, P(Success) =  ● Geometric ~ As above, but specialized to k = 1 ● Hypergeometric ~ As Binomial, but  changes between trials ● Multinomial ~ As Binomial, but for multiple categories, with 1 + 2 + … + last = 1 and x1 + x2 + … + xlast = n POPULATION random variable X Example: X = Cholesterol level (mg/dL) Example: X = “reaction time” “Pain Threshold” Experiment: Volunteers place one hand on metal plate carrying low electrical current; measure duration till hand withdrawn. Time Time intervals intervals = 1.0 = 5.0 0.5 2.0 1.0 secs secs “In the limit…” f ( x) we obtain a density curve Total Area = 1 SAMPLE In principle, as # individuals in samples increase without bound, the class interval widths can be made arbitrarily small, i.e, the scale at which X is measured can be made arbitrarily fine, since it is continuous. 67 “In the limit…” we obtain a density curve Cumulative probability F(x) = P(X  x) = Area under density curve up to x f(x) = probability density function (pdf) • f(x)  0 • Area = 1 f ( x) 00 F(x) increases continuously from 0 to 1. x x x As with discrete variables, the density f(x) is the height, NOT the probability p(x) = P(X = x). In fact, the zero area “limit” argument would seem to imply P(X = x) = 0 ??? (Later…) However, we can define “interval probabilities” of the form P(a  X  b), using cdf F(x). 68 “In the limit…” we obtain a density curve Cumulative probability F(x) = P(X  x) = Area under density curve up to x F(b) f(x) = probability density function (pdf) F(b)  F(a) F(a) • f(x)  0 • Area = 1 f ( x) a b F(x) increases continuously from 0 to 1. a b As with discrete variables, the density f(x) is the height, NOT the probability p(x) = P(X = x). In fact, the zero area “limit” argument would seem to imply P(X = x) = 0 ??? (Later…) However, we can define “interval probabilities” of the form P(a  X  b), using cdf F(x). 69 “In the limit…” we obtain a density curve Cumulative probability F(x) = P(X  x) = Area under density curve up to x F(b) f(x) = probability density function (pdf) F(b)  F(a) F(a) • f(x)  0 • Area = 1 f ( x) a b F(x) increases continuously from 0 to 1. a b An “interval probability” P(a  X  b) can be calculated as the amount of area under the curve f(x) between a and b, or the difference P(X  b)  P(X  a), i.e., F(b)  F(a). (Ordinarily, finding the area under a general curve requires calculus techniques… unless the “curve” is a straight line, for instance. Examples to follow…) 70 Consider the following continuous random variable… Example: X = “Ages of children from 1 year old to 6 years old” Further suppose that X is uniformly distributed over the interval [1, 6]. f ( x)  0.20 > 0 Density f ( x)  Total Area = 1 1 Check? 1 1 1 6 Base 6= 6 – 16= 5 6 Height = 0.2 1 6 1 6 5  0.2 = 1  X “What is the probability of that rolling a random a 4?” child is 4 years old?” doesn’t mean….. P( X  4) 4.000000000......)  16 A single value is one point out of an infinite continuum of points on the real number line. The probability that a continuous random variable is exactly equal to any single value is ZERO! Consider the following continuous random variable… Example: X = “Ages of children from 1 year old to 6 years old” Further suppose that X is uniformly distributed over the interval [1, 6]. f ( x)  0.20 Density f ( x) 1 6 1 6 1 6 1 6 1 6 1 6 X “What is the probability of rolling a 4?” child is 4between 4 and 5 years old?” that a random years old?” actually means.... P(4 ( XX4) 5) = (5 – 4)(0.2) = 0.2 NOTE: Since P(X = 5) = 0, no change for P(4  X  5), P(4 < X  5), or P(4 < X < 5). Consider the following continuous random variable… Example: X = “Ages of children from 1 year old to 6 years old” Further suppose that X is uniformly distributed over the interval [1, 6]. Cumulative probability F(x) = P(X  x) = Area under density curve up to x f ( x)  0.20 Density f ( x) For any x, the area under the curve is 1 6 1 6 1F(x) =10.2 (x1– 1). 1 6 6 6 6 X x x or F ( x)   0.2 dt 1 Consider the following continuous random variable… Example: X = “Ages of children from 1 year old to 6 years old” Further suppose that X is uniformly distributed over the interval [1, 6]. Cumulative probability F(x) = P(X  x) = Area under density curve up to x f ( x)  0.20 F(x) = 0.2 (x – 1) Density f ( x) For any x, the area under the curve is 1 6 1 6 F(x) increases continuously from 0 to 1. 1F(x) =10.2 (x1– 1). 1 6 6 6 6 (compare with “staircase graph” for discrete case) X x x or F ( x)   0.2 dt 1 Consider the following continuous random variable… Example: X = “Ages of children from 1 year old to 6 years old” Further suppose that X is uniformly distributed over the interval [1, 6]. Cumulative probability F(x) = P(X  x) = Area under density curve up to x f ( x)  0.20 F(x) = 0.2 (x – 1) Density f ( x) F(5) = 0.8 1 6 1 6 1 6 1 6 1 6 1 6 X “What is the probability of rolling a 4?” child is under 5 years old? that a random F (5)  P ( X  5)  0.2 (5  1)  0.8 Consider the following continuous random variable… Example: X = “Ages of children from 1 year old to 6 years old” Further suppose that X is uniformly distributed over the interval [1, 6]. Cumulative probability F(x) = P(X  x) = Area under density curve up to x f ( x)  0.20 F(x) = 0.2 (x – 1) Density f ( x) 1 6 1 6 1 6 1 6 1 6 1 6 F(4) = 0.6 X “What is the probability of rolling a 4?” child is under 4 years old? that a random F (4)  P ( X  4)  0.2 (4  1)  0.6 Consider the following continuous random variable… Example: X = “Ages of children from 1 year old to 6 years old” Further suppose that X is uniformly distributed over the interval [1, 6]. Cumulative probability F(x) = P(X  x) = Area under density curve up to x f ( x)  0.20 F(x) = 0.2 (x – 1) Density f ( x) F(5) = 0.8 1 6 1 6 1 6 1 6 1 6 1 6 F(4) = 0.6 X “What is the probability of rolling a 4?” child is between 4 and 5 years old?” that a random P(4  X  5)  P ( X  5)  P( X  4) Consider the following continuous random variable… Example: X = “Ages of children from 1 year old to 6 years old” Further suppose that X is uniformly distributed over the interval [1, 6]. Cumulative probability F(x) = P(X  x) = Area under density curve up to x f ( x)  0.20 F(x) = 0.2 (x – 1) Density f ( x) F(5) = 0.8 1 6 1 6 1 6 1 6 1 6 0.2 1 6 F(4) = 0.6 X “What is the probability of rolling a 4?” child is between 4 and 5 years old?” that a random P(4  X  5)  P ( X  5)  P( X  4) = F(5)  F(4) = 0.8 – 0.6 = 0.2 Consider the following continuous random variable… Example: X = “Ages of children from 1 year old to 6 years old” Further suppose that X is uniformly distributed over the interval [1, 6]. Density f ( x) f ( x)  .08 ( x  1)  0  1 Base Height  1)  (0.4) Area = (6 2 =1  0.4 Consider the following continuous random variable… Example: X = “Ages of children from 1 year old to 6 years old” Cumulative Distribution Function F(x) Cumulative probability F(x) = P(X  x) = Area under density curve up to x f ( x)  .08 ( x  1) Density f ( x) F ( x) x base height 1 ( x  1) .08( x  1) 2  .04 ( x  1) 2 F ( x)  i.e.,  x 1 .08(t  1) dt F ( x) x Consider the following continuous random variable… Example: X = “Ages of children from 1 year old to 6 years old” Cumulative Distribution Function F(x) Cumulative probability F(x) = P(X  x) = Area under density curve up to x f ( x)  .08 ( x  1) Density f ( x) F ( x) base height 1 ( x  1) .08( x  1) 2  .04 ( x  1) 2 F ( x)  i.e.,  x 1 .08(t  1) dt F (5) F (4) x “What is the probability that a child is under 4 years old?” “What is the probability that a child is under 5 years old?” “What is the probability that a child is between 4 and 5?” P ( X  4)  F (4) P ( X  5)  F (5) P(4  X  5)  A continuous random variable X Cumulative probability function (cdf) In summary… x corresponds to a probability density F ( x)  P( X  x)  f (t ) dt function (pdf) f(x), whose graph is a  density curve. f(x) is NOT a pmf! F ( x)  f ( x)  f ( x)  0 f ( x)    f ( x) dx  1 Fundamental Theorem of Calculus P( X  any constant a)  0, not f (a) F(x) increases continuously from 0 to 1. b P(a  X  b)   f ( x) dx  F (b)  F (a) Moreover… a 82 A continuous random variable X Cumulative probability function (cdf) In summary… x corresponds to a probability density F ( x)  P( X  x)  f (t ) dt function (pdf) f(x), whose graph is a  density curve. f(x) is NOT a pmf! F ( x)  f ( x)  f ( x) 0   E[ X ]   f ( x) dxx f1( x) dx      E ( X   )    2 2  Fundamental Theorem of Calculus  ( x   ) F(x)f increases ( x) dx 2  continuously from 0 to 1.  E  X   E[ X ]   x f ( x) dx    2 2 P( X  any constant a)  0, not f (a) 2 b 2 P(a  X  b)   f ( x) dx  F (b)  F (a) Moreover… a 83 SECTION 4.3 IN POSTED LECTURE NOTES 85 Four Examples: 1 For any b > 0, consider the following probability density function (pdf)... Determine the cumulative distribution function (cdf) 2  x, 0  x  b f ( x)   b 2 0, else F ( x)  P( X  x) 2 For any x < 0, it follows that…  b F ( x)  P( X  x)  0. 2 x 2 b For any 0  x  b, it follows that… F ( x)  P( X  x)  x 2 b2 without calculus... 2 x b x 0 1  2 0  ( x  0)  2 2 b with calculus... 2 b    2  x x  2  b f (t ) dt  0   x 0 2 x2 t dt  2 2 b b Four Examples: 1 For any b > 0, consider the following probability density function (pdf)... Determine the cumulative distribution function (cdf) 2  x, 0  x  b f ( x)   b 2 0, else F ( x)  P( X  x) 2 For any x < 0, it follows that  b F ( x)  P( X  x)  0. 2 x 2 b For any 0  x  b, it follows that… F ( x)  P( X  x)  x 2 b2 x 2 b2 0 x b Four Examples: 1  For any b > 0, consider the following probability density function (pdf)... Determine the cumulative distribution function (cdf) 2  x, 0  x  b f ( x)   b 2 0, else F ( x)  P( X  x) 2 For any x < 0, it follows that  b F ( x)  P( X  x)  0. 2 x 2 b For any 0  x  b, it follows that… F ( x)  P( X  x)  x 2 b2 Note: F (b)  b 2 b 2  1 1 For any x  b, it follows that… F ( x)  P( X  x)  1  0 0 bx Four Examples: 1 For any b > 0, consider the following probability density function (pdf)... Determine the cumulative distribution function (cdf) 2  x, 0  x  b x0 f ( x)   b 2 0,  x 2 0, else F ( x)  P( X  x)   2 , 0 xb 2 b  xb b  1, 2 x 2 b 1 2 x 2 b Monotonic and continuous from 0 to 1 0 b 0 b 89 Four Examples: 2 For any b > a > 0, consider the probability density function (pdf)...  2x 0 xa  ab ,   2( x  b) f ( x)   , a xb  b( a  b)  0, else Determine the cumulative distrib function (cdf) F ( x)  P( X  x) For any x  0, it follows that F ( x)  0. For any 0  x  a , it follows that F ( x)  0   2( x  b) b(a  b) 2x ab a 0 x b x x 0 2 a x 2t dt   F (a)  ab b ab For any a  x  b , it follows that x 2(t b) a ( x  b) 2 F ( x)  F (a)  dt  1  b a b(a b) b( a  b) 1 (b)  0 For any x  b, it follows that F ( x)  F 90 Four Examples: 2 For any b > a > 0, consider the probability density function (pdf)...   2x   Edistrib [ X ] function x f ((cdf) x) dx Determine the mean cumulative  , 0  x  a   ab  a b  2( x  b)  x f ( x) dx  x f ( x) dx f ( x)   , a xb 0 a b ( a  b )  a 2x b 2( x  b)   x dx  x dx  0 a 0, ab b(a  b) else     Determine the variance 2( x  b) b(a  b) 2x ab 2  0 a    E  X   E  X    x2 f ( x) dx   2  2 a 0 2 b 2x 2 2( x  b) x dx   x dx   2  a ab b(a  b) 2 b 91 Four Examples: 3 Consider the following probability density function (pdf)... 2  , x 1 f ( x)   x3 0, x 1 WARNING: “IMPROPER INTEGRAL” Confirm pdf    f ( x) dx    1    E[ X ]     x f ( x) dx   2 2    x 3 dx   2 dx 1 1 x x 1 c c x   lim 2 x 2 dx  lim 2   1 c  c   1  1 c c c c x  2 3 dx  lim 2 1 x dx  lim 2   3 c  c   2 x  1 c 1 1    lim   2   1  lim 2  1 c  c  c  x 1 2 2  2  lim     2  lim  2 c  c  c  x 1 Four Examples: 4 3 Consider the following probability density function (pdf)...  12  , x 1 f ( x)   x 23 0, x 1 WARNING: “IMPROPER INTEGRAL” Confirm pdf    f ( x) dx    1 1 2 c c  x x   cc 21 23 dx  lim 2  xx dx dx  lim 2    32 c  1 1 c c   x   12 1 1 cc 1 1 11    lim   2  11lim lim 2 1 1 c  c  c cc x   1 1    E[ X ]     x f ( x) dx   12 12    x 23 dx   2dxdx 1 1 x x 1 c c c x  2  lim 2  x x1dx dx  lim 2   c  1 1 c   1  1 c c 2  2  lim     2  lim  2 c  c  c  x 1 Four Examples: 4 3 Consider the following probability density function (pdf)...  12  , x 1 f ( x)   x 23 0, x 1 WARNING: “IMPROPER INTEGRAL” Confirm pdf    f ( x) dx    1 1 2 c c  x x   cc 21 23 dx  lim 2  xx dx dx  lim 2    32 c  1 1 c c   x   12 1 1 cc 1 1 11    lim   2  11lim lim 2 1 1 c  c  c cc x   1 1    E[ X ]     1   x f ( x) dx 1 1 x 2 dx   dx 1 x x  lim  x 1 dx  lim  ln | x |1 c c  1  lim (ln c) c  c c c    Time intervals intervals = 1.0 = 5.0 2.0 1.0 secs secs Time 0.5 “Density” Interval widths can be made arbitrarily small, i.e, the scale at which X is measured can be made arbitrarily fine, since it is continuous. f ( x) p ( x) (height) (area) | x x (width) pmf P( X  x)  p( x)  f ( x) x  As x  0 and # rectangles  ∞, this “Riemann sum” approaches the area under the density curve f(x), expressed as a definite integral. pdf   f ( x)  dxx  1   Total Area b P(a  X  b)   f ( x)) dxx b a a 96 ~ The Normal Distribution ~ (a.k.a. “The Bell Curve”) standard deviation X ~ N(μ, σ) σ Johann Carl Friedrich Gauss 1777-1855 X mean μ • Symmetric, unimodal • Models many (but not all) natural systems • Mathematical properties make it useful to work with 97 Standard Normal Distribution Z ~ N(0, 1) density function 2  ( z)  1  z2 e 2 1 Total Area = 1 Z The cumulative distribution function (cdf) is denoted by (z). It is not expressible in explicit, closed form, but is tabulated, and computable in R via the command pnorm. Example Standard Normal Distribution Find (1.2) = P(Z  1.2). Z ~ N(0, 1) 1 Total Area = 1 Z 1.2 “z-score” Example Standard Normal Distribution Find (1.2) = P(Z  1.2). Z ~ N(0, 1)  Use the included table. 1 Total Area = 1 Z 1.2 “z-score” Lecture Notes Appendix… 101 102 Example Standard Normal Distribution Find (1.2) = P(Z  1.2). Z ~ N(0, 1)  Use the included table.  Use R: > pnorm(1.2) [1] 0.8849303 1 Total Area = 1 0.88493 P(Z > 1.2) 0.11507 Z 1.2 “z-score” Note: Because this is a continuous distribution, P(Z = 1.2) = 0, so there is no difference between P(Z > 1.2) and P(Z  1.2), etc. Standard Normal Distribution X ~ N(μ, σ) σ μ Z ~ N(0, 1) Z X   1 Z Why be concerned about this, when most “bell curves” don’t have mean = 0, and standard deviation = 1? Any normal distribution can be transformed to the standard normal distribution via a simple change of variable. Example POPULATION Random Variable X = Age at first birth Question: What proportion of the population had their first child before the age of 27.2 years old? P(X < 27.2) = ? Year 2010 X ~ N(25.4, 1.5) σ = 1.5 μ = 25.4 27.2 105 Example POPULATION Random Variable X = Age at first birth Question: What proportion of the population had their first child before the age of 27.2 years old? P(X < 27.2) = ? Year 2010 X ~ N(25.4, 1.5) The x-score = 27.2 must first be transformed to a corresponding z-score. σ = 1.5 μ μ==25.4 27.2 33 106 Example POPULATION Random Variable X = Age at first birth Question: What proportion of the population had their first child before the age of 27.2 years old? P(X < 27.2) = ?P(Z < 1.2) = 0.88493 Year 2010 X ~ N(25.4, 1.5) 27.2 X 25.4    1.2 Z Z Z  1.5 σ = 1.5  Using R: > pnorm(27.2, 25.4, 1.5) [1] 0.8849303 μ μ==25.4 27.2 33 107 Standard Normal Distribution Z ~ N(0, 1) 1 Z What symmetric interval about the mean 0 contains 95% of the population values? That is… Standard Normal Distribution Z ~ N(0, 1)  Use the included table. 0.95 0.025 0.025 Z -z.025 = ? +z.025 = ? What symmetric interval about the mean 0 contains 95% of the population values? That is… Lecture Notes Appendix… 110 111 Standard Normal Distribution Z ~ N(0, 1)  Use the included table.  Use R: > qnorm(.025) [1] -1.959964 > qnorm(.975) [1] 1.959964 0.95 0.025 0.025 Z -z.025 = -1.96 ? “.025 critical values” +z.025 = +1.96 ? What symmetric interval about the mean 0 contains 95% of the population values? X ~ N(μ1.5) , σ) X ~ N(25.4, Standard Normal Distribution Z ~ N(0, 1) What symmetric interval about the mean age of 25.4 contains 95% of the population values? 22.46  X  28.34 yrs > areas = c(.025, .975) > qnorm(areas, 25.4, 1.5) [1] 22.46005 28.33995 Z X   X  25.4 1.96  1.5 X  25.4  (1.96)(1.5) X  25.4  2.94 0.95 0.025 0.025 Z -z.025 = -1.96 ? “.025 critical values” +z.025 = +1.96 ? What symmetric interval about the mean 0 contains 95% of the population values? Standard Normal Distribution Z ~ N(0, 1)  Use the included table. 0.90 0.05 0.05 Z Similarly… -z.05 = ? +z.05 = ? What symmetric interval about the mean 0 contains 90% of the population values? …so average 1.64 and 1.65 0.95  average of 0.94950 and 0.95053… 115 Standard Normal Distribution Z ~ N(0, 1)  Use the included table.  Use R: > qnorm(.05) [1] -1.644854 > qnorm(.95) [1] 1.644854 0.90 0.05 0.05 Z Similarly… -z.05 = -1.645 ? “.05 critical values” +z +z.05 = +1.645 ? .05 = What symmetric interval about the mean 0 contains 90% of the population values? Standard Normal Distribution Z ~ N(0, 1) In general…. 10.90 – 0.05 /2 0.05 /2 Z Similarly… -z.05 = -1.645 ? -z / 2 ““.05  / 2critical criticalvalues” values” +z +z.05 = +1.645 ? .05 / 2= What symmetric interval about the mean 0 contains 100(1 – )% of the population values? continuous discrete Normal Approximation to the Binomial Distribution Suppose a certain outcome exists in a population, with constant probability . We will randomly select a random sample of n individuals, so that the binary “Success vs. Failure” outcome of any individual is independent of the binary outcome of any other individual, i.e., n Bernoulli trials (e.g., coin tosses). Discrete random variable X = # Successes in sample (0, 1, 2, 3, …,, n) P(Success) =  P(Failure) = 1 –  Then X is said to follow a Binomial distribution, written X ~ Bin(n, ), with “probability function” n x n x ,  (1   )  x   p(x) = x = 0, 1, 2, …, n. 118 > dbinom(10, 100, .2) [1] 0.00336282 Area 119 > pbinom(10, 100, .2) [1] 0.005696381 Area 120 121 122 123 124 Therefore, if… X ~ Bin(n, ) with n  15 and n (1 – )  15, then…   X  N n  , n  (1   . That is…  X ˆ   N   , n   (1   )   n  “Sampling Distribution” of ˆ 125 ● Normal distribution ● Log-Normal ~ X is not normally distributed (e.g., skewed), but Y = “logarithm of X” is normally distributed ● Student’s t-distribution ~ Similar to normal distr, more flexible ● F-distribution ~ Used when comparing multiple group means ● Chi-squared distribution ~ Used extensively in categorical data analysis ● Others for specialized applications ~ Gamma, Beta, Weibull… 126

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Document