Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Principles of Biostatistics 2nd Edition 2nd Edition by Marcello Pagano & Kimberlee Gauvreau 2000 Chapter 7 Theoretical Probability Theoretical Probability Distributions Liang‐‐Yi Wang Liang Yi Wang Department of Public Health, NCKU [email protected] 2010/3/17 1 Contents Probability distributions Probability distributions The binominal distribution The Poisson distribution The normal distribution The normal distribution 2010/3/17 2 V i bl Variable A characteristic that can be measured or categorized Random variable A variable can assume a number of different value such that any particular outcome is determined by chance any particular outcome is determined by chance Discrete random variable ◦ Only a finite or countable number of outcomes. Ex: marital status( single, married, divorced, widowed), the number of infections an infant develops during his or her first year of infections an infant develops during his or her first year of life. Continuous random variable ◦ A variable can take on any value within a specified interval continuum, ex: weight, height. 2010/3/17 3 Probability Distribution Probability Distribution A probability distribution applies the A probability distribution applies the theory of probability to describe the behavior of the random variable. f ◦ Discrete random variable Determine the probabilities of possible outcomes of the random variable ◦ Continuous random variable determine the probabilities associated with p specified ranges of values. 2010/3/17 4 Discrete random variable • Determine the probabilities of p possible outcomes of the random variable 2010/3/17 5 Continuous random variable • determine the probabilities p associated with specified ranges of values. 2010/3/17 6 Probability Distribution Probability Distribution Empirical probability distribution Empirical probability distribution ◦ Probability that are calculated from a finite amount of data Theoretical probability distribution p y ◦ Probability that are determined based on theoretical considerations. theoretical considerations. ◦ N, mean, variance, ….. 2010/3/17 7 The binominal distribution Th bi i l di t ib ti The Poisson distribution The normal distribution THEORETICAL PROBABILITY DISTRIBUTION 2010/3/17 8 Theoretical Probability Distribution What have to Know? Function of P(X), mean, variance, sd Function of P(X) mean variance sd 2. Parameter 3. The probability distribution 1 1. 2010/3/17 9 THE BINOMINAL DISTRIBUTION 2010/3/17 10 What is the Binominal Distribution? What is the Binominal Distribution? Dichotomous random variable (Y) Dichotomous random variable (Y) ◦ Live and death; male and female; sickness and health; failure and success ◦ Bernoulli random variable 29% of the adults in the US are smoker (Y=1, smoker; Y=0, nonsmoker) 2010/3/17 11 What is the Binominal Distribution What is the Binominal Distribution Dichotomous random variable (Y) Dichotomous random variable (Y) ◦ Live and death; male and female; sickness and health; failure and success ◦ Bernoulli random variable ( two mutually exclusive outcomes) The number of persons in the sample who are smokers (X) ◦ Ex: sample size n=2, X can take on 0, 1, or 2. ◦ Bernoulli trial B lli t i l 2010/3/17 12 n=2 What is the Binominal Distribution What is the Binominal Distribution 2010/3/17 13 n=2 What is the Binominal Distribution What is the Binominal Distribution probability 0.504 0.412 0.084 0 0.6 6 Probability X=x 0.5 Binomial distribution (n=2 p=0 29) (n=2, p=0.29) 0.4 0.3 0.2 0.1 0 0 1 2 Number of smokers x 2010/3/17 14 Assumption (the Binominal Distribution) (the Binominal Distribution) There are a fixed number of trials n, each of which results in one of two each of which results in one of two mutually exclusive outcomes. 2) The outcomes of the n Th f h trials are i l independent. 3) The probability of success p is constant for each trial for each trial. 1) 2010/3/17 15 n=3 p=0 29 n=3, p=0.29 The Binominal Distribution The Binominal Distribution 2010/3/17 16 Parameter = n, p The Binominal Distribution The Binominal Distribution Function of P(X), mean, variance, SD i Mean= np Variance= np(1‐p) Variance= np(1‐p) SD= Try to use Try to use Table A.1 Table A.1 in Appendix A in Appendix A 2010/3/17 17 n=10 p=0 29 n=10, p=0.29 The Binominal Distribution The Binominal Distribution Mean= np Mean np=2.9 2.9 SD= 2010/3/17 18 SD= 2010/3/17 19 (review) Th Theoretical Probability Distribution ti l P b bilit Di t ib ti What have to Know? Function of P(X), mean, variance, sd 2. Parameter 3 The probability distribution 3. The probability distribution 1. 2010/3/17 20 THE POISSON DISTRIBUTION 2010/3/17 21 What is the Poisson Distribution What is the Poisson Distribution Ex: Ex: ◦ We are interested in the probability distribution of the number of individuals involved in a motor vehicle the number of individuals involved in a motor vehicle accident each year (x) in a town of the US. ◦ p=0.00024, n=10,000 Difficult to calculate using binomial dist.!!! using binomial dist !!! 2010/3/17 22 What is the Poisson Distribution What is the Poisson Distribution When n is very large and p is very small, the When n is very large and p is very small the binomial distribution is well approximated by another theoretical probability distribution another theoretical probability distribution ‐‐‐‐‐‐‐Poisson distribution Poisson distribution Th P i The Poisson distribution is used to model di t ib ti i dt d l discrete events that occur infrequently in time or space. 2010/3/17 23 The Poisson Distribution The Poisson Distribution A Assumption ti 1) 2) 3) The probability that a single event occurs within an interval is proportional to the length h l l h l h of the interval Wi hi Within a single interval, an infinite number of i l i l i fi i b f occurrences of the event are theoretically possible We are not restricted to a fixed possible. We are not restricted to a fixed number of trials. The events occur independently both within The events occur independently both within the same interval and between consecutive intervals. 2010/3/17 24 The The Poisson Distribution Poisson Distribution X (0→∞) ◦ the number of occurrences of some event of interest over a given interval λ =np (constant) ◦ The average number of occurrences of the event in an interval e =Natural logarithms ≈ 2.71828 g 2010/3/17 25 The The Poisson Distribution Poisson Distribution Function of P(X), mean, variance i Mean= np Variance= np(1‐p) ≈ np Mean=variance=np = λ Try to use Try to use Table A.2 Table A.2 in Appendix A in Appendix A 2010/3/17 26 The The Poisson Distribution Poisson Distribution Ex: We are interested in the probability distribution of Ex: We are interested in the probability distribution of the number of individuals involved in a motor vehicle accident each year (x) accident each year (x) in in a town of the US. a town of the US. ◦ p=0.00024, n=10,000 2010/3/17 27 The The Poisson Distribution Poisson Distribution 2010/3/17 28 Small λ → Poisson distribution is highly skewed λ increase → the distribution becomes symmetric increase → the distribution becomes symmetric 2010/3/17 29 THE NORMAL DISTRIBUTION 2010/3/17 30 What is the Continuous Distribution What is the Continuous What is the In the binomial or Poisson distribution is for In the binomial or Poisson distribution is for discrete random variable ( x is integer value), not for continuous random variable not for continuous random variable. The probability density 2010/3/17 31 The Normal Distribution (also called Gaussian Dist. or Bell (also called Gaussian Dist. or Bell‐‐shaped Curve) The most common continuous distribution Symmetrical, mean=median=mode y , The probability density ◦ μ (mu): mean (mu): mean ◦ σ (sigma): standard deviation ◦ π (pi): a constant, 3.14159 (pi): a constant 3 14159 2010/3/17 32 The Normal Distribution The Normal Distribution μ =129 σ =19.8 19 8 2010/3/17 33 The Normal Distribution The Normal Distribution The total area under the Th t t l d th curve must be 1 145 mm Hg P(x=90.2)=0 P(X < 90.2)=0.025 P(90.2 < X < 167.8)=0.95 ( ) 2010/3/17 34 If If ◦ p is constant but n approaches infinity in a binomial distribution ◦ λ approaches infinity in a Poisson distribution → Normal distribution shape → Normal distribution shape 2010/3/17 35 The Standard Normal Distribution The Standard Normal Distribution It s impossible to tabulate the area associated It’s impossible to tabulate the area associated with every normal distribution curve. W We can The standard normal distribution ◦ μ =0, σ =1 Use Table A.3 in Appendix A Use Table A.3 Use Table A 3 in Appendix A in Appendix A 2010/3/17 36 The The Standard Normal Distribution Standard Normal Distribution Table A.3 in Appendix A Table A.3 in Appendix A For a particular value of z, the entry in the body of the table specifies the area beneath the of the table specifies the area beneath the curve to the right of z, or P(Z>z) 2010/3/17 37 The The Standard Normal Distribution Standard Normal Distribution Use table A.3 in Appendix A 2010/3/17 38 The The Standard Normal Distribution Standard Normal Distribution If P(Z>z)=0 10 z=? If P(Z>z)=0.10, z=? Use table A.3 in Appendix A 2010/3/17 39 The The Standard Normal Distribution Standard Normal Distribution Normal dist. μ =2 σ =0.5 Standard normal dist. Z score By transforming X into Z, we can use the table A.3 (standard normal curve) to estimate probabilities associated with X 2010/3/17 40 The The Standard Normal Distribution Standard Normal Distribution Let X be a random variable that represents systolic blood pressure For the population of 18 to 74‐year old males pressure . For the population of 18‐ to 74 year old males in the US, systolic blood pressure is approximately normally distributed with mean 129 mm Hg and standard deviation 19 9 mm Hg deviation 19.9 mm Hg. μ =129 σ =19.8 2010/3/17 41 The The Standard Normal Distribution Standard Normal Distribution μ =129 σ =19.8 P(90.2 < X < 167.8)= ? ( ) When X=90.2→ Z=(90.2‐129)/19.8=‐1.96 ( ) X=167.8→ Z=(167.8‐129)/19.8= 1.96 P(‐1.96 < Z < 1.96)= 0.95 → P(90.2 < X < 167.8)= 0.95 ( ) → ( ) 2010/3/17 42 END 2010/3/17 43