* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Poisson distribution
Survey
Document related concepts
Transcript
Theoretical distributions: the other distributions The Aim By the end of this lecture, the students will be aware of other theoretical distributions 2 The Goals • List the important properties of the t-, Chi-squared, F- and Lognormal distributions • Explain when each of these distributions is particularly useful • List the important properties of the Binomial and Poisson distributions • Explain when the Binomial and Poisson distributions are each particularly useful 3 Other theoretical distribution A-Continious probaility distribution -t-distribution -Chi-squared (x2) distribution -F-distribution -LogNormal distribution B-Discrete probability distribution -Binomial distribution -Poisson distribution 4 More continuous probability distributions •These distributions are based on continuous random variables. •Often it is not a measurable variable that follows such a distribution but a statistic derived from the variable. •The total area under the probability density function represents the probability of all possible outcomes, and is equal to one. 5 The t-distribution -Derived by W.S. Gossett, who published under the pseudonym 'Student'; it is often called Student's t-distribution. -The parameter that characterizes the t-distribution is the degrees of freedom (df=n-1), so we can draw the probability density function if we know *the equation of the t-distribution and *its degrees of freedom. Note that they are often closely affiliated to sample size. 6 The t-distribution -Its shape is similar to that of the Standard Normal distribution, but it is more spread out, with longer tails. Its shape approaches Normality as the degrees of freedom increase. -It is particularly useful for calculating confidence intervals for and testing hypotheses about one or two means. 7 Sec. 10.1 t Distribution for Inferences about a Mean • The following diagram is a comparison between the standard normal distribution and two different t distributions of sample size n = 3 and n = 12 – As you can see, they are very similar in shape, and as the sample size increases, the t distribution becomes more and more normal • the t distribution X t s n S is the estimated standard deviation The test statistic has a T distribution (assuming the underyling population Really is normally distributed) The distribution has n-1 degrees of freedom Use of the t-distribution • The t is often thought of as a small-sample technique • But, STRICTLY SPEAKING, the t should be used whenever the population standard deviation σ is NOT KNOWN • Some practitioners use z whenever the sample is large – Central Limit Theorem – There isn’t much difference between t and z Student’s t Distribution Note: t Z as n increases Standard Normal (t with df = ) t (df = 13) t-distributions are bellshaped and symmetric, but have ‘fatter’ tails than the normal t (df = 5) 0 from “Statistics for Managers” Using Microsoft® Excel 4th Edition, Prentice-Hall 2004 t t distribution values With comparison to the Z value Confidence t Level (10 d.f.) t (20 d.f.) t (30 d.f.) Z ____ .80 1.372 1.325 1.310 1.28 .90 1.812 1.725 1.697 1.64 .95 2.228 2.086 2.042 1.96 .99 3.169 2.845 2.750 2.58 Note: t Z as n increases from “Statistics for Managers” Using Microsoft® Excel 4th Edition, Prentice-Hall 2004 The Chi-squared (x2) distribution • • • • It is a right-skewed distribution taking positive values. It is characterized by its degrees of freedom Its shape depends on the degrees of freedom; it becomes more symmetrical and approaches Normality as the degrees of freedom increases. It is particularly useful for analyzing categorical data. 15 The Chi-Square Distribution k-1 degrees of freedom. (where k = the number of categories) See Table P.495 df = 3 df = 5 df = 10 c2 The F-distribution • • • • It is skewed to the right. It is defined by a ratio. The distribution of a ratio of two estimated variances calculated from Normal data approximates the F-distribution. The two parameters which characterize it are the degrees of freedom of the numerator and the denominator of the ratio. The F-distribution is particularly useful for comparing two variances, and more than two means using the analysis of variance (ANOVA). 19 The F-distribution s12 and s22 represent the sample -Let variances of two different populations. -If both populations are normal and the population variances σ 12 and σ 22 are equal, then the sampling distribution of s12 F 2 s2 is called an F-distribution. Larson/Farber 4th ed 20 Properties of the F-distribution -F-values are always greater than or equal to 0. -For all F-distributions, the mean value of F is approximately equal to 1. d.f.N = 1 and d.f.D = 8 d.f.N = 8 and d.f.D = 26 d.f.N = 16 and d.f.D = 7 d.f.N = 3 and d.f.D = 11 F 1 2 3 Larson/Farber 4th ed 4 21 The Lognormal distribution -It is the probability distribution of a random variable whose log (e.g. to base 10 or e) follows the Normal distribution. -It is highly skewed to the right (Fig. 8.3a). -If, when we take logs of our raw data that are skewed to the right, we produce an empirical distribution that is nearly Normal (Fig. 8.3b), our data approximate the Lognormal distribution. 22 The Lognormal distribution -Many variables in medicine follow a Lognormal distribution. We can use the properties of the Normal distribution to make inferences about these variables after transforming the data by taking logs. -If a data set has a Lognormal distribution, we can use the geometric mean as a summary measure of location. 23 Other theoretical distribution A-Continious probaility distribution -t-distribution -Chi-squared (x2) distribution -F-distribution -LogNormal distribution B-Discrete probability distribution -Binomial distribution -Poisson distribution 26 Discrete probability distributions •The random variable that defines the probability distribution is discrete. •The sum of the probabilities of all possible mutually exclusive events is one. 27 The Binomial distribution • Suppose, in a given situation, there are only two outcomes, 'success' and 'failure'. • For example, we may be interested in whether a woman conceives (a success) or does not conceive (a failure) after in vitro fertilization (IVF). • If we look at n = 100 unrelated women undergoing IVF (each with the same probability of conceiving), the Binomial random variable is the observed number of conceptions (successes). • Often this concept is explained in terms of n independent repetitions of a trial (e.g. 100 tosses of a coin) in which the outcome is either success (e.g. head) or failure. 28 The Binomial distribution • The two parameters that describe the Binomial distribution are; -n; the number of individuals in the sample (or repetitions of a trial) and -p; the true probability of success for each individual (or in each trial). • Its mean (the value for the random variable that we expect if we look at individuals, or repeat the trial n times) is np. Its variance is np(1- p). 29 The Binomial distribution • When n is small, -if p < 0.5, the distribution is skewed to the right -if p > 0.5, the distribution is skewed to the left. • The distribution becomes more symmetrical as the sample size increases (Fig. 8.4) and approximates the Normal distribution if both np and n(1 - p) are greater than 5. 30 The Binomial distribution • We can use the properties of the Binomial distribution when making inferences about proportions. • In particular, we often use the Normal approximation to the Binomial distribution when analyzing proportions. 31 Notation for Binomial Experiments Symbol Description n The number of times a trial is repeated. p = P(S) The probability of Success in a single trial. q = P(F) The probability of Failure in a single trial (q = 1 – p) x The random variable represents a count of the number of successes in n trials: x = 0, 1, 2, 3, . . . n. Binomial Probabilities There are several ways to find the probability of x successes in n trials of a binomial experiment. One way is to use the binomial probability formula. Binomial Probability Formula In a binomial experiment, the probability of exactly x successes in n trials is: P( x) n C x p q x n x n! x n x p q (n x)! x! Ex: Finding Binomial Probabilities A six sided die is rolled 3 times. Find the probability of rolling exactly one 6. Roll 1 You could use a tree diagram Roll 2 Roll 3 # of 6’s Probability (1)(1)(1) = 1 3 1/216 (1)(1)(5) = 5 2 5/216 (1)(5)(1) = 5 2 5/216 (1)(5)(5) = 25 1 25/216 (5)(1)(1) = 5 2 5/216 (5)(1)(5) = 25 1 25/216 (5)(5)(1) = 25 1 25/216 (5)(5)(5) = 125 0 125/216 Frequency Ex: Finding Binomial Probabilities There are three outcomes that have exactly one six, and each has a probability of 25/216. So, the probability of rolling exactly one six is 3(25/216) ≈ 0.347. Another way to answer the question is to use the binomial probability formula. In this binomial experiment, rolling a 6 is a success while rolling any other number is a failure. The values for n, p, q, and x are n = 3, p = 1/6, q = 5/6 and x = 1. The probability of rolling exactly one 6 is: P( x) n C x p q x Or you could use the binomial probability formula n x n! p x q n x (n x)! x! Ex: Finding Binomial Probabilities 3! 1 1 5 31 P (1) ( ) ( ) (3 1)!1! 6 6 1 5 2 3( )( ) 6 6 By listing the possible values of x 1 25 with the corresponding 3( )( ) probability of each, you can 6 36 construct a binomial probability 25 distribution. 3( ) 216 25 0.347 72 • The Poisson distribution • Poisson random variable is the count of the number of events that occur independently and randomly in time or space at some average rate, µ. • For example, the number of hospital admissions per day typically follows the Poisson distribution. • We can use our knowledge of the Poisson distribution to calculate the probability of a certain number of admissions on any particular day. 38 • The Poisson distribution • The parameter that describes the Poisson distribution is the mean, i.e. the average rate, µ. • The mean equals the variance in the Poisson distribution. • It is a right skewed distribution if the mean is small, but becomes more symmetrical as the mean increases, when it approximates a Normal distribution. 39 The Poisson Distribution The Poisson distribution is defined by: f ( x) x e x! Where f(x) is the probability of x occurrences in an interval is the expected value or mean value of occurrences within an interval e is the natural logarithm. e = 2.71828 Poisson Distribution, example The Poisson distribution models counts, such as the number of new cases of SARS that occur in women in New England next month. The distribution tells you the probability of all possible numbers of new cases, from 0 to infinity. If X= # of new cases next month and X ~ Poisson (), then the probability that X=k (a particular count) is: p( X k ) k e k! Example: Mercy Hospital • Poisson Probability Function MERCY Patients arrive at the emergency room of Mercy Hospital at the average rate of 6 per hour on weekend evenings. What is the probability of 4 arrivals in 30 minutes on a weekend evening? Example: Mercy Hospital Poisson MERCY Probability Function = 6/hour = 3/half-hour, x = 4 p( X k ) 4 e k k! 3 3 (2.71828) f (4) 4! .1680 Using Excel to Compute Poisson Probabilities Formula A 1 2 MERCY Worksheet B 3 = Mean No. of Occurrences ( ) Number of 3 Arrivals (x ) 4 0 5 1 6 2 7 3 8 4 9 5 10 6 … and so on Probability f (x ) =POISSON(A4,$A$1,FALSE) =POISSON(A5,$A$1,FALSE) =POISSON(A6,$A$1,FALSE) =POISSON(A7,$A$1,FALSE) =POISSON(A8,$A$1,FALSE) =POISSON(A9,$A$1,FALSE) =POISSON(A10,$A$1,FALSE) … and so on Using Excel to Compute Poisson Probabilities Value MERCY Worksheet A 1 2 B 3 = Mean No. of Occurrences ( ) Number of 3 Arrivals (x ) 0 4 1 5 2 6 3 7 4 8 5 9 6 10 … and so on Probability f (x ) 0.0498 0.1494 0.2240 0.2240 0.1680 0.1008 0.0504 … and so on Example: Mercy Hospital Poisson MERCY Distribution of Arrivals Poisson Probabilities Probability 0.25 0.20 actually, the sequence continues: 11, 12, … 0.15 0.10 0.05 0.00 0 1 2 3 4 5 6 7 8 9 Number of Arrivals in 30 Minutes 10 Summary Other theoretical distributions A-Continious probaility distribution -t-distribution -Chi-squared (x2) distribution -F-distribution -LogNormal distribution B-Discrete probability distribution -Binomial distribution -Poisson distribution 47