Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Statistics for Managers Using Microsoft Excel 3rd Edition Chapter 5 The Normal Distribution and Sampling Distributions Chapter Topics The normal distribution The standardized normal distribution Evaluating the normality assumption The exponential distribution Chapter Topics Introduction to sampling distribution Sampling distribution of the mean Sampling distribution of the proportion Sampling from finite population (continued) Continuous Probability Distributions Continuous random variable Continuous probability distribution Values from interval of numbers Absence of gaps Distribution of continuous random variable Most important continuous probability distribution The normal distribution The Normal Distribution “Bell shaped” Symmetrical Mean, median and mode are equal Interquartile range equals 1.33 s Random variable has infinite range f(X) Mean Median Mode X The Mathematical Model f X 1 e 1 2s 2 X 2s 2 f X : density of random variable X 3.14159; e 2.71828 : population mean s : population standard deviation X : value of random variable X Expectation E( X ) s 1 2 s 1 2 ( x ) 2 / 2s 2 xe 1 2 s 0 ( x )e e dx ( x ) 2 / 2s 2 ( x ) 2 / 2s 2 dx d x Variance EX 2 s 1 2 s 2 2 s2 2 ( x 2 s 2 y2 / 2 y e 2 ) e ( xs ) 2 / 2 dy d x s Many Normal Distributions There are an infinite number of normal distributions By varying the parameters s and , we obtain different normal distributions Finding Probabilities Probability is the area under the curve! P c X d ? f(X) c d X Which Table to Use? An infinite number of normal distributions means an infinite number of tables to look up! Solution: The Cumulative Standardized Normal Distribution Cumulative Standardized Normal Distribution Table (Portion) Z .00 .01 Z 0 sZ 1 .02 .5478 0.0 .5000 .5040 .5080 Shaded Area Exaggerated 0.1 .5398 .5438 .5478 0.2 .5793 .5832 .5871 Probabilities 0.3 .6179 .6217 .6255 0 Z = 0.12 Only One Table is Needed Standardizing Example Z X s 6.2 5 0.12 10 Standardized Normal Distribution Normal Distribution s 10 5 sZ 1 6.2 X Shaded Area Exaggerated Z 0 0.12 Z Example: P 2.9 X 7.1 .1664 Z X s 2.9 5 .21 10 Z X s 7.1 5 .21 10 Standardized Normal Distribution Normal Distribution s 10 .0832 sZ 1 .0832 2.9 5 7.1 X 0.21 Shaded Area Exaggerated Z 0 0.21 Z Example: P 2.9 X 7.1 .1664(continued) Cumulative Standardized Normal Distribution Table (Portion) Z .00 .01 Z 0 sZ 1 .02 .5832 0.0 .5000 .5040 .5080 Shaded Area Exaggerated 0.1 .5398 .5438 .5478 0.2 .5793 .5832 .5871 0.3 .6179 .6217 .6255 0 Z = 0.21 Example: P 2.9 X 7.1 .1664(continued) Cumulative Standardized Normal Distribution Table (Portion) Z .00 .01 .02 Z 0 sZ 1 .4168 -03 .3821 .3783 .3745 Shaded Area Exaggerated -02 .4207 .4168 .4129 -0.1 .4602 .4562 .4522 0.0 .5000 .4960 .4920 0 Z = -0.21 Normal Distribution in PHStat PHStat | probability & prob. Distributions | normal … Example in excel spreadsheet Example: P X 8 .3821 Z X s 85 .30 10 Standardized Normal Distribution Normal Distribution s 10 sZ 1 .3821 5 8 X Shaded Area Exaggerated Z 0 0.30 Z Example: P X 8 .3821 Cumulative Standardized Normal Distribution Table (Portion) Z .00 .01 Z 0 (continued) sZ 1 .02 .6179 0.0 .5000 .5040 .5080 Shaded Area Exaggerated 0.1 .5398 .5438 .5478 0.2 .5793 .5832 .5871 0.3 .6179 .6217 .6255 0 Z = 0.30 Finding Z Values for Known Probabilities What is Z Given Probability = 0.1217 ? Z 0 sZ 1 Cumulative Standardized Normal Distribution Table (Portion) Z .00 .01 0.2 0.0 .5000 .5040 .5080 .6217 0.1 .5398 .5438 .5478 0.2 .5793 .5832 .5871 Shaded Area Exaggerated 0 Z .31 0.3 .6179 .6217 .6255 Recovering X Values for Known Probabilities Standardized Normal Distribution Normal Distribution s 10 sZ 1 .1179 .3821 5 ? X Z 0 0.30 X Zs 5 .3010 8 Z Assessing Normality Not all continuous random variables are normally distributed It is important to evaluate how well the data set seems to be adequately approximated by a normal distribution Assessing Normality Construct charts (continued) For small- or moderate-sized data sets, do stemand-leaf display and box-and-whisker plot look symmetric? For large data sets, does the histogram or polygon appear bell-shaped? Compute descriptive summary measures Do the mean, median and mode have similar values? Is the interquartile range approximately 1.33 s? Is the range approximately 6 s? Assessing Normality Observe the distribution of the data set (continued) Do approximately between mean Do approximately between mean Do approximately between mean 2/3 of the observations lie 1 standard deviation? 4/5 of the observations lie 1.28 standard deviations? 19/20 of the observations lie 2 standard deviations? Evaluate normal probability plot Do the points lie on or close to a straight line with positive slope? Assessing Normality (continued) Normal probability plot Arrange data into ordered array Find corresponding standardized normal quantile values Plot the pairs of points with observed data values on the vertical axis and the standardized normal quantile values on the horizontal axis Evaluate the plot for evidence of linearity Assessing Normality (continued) Normal Probability Plot for Normal Distribution 90 X 60 Z 30 -2 -1 0 1 2 Look for Straight Line! Normal Probability Plot Left-Skewed Right-Skewed 90 90 X 60 X 60 Z 30 -2 -1 0 1 2 -2 -1 0 1 2 Rectangular U-Shaped 90 90 X 60 X 60 Z 30 -2 -1 0 1 2 Z 30 Z 30 -2 -1 0 1 2 Exponential Distributions P arrival time X 1 e X X : any value of continuous random variable : the population average number of arrivals per unit of time 1/: average time between arrivals e 2.71828 e.g.: Drivers Arriving at a Toll Bridge; Customers Arriving at an ATM Machine Exponential Distributions (continued) Describes time or distance between events f(X) Density function Used for queues f x Parameters 1 e x = 0.5 = 2.0 X s Example e.g.: Customers arrive at the check out line of a supermarket at the rate of 30 per hour. What is the probability that the arrival time between consecutive customers to be greater than five minutes? 30 X 5 / 60 hours P arrival time >X 1 P arrival time X 1 1 e .0821 30 5/ 60 Exponential Distribution in PHStat PHStat | probability & prob. Distributions | exponential Example in excel spreadsheet Why Study Sampling Distributions Sample statistics are used to estimate population parameters e.g.: X 50 Estimates the population mean Problems: different samples provide different estimate Large samples gives better estimate; Large samples costs more How good is the estimate? Approach to solution: theoretical basis is sampling distribution Sampling Distribution Theoretical probability distribution of a sample statistic Sample statistic is a random variable Sample mean, sample proportion Results from taking all possible samples of the same size Developing Sampling Distributions Assume there is a population … Population size N=4 B C Random variable, X, is age of individuals Values of X: 18, 20, 22, 24 measured in years A D Developing Sampling Distributions (continued) Summary Measures for the Population Distribution N X i 1 P(X) i .3 N 18 20 22 24 21 4 N s X i 1 i N .2 .1 0 2 2.236 A B C D (18) (20) (22) (24) Uniform Distribution X Developing Sampling Distributions All Possible Samples of Size n=2 1st Obs 2nd Observation 18 20 22 24 18 18,18 18,20 18,22 18,24 20 20,18 20,20 20,22 20,24 (continued) 16 Sample Means 22 22,18 22,20 22,22 22,24 1st 2nd Observation Obs 18 20 22 24 24 24,18 24,20 24,22 24,24 18 18 19 20 21 16 Samples Taken with Replacement 20 19 20 21 22 22 20 21 22 23 24 21 22 23 24 Developing Sampling Distributions (continued) Sampling Distribution of All Sample Means Sample Means Distribution 16 Sample Means 1st 2nd Observation Obs 18 20 22 24 18 18 19 20 21 20 19 20 21 22 22 20 21 22 23 24 21 22 23 24 P(X) .3 .2 .1 0 _ 18 19 20 21 22 23 24 X Developing Sampling Distributions (continued) Summary Measures of Sampling Distribution N X X i 1 N i 18 19 19 16 N sX X i 1 i X 21 2 N 18 21 19 21 2 24 16 2 24 21 2 1.58 Comparing the Population with its Sampling Distribution Population N=4 21 s 2.236 Sample Means Distribution n=2 X 21 P(X) .3 P(X) .3 .2 .2 .1 .1 0 0 A B C (18) (20) (22) D X (24) s X 1.58 _ 18 19 20 21 22 23 24 X Properties of Summary Measures X I.E. X Is unbiased Standard error (standard deviation) of the sampling distribution s X is less than the standard error of other unbiased estimators For sampling with replacement: As n increases, sX decreases sX s n Unbiasedness P(X) Unbiased Biased X X Less Variability P(X) Sampling Distribution of Median Sampling Distribution of Mean X Effect of Large Sample Larger sample size P(X) Smaller sample size X When the Population is Normal Population Distribution Central Tendency X Variation sX s n Sampling with Replacement s 10 50 Sampling Distributions n4 n 16 sX 5 s X 2.5 X 50 X When the Population is Not Normal Population Distribution Central Tendency X Variation sX s n Sampling with Replacement s 10 50 Sampling Distributions n4 n 30 sX 5 s X 1.8 X 50 X Central Limit Theorem As sample size gets large enough… the sampling distribution becomes almost normal regardless of shape of population X How Large is Large Enough? For most distributions, n>30 For fairly symmetric distributions, n>15 For normal distribution, the sampling distribution of the mean is always normally distributed Example: 8 s =2 n 25 P 7.8 X 8.2 ? 7.8 8 X X 8.2 8 P 7.8 X 8.2 P sX 2 / 25 2 / 25 P .5 Z .5 .3830 Standardized Normal Distribution Sampling Distribution 2 sX .4 25 sZ 1 .1915 7.8 8.2 X 8 X 0.5 Z 0 0.5 Z Population Proportions Categorical variable e.g.: Gender, voted for Bush, college degree Proportion of population having a characteristic p Sample proportion provides an estimate p X number of successes pS n sample size If two outcomes, X has a binomial distribution Possess or do not possess characteristic Sampling Distribution of Sample Proportion Approximated by normal distribution np 5 n 1 p 5 P(ps) .3 .2 .1 0 Mean: Sampling Distribution p p 0 .2 .4 .6 8 1 ps S Standard error: sp S p 1 p n p = population proportion Standardizing Sampling Distribution of Proportion Z pS pS sp S p 1 p n Standardized Normal Distribution Sampling Distribution sp pS p sZ 1 S p S pS Z 0 Z Example: n 200 p .4 P pS .43 ? p .43 .4 S pS P pS .43 P s pS .4 1 .4 200 Standardized Normal Distribution Sampling Distribution sp P Z .87 .8078 sZ 1 S p .43 S pS 0 .87 Z Sampling from Finite Sample Modify standard error if sample size (n) is large relative to population size (N ) n .05N or n / N .05 Use finite population correction factor (fpc) Standard error with FPC sX sP S s n N n N 1 p 1 p N n n N 1