Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Probability & Statistics Review I 1. 2. 3. Normal Distribution Sampling Distribution Inference - Confidence Interval 1. Normal Distribution: N(μ,σ2) The normal probability density function is • ‘Bell Shaped’, Symmetrical • Location is characterized by the mean, μ • Spread is characterized by the standard deviation, σ Changing μ shifts the distribution left or right. Changing σ increases or decreases the spread. Standard Normal Distribution: N(0,12) The standard normal distribution has a mean of 0 and a standard deviation of 1. A random variable X following Normal distribution N(μ , σ 2) can be translated to the random variable Z following standard normal distribution by subtracting the mean of X and dividing by its standard deviation: f(Z) Z X μ σ 1 0 Z Ex) If X is distributed normally with mean of 100 and standard deviation of 50, the Z value for X = 200 is Z X μ 200 100 2.0 σ 50 Standard Normal Probability Table The Standardized Normal table in the textbook (Table I) gives the probability less than a desired value for Z (i.e., from negative infinity to Z) .9772 Example: P(Z < 2.00) = .9772 0 Z The row shows the value of Z to the first decimal point 0.00 0.01 2.00 Z 0.02 … The column gives the value of Z to the second decimal point 0.0 0.1 . . . 2.0 .9772 The value within the table gives the probability from Z = up to the desired Z value. Finding Normal Probability To find P(a < X < b) when X is distributed normally: f(X) P(a ≤ X ≤ b) a b Draw the normal curve for the problem in terms of X. Translate X-values to Z-values. Z Use the Standardized Normal Table. X μ σ Ex) Suppose X is normal with mean 8.0 and standard deviation 5.0. Find P(X < 8.6) and P(X > 8.6) . μ=0 σ=1 μ=8 σ = 10 8 X 8.6 Z 0 0.12 P(X < 8.6) = P(Z < 0.12) X μ 8.6 8.0 Z 0.12 σ 5.0 P(X < 8.6) = P(Z < 0.12) = 0.5478 P(X > 8.6) = 1.0 - .5478 = .4522 Z .00 .01 .02 0.0 .5000 .5040 .5080 0.1 .5398 .5438 .5478 0.2 .5793 .5832 .5871 0.3 .6179 .6217 .6255 Ex) Suppose X is normal with mean 8.0 and standard deviation 5.0. Find P(8 < X < 8.6) 8 8.6 P(8 < X < 8.6)= P(0 < Z < 0.12) = P(Z < 0.12) – P(Z ≤ 0) Z .00 .01 .02 0.0 .5000 .5040 .5080 0.1 .5398 .5438 .5478 0.2 .5793 .5832 .5871 0.3 .6179 .6217 .6255 = .5478 - .5000 = .0478 X Standard Normal Distribution Table Ex) Suppose X is normal with mean 20 and standard deviation 5 Find the following: 1) P( X > 30) 2) P(X<15) 3) P(10<X<25) 2. Sampling Distributions POPULATION A population consists of all the items or individuals about which you want to draw a conclusion. SAMPLE A sample is the portion of a population selected for analysis. PARAMETER A parameter is a numerical measure that describes a characteristic of a population. STATISTIC A statistic is a numerical measure that describes a characteristic of a sample. Population Measures used to describe the population are called parameters Sample Measures computed from sample data are called statistics Chap 1-10 • A sampling distribution is a distribution of all of the possible values of a statistic for a given size sample selected from a population. Ex) Suppose you sample 50 students from your college regarding their mean GPA. If you obtained many different samples of 50, you will compute a different mean for each sample. We are interested in the distribution of all potential mean GPA we might calculate for any given sample of 50 students. Chap 7-11 Probability and Statistics Probability Population Sample Statistics 12 We specify the population and study the behavior of samples selected from the population We make inferences in populations based on the information contained in a sample Statistics Descriptive Statistics Collecting, summarizing, and describing data Collect data ex. Survey Present data ex. Tables and graphs Characterize data ex. Sample mean = X i n Inferential Statistics Drawing conclusions and/or making decisions concerning a population based only on sample data Estimation -Point Estimation ex) Estimate the population mean weight using the sample mean weight - Interval Estimation ex) Confidence Interval Hypothesis testing ex. Test the claim that the population mean weight is 120 pounds Chap 1-13 Sampling Distributions: Sample Mean Example • • • • Suppose your population is “brothers and sisters” in your family. Population size N=4 Random variable, X, is age of individuals Values of X: 18, 20, 22, 24 (years) P(x) • X x=18 x=20 x=22 x=24 P(X=x) 0.25 0.25 0.25 0.25 .3 .2 .1 0 18 20 22 24 Population has Uniform distribution with mean=21and variance=5 x μ N i 18 20 22 24 21 4 Population mean σ 2 (x i μ)2 N 5 Population variance Chap 7-14 x • • Suppose we are sampling of sample size n=2 from this population with replacement. 16 possible samples => 16 sample means Sample Sample (X1, X2) Mean Sample (X1, X) Sample Mean 18, 18 18 22, 18 20 18, 20 19 22, 20 21 18, 22 20 22, 22 22 18, 24 21 22, 24 23 20, 18 19 24, 18 21 20, 20 20 24, 20 22 20, 22 21 24, 22 23 20, 24 22 24, 24 24 S ample Means Distribution P( ) .3 .2 .1 0 18 19 20 21 22 23 24 => Not a Uniform Distribution μX σX 2 X N i (X i 18 19 21 24 21 16 μ X )2 N (18 - 21) 2 (19 - 21) 2 (24 - 21) 2 2.5 16 Chap 7-15 Sample Means Distribution n=2 Population N=4 μ 21 σ 2.236 σ 2 5 μ X 21 σ X 1.58 σ X 2.5 2 P( ) .3 P(X) .3 .2 .2 .1 .1 0 18 20 22 24 X 0 18 19 20 21 22 23 24 Chap 7-16 A measure of the variability in the mean from sample to sample is given by the Standard Error of the Mean: σ σX n Note that the standard error of the mean decreases as the sample size increases. Chap 7-17 Sampling Distribution of Normal Populations • If a population is normal with mean μ and standard deviation σ, the sampling distribution of the mean is also normally distributed with μ μ and σ σ X X n • That is, • X ~ N(μ, σ2) Nice Property! => Recall that, from the previous example, if X~ Unifrom => not follow a Uniform distiribution does Z-value for the sampling distribution of the sample mean: Z (X μ X ) σX (X μ) σ n ~ N(0,12) Chap 7-18 • As n increases, σ x decreases Larger sample size Smaller sample size μ Statistics for Managers Using Microsoft Excel, 5e © 2008 x Chap 7-19 Sampling Distributions of Non-Normal Populations • The Central Limit Theorem states that as the sample size (that is, the number of values in each sample) gets large enough, the sampling distribution of the mean is approximately normally distributed. This is true regardless of the distribution of the individual values in the population. σ • Measures of the sampling distribution: μ x μ , σ x n Population Distribution Sampling Distribution (becomes normal as n increases) Smaller sample size μ x Larger sample size μx x Chap 7-20 Sampling Distributions • For most distributions, n > 30 will give a sampling distribution that is nearly normal • For fairly symmetric distributions, n > 15 will give a sampling distribution that is nearly normal • For normal population distributions, the sampling distribution of the mean is always normally distributed Chap 7-21 Example • Suppose a population has mean μ = 8 and standard deviation σ = 3. Suppose a random sample of size n = 36 is selected. • What is the probability that the sample mean is between 7.75 and 8.25? • Even if the population is not normally distributed, the central limit theorem can be used (n > 30). • So, the distribution of the sample mean is approximately normal with μx 8 Z 7.75 - 8 0.5 3 36 σx σ 3 0.5 n 36 Z 8.25 - 8 0.5 3 36 P(7.75 μ X 8.25) P(-0.5 Z 0.5) 0.3830 Chap 7-22 Population Distribution = 2(.5000-.3085) = 2(.1915) μ8 X = 0.3830 Sample Sampling Distribution 7.75 Standardized Normal Distribution μX 8 x 8.25 -0.5 μ 0 0.5 z Z Sampling Distributions : The Proportion Let p be the proportion of the population having some characteristic. Sample proportion ( p̂) provides an estimate of p: pˆ X number of items in the sample having the characteri stic of interest n sample size X ~ Binomial distribution, Bin(n, p) and Yi ~ Bernoulli (p) Thus, sample proportion p can be expressed as a sample mean from a sample with Bernoulli distribution σ p̂ Y Central Limit Theorem can be applied: pˆ p Z Z -value for the proportion: σp p(1 p) n pˆ p ~ N(0,12) p(1 p) n Chap 7-24 Example • If the true proportion of voters who support Proposition A is p = 0.4, what is the probability that a sample of size 200 yields a sample proportion between 0.40 and 0.45? In other words, if p =0 .4 and n = 200, what is P(0.40 ≤ p̂ ≤ 0.45) ? σ p̂ Y p(1 p) n 0.4(1 0.4) 0.03463 200 0.45 0.40 0.40 0.40 P(.40 pˆ .45) P Z 0.03464 0.03464 P(0 Z 1.44) 0.4251 Chap 7-25 3. Inference-Confidence Interval • A point estimate is a single number. Ex) - For the population mean, a point estimate is the sample mean. - For the population standard deviation, a point estimate is the sample standard deviation. • A confidence interval provides additional information about variability. Lower Confidence Limit Point Estimate Width of confidence interval Upper Confidence Limit • A confidence interval (C.I.) is stated in terms of confidence level Ex) 95% confidence, 99% confidence • Confidence level is a percentage denoting the confidence in which the interval will contain the unknown population parameter. • Ex) Confidence level = 95% ( 1- =0.95) P(A specific C.I. will contain the true parameter) = 0.95 95% of all the C.I. s will construct the true parameter • The general formula for all confidence intervals is: Point Estimate ± (Critical Value) (Standard Error) Confidence Interval for μ with σ Known Assumptions – Population standard deviation σ is known – Population is normally distributed – If population is not normal, use large sample (CLT) Confidence interval estimate: or X z/2 σ n (, where Zα/2 is the standardized normal distribution critical value for a probability of α/2 in each tail) Chap 8-28 Critical Value: Zα/2 Consider a 95% confidence interval: α .025 2 Z1- /2 = -1.96 1 .95 α .025 2 0 Zα/2 = 1.96 Commonly used confidence levels are 90%, 95%, and 99% X units: Chap 8-29 Example A sample of 11 circuits from a normal population has a mean resistance of 2.20 ohms. We know from past testing that the population standard deviation is 0.35 ohms. Determine a 95% confidence interval for the true mean resistance of the population. X Z / 2 σ 2.20 1.96 (0.35/ 11) n 2.20 .2068 (1.9932, 2.4068) Chap 8-30 Confidence Interval for μ with σ Unknown • If the population standard deviation σ is unknown, we can substitute the sample standard deviation, S • This introduces extra uncertainty, since S is variable from sample to sample => Use the t distribution instead of the normal distribution Assumptions – Population standard deviation is unknown – Population is normally distributed – If population is not normal, use large sample • Confidence Interval Estimate: X t /2, n -1 or s n (,where t/2, n-1 is the critical value of the t distribution with n-1 d.f. and an area of α/2 in each tail) Chap 8-31 Student’s t Distribution • T-distriburions are symmetric and bell shaped but have flatter tails than normal • The t value depends on degrees of freedom (d.f.) • As d.f. goes infinity, t-distribution -> N(0,12) Standard Normal (t with df = ∞) t (df = 13) t (df = 5) 0 t Chap 8-32 Table of T-distiribution Example A random sample of n = 25 has the sample mean 50 and the sample variance 8. Form a 95% confidence interval for μ – d.f. = n – 1 = 24, so – The confidence interval is S 8 X t/2, n -1 50 (2.0639) n 25 (46.698 , 53.302) Chap 8-34 Ex) The following is the fuel efficiency data (km per liter) of a new car model; 17.2 16.9 17.6 18.0 17.4 16.3 15.8 17.2 17.3 16.0 Suppose that the km per liter follows a normal probability, calculate the 95% confidence interval for the mean km per liter. Confidence Intervals for the Population Proportion, p Recall that the distribution of the sample proportion is approximately normal if the sample size is large, with standard deviation σ p̂ p(1 p) n Confidence interval: p̂ Z/2 p̂(1 p̂) n Chap 8-36 Example A random sample of 100 people shows that 25 wear glasses. Form a 95% confidence interval for the true proportion of the population who wear glasses. p̂ Z / 2 p̂(1 p̂)/n 25/100 1.96 0.25(0.75)/100 0.25 1.96 (0.0433) (0.1651 , 0.3349) Note : We are 95% confident that the true percentage of people wearing glasses in the population is between 16.51% and 33.49%. Although the interval from .1651 to .3349 may or may not contain the true proportion, 95% of intervals formed from samples of size 100 in this manner will contain the true proportion. Chap 8-37