Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Section 5.1 Normal Distributions Statistics: Unlocking the Power of Data Lock5 Outline Density curves Normal distribution Finding normal probabilities (technology) Finding normal endpoints (technology) Standard normal Statistics: Unlocking the Power of Data Lock5 Some Bootstrap and Randomization Distributions Correlation: Malevolent uniforms Measures from Scrambled Collection 1 Slope :Restaurant tips Measures from Scrambled RestaurantTips -60 -40 Dot Plot -20 0 20 slope (thousandths) 40 Dot Plot 60 0.2 0.0 r -0.2 -0.4 -0.6 What do Diff means: Finger taps Mean :Body TemperaturesAll bell-shaped distributions you notice? Measures from Sample of BodyTemp50 98.2 98.3 98.4 0.6 0.4 Dot Plot Measures from Scrambled CaffeineTaps 98.5 98.6 Nullxbar 98.7 98.8 Proportion : Owners/dogs Measures from Sample of Collection 1 98.9 0.3 Unlocking 0.4 0.5 0.6 0.7 Data 0.8 Statistics: the Power of phat 99.0 -4 Dot Plot Dot Plot -3 -2 -1 0 Diff 1 2 3 Mean : Atlanta commutes Measures from Sample of CommuteAtlanta 26 27 28 29 xbar 30 4 Dot Plot 31 Lock325 Density Curve A density curve is a theoretical model to describe a variable’s distribution. Think of a density curve as an idealized histogram, where: (1) The total area under the curve is one. (2) The proportion of the population in any interval is the area over that interval. Statistics: Unlocking the Power of Data Lock5 Density for Bootstrap Means for Atlanta Commutes What proportion are between 30 and 31? Area is about 0.15 Statistics: Unlocking the Power of Data Lock5 Normal Distribution A normal distribution has a symmetric bell-shaped density curve. Statistics: Unlocking the Power of Data Lock5 Parameters of a Normal Two features distinguish one normal density from another: • The mean is its center of symmetry (μ). • The standard deviation controls its spread (σ). Notation: X~N(μ,σ) Statistics: Unlocking the Power of Data Lock5 N(µ,σ) σ µ2σ µσ Statistics: Unlocking the Power of Data σ μ µ+σ µ+2σ Lock5 Example: A Population Verbal SAT ~ N( 580, 70) Statistics: Unlocking the Power of Data Lock5 Example: Bootstrap Distribution Original sample 𝑥’s for Atlanta commutes ≈ N( 29.11, 0.93) Bootstrap std. dev. (SE) Statistics: Unlocking the Power of Data Lock5 Ex: Randomization Distribution H0 𝑝’s for dog/owners matches ≈ N( 0.5, .10) Randomization std. dev. (SE) Statistics: Unlocking the Power of Data Lock5 How can we find areas under a normal density? N(μ,σ) We need technology! a Calculus! b b Area a Statistics: Unlocking the Power of Data 1 e 2 ( x )2 2 2 dx Lock5 StatKey Pick the tail Adjust μ, σ Probability Statistics: Unlocking the Power of Data Endpoint Lock5 Example: Verbal SAT scores Suppose that verbal SAT scores for applicants at a college follow a normal distribution with mean µ = 580 and std. dev. σ =70. What proportion of applicants have SAT scores above 650? Statistics: Unlocking the Power of Data Lock5 Example: Verbal SAT scores About 4.3% of applicants will have verbal SAT scores above 700 Statistics: Unlocking the Power of Data Lock5 Example: Bootstrap Means Suppose that the bootstrap distribution of means for samples of size 500 Atlanta commute times is N(29.11,0.93). Find an endpoint (percentile) so that just 5% of the bootstrap means are smaller. Statistics: Unlocking the Power of Data Lock5 Example: Bootstrap Means About 5% of the bootstrap means will be less than 27.58 minutes. Statistics: Unlocking the Power of Data Lock5 Note: All that really matters is the number of Finding Probabilities for N(μ,σ) standard deviations from the mean. About what proportion should be within one std. dev. of the mean? ≈68% Statistics: Unlocking the Power of Data Lock5 Standard Normal =0, =1 Z~N(0,1) To convert any X~N(μ,σ) to Z~N(0,1): 𝑋−𝜇 𝑍= 𝜎 (z-score) “Standardize” the endpoint(s), then use Z~N(0,1) Statistics: Unlocking the Power of Data Lock5 Ex: Dog/Owner randomization proportions, 𝑝 ≈ N(0.5,0.1) ? Original 𝑝 0.64 X-area above 0.64= Z-area 0.64−0.5 above 0.1 = 1.40 area = 0.081 Statistics: Unlocking the Power of Data Lock5 Converting Normals 𝑋−𝜇 𝑍= 𝜎 X~N(μ, σ) Z~N(0,1) 𝑋 = 𝜇 + 𝑍𝜎 Statistics: Unlocking the Power of Data Lock5 Example: Percentile for Verbal SAT 25%-tile (Q1) for Z~N(0,1) is -0.674. Find the 25%-tile for the Verbal SAT~N(580,70) distribution X= μ + Zσ = 580+(-0.674)(70) = 533 Statistics: Unlocking the Power of Data Lock5 GOALS If we can approximate a bootstrap distribution with a normal … … construct a confidence interval. If we can approximate a randomization distribution with a normal … … compute a p-value. IF we can find an easy way to estimate SE, we can even do this without generating the distribution! Statistics: Unlocking the Power of Data Lock5 What is the area below -1.50 in a standard normal distribution? A. 0.067 B. 0.933 C. 0.241 D. 0.759 E. 0.500 Statistics: Unlocking the Power of Data Lock5 What is the area above 2.20 in a standard normal distribution? A. 0.003 B. 0.997 C. 0.014 D. 0.986 E. 0.037 Statistics: Unlocking the Power of Data Lock5 What is the area between 0.8 and 1.4 in a standard normal distribution? A. 0.247 B. 0.185 C. 0.028 D. 0.476 E. 0.131 Statistics: Unlocking the Power of Data Lock5 What is the endpoint z in a standard normal distribution if the area to the right of z is 0.03 ? A. 0.247 B. 1.881 C. 0.897 D. 2.158 E. 1.751 Statistics: Unlocking the Power of Data Lock5 What is the endpoint z in a standard normal distribution if the area to the left of z is 0.18? A. 0.915 B. -1.254 C. 1.762 D. -2.158 E. -0.915 Statistics: Unlocking the Power of Data Lock5 What is the endpoint z in a standard normal distribution if the area between z and –z is 0.60? A. 0.200 B. 0.842 C. 1.168 D. -2.158 E. 0.400 Statistics: Unlocking the Power of Data Lock5 Summary Statistical inference is drawing conclusions about a population based on a sample We use a sample statistic to estimate a population parameter To assess the uncertainty of a statistic, we need to know how much it varies from sample to sample To create a sampling distribution, take many samples of the same size from the population, and compute the statistic for each Standard error is the standard deviation of a statistic Statistics: Unlocking the Power of Data Lock5