Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Lesson 9 - 3 Introduction to the Practice of Statistics Objectives • Define statistics and statistical thinking • Understand the process of statistics • Distinguish between qualitative and quantitative variables • Distinguish between discrete and continuous variables Vocabulary • Central Limit Theorem – the larger the sample size, the closer the sampling distribution for the sample mean from any underlying distribution approaches a Normal distribution • Standard error of the mean – standard deviation of the sampling distribution of x-bar Sample Mean, x̄ The behavior of x̄ in repeated sampling is much like that of the sample proportion, p-hat. • Sample mean x̄ is an unbiased estimator of the population mean μ • Spread is less than that of X. Standard deviation of x̄ is smaller than that of X by a factor of 1/√n Sample Spread of x̄ If the random variable X has a normal distribution with a mean of 20 and a standard deviation of 12 – If we choose samples of size n = 4, then the sample mean will have a normal distribution with a mean of 20 and a standard deviation of 6 – If we choose samples of size n = 9, then the sample mean will have a normal distribution with a mean of 20 and a standard deviation of 4 Example 1 The height of all 3-year-old females is approximately normally distributed with μ = 38.72 inches and σ = 3.17 inches. Compute the probability that a simple random sample of size n = 10 results in a sample mean greater than 40 inches. P(x-bar > 40) μ = 38.72 σ = 3.17 n = 10 σx = 3.17 / 10 = 1.00244 x-μ 1.28 40 – 38.72 Z = ------------- = ----------------- = ----------------σx 1.00244 1.00244 = 1.277 normalcdf(1.277,E99) = 0.1008 normalcdf(40,E99,38.72,1.002) = 0.1007 a Example 2 We’ve been told that the average weight of giraffes is 2400 pounds with a standard deviation of 300 pounds. We’ve measured 50 giraffes and found that the sample mean was 2600 pounds. Is our data consistent with what we’ve been told? P(x-bar > 2600) μ = 2400 σ = 300 n = 50 σx = 300 / 50 = 42.4264 x-μ 200 2600 – 2400 Z = ------------- = ----------------- = ----------------σx 42.4264 42.4264 = 4.714 normalcdf(4.714,E99) = 0.000015 normalcdf(2600,E99,2400,42.4264) = 0.0000001 a Example 3 Young women’s height is distributed as a N(64.5, 2.5), What is the probability that a randomly selected young woman is taller than 66.5 inches? P(x > 66.5) μ = 64.5 σ = 2.5 n = 1 σx = 2.5 / 1 = 2.5 !! x-μ 2 66.5 – 64.5 Z = ------------- = ----------------- = --------σx 2.5 2.5 = 0.80 normalcdf(0.80,E99) = 1 – 0.7881 = 0.2119 normalcdf(66.5,E99,64.5,2.5) = 0.2119 a Example 4 Young women’s height is distributed as a N(64.5, 2.5), What is the probability that an SRS of 10 young women is greater than 66.5 inches? P(x > 66.5) μ = 64.5 σ = 2.5 n = 1 σx = 2.5 / 10 = 0.79 x-μ 2 66.5 – 64.5 Z = ------------- = ----------------- = --------σx 0.79 0.79 = 2.53 normalcdf(2.53,E99) = 1 – 0.9943 = 0.0057 normalcdf(66.5,E99,64.5,2.5/√10) = 0.0057 a Central Limit Theorem X or x-bar Distribution Regardless of the shape of the population, the sampling distribution of x-bar becomes approximately normal as the sample size n increases. Caution: only applies to shape and not to the mean or standard deviation x x x x x x x x x x x x x Random Samples Drawn from Population Population Distribution x x x Central Limit Theorem in Action n =1 n=2 n = 10 n = 25 Example 5 The time a technician requires to perform preventive maintenance on an air conditioning unit is governed by the exponential distribution (similar to curve a from “in Action” slide). The mean time is μ = 1 hour and σ = 1 hour. Your company has a contract to maintain 70 of these units in an apartment building. In budgeting your technician’s time should you allow an average of 1.1 hours or 1.25 hours for each unit? P(x > 1.1) vs P(x > 1.25) μ=1 σ=1 n = 70 σx = 1 / 70 = 0.120 x-μ 0.1 1.1 – 1 Z = ------------- = ------------ = --------- = 0.83 σx 0.12 0.12 normalcdf(0.83,E99) = 1 – 0.7967 = 0.2033 a Example 5 cont The time a technician requires to perform preventive maintenance on an air conditioning unit is governed by the exponential distribution (similar to curve a from “in Action” slide). The mean time is μ = 1 hour and σ = 1 hour. Your company has a contract to maintain 70 of these units in an apartment building. In budgeting your technician’s time should you allow an average of 1.1 hours or 1.25 hours for each unit? P(x > 1.25) μ=1 σ=1 n = 70 σx = 1 / 70 = 0.120 x-μ 0.25 1.25 – 1 Z = ------------- = ------------ = --------- = 2.083 σx 0.12 0.12 normalcdf(2.083,E99) = 1 – 0.9818 = 0.0182 a Summary of Distribution of x Shape, Center and Spread of Population Distribution of the Sample Means Shape Normal with mean, μ and standard deviation, σ Regardless of sample size, n, distribution of x-bar is normal Population is not normal with mean, μ and standard deviation, σ As sample size, n, increases, the distribution of x-bar becomes approximately normal Center Spread μx-bar = μ σ σx-bar = ------n μx-bar = μ σ σx-bar = ------n Summary and Homework • Summary – Take an SRS and use the sample proportion x̄ to estimate the unknown parameter μ – x̄ is an unbiased estimator of μ – Increase in sample size decreases the standard deviation of x̄ (by a factor of 1/√n) – If the population is normal, then so is x̄ – Central Limit Theorem: for large n, the sampling distribution of x̄ is approximately normal for any population (with a finite σ) • Homework – Day 1: pg 595-6; 9.31-4 – Day 2: pg 601-4; 9.35, 36, 38, 42-44