Download Sampling Distributions

Sampling Distributions Welcome to inference!!!! Chapter 9.1/9.3 Parameter  A Parameter is a number that describes the population.   A parameter always exists but in practice we rarely know it’s value b/c of the difficulty in creating a census. We use Greek letters to describe them (like μ or σ). If we are talking about a proportion parameter, we use rho (ρ) Ex: If we wanted to compare the IQ’s of all American and Asian males it would be impossible, but it’s important to realize that μAmericans and μmales exist.  Ex: If we were interested in whether there is a greater percentage of women who eat broccoli than men, we want to know whether ρwomen > ρmen  Statistic    A statistic is a number that describes a sample. The value of a statistic can always be found when we take a sample . It’s important to realize that a statistic can change from sample to sample. Statistics use variables like 𝑥 , s, and 𝑝 (non greek). We often use statistics to estimate an unknown parameter.    Ex: I take a random sample of 500 American males and find their IQ’s. We find that 𝑥 = 103.2. I take a random sample of 200 women and find that 40 like broccoli. Then 𝑝w = .2 IMPORTANT! A POPULATION NEEDS TO BE AT LEAST 10 TIMES AS BIG AS A SAMPLE TAKEN FROM IT. IF NOT, YOU NEED A SMALLER SAMPLE Bias  We say something is biased if it is a poor predictor Variability *Variability of population doesn’t change- (scoop example) size of scoop matters How can we use samples to find parameters if they give us different results?      Imagine an archer shooting many arrows at a target: 4 situations can occur a) High bias, low variability b) Low bias, low variability c) High bias, high variability d) Low bias, low variability--IDEAL Here’s an example:    Suppose our goal was to estimate μamerican male IQ. We can’t take a census, so we take many samples. We find the average IQ of american males in each sample (𝑥). Describe a situation the matches each of the 4 possibilities described.  a) if our many samples of IQ’s are consistent but higher than the true average IQ of AM’s, then we have a situation with high bias and low variability  b) If our many samples are inconsistent- some high, some low than the true mean of AM’s Iqs, we have low bias and high variability.  If our sample means are not close to each other but all higher than μAM then we have high bias and high variability (c)  Finally, if our samples all just slightly higher or slightly lower than μAM, we have our desired situation: low bias and low variability. But…  You aren’t taking a bunch of samples…you’re only going to take 1! and we want it to predict μam  If we used the data from situation d) then any of the samples would provide a good predictor for μam  We already know some ways to get a good sample- using an SRS and being very sure to have no bias when choosing our sample.  Inference is using our sample statistic (assuming it’s a good sample) to predict our parameter with a certain degree of confidence. The Sampling Distribution  The sampling distribution of a statistic is the distribution of means of all possible samples of the same size from the population.  When we sample, we sample with replacement.  A sampling distribution is a sample space - it describes everything that can happen when we sample. Cool demo Using Sampling Distributions  If the population is Normal, then the sampling distribution will also be Normal.  WE CAN ONLY DO CALCULATIONS BASED ON NORMAL DISTRIBUTIONS!  The Sampling Distribution of means has a mean of σ μ and a standard deviation of 𝑛  N(μ,   σ 𝑛 ) THIS IS ONLY TRUE IF (a) the population is Normal …or (b) what we will look at next class Example  The true average study time for a final exam in history is found to be 6 hours and 25 minutes with a standard deviation of 1 hour and 45 minutes. Assume the distribution is normal. N(6.417, 1.75)    What is the probability that a student chosen at random spends more than 7 hours studying? Normalcdf(7,100,6.417,1.75) = 37% What is the probability that an SRS of 4 students will average more than 7 hours in studying? Normalcdf(7,100,6.417,1.75/√4) = 25.3%. Why did the probability go down? A student to study more than 7 hours is not probable…a group of 4 to average more than 7 is less probable. Sampling Distributions  At the sample size increases, the standard deviation of the sampling distribution decreases (the variability decreases) Homework:  Study Guide 9.3  Exercises 9.11-9.13, 9.17, 9.31-9.34  (hint for 9.33…use algebra!)  Next Class: Bring 25 pennies/pair of students.

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Sampling Distributions