Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Statistics for the Social Sciences Psychology 340 Spring 2005 Sampling distribution Outline Statistics for the Social Sciences • Review 138 stuff: – – – – What are sample distributions Central limit theorem Standard error (and estimates of) Test statistic distributions as transformations Flipping a coin example Statistics for the Social Sciences 3= n = 2 2 8 total outcomes HHH Number of heads 3 HHT 2 HTH 2 HTT 1 THH 2 THT 1 TTH 1 TTT 0 Flipping a coin example Statistics for the Social Sciences Number of heads 3 Distribution of possible outcomes probability (n = 3 flips) .4 .3 .2 .1 .125 .375 .375 .125 0 1 2 3 Number of heads 2 X f p 3 1 .125 2 2 1 3 3 .375 .375 1 0 1 .125 1 2 1 0 Hypothesis testing Statistics for the Social Sciences Distribution of possible outcomes (of a particular sample size, n) Can make predictions about likelihood of outcomes based on this distribution. • In hypothesis testing, we compare our observed samples with the distribution of possible samples (transformed into standardized distributions) • This distribution of possible outcomes is often Normally Distributed Distribution of sample means Statistics for the Social Sciences • Comparison distributions considered so far were distributions of individual scores • Mean of a group of scores – Comparison distribution is distribution of means Distribution of sample means Statistics for the Social Sciences • A simple case – Population: 2 4 6 8 – All possible samples of size n = 2 Assumption: sampling with replacement Distribution of sample means Statistics for the Social Sciences • A simpler case – Population: 2 4 6 8 – All possible samples of size n = 2 mean mean 2 2 4 6 2 5 2 4 2 6 2 8 4 2 4 4 3 4 5 4 8 6 2 6 4 3 4 6 6 6 8 6 4 5 6 7 There are 16 of them mean 8 2 5 8 4 8 6 8 8 6 7 8 Distribution of sample means Statistics for the Social Sciences 5 4 3 2 1 In long run, the random selection of tiles leads to a predictable pattern 2 3 4 5 6 7 8 means 2 mean 2 2 4 mean 6 5 8 mean 2 5 2 4 3 4 5 4 8 8 4 2 6 6 2 8 6 2 8 6 4 8 8 4 2 3 4 6 6 4 4 6 8 6 4 5 6 7 6 7 8 Distribution of sample means Statistics for the Social Sciences 5 4 3 2 1 • Sample problem: 2 3 4 5 6 7 8 means X f – What’s the probability of getting a sample with a mean of 6 or more? p 8 1 0.0625 7 2 0.1250 6 3 0.1875 5 4 0.2500 4 3 0.1875 3 2 0.1250 2 1 0.0625 P(X > 6) = .1875 + .1250 + .0625 = 0.375 • Same as before, except now we’re asking about sample means rather than single scores Distribution of sample means Statistics for the Social Sciences • Distribution of sample means is a “virtual” distribution between the sample and population Population Distribution of sample means Sample Statistics for the Social Sciences Properties of the distribution of sample means • Shape – If population is Normal, then the dist of sample means will be Normal – If the sample size is large (n > 30), regardless of shape of the population Population Distribution of sample means N > 30 Statistics for the Social Sciences Properties of the distribution of sample means • Center – The mean of the dist of sample means is equal to the mean of the population Population Distribution of sample means same numeric value different conceptual values Statistics for the Social Sciences Properties of the distribution of sample means • Center – The mean of the dist of sample means is equal to the mean of the population – Consider our earlier example Population 2 4 6 Distribution of sample means 8 = 2+4+6+8 4 =5 5 4 3 2 1 2 3 4 5 6 7 8 means = 2+3+4+5+3+4+5+6+4+5+6+7+5+6+7+8 16 =5 Statistics for the Social Sciences Properties of the distribution of sample means • Spread – The standard deviation of the distribution of sample mean depends on two things • Standard deviation of the population • Sample size Statistics for the Social Sciences Properties of the distribution of sample means • Spread • Standard deviation of the population • The smaller the population variability, the closer the sample means are to the population mean X3 X1 X2 X3 X1 X2 Statistics for the Social Sciences Properties of the distribution of sample means • Spread • Sample size n=1 X Statistics for the Social Sciences Properties of the distribution of sample means • Spread • Sample size n = 10 X Statistics for the Social Sciences Properties of the distribution of sample means • Spread • Sample size n = 100 The larger the sample size the smaller the spread X Properties of the distribution of sample means Statistics for the Social Sciences • Spread • Standard deviation of the population • Sample size – Putting them together we get the standard deviation of the distribution of sample means X n – Commonly called the standard error Standard error Statistics for the Social Sciences • The standard error is the average amount that you’d expect a sample (of size n) to deviate from the population mean – In other words, it is an estimate of the error that you’d expect by chance (or by sampling) Distribution of sample means Statistics for the Social Sciences • Keep your distributions straight by taking care with your notation Population Distribution of sample means X Sample s X Statistics for the Social Sciences Properties of the distribution of sample means • All three of these properties are combined to form the Central Limit Theorem – For any population with mean and standard deviation , the distribution of sample means for sample size n will approach a normal distribution with a mean of and a standard deviation of as n approaches infinity n (good approximation if n > 30). Performing your statistical test Statistics for the Social Sciences • What are we doing when we test the hypotheses? – Computing a test statistic: Generic test Could be difference between a sample and a population, or between different samples observed difference test statistic difference expected by chance Based on standard error or an estimate of the standard error Statistics for the Social Sciences Hypothesis Testing With a Distribution of Means • It is the comparison distribution when a sample has more than one individual • Find a Z score of your sample’s mean on a distribution of means Z (X X ) X “Generic” statistical test Statistics for the Social Sciences An example: One sample z-test Memory example experiment: • We give a n = 16 memory patients a memory improvement treatment. • After the treatment they have an average score of X = 55 memory errors. • How do they compare to the general population of memory patients who have of memory errors that is a distribution Normal, = 60, = 8? • Step 1: State your hypotheses H0: the memory treatment sample are the same (or worse) as the population of memory patients. Treatment > pop > 60 HA: Their memory is better than the population of memory patients Treatment < pop < 60 “Generic” statistical test Statistics for the Social Sciences An example: One sample z-test H0: Treatment > pop > 60 HA: Treatment < pop < 60 Memory example experiment: • We give a n = 16 memory patients a memory improvement treatment. • After the treatment they have an average score of X = 55 memory errors. • How do they compare to the general population of memory patients who have of memory errors that is a distribution Normal, = 60, = 8? • Step 2: Set your decision criteria a = 0.05 One -tailed “Generic” statistical test Statistics for the Social Sciences An example: One sample z-test H0: Treatment > pop > 60 HA: Treatment < pop < 60 Memory example experiment: • We give a n = 16 memory patients a memory improvement treatment. • After the treatment they have an average score of X = 55 memory errors. • How do they compare to the general population of memory patients who have of memory errors that is a distribution Normal, = 60, = 8? One -tailed • a = 0.05 Step 3: Collect your data “Generic” statistical test Statistics for the Social Sciences An example: One sample z-test H0: Treatment > pop > 60 HA: Treatment < pop < 60 Memory example experiment: • We give a n = 16 memory patients a memory improvement treatment. • After the treatment they have an average score of X = 55 memory errors. • How do they compare to the general population of memory patients who have of memory errors that is a distribution Normal, = 60, = 8? a = 0.05 One -tailed • Step 4: Compute your test statistics zX X X X = -2.5 55 60 8 16 “Generic” statistical test Statistics for the Social Sciences An example: One sample z-test Memory example experiment: • We give a n = 16 memory patients a memory improvement treatment. H0: Treatment > pop > 60 HA: Treatment < pop < 60 a = 0.05 One -tailed zX 2.5 • Step 5: Make a decision • After the treatment they have an about your null hypothesis average score of X = 55 memory errors. • How do they compare to the general population of memory patients who have 5% of memory errors that is a distribution Normal, = 60, = 8? -2 -1 Reject H0 1 2 “Generic” statistical test Statistics for the Social Sciences An example: One sample z-test H0: Treatment > pop > 60 HA: Treatment < pop < 60 Memory example experiment: • We give a n = 16 memory patients a memory improvement treatment. • After the treatment they have an average score of X = 55 memory errors. • How do they compare to the general population of memory patients who have of memory errors that is a distribution Normal, = 60, = 8? One -tailed a = 0.05 zX 2.5 • Step 5: Make a decision about your null hypothesis - Reject H0 - Support for our HA, the evidence suggests that the treatment decreases the number of memory errors