Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Chapter 5 Introduction to inferential Statistics: Sampling and the Sampling Distribution Copyright © 2012 by Nelson Education Limited. 5-1 In this presentation you will learn about: • The basic logic and terminology of inferential statistics • Random sampling • The sampling distribution Copyright © 2012 by Nelson Education Limited. 5-2 Basic Logic and Terminology of Inferential Statistics – Population and Sample – Parameter and Statistic – Representative and EPSEM Copyright © 2012 by Nelson Education Limited. 5-3 Population and Sample • Problem: The populations we wish to study are almost always so large that we are unable to gather information from every case. Copyright © 2012 by Nelson Education Limited. 5-4 Population and Sample (continued) • Solution: We choose a sample -a carefully chosen subset of the population – and use information gathered from the cases in the sample to generalize to the population. Copyright © 2012 by Nelson Education Limited. 5-5 Parameter and Statistic • Statistics are mathematical characteristics of samples. • Parameters are mathematical characteristics of populations. • Statistics are used to estimate parameters. PARAMETER STATISTIC Copyright © 2012 by Nelson Education Limited. 5-6 Representative and EPSEM • Sample must be representative of the population. – Representative: The sample has the same characteristics as the population. • How can we ensure a sample is representative? – Samples drawn according to the rule of EPSEM (equal probability of selection method), in which every case in the population has the same chance of being selected for the sample are likely to be representative. Copyright © 2012 by Nelson Education Limited. 5-7 Random Sampling Techniques • The most basic EPSEM sampling technique is simple random sampling (SRS). • There are variations of this technique, such as sstratified random sampling and cluster sampling. – The textbook, however, considers only simple random sampling since it is the most straightforward application of EPSEM. Copyright © 2012 by Nelson Education Limited. 5-8 Simple Random Sampling (SRS) • To draw a simple random sample, we need: – A list of the population. – A method for selecting cases from the population so each case has an equal probability of being selected (EPSEM). Copyright © 2012 by Nelson Education Limited. 5-9 Simple Random Sampling (SRS): An Example • You want to know what % of students at a large university work during the semester. • Draw a sample of 500 from a list of all students at the university (N = 20,000). • Assume the list is available from the Registrar. Copyright © 2012 by Nelson Education Limited. 5-10 Simple Random Sampling (SRS): An Example (continued) • How can you draw names so every student has the same chance of being selected? – Each student has a unique 6 digit Student ID Number that ranges from 000000 to 999999. • Use a Table of Random Numbers (or a computer program) to select 500 ID numbers with 6 digits each. • Each time a randomly selected 6-digit number matches the ID of a student, that student is selected for the sample. Copyright © 2012 by Nelson Education Limited. 5-11 Simple Random Sampling (SRS): An Example (continued) • Below is an excerpt from a Table of Random Numbers: • To select a random sample from a numbered population list, begin anywhere in the Table and select groups of digits the same length as the largest number on your population list. – Since the population list numbers in our example are in the 100,000s (i.e. Student ID Numbers range from 000000 to 999999), we select digits in groups of six. • Every time a number selected from the Table corresponds to the identification number of a case on the population list, that case is selected for the sample. • Continue the process until the desired number of cases (500 in our example) is selected for the sample. – Disregard duplicate numbers. – Ignore cases in which the randomly selected number does not match an identification number on the population list. • e.g. ignore cases in which no student ID matches the randomly selected number. Copyright © 2012 by Nelson Education Limited. 5-12 Simple Random Sampling (SRS): An Example (continued) • For example, if we decide to begin from the left of the top row of the Table, and work across the row, we take the first set of six digits (i.e., 104801, as boxed in the table below). – If this number (104801) corresponds to the identification number of a case on the population list (i.e., a Student's ID Number), that case is selected for the sample. • Next, we take the second set of six digits (501101). • If this number corresponds to a Student's ID Number, that case is selected for the sample. • We continue this process until 500 names are selected for the sample. Copyright © 2012 by Nelson Education Limited. 5-13 Simple Random Sampling (SRS): An Example (continued) • Part of the list of 500 selected students might look like this: ID # 501101 691791 255958 Student Andrea Oja Sasha Albright Keisha Evans Copyright © 2012 by Nelson Education Limited. 5-14 Simple Random Sampling (SRS): An Example (continued) • After questioning each of these 500 students, you find that 368 (74%) work during the semester. Copyright © 2012 by Nelson Education Limited. 5-15 Applying Logic and Terminology of Inferential Statistics • In the research situation above, identify each the following: – Population – Sample – Statistic – Parameter Copyright © 2012 by Nelson Education Limited. 5-16 Applying Logic and Terminology of Inferential Statistics (continued) • Population = All 20,000 students. • Sample = The 500 students selected and interviewed. • Statistic =74% (% of sample that held a job during the semester). • Parameter = % of all students in the population who held a job. Copyright © 2012 by Nelson Education Limited. 5-17 The Sampling Distribution • The single most important concept in inferential statistics. • Definition: The theoretical, probabilistic distribution of a statistic for all possible samples of a given size (n). • The sampling distribution is a theoretical concept. Copyright © 2012 by Nelson Education Limited. 5-18 The Sampling Distribution (continued) • Every application of inferential statistics involves 3 different distributions: 1. Population: empirical; unknown 2. Sampling Distribution: theoretical; known 3. Sample: empirical; known • Information from the sample is linked to the population via the sampling distribution. Population Sampling Distribution Copyright © 2012 by Nelson Education Limited. Sample 5-19 The Sampling Distribution: Properties 1. Normal in shape. 2. Has a mean equal to the population mean. 3. Has a standard deviation (called the standard error) equal to the population standard deviation, σ, divided by the square root of n. Copyright © 2012 by Nelson Education Limited. 5-20 The Sampling Distribution: First Theorem • Tells us the shape of the sampling distribution and defines its mean and standard deviation. – If repeated random samples of size n are drawn from a normal population with mean µ and standard deviation σ, then the sampling distribution of sample means will be normal with a mean µ and a standard deviation (standard error) of σ/n Copyright © 2012 by Nelson Education Limited. 5-21 The Sampling Distribution: First Theorem (continued) • Again, if we begin with a trait that is normally distributed across a population (height, weight) and take an infinite number of equally sized random samples from that population, the sampling distribution of sample means will be normal. Copyright © 2012 by Nelson Education Limited. 5-22 The Sampling Distribution: Second (Central Limit) Theorem • If repeated random samples of size n are drawn from any (e.g., non-normal) shaped population with mean µ and standard deviation σ, then as n becomes large the sampling distribution of sample means will approach normality, with a mean µ and standard deviation of σ/n – Again, for any trait or variable, even those that are not normally distributed in the population (income, wealth) as sample size grows larger, the sampling distribution of sample means will become normal in shape. Copyright © 2012 by Nelson Education Limited. 5-23 The Sampling Distribution: Second (Central Limit) Theorem (continued) • The importance of the Central Limit Theorem is that it removes the constraint of normality in the population. Copyright © 2012 by Nelson Education Limited. 5-24 The Sampling Distribution: Theorem Proofs Copyright © 2012 by Nelson Education Limited. 5-25 The Sampling Distribution: Theorem Proofs (continued) Copyright © 2012 by Nelson Education Limited. 5-26 The Sampling Distribution: Theorem Proofs (continued) Copyright © 2012 by Nelson Education Limited. 5-27 The Sampling Distribution: Theorem Proofs (continued) Proof: population mean, 5, equals mean of the sampling distribution, 5, and population standard deviation divided by the square root of n, 2.236/2, is equal to standard deviation of the sampling distribution, 1.581. Copyright © 2012 by Nelson Education Limited. 5-28 Sampling Distribution: Key Points 1. Definition: Distribution of a statistic for all possible outcomes of a certain size. • The statistic for example can be a mean or a proportion. 2. Shape: Normal • Since the sampling distribution is normal, we can use Appendix A to find areas. 3. Central tendency and dispersion: Mean is same as the population mean; standard deviation of sampling distribution – the standard error – is equal to the population standard deviation divided by the square root of n. Copyright © 2012 by Nelson Education Limited. 5-29 Sampling Distribution: Key Points (continued) • Role of sampling distribution in inferential statistics: Links the sample with the population: Population Sampling Distribution Sample Copyright © 2012 by Nelson Education Limited. 5-30 Three Distributions and their Symbols Copyright © 2012 by Nelson Education Limited. 5-31