Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
SAMPLING AND SAMPLE DISTRIBUTIONS To find out about a population we can: 1. sample the entire population or 2. select a random sample - a sample is quicker and cheaper statistical inference - use a sample to draw conclusions about a population POINT ESTIMATE - calculate a statistic from a sample - use it as an estimate of the population parameter used to estimate (population mean) s (sample s.d.) used to estimate (population s.d.) p (sample proportion)used to estimate p (pop. proportion) x (sample mean) - point estimate as opposed to an interval estimate (e.g. estimate that x is between two numbers) SIMPLE RANDOM SAMPLING simple random sampling - probability of any combination of data points being selected is equal - selection is independent Finite Populations i.e. 1, 2, 3, 4, 5 - if a sample is drawn with n = 2, then N! 5! there are n!{N n)! 2 ! 3! 10 different samples. - if a sample that is drawn can only be drawn once, then the number of possible samples decreases and the probability of choosing any given sample is greater - therefore, to be random (and therefore independent) the sample must be repeatable (replaced). Infinite Populations (or for very large populations) - little or no change in probabilities even if no replacement THE SAMPLING DISTRIBUTION OF MEANS - when we select a sample from a population: not all the values are the same and not all the values are equal to the sample mean - the sample values follow some distribution that has variation (and a mean) - when take means of several samples from the same pop. not all the sample means are the same and not all the sample means are equal to the pop. mean - the sample means follow some distribution that has variation (and a mean) Eg: A group of 4 student’s weekly study times are: Student Hrs/wk A 4 B 6 C 8 D 10 To create a sampling distribution of the means for samples: list all possible samples (let’s make sample size, n, = 2), the mean of each sample, and the probability of each mean. x Sample Means p( x ) Students Hours AB 4,6 4+6 2 5 1/6 AC 4,8 4+8 2 6 1/6 AD 4,10 4+10 2 6+8 2 7 n 2/6 BC 6,8 7 BD 6,10 6+10 2 8 1/6 CD 8,10 8+10 2 9 1/6 Distribution of the population values: Distribution of the sample means: 0.3 0.3 0.2 0.2 p(x) p(x) 0.1 0.1 0 0 4 6 8 10 5 x Mean of the pop. values: = 4+6+8+10 = 7 4 6 7 8 9 x Mean of the sample means: x = 5+6+7+7+8+9 = 7 6 mean of the sampling distribution of means = the pop. mean x = Standard Deviation of Sampling Distribution of Means S.D. of the pop. values: x S.D. of the sample means: 2 x 2 x N = 5 = 2.234 # of samples = 5/3 = 1.291 x n - called standard error of the mean x n = 5 / 2 = 1.5811 Why does it not match? Because the sample size is a large proportion of the pop. When the sample size is more than 5% of the population (n/N > .05), must adjust the standard error using the finite population correction factor. x x n n Nn N 1 Nn N 1 = (5 / 2) (2/3) = 1.291 Type of Distribution If the population is normally distributed, then the sampling distribution of x is normally distributed. If the population is not normally distributed, but the sample size is large, then the sampling distribution of x is normally distributed Central Limit Theorem If we draw a random sample of size n from a population, the distribution of the sample mean x can be approximated by a normal probability distribution as the sample size becomes large (n 30). If the population is not normally distributed, but the sample size is small, then the sampling distribution of x is not necessarily normally distributed Sample Distribution of the Sample Proportion - when interested in the proportion of items in a population that have a particular characteristic: Registrar wants to know the proportion of female students. The characteristic is “female”. A manufacturer of computer chips wants to know the proportion of nondefective chips in a production run. The characteristic of interest is “nondefective” chips. What prop. of the people will vote for party A in an election? Proportion Proportion = # of items that have the characteristic = x total # of items n p = population proportion p = sample proportion p = mean of the sample proportions = p p = p p pq n Note: q = (1-p) For Finite Populations: p pq N n n N 1 If n N 0.05 Type of Distribution Central Limit Theorem If we draw a random sample of size n from a population, the distribution of the sample proportion p can be approximated by a normal probability distribution as the sample size becomes large (np 5, nq 5).