Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Chapter 7 Statistics and Parameters 01 Generalizing from a Sample to a Population In inferential statistics, we speak in terms of probability rather than certainty. WHY? Overgeneralization: to ascribe to all the members of a group what fits only some members Hasty Conclusion: A premature judgment made without sufficient evidence Random Sampling •procedure where each member of a population has an equal probability of being included in a sample Population Sample 03 Random Number Generator Group 1 Group 2 Stratified Sampling •assures that you will be able to represent not only the overall population, but also key subgroups of the population Situation Sampling •involves selecting a sample in units or naturally occurring groups 06 Sampling Error Sampling Error is a normal and expected deviation (it is not a mistake) -must assume that the measures obtained from a sample would not match the measures we would get if we were able to measure the entire population -it is the difference between the sample mean (M) and the population mean (μ) **Sampling error= M – μ -M is just as often below the μ as it is above it *ie. M should overestimate μ 50% of the time & underestimate μ 50% of the time -NOTE: Greek symbols usually represent a population parameter Bias has occurred when the sample differs systematically from the population -it is constant sampling error in one direction *ie. if an M is consistently overestimating a μ but never underestimating μ or vice versa **EX: want to know the average height of college students & you go measure the basketball team -the results of the M will definitely overestimate the μ (but not underestimate at all) -this is a mistake that needs to be avoided Outlier: a score that is really far from the mean -at least 3 or 4 standard deviations away -due to either a distribution that is not normal (unique snowflake) or measurement error -causes skew and usually are just thrown out Sampling Distributions • In a sampling distribution, each point on the abscissa represents the mean of a group’s performance (so far we have been looking at individual performance) -based on infinite random sampling from an infinite population •Mean of the Distribution of Means: -NOTE: MM=μ *EX: We want to find the average IQ of all students at Omega College (pop=6000) -randomly select 30 students, give them the IQ test & find a mean of 118 (M1=118) -then, we randomly select another 30 students, IQ test & find a mean of 121 (M2=121) -since we are selecting 30 at a time, we would continue this process until M 20 -Next, we calculate the mean of the means (MM or μ) to find the average IQ for all students at Omega College Sampling Distributions •Standard Deviation of the Distribution of Means: symbolized by σM -Note: σ also pronounced “sigma” -use the same formula to calculate σM that you use to calculate SD *just use the means instead of raw scores: -referred to as the Standard Error of the Mean -provides a measure of sampling error variability *ie. Tells us the amount of variability between the sample means and the population mean -Q: What if you didn’t want to continue selecting random sample after random sample to find the SD of a population? *A: if you had all the raw scores in a population then you could calculate the SD (σ), the standard error of the mean (σM) could then be calculated using: Sampling Distributions •Central Limit Theorem: when successive random samples are taken from a single population, the means of these samples assume the shape of a normal curve, regardless of whether or not the distribution of individual scores is normal. -ie. Even if the distribution of individual scores is skewed to the left or right the sampling distribution of means will be normal -WHY? because of sample size (the mean is a bigger size than an individual score) *the more people we have in our sample the more our distribution looks normal -remember, the normal curve mirrors observations that occur in nature -it is the reason that we can use the z-score table