Download Vector Flowers Template

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
Chapter 7
Statistics and Parameters
01
Generalizing from a Sample
to a Population
In inferential statistics, we
speak in terms of probability
rather than certainty.
WHY?
Overgeneralization: to
ascribe to all the
members of a group what
fits only some members
Hasty Conclusion: A
premature judgment
made without sufficient
evidence
Random Sampling
•procedure where each member of a population has an equal probability of being
included in a sample
Population
Sample
03
Random Number Generator
Group 1
Group 2
Stratified Sampling
•assures that you will be able to represent not only the overall population, but
also key subgroups of the population
Situation Sampling
•involves selecting a sample in units or naturally occurring groups
06
Sampling Error
Sampling Error is a normal and expected deviation (it is not a mistake)
-must assume that the measures obtained from a sample would not match the measures
we would get if we were able to measure the entire population
-it is the difference between the sample mean (M) and the population mean (μ)
**Sampling error= M – μ
-M is just as often below the μ as it is above it
*ie. M should overestimate μ 50% of the time & underestimate μ 50% of the time
-NOTE: Greek symbols usually represent a population parameter
Bias has occurred when the sample differs systematically from the population
-it is constant sampling error in one direction
*ie. if an M is consistently overestimating a μ but never underestimating μ or vice versa
**EX: want to know the average height of college students & you go measure the basketball team
-the results of the M will definitely overestimate the μ (but not underestimate at all)
-this is a mistake that needs to be avoided
Outlier: a score that is really far from the mean
-at least 3 or 4 standard deviations away
-due to either a distribution that is not normal (unique snowflake) or measurement error
-causes skew and usually are just thrown out
Sampling Distributions
• In a sampling distribution, each point on the abscissa represents the mean of a group’s
performance (so far we have been looking at individual performance)
-based on infinite random sampling from an infinite population
•Mean of the Distribution of Means:
-NOTE: MM=μ
*EX: We want to find the average IQ of all students at Omega College (pop=6000)
-randomly select 30 students, give them the IQ test & find a mean of 118 (M1=118)
-then, we randomly select another 30 students, IQ test & find a mean of 121 (M2=121)
-since we are selecting 30 at a time, we would continue this process until M 20
-Next, we calculate the mean of the means (MM or μ) to find the average IQ for all students at
Omega College
Sampling Distributions
•Standard Deviation of the Distribution of Means: symbolized by σM
-Note: σ also pronounced “sigma”
-use the same formula to calculate σM that you use to calculate SD
*just use the means instead of raw scores:
-referred to as the Standard Error of the Mean
-provides a measure of sampling error variability
*ie. Tells us the amount of variability between the sample means and the population mean
-Q: What if you didn’t want to continue selecting random sample after random sample to find
the SD of a population?
*A: if you had all the raw scores in a population then you could calculate the SD (σ), the
standard error of the mean (σM) could then be calculated using:
Sampling Distributions
•Central Limit Theorem: when successive random samples are taken from a
single population, the means of these samples assume the shape of a normal
curve, regardless of whether or not the distribution of individual scores is normal.
-ie. Even if the distribution of individual scores is skewed to the left or right the sampling
distribution of means will be normal
-WHY? because of sample size (the mean is a bigger size than an individual score)
*the more people we have in our sample the more our distribution looks normal
-remember, the normal curve mirrors observations that occur in nature
-it is the reason that we can use the z-score table