Download Shavelson (chapters 5-12)

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Foundations of statistics wikipedia , lookup

History of statistics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Misuse of statistics wikipedia , lookup

Gibbs sampling wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
Shavelson – Descriptive Statistics
Variability
Range
Variance
SD
Shavelson Chapter 5
S5-1. Define, be able to create and recognize graphic representations of a normal
distribution (115-121).
Normal distribution: Provides a good
model of relative frequency
distribution found in behavioral
research.
Shavelson Chapter 5
S5-2. Know the four properties of the normal distribution (120-121).
Unimodal, thus the greater the distance a
score lies from the mean, the less the
frequency of at score.
Symmetrical
Mean, mode, and median all the same
Aymptotic line never touches the
abscissa
Note that the mean and variance can
differ, thus “a family of normal
distributions”
Shavelson Chapter 5
S5-3. You should know what is meant by the phrase “a family of normal distributions” (121,3). I will also cover in class the general issues of “distributions” which are
frequently used in statistical analyses.
From:http://www.gifted.uconn.edu/siegle/research/Normal/instructornotes.html
25% 25% 25% 25%
.
Th
h.
..
ns
es
e
di
s
tri
bu
tio
ns
ha
.
.
tri
bu
tio
es
e
Th
es
e
Th
di
s
tri
bu
tio
ns
ha
.
.
ha
.
ns
0 of 5
di
s
tri
bu
tio
4.
di
s
3.
es
e
2.
These distributions have the same
mode, different median and SD
These distributions have different
mode, same median, different SD
These distributions have different
means, modes and variances
These distributions have the same
mode, mean and median, but
different SDs
Th
1.
45
Enter question text...
25% 25% 25% 25%
0 of 5
g
is
th
e
sa
m
e
w
h.
..
ns
ot
hi
n
N
Th
es
e
di
s
tri
bu
tio
ns
tri
bu
tio
di
s
es
e
Th
...
.
ha
.
.
ha
.
ns
tri
bu
tio
4.
di
s
3.
es
e
2.
These distributions have the same
mean, but different SDs
These distributions have a
different means and medians, but
the same modes and SDs
These distributions have different
means, modes, and
Nothing is the same with these
two!
Th
1.
45
Shavelson Chapter 5
S5-4. Know the areas under the curve of a normal distribution (roughly, e.g. 34.13%,
13.59%, 2.14 % and .13% on either side of the mean)
From:http://www.gifted.uconn.edu/siegle/research/Normal/instructornotes.html
Shavelson Chapter 5
Shavelson Chapter 5
S5-5a. What is a standard score (z-score) (123,3)? Be able to calculate the z-score,
given a raw score, mean, and standard deviation.
Z score = X-mean
S
X = raw score
Mean = mean of distribution
S = standard deviation
Notice that to calculate the Z score you need the mean
and S of a distribution of scores.
Shavelson Chapter 5
S5-5b. What two bits of information does the z-score provide us (125, 1-2)?
Z scores provides the following information:
1. Size of Z scores indicates the number of standard
deviations raw score is from the mean
2. Sign (+ or -) indicates if the raw score is above the
mean (+) or below the mean (-)
A z score of -1.8 means…
0 of 5
Th
e
n
di
st
rib
of
th
e
di
st
r i.
..
Th
ut
i
e
on
ra
is
w
s.
sc
..
Th
or
e
e
lie
ra
s
w
1.
sc
...
or
Th
e
e
lie
ra
s
w
1.
sc
8.
or
..
e
1.
8
lie
s.
..
20% 20% 20% 20% 20%
m
ea
5.
e
4.
The mean of the distribution is 1.8
The distribution is skewed
The raw score lies 1.8 means above
the mean
The raw score lies 1.8 standard
deviations below the mean
The raw score 1.8 lies standard
deviations above the mean
Th
1.
2.
3.
45
Mean = 10, X= 18, S = 4, what is
the Z score?
N
ot
l
is
te
d
2
-4
20% 20% 20% 20% 20%
4
-2
4
-4
2
Not listed
-2
1.
2.
3.
4.
5.
0 of 5
45
Shavelson Chapter 5
S5-6. Know what a Standard(ized) distribution is.
Convert all raw scores of a distribution into Z scores,
and put into a frequency distribution.
–
–
Mean = 0
Std. Dev. And Variance = 1
-2
-1
0 +1
+2
Shavelson Chapter 5
S5-8. Know how to calculate the proportion of scores that lie above or below a
given raw score
Convert raw score to a Z score
Do rough estimate on a standard normal distribution
Look up in table B (swap labels on 3&4 if it is a neg Z
value).
Mean = 80
S=5
X = 69
Let's do a few more!
Mean = 18, X = 5, S = 7.1. What percentile
was the person who scored the x in?
0 of 5
.6
4
25%
46
25%
3.
44
25%
.6
4
25%
96
1.83
96.64
3.44
46.64
1.
83
1.
2.
3.
4.
45
Shavelson Chapter 8
S8-1. Know the definition of a statistic, parameter, and
estimator
Statistic: describes characteristic of sample e.g.
sample mean x-bar as opposed to population mean
mu (μ)
Parameter: describes characteristic of population
Estimator: statistic that estimates a population
parameter
The mean is an example of…..
Parameter
Statistic
Estimator
All of the above
A
ll
o
ft
Es
he
tim
ab
ov
at
o
e
r
tic
St
at
is
ra
m
et
er
25% 25% 25% 25%
Pa
1.
2.
3.
4.
0 of 5
45
Shavelson Chapter 8
S8-2. Know the role of statistics, as well as the difference between inferential and
descriptive statistics.
Role of Stats:
•
Guidelines for summarizing/describing data
•
Method for drawing inferences from sample to
population
•
Help set effective methodology
Descriptive Stats
•
Organize/summarize/depict/describe
collections of data
Inferential Stats
•
Draw inferences about population from sample
Shavelson Chapter 8
S8-3. Know and be able to recognize and provide examples of the two types of questions
asked about a population (Case 1 and Case II research). (217)
Case I Research:
Was a particular sample of observations drawn
from a particular (known) population?
Example: all students in US took GRE on same
day, means of all scores…look at one state in
particular…mean is higher…they are from a
different population with a higher mean.
Take one sample from the population (get mean)
and compare to the overall population mean.
Actually answer: what is the probability that a
sample was drawn from a particular (known)
population.
Shavelson Chapter 8
S8-5. Know the general approach for conducting case I and case II hypothesis testing. That is, you
should be able to list and briefly describe the steps your author lists at the end of each section
(case I, 220-221, 4 steps; case II 223-224, 5 steps). Be able to describe the various alternative
hypotheses (step two of each)
Case II research:
Are the observations from two different samples
drawn from the same population?
•
(do observations on two groups of subjects
differ from one another)
•
Actually answer: given that a difference exists
between two samples (e.g. the means) what is
the probability that this difference is caused by
chance alone? If not from chance alone, they
must be from different populations e.g. our
treatment changed them!
Shavelson Chapter 8
S8-5
Case 1 research steps:
1. Set your hypotheses
•
Ho: µ = specific value
•
H1: µ  some specific value – usually pop mean (two tailed)
•
H1: µ > some specific value (one tailed)
•
H1: µ < some specific value (one tailed)
2. Randomly select participants for your study
3. decide to reject the null or not based on the comparison of the sample
mean to the population mean
–
Reject null means that the difference between the population
mean and the sample mean is not likely to have occurred by
chance (it was probably due to whatever you were studying!)
–
Failure to reject the null means there is a fairly good chance
that the difference between the sample mean and the
population mean could have occurred simply by chance (not
due to whatever you were studying)
Shavelson Chapter 8
Case II
1. Set your hypotheses
•
Ho: µe = µc
•
µe = Experimental group
•
µc = Control Group
•
H1: µe  µc (two tailed)
•
H1: µe > µc (one tailed)
•
H1: µe < µc value (one tailed)
2. Randomly Select then Randomly Assign participants to
experimental and control groups.
3. Perform the experiment – apply the IV and measure the
DV
4. Decide to reject the Null Hypothesis or not
–
–
Reject the null means that the difference between the
experimental and control group is not likely to have occurred
by chance (thus was probably your IV!)
Failure to reject the null means that it is likely that the
difference between the control group and the experimental
group was due to chance and not your IV.
Shavelson Chapter 8
S8-6. Know the two types of statistical errors: Type 1 and type 2. Be
able to prove and recognize examples of each.
Types of errors in statistical inference
Type I: Reject the null when it is true (say
there is a treatment effect when there is
not)
Type II: Not reject null when it should have
been (say there is no treatment effect
when there was)
Shavelson Chapter 8
S8-6. Know the two types of statistical errors: Type 1 and type 2. Be
able to prove and recognize examples of each.
The way it really is
Your
Decision
Reject Null –
Vit A had an
effect
Accept Null
Vit A had no
effect
Vit A has no
effect
Vit a Had an
effect
Type 1 error
Correct
Decision
Correct
decision
Type 2 error
Shavelson Chapter 9
Probability
Event: any specified outcome
Outcome space: all possible outcomes
• P(e)= the probability of some event
• P(e)= # events/# outcomes in outcome
space
• Ex. Dice
Outcome space = {1,2,3,4,5,6} (= six items)
E = {2} (=1 item)
Probability of getting a 2 = 1/6=.17
Probability: what is the probability of getting a two by chance alone?
First
Question
Second
Question
Third
Question
Correct
Score on test
Correct
Incorrect
3
2
Incorrect
Correct
2
Correct
Incorrect
Correct
Incorrect
1
2
1
Correct
Incorrect
1
0
Correct
Incorrect
Incorrect
Shavelson Chapter 10
10-1. Two fundamental ideas of conducting case I
research:
The null hypothesis is assumed to be true.
•
(that is, the difference between the sample and
population mean is assumed to be due to
chance alone)
A sampling distribution is used to determine the
probability of obtaining a particular sample
mean.
•
In this case the sampling distribution is
composed of group means
Shavelson Chapter 10
10-2. What is the central limit theorem?
The Central Limit Theorem is a statement about the characteristics of
the sampling distribution of means of random samples from a
given population. That is, it describes the characteristics of the
distribution of values we would obtain if we were able to draw an
infinite number of random samples of a given size from a given
population and we calculated the mean of each sample.
The Central Limit Theorem consists of three statements:
[1] The mean of the sampling distribution of means is equal to the mean
of the population from which the samples were drawn.
[2] The variance of the sampling distribution of means is equal to the
variance of the population from which the samples were drawn
divided by sqrt of the size of the samples.
[3] If the original population is distributed normally (i.e. it is bell shaped),
the sampling distribution of means will also be normal. If the
original population is not normally distributed, the sampling
distribution of means will increasingly approximate a normal
distribution as sample size increases. (i.e. when increasingly large
samples are drawn)
Shavelson Chapter 10
10-3. Know the characteristics of a sampling distribution of means.
Characteristics of Sampling distribution of
means
1. normally distributed (even if pop. is
skewed - if N = 30 or more)
2. sampling mean = population mean
3. standard dev (standard error of the mean)
= Pop S.D.
N
Shavelson Chapter 10
10-4. Know what happens to the SEM as
sample size increases.
SEM decreases as N increases
SEM = Pop S.D.
N
σx=σ
N
Shavelson Chapter 10
10-5. Know how one could create a sampling distribution of means
Sampling Distribution of means
A distribution composed of sample means
How to conduct
1. Pull a sample from population of N size
2. Find the mean of the sample
3. Repeat this many times (all samples of size N)
4. Create a frequency distribution of the means
(actual convert if to relative frequencies =
proportions!)
Shavelson Chapter 10
10-5. What is the functions of a sampling distribution of means?
Used as a probability distribution to
determine the likelihood of obtaining a
particular sample mean, given that the
null hypothesis is true.
null hypothesis is true = same thing as “by
chance alone”
Shavelson Chapter 10
S10-6. As your author does, be able to calculate the probability of obtaining a particular sample mean, given
the appropriate data (e.g. the mean of the sampling distribution and the standard error). If I ask for
this on the test I will either supply table B or will have the Zx fall on a whole value (e.g. 1 or 2, or 3).
You should thus review the probabilities under the normal curve as you will be expected to be able
to apply this information) (260-262)
μ = 100 (mean of the population and the sampling distribution)
σ x = 25
X = (mean of the sample we used in our study)
What is the probability of obtaining a sample mean of 175 by
chance alone (i.e. when the null is true: Ho: μ = x)
Zx = mean of the sample – pop mean = X - μ = 175-100
SEM
σx
25
Use table b if needed!
Shavelson Chapter 10
S10-7. What meant by the terms "unlikely" and "likely"? You should
be able to answer this in terms of accepting or rejecting the null
hypothesis, or in terms of what is meant by "significance level"
(263-264)
Level of significance = what we consider to
be “unlikely”
Generally set at 5% or 1 % chance of obtaining
a sample mean by chance alone
Alpha = .05 or alpha = .01
Thus: decisions to reject the null are based on
your alpha level
Reject null if your sample mean is equal too, or
less than your alpha level.
You get all the scores of the folks in CA who took the GRE and find that
their average score is 675 (for verbal). The overall (entire population)
mean is 500 and the SEM is 100. Is the California mean statistically
significant (the diff from the pop mean). Alpha = .05
33%
H
uh
?
o
Ye
33%
N
33%
s
1. Yes
2. No
3. Huh?
0 of 5
45
Shavelson Chapter 10
S10-7
Decisions to reject the null are based on
your alpha level
“Reject the null hypothesis if the probability
of obtaining a sample mean is less than
or equal to .05 (.01); otherwise, don’t
reject the null hypothesis”
Shavelson Chapter 10
10-8 Calculating Zx (critical)
(The Zx score at which we say it is “unlikely” to obtain this
value by chance alone)
at the alpha = .05 level of significance Zx (critical) = 1.65
(from table B)
at the α = .01 level of significance (critical) = 2.33 (from
table B )
Example:
μ = 42
σx = 8
X = 30
Reject the Ho or not at the .05 level of significance?
translate alpha level into z-score
Shavelson Chapter 10
10-8 Calculating Zx (critical)
Two ways to reject the null: Find the probability of
obtaining the Z score (obtained), or find the Z scored
that lies at the alpha level (critical). Then
Either compare the probability of getting the Zobtained
(e.g. .03) to the alpha level (e.g. .05). In this case you
would say reject the null - we show statistical
significance
Or, compare the Zobtained to the Zcritical in this case,
1.88 (obtained) and 1.65(critical). In this case since
the Zobtained is greater than Zcritical we reject the
null - we show statistical significance
Shavelson Chapter 10
10-9. Know the difference between directional and non directional tests,
and when to use each!
1. A one tail may be supported by previous research or theory
2. When in doubt, choose two tailed!
Tails are specified by alternative hypotheses.
Ho: xbar=mu
H1: xbar ≠≠ μ (2 tailed: both)
Or
H1: xbar < μ (1 tailed: left)
Or
H1: xbar > μ (1 tailed: right)
Easier to show statistical significance with 1-tailed test.
Directional vs. non-directional tests
Directional uses only one tail of the sampling distribution
Non-directional uses both tails
Thus: If alpha = .05 and one tail all .05 (1.65) is in one tail (or -1.65)
If alpha = .05 and two tail .025 (1.96) is in one tail, and .025 (-1.96) is
Shavelson, Chapter 10
• If conducting case II research, how could
you determine the probability of getting a
particular difference between 2 means
• (which is what we are looking at for case II).
Shavelson Chapter 10
Sampling distribution of differences between means gives the
probability of obtaining a particular difference between means.
(Case II)
Theoretically you could….
Make sampling distribution of differences between means,
then find a z-score, compare to alpha level, accept or reject the
null hypothesis.
Case II
xbar1-xbar2 = 2
xbar1-xbar2 = 3
xbarz-xbar2 = -1
graph frequency of each difference
make freq distribution of differences between sample means
Can also calculate SD, determine likelihood of obtaining
difference between means by chance alone
Shavelson Chapter 10
•
Characteristics of the sampling distribution
of differences between means
1. normally distributed
2. Mean=0
3. Standard Deviation (called the standard error
of the difference between means)
Is equal to:
σx1-x2 =
•
σx12 + σx22
Note: variance = sigma squared
Shavelson Chapter 10
Calculate a Z score for diff between means
Z x1-x2 = Xe – Xc
σx1-x2
Example:
Xe = 24
Xc = 30
σx1-x2 = 2.8
H1: = Xe ≠ Xc
Z crit?
Z obs?
Shavelson Chapter 11
S11-1. Know the definition and recognize/generate examples of the two types of
errors (Type I and Type II)(also see table 11-1)This is similar to what we did
last unit. How does one adjust the probability of making a type I error? (313).
The way it really is
Your Decision
Reject Null – Vit
A had an effect
Accept Null
Vit A had no
effect
Vit A has no
effect
Vit a Had an
effect
Type 1 error
Correct Decision
Correct decision Type 2 error
Shavelson Chapter 11
S11-2. Know the definition of "power" and how it is
calculated. (314)
Power = 1-Beta
The probability of correctly rejecting a false null
hypothesis. OR: Power is the probability of you
detecting a true treatment effect.
(What researchers are really interested in! Detecting
a true difference if it exists.)
Power = .27 (27%)…very low. Want higher power,
want higher number.
Shavelson Chapter 12
S12-1. What is the purpose of a t test in general (334,3). Also how is a t test used for
case I research? (that is, what question does it answer?(334,3). As in previous
chapters the function of the t test is to determine the probability of observing a
particular sample mean, given that the null hypothesis is true. You should know
this point. You should also know how the standard deviation is estimated for the
population when using the t distribution (334)
T-test is used to…
A. Determine the probability that a sample was drawn from a
hypothesized population (given a true Ho)
B. Used when the population standard deviation is not known
C. Calculated standard deviation (SEM) is:
How would one go about doing this?
Standard Dev. Of Sample = Sx =
Sq. Root of sample size
s
N
Shavelson Chapter 12
S12-2. You should be able to describe the t distribution and what it is used for
(determining the probability of obtaining a particular sample mean)(335336). Know the important differences between the t distribution and the
normal distribution. (335,5,-335,7) (there are three points made).
A.
T(observed):
X–μ
sx
= the number of standard deviations that a particular t lies from the mean)
The t distribution is created from numerous same sized samples from the
population – just like a sampling distribution!
The t(observed) can be compared to the t distribution to determine the
probability of obtaining that particular sample mean (given the Ho is
true)
Shavelson Chapter 12
T-distribution vs. Normal Distribution:
1. T has a different distribution for every sample size (N)
2. More values lie in the tails of t; thus critical values for t are
higher than Z
3. As sample size increases t becomes closer + closer to normal
distribution.
Shavelson Chapter 12