Download class notes - rivier.instructure.com.

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Degrees of freedom (statistics) wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Misuse of statistics wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
CLASS NOTES: Introduction to the t Statistic / t Ratio / t Test
1st set: Basic t-test
nd
2 set: t-test for independent measures
3rd set: t-test for repeated measures
CONCEPT
CALCULATION/EXAMPLES
APPLICATION
* As opposed to comparing sample
scores w/ population scores, for tscores, we are now comparing 2
samples.
* A t-statistic is often times
preferred over a z-statistic. The zscore formula requires more
information than is usually
available (i.e. population data).
* Remember that the reason for
conducting hypothesis testing is to
gain knowledge about an unknown
population.
* We have talked about sampling
w/ replacement being a
requirement for a true experiment.
In these cases, when sampling w/
replacement is not feasible, you
select a large population relative to
smaller sample sizes.
The t Statistic : An Alternative
to z
Requirements for Using the t
Test
1. The samples have been
randomly selected.
2. The traits being measured do not
depart significantly from normality
w/in the population.
3. The standard deviations of the
two samples must be fairly similar.
4. The two samples are
independent of each other.
5. Comparisons are made only b/t
measures of the same trait.
6. The sample scores provide at
least interval data or ratio data.
Hypothesis Testing & the Single
Sample t-Test
State your hypothesis
Step 1:
Null Hypothesis:
Ho = there is no difference
The null hypothesis is always
stated first following the
alternative hypothesis.
Alternative Hypothesis:
H1 = there is a difference
Locate your critical region
Step 2:
α = .05
two-tailed
df = 9
Gather your sample & follow
thru w/ the calculations
Calculate the mean
Step 3:
M1 = ΣX1
n1
Remember that the alpha level is
determined by the researcher.
Whether or not you are using a
one- or two-tailed test is
determined by how you write your
hypothesis statements (whether or
not you use directionality terms in
your statements)
Calculate standard deviation
Estimated Standard Error:
Measures the difference b/t the
sample mean & the population
mean.
Here, you see 2 formulas for the
Estimated SE. Both give you the
same outcome, but the second one
requires less calculation.
Calculate your t-ratio
s1 =
ΣX12 – (ΣX1)2
n___
n-1
Remember we are using values &
formulas related to “sample
means,” thus the ‘sample’ SD
formula & the SE of M.
sM or SEM = s2
n
OR
SEM = s
n
This second formula listed here is
the formula we used when
calculating your final SE of M
value. SM or SEM is the notation
for the estimated standard error for
t tests.
Remember that s stands for your
sample standard deviation.
t=M–µ
SEM
This particular SEM formula
requires the least amount of
calculations & is similar to the
formula we used when calculating
z-scores for our sample means.
Although we are using formulas
similar to those that we used in
calculating sample means, for a
single sample-t though, the
population mean is generally
available to the researcher.
t-ratio: Used to test hypotheses
t = sample mean - pop. Mean
about an unknown population
(data)
(hypothesized
mean µ when the value of σ is
______________from Ho)___
unknown. The formula for the tEstimated standard error
statistic has the same structure as
(computed from
the z-score formula, except that the
the sample data)
t-statistic uses the estimated
standard error in the denominator.
t=M–µ
* Using t tests, we do not know the
SEM
population standard deviation of
that distribution since our
population is unknown.
Step 1: Compute the mean from
* The t ratio tells us specifically
So, the only difference b/t the zscore formula & the t-score
formula is that the z-score uses the
actual population variance & the tscore uses the corresponding
sample variance when the
population is unknown.
how far the sample mean deviates
from the population mean in units
of standard errors of the mean.
* We will always be comparing
two sample means: a known
sample mean w/ an assumed
population mean.
* In the calculation of the single-t
ratio, the assumed population
parameter is presented.
your sample means set.
Step 2: Subtract your estimated
population mean from mean of the
sample means..
Step 3: Calculate your estimated
standard error.
Step 4: Divide your numerator by
your denominator.
Critical Region for the t Test
Look in Appendix B in the back of
your text book to find the critical
regions for a t-distribution.
Your critical region is based upon
your df of degrees of freedom &
your chosen alpha level for either a
one-tailed or two-tailed test.
Make a decision
Draw out a distribution & mark on
that distribution where 1) your tvalue, or cut-off point falls (this is
the t-value you obtained from your
table in the back of the text) &
then mark where your calculated tratio falls. If your t-ratio falls w/in
the “critical region” (or the space
beginning w/ the t-value cut-off
point & beyond in the tail), then
you will “reject the null.” If your
calculated t-ratio falls outside of
the “critical region” (prior to the
cut-off point & beyond, or w/in the
“null” area – the area outside of
the critical region), then you will
“fail to reject the null.”
Hypothesis Tests & Effect Size
The value you obtain from the tdistribution table will also be a tvalue. But this t-value represents
the cut-off point between your
critical region & your null region.
Not to be confused with your
calculated t-ratio.
Step 4:
Either reject the null (there is a
difference b/t the sample &
population values)
Or fail to reject the null (there is no
difference b/t the sample &
population values)
Whatever your outcome, you need
to write first whether or not you
reject or fail to reject the null, &
then you need to repeat the
hypothesis (from Step 1 in the
hypothesis testing process)
statement that associates w/ your
outcome.
A large value for the t-statistic
(either positive or negative)
indicates that the obtained
difference is greater than would be
expected by chance.
Measuring Effect Size: Whenever
a treatment effect is found to be
statistically significant, it is
recommended that you report a
measure of the absolute magnitude
of the effect (which the 2 measures
are Cohens-d & r2. These will
account for the percentage of
variance.
Use estimated-d formula
Estimated Cohen’s d =
d=t
n
Percentage of Variance
r2 = ____t2__
t2 + df
This formula expresses your effect
size in terms of percentage.
r2 represents the percentage of
variance. t represents your tstatistic whereas df as you know,
represents your degrees of
freedom.
OUTLINE OF MATERIAL
I.
II.
t-Statistic: Basic principals
Hypothesis testing for the t-statistic
A. State your hypothesis
1. Null: Ho : M = µ
2. Alternative: H1 : M ≠ µ
B. Locate the critical region
1. Select your alpha level
2. Determine your df value
3. Draw out your distribution & locate where your critical regions based upon your choice
of a one-tailed or two-tailed test
C. Collect the sample data & compute the t statistic
1. Sample mean(s)
2. Standard deviation(s)
3. Estimated Standard Error(s)
4. Estimated Standard Error of Difference (When 2 samples are being used)
5. t-Ratio
D. Make a decision to either reject the Null hypothesis or fail to reject the null hypothesis
E. Calculate your effect size
CLASS NOTES: The t Test for Two Independent Samples
CONCEPT
CALCULATION/EXAMPLES
APPLICATION
Independent Measures Design /
Between Measures Design: Uses a
separate sample for each treatment
condition (or for each population).
* The goal of independentmeasures research study is to
evaluate the mean difference b/t 2
populations (or b/t 2 treatment
conditions).
* It allows us to make a probability
statement regarding whether two
independently selected samples
represent a single population.
Example:
*There is a risk involved in
independent/ b/t measures designs
where some individuals in one
sample may be mismatched to
individuals in the other sample (i.e.,
one may have a higher IQ, etc..)
that can affect the outcome.
Men vs. Women
*The two separate samples are
equal in size (n1 = n2)
Stating your Hypothesis for the
Independent Measures t- test
The independent-measures tstatistic uses the data from 2
separate samples to help decide
whether there is a significant mean
difference b/t two populations or
b/t 2 treatment conditions.
µ1 = notation for population 1
µ2 = notation for population 2
Null hypothesis:
Ho = µ1 - µ2 = 0
Remember that your null
hypothesis (Ho) states that there is
no change or no effect (there is no
difference b/t the means of the 2
independent samples)
Alternative Hypothesis:
H1 = µ1 - µ2 ≠ 0
Your alternative hypothesis (H1)
states that there is a change or there
is an effect (there is a difference b/t
the means of the 2 independent
samples; the means are not equal)
Remember your inferential
statistics. You are using your
samples to make inferences about
unknown or theoretical
populations.
Estimated Standard Error
Calculating for the standard error
of both the first sample
(represented by a subscript of “1”)
σM or SEM1 = _s1_
n
σM or SEM2 = _s2_
& the second sample (represented
by a subscript of “2”)
Standard Error of Difference
Pooled Variance: Corrects for
biases in sample variances by
combining two sample variances
into a single value. The variances
are obtained by averaging or
pooling the two sample variances
using a “procedure” that allows the
bigger sample to carry more weight
in determining the final value.
* This “procedure” or formula is
structured so that each sample is
weighted by the magnitude of its df
value; thus, the larger sample
carries more weight in determining
the final value.
Formulas for an IndependentMeasures Hypothesis Test: Uses
n
SED =
SE2M1 + SE2M2
SED = (n1 – 1)s12 + (n2 – 1)s22 x
(n1 + n2 – 2)
1 + 1
n1
n2
Step 1: Calculate the n value for
each sample
Step 2: Calculate the standard
deviation for each sample
Step 3: For the numerator in the
first bracket, subtract 1 from the n
from your first sample.
Step 4: Multiply that number by
its’ sample standard deviation
Step 5: Repeat steps 3 & 4 for the
second sample.
Step 6: For the denominator in the
first bracket, add your two n values
from each sample & then subtract 2
from this value.
Step 6: For the next bracket, divide
the number 1 by the n value of your
first sample.
Step 7: Do the same for your
second sample.
Step 8: Add these 2 values together
Step 9: Multiply the value from
your first bracket to the value from
your second bracket
Step 10: Square root this remaining
value
t = M1 – M2
SED
We will use the standard error of
difference since we are working w/
2 different sample groups,
representing a “growth” or addition
to our SE of M formulas.
* If sample sizes are not equal, then
the results are biased
* If the variances are obtained from
a large sample, it will be a more
accurate estimate of σ2 than the
variance obtained from a small
sample.
* The pooled variance is actually
an average of the 2 sample
variances & the value of the pooled
variance will always be located b/t
the 2 sample variances.
t=
sample
__mean diff. -
population
mean diff._
the difference b/t two sample
means to evaluate the difference b/t
two population means.
Estimated standard error of
the difference
Your subscript “1” represents data
from your first sample. Your
subscript “2” represents data from
your second sample.
The basic structure of the t-statistic
is the same for both the
independent measures & the singlesample hypothesis tests.
Step 1: Find the mean for each
group
Step 2: Find the standard deviation
for each group
M1 = ΣX1
n1
s1 =
s2 =
Step 3: Find the estimated standard
error of the mean for each group.
Step 4: Find the estimated standard
error of difference for both groups
combined.
The Estimated Standard Error of
Difference (SED): Distribution of
Differences with two populations.
* Used as an estimate of the real
standard error (σM) when the value
of σ (the population standard
deviation) is unknown.
* SED is based on the estimated
standard errors that have been
obtained from each of the two
samples.
* The estimated standard error of
difference is based on the
information contained in just two
samples.
ΣX12 – (ΣX1)2
n___
n-1
Since we are working w/ 2
samples, then we use the sample
computational standard deviation
formula.
ΣX22 – (ΣX2)2
n___
n-1
SEM1 = s
n
SED =
M2= ΣX2
n2
SEM2 = s
n
SE2M1 + SE2M2
Step 1: Take the standard error of
your first sample & square it.
Step 2: Then you add to that the
standard error of the second sample
& square that number.
Step 3: Last but not least, you
square root that final number to
come up w/ your estimated sample
error of difference.
Or the alternative formula for
estimated standard error when
using pooled variance:
Remember that “s” represents your
standard deviation for samples.
Remember your standard error of
the mean formula.
* Since we measure only one pair
of samples to generate this value,
the estimated standard error of
difference is a statistic & not a
parameter.
* It is computed from the sample
variance or sample standard
deviation & provides an estimate of
the standard distance b/t a sample
mean M & the population mean µ.
Step 5: Calculate the t-ratio
Degrees of Freedom: The degrees
of freedom for the independentmeasures t-statistic are determined
by the degrees of freedom values
for the 2 separate samples.
SEP = (n1 – 1)s12 + (n2 – 1)s22 x
(n1 + n2 – 2)
1 + 1
n1
n2
t = M1 – M2
SED
df for the first sample +
df for the second sample
= df1 + df2
= (n1 – 1) + (n2 – 1)
OR
Your total number of values in your
whole study minus 2
nT - 2
Final step in the hypothesis
Step 4:
testing process: State your
Either reject the null (there is a
decision
Remember that to reject the null
difference b/t the first sample & the
hypothesis, then your calculated tsecond sample)
ratio must fall beyond your t-value
Or fail to reject the null (there is no
obtained w/in the table & w/in the
difference b/t the first sample & the
critical region, whereas to “fail to
second sample)
reject the null,” your calculated tratio must fall outside of the critical
region & w/in the “null” region (the
rest of the distribution that falls
outside of the critical region).
Effect Size for Independent
Measures t-Statistic
If your sample sizes are equal, then
you can use the paired-t formula:
* The subscript “p” in this formula
of estimated standard error
represents the “pooled variance”
that is used in the independentmeasures research design t-statistic.
* This is used when the sample
sizes used are not equal.
Your mean from sample 1 minus
your mean of sample 2 divided by
the Standard Error of the
Difference.
* Remember your Degrees of
Freedom or df: It describes the
number of scores in a sample that
are independent & free to vary.
* Remember also that your critical
values region is based upon your df
& your set alpha level for either a
one-tailed or two-tailed test.
The symbol nT represents the
“total” number of ‘n’ values
(subscript “T” representing “total”)
Whatever your outcome, you need
to write first whether or not you
reject or fail to reject the null, &
then you need to repeat the
hypothesis (from Step 1 in the
hypothesis testing process)
statement that associates w/ your
outcome.
d=t
n
Estimated d = t
S2p
Percentage of Variance
r2 = ____t2__
t2 + df
n is representing the number of
pairs of samples.
This is the formula for the effect
size. Your subscript “p” comes
from your pooled variance.
This formula expresses your effect
size in terms of percentage.
r2 (for now) represents the
percentage of variance. t represents
your t-statistic whereas df as you
know, represents your degrees of
freedom.
OUTLINE OF MATERIAL
I.
II.
t-Statistic for Independent Samples / Basic principals
Hypothesis testing for the t-statistic
A. State your hypothesis
1. Null: Ho = µ1 - µ2 = 0
2. Alternative: H1 = µ1 - µ2 ≠ 0
B. Locate the critical region
1. Select your alpha level
2. Determine your df value
3. Draw out your distribution & locate where your critical regions based upon your choice
of a one-tailed or two-tailed test
C. Collect the sample data & compute the t statistic
1. Sample mean(s)
2. Standard deviation(s)
3. Estimated Standard Error(s)
4. Estimated Standard Error of Difference (When 2 samples are being used)
5. Pooled variance (When 2 samples are not equal in size)
6. t-Ratio
D. Make a decision to either reject the Null hypothesis or fail to reject the null hypothesis
E. Calculate your effect size
CLASS NOTES: The t Test for Two Related Samples / Repeated Measures Design
CONCEPT
CALCULATION/EXAMPLES
Repeated Measures Design: A
single sample of individuals is
measured more than once on the
same dependent variable. The same
Example:
A sample group is measured for
symptoms reported before therapy
APPLICATION
* Remember your definition of
dependent variable (The variable
that is observed in order to assess
the effect of the treatment)
samples are used in all of the
treatment conditions. 2 sets of data
are obtained from the same sample
of individuals.
& again after therapy.
* The goal is to use a sample of
difference scores to answer
questions about the general
population. Basically, we would
like to know what would happen if
every individual in the population
were measured in two treatment
conditions (X1 & X2) & the
difference score (D) were
computed for everyone. So, we are
therefore also interest in the mean
for the population of difference
scores.
* The main advantage of a
repeated-measures study is that it
uses exactly the same individuals in
all treatment conditions & there is
no risk of a difference in
participants from one sample to
another.
* A repeated measures design
typically requires fewer subjects
than an independent-measures
design & uses the subjects more
efficiently.
* It works well for studying
learning, development or other
changes that take place over time.
* Observations must be
independent
* Population distribution of
difference scores must be normal.
Formulas for Repeated Measures Hypothesis Test: Uses
the difference b/t two sample
means to evaluate the difference b/t
two population means.
Step 1: Find the mean for each
group
Step 2: Find the standard deviation
for each group
Step 3: Find the estimated standard
error of the mean for each group.
Step 4: Find the estimated standard
error of difference for both groups
combined.
Step 5: Calculate your t-statistic
M1 = ΣX1
n1
M2= ΣX2
n2
s=
ΣX12 – (ΣX1)2
n___
n-1
s=
ΣX22 – (ΣX2)2
n___
n-1
SEM1 = s
n
SED =
SEM2 = s
n
Although we are using one sample
group, we are obtaining 2 sets of
separate values, before & after.
Therefore, we use the sample
computational standard deviation
formula.
These steps are the same as those
presented in the above processes
SE2M1 + SE2M2
t = M1 – M2
The mean of your first set of values
minus the mean of your second set
SED
Hypothesis Testing Process
State your Hypothesis
Step 1:
Null hypothesis:
Ho = µD = 0
Alternative Hypothesis:
H1 = µD ≠ 0
Set the criteria for a decision by
selecting your alpha level &
locating your critical regions by
drawing out your distribution &
using your t-distribution table
of values divided by your Standard
Error of the Difference.
Often times in a related samples ttest, the researcher has a specific
prediction regarding the direction
of the treatment effect, therefore he
or she will often use a directional
or one-tailed test.
You must write out your hypothesis
statements.
Step 2:
two-tailed
α = .05
df = (n - 1) = 4
Collect your data & compute
your sample statistics
Step 3:
Calculate the means for the
samples (ex. Before/after)
M1 = ΣX1
n1
M2= ΣX2
n2
Calculate the standard deviation
for each value set
s=
ΣX12 – (ΣX1)2
n__
n-1
s=
ΣX22 – (ΣX2)2
n___
n-1
Calculate the standard error of
the mean for each sample
SEM1 = s
SEM2 = s
As opposed to the independent
measures-t which combines the
degrees of freedom for each sample
set, in the repeated measures-t,
since we only have one sample set,
then the df formula remains n-1.
n
n
Calculate the standard error of
difference
SED =
SE2M1 + SE2M2
Calculate the t-ratio
t = M1 – M2
SED
Make a decision
Step 4:
Either reject the null (there is a
difference)
Or fail to reject the null (there is no
difference)
The mean from the first set of
values minus the mean from the
second set of values divided by the
Standard Error of the Difference.
Whatever your outcome, you need
to write first whether or not you
reject or fail to reject the null, &
then you need to repeat the
hypothesis (from Step 1 in the
hypothesis testing process)
statement that associates w/ your
outcome.
OUTLINE OF MATERIAL
I.
II.
t-Statistic for repeated measures design: Basic principals
Hypothesis testing for the t-statistic
A. State your hypothesis
1. Null: Ho : M = µ
2. Alternative: H1 : M ≠ µ
B. Locate the critical region
1. Select your alpha level
2. Determine your df value
3. Draw out your distribution & locate where your critical regions based upon your choice
of a one-tailed or two-tailed test
C. Collect the sample data & compute the t statistic
1. Sample mean(s)
2. Standard deviation(s)
3. Estimated Standard Error(s)
4. Estimated Standard Error of Difference (When 2 samples are being used)
5. t-Ratio
D. Make a decision to either reject the Null hypothesis or fail to reject the null hypothesis
E. Calculate your effect size