Download EDFI 6410 Course Packet

Document related concepts

History of statistics wikipedia , lookup

Foundations of statistics wikipedia , lookup

Psychometrics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Categorical variable wikipedia , lookup

Omnibus test wikipedia , lookup

Misuse of statistics wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
Page 1
EDFI 641
Statistics in Education
Course Packet
Dr. Rachel Vannatta
Table of Contents
Video #1—Introduction to Statistics................................................................2
Video #2—Frequency Distributions ...................................................................6
Video #3—Central Tendencies & Variability .................................................. 10
Video #4—Probability & z Score ...................................................................... 18
M-n-M Activity........................................................................................ 19
Video #5—Distribution of Sample Means...................................................... 24
Video #6—Hypothesis Testing......................................................................... 28
Video #7—t Test ................................................................................................. 38
Video #8—t Test of Independent Samples................................................... 44
Interpreting Research ......................................................................... 49
Video #9—t Test of Related Samples............................................................ 50
Interpreting Research ......................................................................... 54
Coke vs. Pepsi Experiment ................................................................... 55
Video #10—AVOVA............................................................................................. 57
Interpreting Research ......................................................................... 63
Video #11—Correlation & Regression.............................................................. 64
Interpreting Research ......................................................................... 73
Video #12—Chi Square ...................................................................................... 75
Interpreting Research .......................................................................... 81
Statistical Test Grid ............................................................................ 82
Unit Normal (z-score) Table ............................................................................. 84
t Distribution Table ............................................................................................ 88
F Distribution (ANOVA) table .......................................................................... 89
Pearson Correlation Table.................................................................................. 90
Chi Square Distribution Table ........................................................................... 91
Video #1—Introduction to Statistics
Page 2
Population—the entire group of individuals that the researcher WISHES to study.
Sample—a set of individuals selected from population, intended to represent the population
Parameter—value that describes the population
Statistic—value that describes the sample
Two major types of statistical methods
• descriptive stats—summarize, organize and simplify data (e.g., mean, standard deviation, tables,
graphs, distributions)
• data
• raw score
• inferential stats—techniques that allow us to study samples and make generalizations about the
population from which they were selected (e.g., t test, ANOVA, correlation)
• sampling error—amount of error between the sample statistic and the population parameter
(degree to which the sample differs from the population)
• random sampling—used to minimize error between sample and population
Inferential statistics also allow us to study relationships between/among variables that the sample holds.
•
variable—characteristic/condition that differs among individuals (gender, height, test scores, IQ)
• construct—hypothetical concepts/theory to organize observations
• operational definition—defines a construct in terms of how it is measured
Types of Variables
• categorical variable (discrete)—consists of separate categories (e.g., gender, religion,
classification of personality)
• quantitative variable (continuous)—can be divided into an infinite number of fractional parts
(e.g., height, time, age)
• independent variable—usually a treatment that has been manipulated (control group versus
experimental group), usually categorical
• dependent variable—usually the effect, usually quantitative
•
confounding variable—an uncontrolled variable that creates a difference between the
control and experimental groups
Variables determine type of relationship being studied
• mutual
• causal
•
Groups must be compared to examine cause and effectÆ groups are created by a
categorical variable
Independent Variable
Dependent Variable
Key Words
Causal
Categorical
Mutual
Quantitative
Quantitative
Cause
Effect
Increase/Decrease
Difference
Quantitative
Relate
Relationship
Predict
Associate
Page 3
Class #1: In-Class Practice Problems
In the following research questions, identify the independent and dependent variables and indicate if it
is categorical or quantitative.
1. Is there a significant relationship between college GPA and SAT scores among college freshmen?
independent variable—
dependent variable—
research design—
2. Does receiving a special diet of oat bran significantly decrease cholesterol levels among middle-age
adults? Note: Researcher compared a treatment group to a control group. Groups were created
using random selection and assignment.
independent variable—
dependent variable—
research design—
3. Does socio-economic status (low, middle, high) effect reading achievement among preschoolers?
independent variable—
dependent variable—
research design—
4. Does receiving whole-language reading instruction increase reading achievement among elementary
students? Note: Research compared treatment group (whole-language) to control group (traditional).
Existing groups were used.
independent variable—
dependent variable—
research design—
Page 4
Research Designs
•
•
•
•
Correlational—studies relationships among 2 or more variables to explain for predict behaviors
•
usually both IV and DV are quantitative
•
example: Teacher studies the relationship between English grades and overall GPA.
Experimental—examines cause and effect; manipulates a treatment and tests the outcome; compares
the experimental and control groups (groups are randomly created)
•
IV=nominal; DV= interval/ratio
•
example: Researcher compares grades of a group of students that receive computer-assisted
instruction to a group that receives none. Groups were created through random assignment.
Quasi-Experimental—examines cause and effect; indirectly manipulates a treatment and tests the
outcome; compares the experimental and control groups (uses existing groups)
•
IV=nominal; DV= interval/ratio
•
example: Researcher compares grades of a group of students that receive computer-assisted
instruction to a group that receives none. Existing groups were used.
Causal Comparative—examines cause and effect (cautiously); compares groups created by some
categorical characteristic (gender, religion, ethnicity)
•
IV=nominal; DV= interval/ratio
•
example: Researcher compares final grades of male and female students.
Most research is guided by a hypothesis,
a prediction about the effect of the treatment.
Measurement Scales
•
Nominal—numbers have NO numerical value but represent categories (religion, ethnicity, occupation,
gender)
•
Ordinal—numbers represent a rank (1 begin the best); interval can vary (e.g., class rank, Olympic
ordinals)
•
Interval—numbers have typical numerical value; interval are equal; no real zero (e.g., temperature,
test score)
•
Ratio—same as interval but has a real zero (e.g., money, time)
Page 5
Identify the measurement scale (nominal, ordinal, interval, ratio) for each.
_________________5.
Size of school district (small, medium, large)
_________________6.
Rank of faculty on their teaching ratings
_________________7.
Social security number
_________________8.
Color of person’s eyes
_________________9.
IQ scores
_________________10. Degree in Fahrenheit
_________________11.
Religious affiliation
________________12.
Medalists in an Olympic event
________________13.
Income in actual dollars
Page 6
Video #2—Frequency Distributions
Frequency distribution—table/graph of the number of individuals located in each category
• places scores in highest to lowest;
• groups together all individuals who have the same score
f
X
10
9
8
6
5
4
1
4
5
6
2
2
Proportion and Percents of Frequency Distributions
• Proportion—relative frequencies; measures the fraction of the total group that is associated
with each score; most often appear as decimals
•
•
proportion = p
=
f
N
Percentage—percent of the total group that is associated with each score
•
X
10
9
8
6
5
4
percentage = p (100) = f (100)
N
f
1
4
5
6
2
2
p=f/N
1/20=.05
4/20=.20
5/20=.25
6/20=.30
2/20=.10
2/20=.10
%=p(100)
5
20
25
30
10
10
cum f
1
5
10
16
18
20
cum%
5
25
50
80
90
100
Page 7
Grouped Frequency Distribution Table
•
used when data covers a wide range of values; groups are based on class intervals
•
to construct a grouped frequency distribution table, follow these rules:
• rule 1—number of intervals—shoot for 8-12 intervals, 10 intervals being the ideal
• rule 2—interval width—use appropriate width to reach appropriate # of intervals
• rule 3—interval starting pt—should be a multiple of the width
• rule 4—all intervals should be the same width
Helpful Hints
•
use the following equation to determine the number of intervals and the width of intervals that is
appropriate for the data
•
number of intervals = highest score - lowest score + 1 *
interval width
•
ALWAYS round up the number of intervals! It is impossible to have a fourth of an
interval at the end of the distribution. So even if the number of intervals (using the
above formula equals 8.25, round up to 9!*
•
try different widths, until an appropriate number of intervals is calculated
Example: N=25
51, 55, 57, 60, 63, 66, 68, 69, 70, 72, 74, 74, 74, 75, 77, 79, 83, 84, 85, 85, 88, 90, 92, 95, 98
•
number of intervals =
X
95-99
90-94
85-89
80-84
75-79
70-74
65-69
60-64
55-59
50-54
98 – 51 + 1
5
=
48 = 9.6
5
(round up to 10)
f
2
2
3
2
3
5
3
2
2
1
*
keep in mind that since a continuous variable
contains an infinite number of points, a score
is not assigned a single point but rather an
interval with boundaries, also called real
limits, that separate a score from the
adjacent scores.
Example: X=88
• upper real limit = 88.4
• lower real limit= 87.5
• therefore, a score of 87.75 would fall in
the interval of X=88
Frequency Distribution Graphs
• Uses an x-axis to represent scores or and a y-axis to represent frequencies
• List scores increasing in value from left to right
• List frequencies in increasing value from bottom to top
• The height of the y-axis should be approximately 2/3 to 3/4 of the length of the x-axis
Page 8
•
Creating a Grouped Frequency Histogram—follow rules for Grouped Frequency Table
• Histogram—used for interval/ratio data; a bar represents an interval (real limits of the score or
class interval); bars touch each other to represent the continuous nature of the data; height
corresponds to frequency
• Example: Using data from the Grouped Frequency Table on previous page
5
4
f
3
2
1
50-54
Starting pt. is
a multiple of
the width (5)
•
55-59
60-64
65-69 70-74 75-79 80-84
85-89 90-94
95-99
10 intervals
meet the 8-12
interval
requirement
Interval width is 5 in
order to generate 10
intervals
Other Types of Frequency Distribution Graphs and Polygons
•
Bar Graph—used for nominal/ordinal data; a bar represents a category, bars do not touch
•
Frequency Distribution Polygons— used for interval/ratio data; a single dot represents an individual
score or a class interval; dots are connected
•
Distribution Curve—shows relative frequencies for the population; smooth
• Normal—symmetrical; greatest frequency in the middle, smallest frequency in the
extremes (tails)
•
Positively Skewed— smallest frequency in the positive (right) end of the distribution
•
Negatively Skewed— smallest frequency in the negative (left) end of the distribution
Video # 2
In-Class Practice Problems
Page 9
10, 15, 18, 22, 25, 26, 29, 31, 33, 33, 34, 37, 38, 39, 39, 40, 40, 40, 41, 42, 42, 43, 44, 45, 46, 46, 47,
48, 49, 50
1. Using the data above, do the following:
a. construct a histogram based upon the grouped frequency distribution
b. determine the distribution type (normal, positive, negative) from the histogram
Video #3: Central Tendency
Page 10
Measure of Central Tendency
•
•
•
•
•
describes a group of individuals with a single measurement that is most representative of all
individuals
Types: mean, median, and mode
Mean—arithmetic average
•
used for interval/ratio (quantitative) data
•
computed by adding all the scores and dividing by the number of scores
•
Population mean = μ = ΣX
N
Sample mean = X = ΣX
n
Median—the midpoint; the score that divides the distribution exactly in half; 50% are above and
below the median
•
used for ordinal data or when: there is a skewed distribution, some scores are undetermined, or
there is an open-ended distribution
•
Calculating the median when N is an odd number
• make sure scores are in order; find the middle score
•
Calculating the median when N is an even number
• make sure scores are in order; find the two middle scores; add the two scores & divide by 2
Mode—the most frequent score
• used especially for nominal data
• represented by the highest point in the frequency distribution
Central Tendency and the Shape of Distributions
•
Normal distribution—mean, median, and mode are equal and smack-dab in the middle of the
distribution
•
Skewed Distributions
• not symmetrical
• mean, median, mode are different
• extreme scores on one end of the distribution
• Mean is most affected by extreme scores, so it will be furthest out in the tail
Negatively Skewed—extreme scores are on the low end of the distribution
Mean Median Mode
Page 11
Positively Skewed—extreme scores are on the high end of the distribution
Mode
Median
Mean
Variability
Variability—a measure that describes how spread out or close together the scores are within
the distribution
•
Range—distance between the highest score and the lowest score in the distribution;
easiest measure of variability
•
range = (high score - low score)
Distribution 1
Range =
10
Mean=
6
Median=
6
Mode=
6
SD=
2.45
7
6
6
5
5
5
4
4
4
3
3
3
2
2
2
1
1
1
0
1
2
3
4
5
6
7
8
9
10
11
Page 12
•
Standard Deviation from the Mean
• most common measure of variability;
• average distance of scores from the mean
7
6
6
5
5
5
4
4
4
3
3
3
2
2
2
1
1
1
0
1
9
2
3
4
5
6
7
8
9
10
8
11
8
8
7
6
5
4
4
3
3
2
2
2
1
1
3
Distribution #2
2
2
1
0
1
2
3
4
5
6
7
8
9
10
11
Page 13
Page 14
Standard Deviation Activity
o Need 16 pieces of candy (M-n-M’s, Skittles, etc.)
o You must use all 16 pieces for each distribution.
o Use Distribution Graph from Blackboard Course Site (located in Course Documents)
Steps
1.
For distribution A create a normal distribution like Dr. Vannatta’s with your candy. Trace
outline of distribution.
Now on your own, complete the following:
2. For distribution B, move candy around to create a distribution that has greater
variability than A. Trace outline of distribution.
3. For distribution C, move candy around to create a distribution that has less
variability than A. Trace outline of distribution.
4. For distribution D, move candy around to create a distribution that has the
least possible amount of variability.
Variability Key Concepts
•
•
•
Variability shows how spread out scores are in the distribution.
• Range only takes into account the two extreme scores (highest and lowest)
• Standard deviation compares all scores to the mean
When scores are close to the mean, then variability is less.
When scores are far from the mean (outliers, extreme ends of the distribution), then
variability is more.
Calculating Standard Deviation
•
standard deviation for population =
σ
=
Σ(X - μ)2
N
•
standard deviation for sample = s =
Σ(X - X)2
n-1
•
degrees of freedom (df = n - 1) —an adjustment of sample bias; to calculate the
standard deviation, we must know the sample mean—this places a restriction on sample
variability since only (n - 1) scores are free to vary once we know the sample mean.
Page 15
•
Example for calculating the standard deviation for a sample
(X - X)2
X
X
X - X
2
5
-3
9
3
5
-2
4
3
5
-2
4
4
5
-1
1
4
5
-1
1
4
5
-1
1
5
5
0
0
5
5
0
0
5
5
0
0
5
5
0
0
6
5
1
1
6
5
1
1
6
5
1
1
7
5
2
4
7
5
2
4
8
5
3
9
•
Variance(s2) = Σ (X - X)2 = SS = 40 = 2.6
n–1
•
n–1
Standard dev (s) = Σ(X - X)2= SS = 2.6 = 1.62
n-1
n-1
• Sum of squares—sum of squared deviation scores
or sum of squared differences
• SS = Σ(X - X)2 also SS = s2(n-1)
• Variance—mean of squared deviation scores; sum
of squares divided by the number of scores minus 1
• variance = s2 = Σ(X - X)2
n-1
SS = 40
Steps to Calculate Standard Deviation
1.
2.
3.
4.
15
Calculate mean (X)
Calculate the difference between each score and the mean (X – X)
Square each difference (X –X)2
Add the squared differences
• This is the Sum of Squares (SS) = Σ(X – X)2
5. Divide SS by degrees of freedom (df = n-1)
This is Variance = Σ(X – X)2
n-1
6. Take the square root of variance
• This is the Standard Deviation (SD) =
•
Σ(X – X)2
n-1
Page 16
Standard Deviation Calculation Practice
a. Calculate the standard deviation for
the following data (X=6).
X
X
X – X
b. Calculate the standard deviation
for the following data (X=6). Notice
the mean is the same, but three
scores have been changed to 6.
(X – X)2
X
2
2
2
6
8
6
8
6
10
10
X
X - X
SS=
(X - X)2
SS=
How does the change in data effect the SD? Why?
Characteristics of standard deviation
• a small standard deviation indicates that scores are close together
• a large standard deviation indicate that scores are spread out
• adding a constant to each score will not chance the standard deviation
• multiplying each score by a constant cause the standard deviation to multiply by that same
constant
• research articles usually use (SD) to refer to the standard deviation
•
Standard deviation and the normal distribution
• three standard deviations on each side of the mean
-3σ
−2σ
−1σ
mean
+1σ
+2σ
+3σ
Video #3: In-Class Practice Problems
For the following sample of scores: 1, 2, 3, 3, 4, 4, 5, 5, 5, 5, 6, 6, 7, 7, 8, 9
Page 17
This data is slightly different from what is presented in the video so that a “cleaner” mean would be calculated.
a. Sketch a frequency distribution histogram.
b. Calculate the following:
mean = ____________________
median = ____________________
mode = ____________________
range = ____________________
degrees of freedom = ____________________
standard deviation = ____________________
c.
From you calculations, identify the distribution type.
Page 18
Video #4: Probability
Probability is used to:
• determine the types of sample we are likely to obtain from a population
• make conclusions about the population from the sample
Probability—fraction, proportion or percent of selecting a specific outcome out of the
total number of possible selections
•
•
probability of A =
number of A’s
total number of possible outcomes
•
probability of selecting a heart out of deck of cards
• p (heart)
= 13 = 1 = .25 Æ 25%
52
4
Probability and the Normal Distribution
•
•
•
•
A normal distribution holds 100% of the individuals in it
the mean, median and mode are all equal and divide the distribution in half
50% of distribution is above and below the mean
When the percent is divided by the standard deviations, it looks like this
99.7%
95%
68%
13.59%
2.14%
.13%
34.13%
34.13%
13.59%
2.14%
.13%
-3σ
−2σ
−1σ
mean
+1σ
+2σ
+3σ
0.13%
2.28%
15.87%
50%
84.13%
97.72%
99.87%
Page 19
z z
Page 20
z Scores
z score—measure of relative position; identifies position of a raw score in terms of the number of
standard deviations it falls above or below the mean
• Use z scores to convert raw score into percentile rank
z=X-μ
σ
Example: Jill gets a raw score of 55 on a standardized math test (μ=50, σ=10). What is Jill’s
z score?
•
z = X - μ = 55 - 50
σ
10
=
5 = .5
10
So Jill is .5 standard deviation
above the mean.
99.7%
95%
68%
13.59%
.13%
34.13%
34.13%
13.59%
2.14%
2.14%
-3z
-2z
-3z
mean
1z
2z
0.13%
2.28%
15.87%
50%
84.13%
97.72%
.13%
3z
99.87%
View area under the normal curve in terms of probability and percent:
•
•
•
•
•
•
What
What
What
What
What
•
is the probability of selecting a score that fall beyond 1z? p=.1587
is the probability of selecting a score that fall below -2z? p=.0228
is the percentile rank of someone who has a z score of 2? 98th %tile
is the percentile rank of someone who has a z score of 1? 84th %tile
if we have a z-score of 1.2, how can we find the probability or percentile rank?
we use the table of z scores provided in your course packet (see statistical tables on
page 84)
Page 21
Putting it all together
•
Suppose Jack receives a raw score of 540 on the SAT-math (μ=500, σ=100).
What is Jack’s z score and percentile rank?
Jack
z = 540 - 500 = .4
100
Proportion (p) = .6554
Rank = 65.54 %tile
200
-3z
300
-2z
400
-1z
500
0z
600
+1z
700
+2z
800
+3z
• Use z score determine an unknown raw
score
•
Suppose an individual scored at the 70th % on a
standardized test (μ,=100, σ=10), but for some reason
we don’t know his raw score and need to calculate it.
1. Use the equation:
raw score = μ + zσ
2. Use the percentile rank and convert it to a probability (example: 70% Æ .7000).
3. Use the z-table to identify the z-score associated with the probability
• .7000 corresponds to a z-score of z=.52 (Notice that we could not find a probability of
exactly .7000 but had to find a probability that was closest to .7000, which was .6985).
•
Now just plug
• raw score
• raw score
• raw score
in z, μ, and σ to our equation
= μ + zσ
= 100 + .52(10)
= 105.2
Video #4: In-Class Practice Problems
Page 22
For the problems 1-4, apply the parameters (μ = 50, σ = 5).
1. Draw the distribution. Include z-scores and mean and standard deviation.
2. Bebe scored 48. Place Bebe’s score on the distribution. What is her z-score and percentile rank?
3. Kenny scored 63. Place Kenny’s score on the distribution. What is his z-score and percentile rank?
4. Sally is at the 71st percentile. Place Sally on the distribution. What is her z-score and raw score?
Page 23
For the problems 5-7, use the following parameters from the GRE (μ = 500, σ = 100).
5. Mary scored 570. What is her z-score and percentile rank?
6. Dick scored 340. What is his z-score and percentile rank?
7. Jill is at the 38th percentile. What is her raw score?
For the problems 8-10, use the parameters from an IQ test (μ = 100, σ = 15).
8. Wendy scored at the 90th percentile. What is her raw score?
9. What percent falls between the scores of 100 and 115?
10. Jack scored 80. What is his z score and percentile rank?
Answers for Class #4 In-Class Problems:
5) z=.7, percentile rank=75.8; 6) z=-1.6, percentile rank=5.48; 7) z=-.31, raw score = 469; 8) z=1.28, raw
score = 119.2; 9) 34.13% fall between the mean and 1z; 10) z=-1.33, percentile rank = 9.2
Video #5: Distribution of Sample Means
Page 24
With statistics, we are usually trying to make conclusions/inferences about the population from the
studied sample.
•
Consequently, we want to compare the sample to the population of similar samples. But in doing so,
two issues arise:
• How do we know is a sample is representative of the population when every sample is
different?
• How can we transform a population distribution of individuals to a population distribution of
sample means?
•
Every sample is different from the population, this is known as sampling error, or the
discrepancy/error between the sample and the population.
• Random sampling is used to minimize sampling error, which can occur randomly
If we were to take a population distribution of individuals. . .
• randomly group individuals into similar sized samples
• then calculated the means of these samples and placed them into a frequency distribution
• a normal curve would form—this distribution is known as the distribution of sample means.
• any distribution that is of sample statistics and NOT individual scores is referred to as a
sampling distribution.
Characteristics of the distribution of sample means
•
•
•
•
will approach a normal distribution as sample size increases (a sample size greater than 30 is
considered normal)
the mean of the distribution of sample means is equal to the population mean of individuals and is also
known as the expected value of X.
standard deviation of this new distribution is called the standard error of X.
standard error (σx)—measures the standard distance between the sample mean (X) and the population
mean (μ); indicates how good an estimate X will be for μ.
•
standard error (σx) =
•
as sample size increases, the standard error will decrease-----> which means that the samples are
more representative of the population
σ
n
Page 25
Probability and the Distribution of Sample Means
We can now use the distribution of sample means to find the probability of obtaining a specific
sample mean from the population of samples
•
•
Example: What is the probability of getting a sample mean of 515 or higher on the SAT-math
(μ=500, σ=100) with a random sample of n = 400?
•
Calculate the standard error for samples of n=400.
•
σx =
•
pop of individuals
pop of samples (n=400)
σ
=
n
100
400
=
100
20
=
5
Draw distribution of sample means
-3z
200
485
-2z
300
490
-1z
400
495
0z
500
500
+1z
600
505
+2z
700
510
+3z
800
515
•
A sample mean of 515 corresponds to +3z
•
Using the z table, +3z corresponds to a probability of .0013 (.13%)
Page 26
• What if the sample mean does not correspond to a whole z score?
•
•
Use z = X - μ
σx
Example: What is the probability of getting a sample mean of 104 or higher on an IQ test
(μ=100, σ=15) with a random sample of n = 36?
•
Calculate the standard error for samples of n=36.
•
•
σx
= σ
n
=
15
6
=
2.5
Draw distribution of sample means
pop of individuals
pop of samples (n=36)
•
= 15
36
-3z
55
92.5
-2z
70
95
-1z
85
97.5
0z
100
100
+1z
115
102.5
+2z
130
105
A sample mean of 104 corresponds to +1.6z
z = X - μ = 104 − 100 = 4 = 1.6
σX
2.5
2.5
•
Using the z table, +1.6z corresponds to a probability of .0548 (5.48%)
+3z
145
107.5
Page 27
Video #5: In-Class Practice Problems
1. A normal population has μ = 70 and σ = 12.
a.
Sketch the population distribution. What proportion of the scores have values greater than a
score of X = 73?
b.
Sketch the distribution of sample means for samples of size n = 16. What proportion of the
means have values greater than a mean of X = 73?
-3z
-2z
-1z
0z
+1z
+2z
+3z
pop of individuals
pop of samples (n=36)
2.
For a normal population with μ = 70 and σ= 20, what is the probability of obtaining a sample mean
greater than X = 75
a. For a random sample of n =4?
b. For a random sample of n =16?
c. For a random sample of n = 100?
pop of samples (n=4)
pop of samples (n=16)
pop of samples (n=100)
-3z
-2z
-1z
0z
+1z
+2z
+3z
Video #6: Hypothesis Testing
Page 28
Hypothesis Testing—using sample data to evaluate a hypothesis (prediction) about the population so
conclusions/inferences can be made about the population from the sample
•
•
We are testing a hypothesis to determine if the treatment has caused a significant change in the
population
the majority of sample means are in the middle of the distribution; so for a sample to be
significantly different, it should be with the extreme means in the tails of the distribution, where
the probability is very low
Steps in Hypothesis Testing
1. Stating the Hypotheses
2. Establish significance criteria
3. Collect and analyze data
4. Evaluate null hypothesis
5. Draw conclusion
Step 1—Stating the Hypotheses
•
•
hypotheses should be stated in terms of the population
like a research question, your hypothesis should include three parts: variables, relationship, and
sample
•
two hypotheses must be developed—an alternative and a null
• Write alternative hypothesis in statement form
• Write notation for both alternative and null
•
•
•
hypotheses can also be directional or non-directional
• non-directional—just a prediction of a change/effect
• Key words: effect, impact, difference, cause
•
•
alternative hypothesis—the actual prediction about the change or relationship that may
occur in the population
null hypothesis—statement that the treatment has no effect on the population
directional—a prediction of increase or decrease
• Key words: increase, decrease, higher, lower, positive, negative
Summary of Hypotheses Notation
(applying example values of μ=60)
Alternative
Null
One-tailed
H1: μsprog > 60
H0: μsprog ≤ 60
Two-tailed
H1: μsprog< 60
H1: μsprog ≠ 60
H0: μsprog ≥ 60
H0: μsprog = 60
(Directional)
(Non-directional)
Page 29
•
Example: Suppose that local school district implemented an experimental program for science
education. After one year, 100 children in the special program obtained a mean score of X=63 on
a national science achievement test (μ=60, σ=12). Did the program have an impact on the
participants’ science achievement?
•
alternative—The science program will significantly effect science achievement among
program participants. This is an example of a non-directional hypothesis;
•
•
H1: μsprog ≠ 60
null— The science program will NOT significantly effect science achievement among
program participants.
•
H0: μsprog = 60
Step 2—Establish significance criteria
•
•
•
•
How much does the population need to change to show a significant effect from the treatment?
Is the change due to the treatment or sampling error?
Typically to be significantly different, we require the sample to be different from 95% or 99% of
the population
•
By setting a benchmark or criteria that requires the change in the population mean to be
quite large and the probability of this change due to be very low, we decrease our chance
of a Type I error
•
•
•
this criteria is known as the level of significance or alpha level (α)
most commonly used alpha levels are .05 (5%) and .01 (1%)
these levels of significance correspond with specific z scores, but depends upon whether
the hypothesis is directional or non-directional
non-directional hypothesis--->2-tailed test
• .05 level -------> zcritical = ± 1.96
• .01 level -------> zcritical = ± 2.58
99%
95%
-3z
-2z
-1z
0z
-2.58z -1.96z
•
directional hypothesis---> 1-tailed test
• .05 level -------> zcritical = + or - 1.65
• .01 level -------> zcritical = + or - 2.33
+1z
+2z
+3z
+1.96z +2.58z
95%
99%
•
-3z
-2z
-1z
0z
when the sample mean exceeds
the limit, then it differs significantly so we would reject the null
+1z+
2z
+1.65z +2.33z
+3z
Page 30
Step 3—Collect & analyze sample data--random selection highly recommended so that
sample is representative of population
•
•
Recall that when a test statistic is calculated by hand, you need to identify the critical value
(zcritical), which is then compared to the test statistic (zcalculated) to determine significance.
Computer automatically determines the probability of obtaining a test statistic due to chance.
Consequently, when determining significance you do NOT compare zcalculated to zcritical, rather you
examine the p-value or level of significance.
•
•
If p (or sig) is less than alpha level (.05 or .01) Ætest statistic is significantÆreject
the null.
If p (or sig) is greater than alpha level (.05 or .01) Ætest statistic is NOT
significantÆfail to reject the null.
Decision-making Table
Hand Calculations
Computer
Comparison
Significance?
Decision?
Conclusion
zcalculated ≥ zcritical
zcalculated < zcritical
p ≤ alpha
p > alpha
Significance!
Not!
Reject Null
Fail to Reject Null
Restate Alternative
Restate Null
Significance!
Not!
Reject Null
Fail to Reject Null
Restate Alternative
Restate Null
Step 4–Evaluate the null hypothesis
Compare the data with the null
• if the sample data is significantly different, then reject the null
• if the sample data is NOT significantly different,
then fail to reject the null
•
Step 5—Draw conclusion
If null is rejectedÆrestate alternative hypothesis for conclusion.
If you fail to reject the nullÆstate the null hypothesis as conclusion
•
•
Errors in Hypothesis Testing--Two types of errors are possible when testing a hypothesis:
•
•
Type I Error—we could make the mistake of rejecting the null when it really the H0 is true, when
there really isn’t a significant change due to the treatment
•
this kind of error may be due to sampling error (the sample was above the population mean
even before the treatment)
•
minimize a Type I error by setting low alpha (α) level (low probability for making an error)
•
Type I error is more serious!
Type II Error— we could make the mistake of not rejecting the null when we should have, when
there really is a significant change due to the treatment
•
the treatment effect was not big enough most likely due to sampling error (the sample was
below the population mean even before the treatment)
Page 31
•
Putting it all togetherÆExample of a two-tailed test
•
Let’s go back to our previous example of the science program: After one year, 100 children
in the special program obtained a mean score of 63 on a national science achievement test
(μ=60, σ=12). Did the program have an impact on the participants’ science achievement?
Test at the .05 level.
•
Step 1: Develop hypotheses
•
•
•
State Alternative—Special science program will significantly effect science achievement
among program participants.
Determine if it is a one-tailed or two-tailed test.
• It is non-directional hypothesis ------>two-tailed
• Notation: H1: μsprog ≠ 60
H0: μsprog = 60
Step 2: Establish significance criteria
• ComputerÆα = .05
• Hand calculationsÆidentify z scores used for the alpha level and the appropriate test.
•
•
two-tailed test at .05 corresponds to zcritical = ± 1.96
Step 3: Collect and analyze sample data
• ComputerÆenter and analyze data
• Hand calculationsÆ
•
Calculate standard error
σx = σ = 12 = 12 = 1.2
n
100
10
•
Draw distribution of sample means and shade in critical region
95%
pop of individuals
pop of sample means (n=100)
-3z
24
56.4
-2z
36
57.6
-1.96z
-1z
48
58.8
0z
60
60
1z
72
61.2
2z
84
62.4
+1.96z
3z
96
63.6
Page 32
•
Step 4: Compare sample data to null
•
ComputerÆ
•
•
•
Identify test statistic and level of significance (p-value) in output
• z = 2.49, p=.0064
Compare level of significance with alpha level
• p-value of .0064 is less than .05Æit is significantÆreject null
Hand calculationsÆ
•
•
Calculate test statistic
Convert sample mean into z score to determine if it falls in critical region.
z = X − μ = 63 - 60 = 3 = 2.5
σX
1.2
1.2
•
it exceeds +1.96z, so it
is significant, reject the null
Step 5: Draw conclusion—
• Null is rejected so alternative hypothesis is restated as conclusion
• Participation in the science program did significantly effect science achievement
scores among program participants.
Example of a one-tailed test: Suppose we took the same example, but hypothesized that the
program would cause a significant increase in achievement scores--this would be a directional hypothesis.
In addition, let’s change the level of significance to .01
Recall: n = 100, X = 63, μ = 60, σ = 12
•
Step 1: Develop hypotheses
•
•
•
State alternative: Special science program will significantly increase science achievement
scores among program participants.
Determine if it is a one-tailed or two-tailed test.
• It is directional hypothesis ------>one-tailed
H1: μsprog > 60
H0: μsprog < 60
Step 2: Establish significance criteria
• ComputerÆα = .01
• Hand calculationsÆIdentify z scores used for the alpha level and the appropriate test.
•
•
one-tailed test at .01 corresponds to z = + 2.33, since we are looking for an increase, we
are focusing on the positive end of the distribution
Step 3: Collect and analyze sample data
• ComputerÆenter and analyze data
• Hand calculationsÆ
•
Calculate standard error
σx =
σ
n
=
12
100
=
12
10
=
1.2
Page 33
•
Draw distribution of sample means and shade in critical region
99%
-3z
pop of individuals 24
sample means (n=100)
56.4
-2z
36
57.6
-1z
48
0z
60
58.8
60
+1z
72
61.2
+2z
84
62.4
+3z
96
63.6
+2.33z
• Step 4: Compare sample data to null
• ComputerÆ
Identify test statistic and level of significance (p-value) in output
• z= 2.49, p=.0032
• Compare level of significance with alpha level
• p-value of .0032 is less than .01Æit is significantÆreject null
Hand calculationsÆ
• Calculate test statistic
• Convert sample mean to z score to determine if it falls into the critical region.
•
•
z = X - μ = 63 – 60 =
σX
1.2
3 = 2.5
it exceeds +2.33z,
1.2
so it is significant,reject the null
• Step 5: Draw conclusion
•
Null is rejected so alternative hypothesis is restated as conclusion
• Participation in the science program did significantly increase achievement
scores among program participants.
Page 34
Assumptions for Hypothesis Testing with z Scores
•
•
•
random sampling and independent observations
population standard deviation will remain the same after the treatment; it is like adding a
constant—the mean changes but the σ will not
normal sampling distribution
Reporting of Results of the Statistical Test
•
•
•
•
p-value is reported in as:
• reject the null—p<.05
• fail to reject the null—p>.05
z test results statement include the following parts:
• sample mean; (M=63)
• z calculated with the degrees of freedom in parentheses; (z(99) = 2.5)
• to calculate degrees of freedom (df); df = n - 1
• in our example, n=100, so df= n-1 = 100 - 1 = 99
• alpha level; (p< .05)
• two-tailed or one-tailed
include population mean and SD (μ=60, σ=12)
Example from one-tailed test: Participation (M=63) in the science program did
significantly increase achievement scores; z(99)=2.5, p<.05, one-tailed; when compared to
the population (μ=60, σ=12).
Video #6: In-Class Practice Problems
Page 35
Complete the process of hypothesis testing for each of the scenarios.
1. A high school counselor created preparation course for the SAT-verbal (μ=500, σ=100). A random
sample of n = 16 students complete the course and then take the SAT. The sample had a mean score
of X = 554. Does the course have a significant affect on SAT scores? Test at the .01 level.
Z-test results:
μ - mean of Variable (Std. Dev. = 100)
H0 : μ=500
HA : μ not equal 500
Variable
var1
n
Sample Mean
16
Std. Err.
554
25
Z-Stat
P-value
2.16
0.0308
a. Alternative hypothesis in sentence form.
b. Circle:
one-tailed
or
two-tailed
c. Write the alternative and null hypotheses using correct notation.
H 1:
H0:
d. zcalculated =
f. Circle:
e. Level of significance (p) =
reject null
or
fail to reject null
g. Write your conclusion in sentence form.
Page 36
2. A researcher believes that children who grow up as an only child develop vocabulary skills at a faster
rate than children in large families. To test this, a sample of n = 25 four-year-old only children are
tested on a standardized vocabulary test (μ=60, σ=10). The sample obtains a mean of X = 63.8. Test
at the .05 level.
Z-test results:
μ - mean of Variable (Std. Dev. = 10)
H0 : μ=10
HA : μ > 10
Variable
var1
n
Sample Mean
25
Std. Err.
63.8
Z-Stat
2
26.9
P-value
<0.0001
a. Alternative hypothesis in sentence form.
b. Circle:
one-tailed
or
two-tailed
c. Write the alternative and null hypotheses using correct notation.
H 1:
H0:
d. zcalculated =
f. Circle:
e. Level of significance (p) =
reject null
or
fail to reject null
g. Write your conclusion in sentence form.
There was an error when
conducting this test. The
population mean is NOT 10
but rather 60. The result
is still significant, but the
z-statistics would have
been 1.93 with p=.03.
Page 37
3. A psychologist investigates IQ among autistic children to determine if their IQ is
significantly different from the norm. Using a standardized IQ test (μ=100, σ=10), he tests
10 autistic children, all age 12. The following output was generated using StatCrunch. Test
at α = .05. Sample data are: 105, 110, 130, 150, 185, 100, 125, 95, 85, 120
Z-test results:
μ - mean of Variable (Std. Dev. = 10)
H0 : μ=100
HA : μ not equal 100
Variable
n
var1
10
Sample Mean
Std. Err.
Z-Stat
P-value
120.5 3.1622777 6.4826694 <0.0001
a. Alternative hypothesis in sentence form.
b. Circle:
one-tailed
or
two-tailed
c. Write the alternative and null hypotheses using correct notation.
H 1:
H0:
d. zcalculated =
f. Circle:
e. Level of significance (p) =
reject null
or
fail to reject null
g. Write your conclusion in sentence form.
Page 38
Video #7: The t Statistic
To use the z score as a test statistic, we must know the population standard deviation in order
to calculate the standard error of sample means. Unfortunately, most of the time we do not
know
σ, so what do we do?
The t statistic, commonly known as a t test, allows us to compare the sample to the null by using
the sample standard deviation to estimate the standard error of sample means.
estimated standard error (sX) =
s
n
The t statistic uses a formula very similar to z but instead utilizes the estimated standard
error.
t=X-μ
z= X-μ
σX
sX
Tip on when to use which:
• if you know σ, then use z
• if you don’t know σ, use t
Since we are comparing a single sample mean to a population mean, this t test is called Single
Sample t Test or One Sample t Test.
The t Distribution
Since the t statistic utilizes the estimated standard error (sX), the t distribution only
approximates the normal distribution and is based on degrees of freedom
• (df = n - 1) not the total sample size.
• as df and sample size increase, the closer the s represents σ, and the better the t
distribution approximates the normal (z) distribution
• since the t distribution has more variability, it is more spread out and flatter
• we use the t statistic in a very similar way as we used z, in that we use a t distribution
table to find the probability of a t statistic
• note: since the t statistic is dependent on degrees of freedom, the critical t statistics
corresponding to levels of significance (α) vary with the degrees of freedom, unlike the
critical z scores (where a two-tailed test at .05 will always corresponds to zcritical = ± 1.96)
Summary Table of Hypotheses Notation (applies values from following example)
Alternative
Null
One-tailed
H1: μ > 27
H0: μ ≤ 27
Two-tailed
H1: μ ≠ 27
H0: μ = 27
Page 39
Reporting of Results of the t Test
t Test results statement include the following parts:
• results with sample mean and standard deviation; (M = 24.58 , SD = 3.48 )
• t calculated with the degrees of freedom in parentheses; (t(11) = -2.40)
• alpha level or p-value; (p< .05)
• two-tailed or one-tailed
Example:
Subjects (M = 24.58 , SD = 3.48) spent significantly less time talking to parents than the
therapist’s claim; t(11) = -2.40, p< .05, two-tailed.
Assumptions of the t test: independent observations, normal population
Putting it all togetherÆExample of a two-tailed t test
A family therapist states that parent talk to their teens an average of 27 minutes per week.
Surprised by this claim, a counselor collects data on 12 teens and finds the following (X = 24.58,
s = 3.48) Does the amount of parent talk for the sample significantly differ from the
therapist’s claim? Test at the .05 level.
•
Step 1: Develop hypotheses
• State Alternative: Amount of parent talk for sample will significantly differ from the
norm.
• Determine if it is a one-tailed or two-tailed test.
• It is non-directional hypothesis ------>two-tailed
• H1: μ≠ 27 (samples will be different)
• H0: μ= 27 (samples will NOT be different)
•
Step 2: Establish significance criteria
• ComputerÆ α=.05
• Hand calculationsÆIdentify tcritical used for the alpha level, the appropriate test, & df
• two-tailed test at .05 (df =11) corresponds to tcritical = ± 2.201
•
Step 3: Collect and analyze sample data
• ComputerÆenter and analyze data
• Hand calculationsÆ
• Calculate estimated standard errorsx =
s
n
= 3.48 = 3.48 = 1.01
12
3.46
Page 40
•
Step 4: Compare sample data to null------>calculate test statistic
• ComputerÆIdentify test statistic and p-value in output
o t(11)=-2.396, p=.019
o p-value (.019) is less than alpha (.05)Æso it is significantÆreject null
ƒ Hand CalculationsÆ
• Convert the sample mean into a t statistic to determine if it falls into the
critical region.
tcalculated = X - μ = 24.58 - 27 = -2.42
sX
1.01
1.01
•
= -2.396
it exceeds -2.201, so
it is sig., reject null
Step 5: Draw conclusion
• Amount of parent talk for sample (M = 24.58, SD = 3.48) significantly differs
from the norm; t(11)=-2.396, p<.05, two-tailed.
Page 41
Video #7: In-Class Practice Problems
1. On a standardized spatial skills task, normative data reveals that people typically get μ = 15
correct solutions. A psychologist tests n = 7 individuals who have brain injuries in the right
cerebral hemisphere. For the following data, determine whether or not right-hemisphere
damage results in reduced performance on the spatial skills task. Test at the .05 level.
Data: 12, 16, 9, 8, 10, 17, 10
T-test results:
μ - mean of Variable
H0 : μ = 15
HA : μ < 15
Variable
Sample Mean
var1
11.714286
a. Independent Variable =
One-tailed
1.3222327
DF
6
Two-tailed
e. Write the alternative and null hypotheses using correct notation.
H1:
H0:
f. tcalculated =
i.
g. Level of significance (p) =
reject null
or
fail to reject null
Write your conclusion in sentence form.
P-value
-2.4849744
Scale (circle): Categorical
d. Alternative hypothesis in sentence form.
h. Circle:
T-Stat
Scale (circle): Categorical
b. Dependent Variable =
c. Circle:
Std. Err.
0.0237
Quantitative
Quantitative
Page 42
2. A researcher would like to examine the effects of humidity on eating behavior. It is know
that laboratory rats normally eat an average of μ = 21 grams of food each day. The
researcher selects a random sample of n = 25 rats and places them in a controlledatmosphere room where the relative humidity is maintained at 90%. On the basis of this
sample, can the researcher conclude that humidity affects eating behavior. Test at the .05
level.
T-test results:
μ - mean of Variable
H0 : μ = 21
HA : μ not equal 21
Variable Sample Mean
var1
Two-tailed
e. Write the alternative and null hypotheses using correct notation.
H1:
H0:
f. tcalculated =
i.
g. Level of significance (p) =
reject null
or
fail to reject null
Write your conclusion in sentence form.
P-value
24 -6.1593122 <0.0001
Scale (circle): Categorical
d. Alternative hypothesis in sentence form.
h. Circle:
T-Stat
Scale (circle): Categorical
b. Dependent Variable =
One-tailed
DF
16.12 0.79229623
a. Independent Variable =
c. Circle:
Std. Err.
Quantitative
Quantitative
Page 43
3. Does the average age of students enrolled in EDFI 641 differ significantly from the
average age of BGSU grad students (24 years)? Test at the .01 level.
T-test results:
μ - mean of Variable
H0 : μ = 24
HA : μ not equal 24
Variable Sample Mean
var1
27.125 1.4314183
a. Independent Variable =
One-tailed
Two-tailed
e. Write the alternative and null hypotheses using correct notation.
H1:
H0:
f. tcalculated =
i.
g. Level of significance (p) =
reject null
T-Stat
15 2.1831493
Scale (circle): Categorical
d. Alternative hypothesis in sentence form.
h. Circle:
DF
Scale (circle): Categorical
b. Dependent Variable =
c. Circle:
Std. Err.
or
fail to reject null
Write your conclusion in sentence form.
P-value
0.0453
Quantitative
Quantitative
Page 44
Video #8: t Test of Independent Samples
So far, we have only used one sample to draw inferences about one population. What if we want to
compare two different groups, such as male vs female or Treatment A students vs Treatment B
students?
t Test of Independent Samples draws conclusions about two populations by comparing two samples; since
we are looking at differences between the two samples and the two populations, the t statistic reflects
these multiple comparisons
tsingle sample = X - μ
sX
tind samples = (X1 - X2) - (μ1 − μ2)
sX1 - X2
where sX1 - X2 =
sp2
n1
+
sp 2
n2
Recall, that for the single sample t test, we calculated the estimated standard error. Since we are now
comparing two samples to two populations, we calculate the standard error of sample mean differences.
Standard error of sample mean differences —total amount of error involved in using two sample means
to approximate two population means (averages the error of the two sources).
• However, the preceding formula for sX1 - X2 is only appropriate when the two samples are the
same size. To correct for the bias in sample variances, we need to combine the two sample
variances into a single value called pooled variance.
Pooled Variance—averages the two sample variances, which allows the bigger sample to carry
more weight.
pooled variance =
sp2 =
SS1 + SS2
df1 + df2
• Using the pooled variance, we can now calculate an unbiased measure of the standard
error of sample mean differences:
sX1 - X2
=
sp2
n1
+ sp 2
n2
Hypothesis Testing with t Test of Independent Samples
t Test of Independent Samples used to test a hypothesis about the mean difference between two
populations
•
•
null hypothesis reflects no difference
alternative hypothesis reflects a difference
One-tailed
Two-tailed
•
•
Alternative
H1: μ1 > μ2 OR H1: μ1 − μ2 > 0
H1: μ1 ≠ μ2 OR H1: μ1 - μ2 ≠ 0
H0: μ1 ≤ μ2
H0: μ1 = μ2
Null
OR H0: μ1 − μ2 ≤ 0
OR H1: μ1 - μ2 = 0
rejection of null------>data indicate a significant difference between the two populations
failure to reject null------>data indicate NO significant difference between the two populations
Assumptions about t test of independent samples: independent observations, each population must be
normal and have equal variances (homogeneity of variance).
Page 45
Putting it all togetherÆExample of a one-tailed t test
A psychologist would like to examine the effects of fatigue on mental alertness. An attention test is
prepared that requires subjects to sit in front of a blank TV screen and press a response button each
time a dot appears on the screen. A total of 110 dots are presented during a 90 minute period, and the
psychologist records the number of errors for each subject. Two groups of subjects are selected. The
first group (n =5) is test after they have been awake for 24 hours (X = 34, SS = 63). The second group
(n=10) is tested in the morning after a full night’s sleep (X = 24, SS = 100). Can the psychologist conclude
that fatigue significantly increases errors on an attention task? Test at .05 level.
•
Step 1: Develop hypotheses
• State alternative: Fatigue will significantly increase the number of errors on an attention task.
• It is directional hypothesis ------>one-tailed
H1: μfatigue > μrested
H0: μfatigue ≤ μrested
•
Step 2: Establish significance criteria
• ComputerÆα=.05
• Hand calculationsÆIdentify tcritical used for the alpha level, the appropriate test, and df
• one-tailed test at .05 (df =13) corresponds to tcritical = +1.771
•
Step 3: Collect and analyze sample data;
• ComputerÆ
• Hand calculationsÆCalculate pooled variance
pooled variance = sp2 = SS1 + SS2 = 63 + 100
df1 + df2
4+9
•
Calculate standard error of sample mean differences
sX1 - X2 =
•
= 163 = 12.54
13
sp2
n1
+
sp2
n2
= 12.54 + 12.54 = 2.51 + 1.25 = 1.94
5
10
Step 4: Compare sample data to null------>calculate test statistic
• ComputerÆreview output
Two Sample T-test results (with pooled variances):
μ1 - mean of var2 where var1=1
μ2 - mean of var2 where var1=2
H0 : μ1 - μ2 = 0
HA : μ1 - μ2 > 0
Difference
μ1 - μ2
ƒ
•
Sample Mean
10
Std. Err.
1.9360149
DF
13
T-Stat
5.1652493
P-value
<0.0001
Identify test statistic and p-value in outputÆt(13)=5.17, p<.0001
Compare p-value to alpha levelÆ p is less than .05Æreject null
• Hand calculationsÆCalculate t
• tind samples = (X1 - X2) - (μ1 − μ2)
sX1 - X2
Page 46
=
(34 - 24) - 0 = 10
1.94
1.94
= 5.15
• tcalculated > t critical, reject null
• Step 5: Draw conclusion
•
Null is rejects so alternative hypothesis is restated as conclusion
• Fatigue significantly increased the number of errors in attention task; t(13)=5.17,
p<.0001, one-tailed.
Some additional thoughts when comparing groups:
•
Create frequency polygons for each group to decide which measure of central tendency is
appropriate and if they follow a normal distribution
•
If possible use information about known groups, such as norms from standardized tests, to
compare sample data
•
Calculate effect size as a measure of the magnitude of a difference between the two
groups. This has become very important in recent years.
• A t test will not calculate effect size. You must calculate it by hand.
o A common index of effect size (r2) Percentage of Variance accounted for
•
effect size (r2) =
t2
t2 + df
•
Typically an effect size of 0.50 (50%)or larger signifies an important difference
•
Use inferential statistics very cautiously especially when dealing with non-random
samples-be very careful in generalizing your results to the population
Page 47
In-Class Practice Problems
1. Extensive data indicate that first-born children develop different characteristics than later-born
children. For example, first-borns tend to be more responsible, hard working, higher achieving, and
more self-disciplined than their later-born siblings. The following data represent scores on a test
measuring self-esteem and pride. Samples of n=10 first-born college freshman and n=20 later-born
freshmen were each given the self-esteem test. Do these data indicate a significant difference?
Test at the .05 level.
Summary statistics for var2 grouped by var1
var1
n
Mean
Variance
Std. Dev.
1
10
43.1
17.211111
4.1486278
1.3119112
43.5
2
20
36.8
25.010527
5.0010524
1.1182693
36.5
Two Sample T-test results (with
pooled variances):
μ1 - mean of var2 where var1=1
μ2 - mean of var2 where var1=2
H0 : μ1 - μ2 = 0
HA : μ1 - μ2 not equal 0
Std. Err.
Difference
Median
Sample Mean
Range
Max
Q1
Q3
14
36
50
40
46
18
30
48
33
40
Std. Err.
6.3
μ1 − μ2
Min
1.8372631
DF
28
T-Stat
3.4290135
P-value
0.0019
a. Independent Variable =
Scale (circle): Categorical
Quantitative
b. Dependent Variable =
Scale (circle): Categorical
Quantitative
c. Circle:
One-tailed
Two-tailed
d. Alternative hypothesis in sentence form.
e. Write the alternative and null hypotheses using correct notation.
H1:
H0:
f. tcalculated =
h. Circle:
i.
g. Level of significance (p) =
reject null
or
fail to reject null
Write your conclusion in sentence form.
j. effect size r2=
Page 48
2.
Does level of anxiety (measured on a scale from 1 to 10) when enrolling in a statistics class differ by
gender? Test at the .05 level.
Summary statistics for var2 grouped by var1
var1
n
Mean
Variance
1
2
Std. Dev.
10
7.1
8.1
2.8460498
10
5.6
6.711111
2.5905812
Std. Err.
Median
Range
Min
Max
Q1
Q3
0.9
8
7
3
10
4
10
0.8192137
5
7
3
10
4
7
Two Sample T-test results (with pooled variances):
μ1 - mean of var2 where var1=1
Difference
Sample Mean
μ2 - mean of var2 where var1=2
1.5
μ1 - μ2
H0 : μ1 - μ2 = 0
HA : μ1 - μ2 not equal 0
Std. Err.
1.2170091
DF
18
T-Stat
1.2325299
P-value
0.2336
a. Independent Variable =
Scale (circle): Categorical
Quantitative
b. Dependent Variable =
Scale (circle): Categorical
Quantitative
c. Circle:
One-tailed
Two-tailed
d. Alternative hypothesis in sentence form.
e. Write the alternative and null hypotheses using correct notation.
H1:
H0:
f. tcalculated =
h. Circle:
g. Level of significance (p) =
reject null
or
fail to reject null
j. effect size r2=
i.
Write your conclusion in sentence form.
Page 49
Additional Practice: Interpreting Research Articles
t-test of Independent Sample
Read the following excerpt to complete the questions on the next page:
Researchers studied women enlisted in the Navy and examined the impact of sexual harassment
on their satisfaction with the military. Among the participants, 436 were sexually harassed and 582
were not. Participants completed a 7-item question that utilized a 5 point scale in which higher scores
indicate more positive perceptions. Item 3 scores have been reversed to align with the positive nature of
the other items.
Table 1. Mean responses and t-test results
Question
1.
2.
3.
4.
5.
I would recommend the Navy to others.
I am satisfied with my rating.
I plan to leave the Navy because I am dissatisfied.
My experiences have encouraged me to stay in the Navy.
This command provides the information people need to make
decisions about staying in the Navy.
6. In general, I am satisfied with the Navy.
7. I intend to stay in the Navy for at least 20 years.
t
Mean
Harassed
3.31
3.24
3.17
2.24
2.71
Mean
Not Harassed
3.60
3.56
3.67
2.58
3.00
3.76*
4.02*
5.89*
4.56*
3.80*
3.29
2.66
3.68
3.22
5.41*
5.63*
* indicates p<.001
Source: Newell, C.E., Rosenfeld, P., & Culbertson, A. L. (1995). Sexual harassment experiences and equal
opportunity perceptions of Navy women. Sex Roles, 32, 159-168.
1. Which group of Navy women is more likely to recommend the Navy to others? In other words, which
group has the higher mean for item one?
2. Is the mean difference for item 1 statistically significant?
3. Should we reject the null hypothesis for item 1? Explain.
4. How many items generated statistically significant mean differences?
5. In general, what can we conclude about sexual harassment and navy satisfaction?
Answers: 1) Those who have NOT been sexually harassed have the higher mean and are more likely to recommend
the Navy to others; 2) Yes, it is significant at the p<.001 level. 3) Yes, the t result is significant at p<.001.; 4) all
items were significant; 5) Navy women who have NOT been sexually harassed are more satisfied with the Navy
than those who have been sexually harassed.
Video #9: t Test of Related Samples
Page 50
Many times research evaluates the effect of a treatment by uses a pretreatment and post
treatment design with a single sample, this is called a repeated measures study.
• since the test uses the same sample, there is no risk that one group is different from
another even before the treatment begins.
• researchers try to build upon this concept when studying two samples by matching
subjects from the two groups--this helps to eliminate pretreatment differences
• t test of related samples compares the differences between the pre and post
treatment scores of the sample to pre-post differences in the population.
•
difference score = D = X2 - X1
•
Mean of differences (D) = ΣD
n
Computing the t of related samples
•
Recall tsingle sample = X - μ
sX
•
For t of related samples, the sample data are the difference scores (D) and the
population data we are interested in is NOT the population mean but the population
mean difference (μD), therefore,
t related samples = D - μD
sD
where sD = s
n
• We are not comparing means of the pre and post, rather the pre and post scores
for each individual are compared!
Developing the hypotheses:
Alternative
Null
One-tailed
H 1 : μD > 0
H 0 : μD ≤ 0
Two-tailed
H 1 : μD ≠ 0
H 0 : μD = 0
Assumptions of the related samples t test
• independent observations, normal distribution of pop of differences
Page 51
Putting it all togetherÆExample of a one-tailed t test
A researcher is interested in studying the effects of endorphins (the feeling-good chemical
that is released in the brain at the end of aerobic exercise) on pain tolerance. A sample of 16
subjects is obtained; each person’s tolerance for pain is tested before and after a 50 minute
session of aerobic exercise. On the average, the pain tolerance for the sample was D =10.5
higher after exercise than it was before. The SS for the sample difference scores was SS =
960. Do these data indicate a significant increase in pain tolerance following exercise. Test at
the .01 level.
•
Step 1: Develop hypotheses
• State alternative—Exercise will significantly increase pain tolerance
• It is directional hypothesis ------>one-tailed
H1: μD > 0
H0: μD ≤ 0
•
Step 2: Establish significance criteria
• ComputerÆ α=.01
• Hand calculationsÆIdentify tcritical used for the alpha level, the appropriate test, and df
• one-tailed test at .01 (df =15) corresponds to tcritical = +2.602
•
Step 3: Collect and analyze sample data
• ComputerÆ
• Hand calculationsÆ
• Calculate sample mean of D (D): D = 10.5
• Calculate standard deviation of D scores
s=
•
SS =
n-1
960 =
15
64 = 8
Calculate estimated standard error of D
sD =
s = 8
n
16
= 2
•
Step 4: Compare sample data to null------>calculate test statistic
• ComputerÆ
• Identify test statistic and p-value; t(15)=5.25, p<.001
• Compare p-value with alpha level
• .001 is less than .01Æ reject null
• Hand calculationsÆCalculate t
trelated samples = D - μD = 10.5 = 5.25 it exceeds tcriticalÆreject null
2
sD
•
Step 5: Draw conclusion
• Aerobic exercise significantly increased pain tolerance; t(15)=5.25, p<.001, one-tailed.
Page 52
In-Class Practice Problems
1. An investigator for NASA examines the effect of cabin temperature on reaction time. A
random sample of 10 astronauts and pilots is selected. Each person’s reaction time to an
emergency light is measured in a simulator where the cabin temperature is maintained at 70
degrees F and again the next day at 95 degrees. Using the results of this experiment, can
the psychologist conclude that temperature has a significant effect on reaction time. Test
at the .01 level.
Summary statistics
Column
n
var1
10
var2
10
Mean
Variance
Std. Dev.
Std. Err.
Median Range Min Max
Q1
Q3
203
381.55554
19.533447
6.177018
205.5
55 176
231
183
216
223
417.1111
20.423298
6.458414
224
65 190
255 206 240
Paired T-test results:
μD - mean of the differences between var1 and var2
H0:μD = 0
HA:μD not equal 0
Difference
Sample Diff.
var1 - var2
Std. Err.
-20
1.67332
DF
T-Stat
9
P-value
-11.952286
<0.0001
a. Independent Variable =
Scale (circle): Categorical
Quantitative
b. Dependent Variable =
Scale (circle): Categorical
Quantitative
c. Circle:
One-tailed
Two-tailed
d. Alternative hypothesis in sentence form.
e. Write the alternative and null hypotheses using correct notation.
H 1:
H0:
f. tcalculated =
h. Circle:
g. Level of significance (p) =
reject null
or
fail to reject null
i. Write your conclusion in sentence form.
Page 53
2. Does eating oatmeal decrease cholesterol levels? A researcher implements a 30-day
treatment that consists of eating a bowl of oatmeal everyday for breakfast.
Cholesterol is measured before (var1) and after (var2) the treatment for the 10
participants. An α = .05 was utilized.
Summary statistics
Column
n
Mean
Variance
Std. Dev.
Std. Err.
Median
var1
10
258.2
192.4
13.870832
4.3863425
var2
10
222
269.33334
16.411379
5.1897335
Range
Min
Max
Q1
Q3
257.5
40
240
280
245
270
221
56
190
246
210
230
Paired T-test results:
μD - mean of differences between var1 and var2
H0:μD = 0
HA:μD > 0
Difference
Sample Diff.
var1 - var2
36.2
Std. Err.
4.319979
DF
T-Stat
9
8.379669
P-value
<0.0001
a. Independent Variable =
Scale (circle): Categorical
b. Dependent Variable =
Scale (circle): Categorical
c. Circle:
One-tailed
Two-tailed
d. Alternative hypothesis in sentence form.
e. Write the alternative and null hypotheses using correct notation.
H 1:
H0:
f. tcalculated =
h. Circle:
g. Level of significance (p) =
reject null
or
fail to reject null
i. Write your conclusion in sentence form.
Quantitative
Quantitative
Page 54
Additional Practice: Interpreting Research Articles
t-test of Related Samples
Read the following excerpt to complete the questions on the next page:
Seventy-four drug users participated in a Behavioral Counseling Program to reduce drug use.
Among the participants, 75% were male, 75% were adults, 12% were minority, and 25% were mandated
to obtain counseling by a public agency. With respect to drug use, about 50% used cocaine and 75% used
marijuana. The Behavioral Counseling Program consisted of three parts: 1) stimulus control, including
competing response training; 2) urge control procedure for interrupting incipient drug use urges,
thoughts, and actions; and 3) behavior contracting, especially between youth and parents. Drug use was
measured at the beginning of treatment, the end of treatment, and one month after treatment. Drug
use decreased substantially from pretreatment to the end of treatment ( t=4.28, p<.001) with slight,
nonsignificant decrease from end of treatment to the follow-up month ( t=.92,p=.72). The decrease from
pretreatment to follow-up remained statistically significant ( t=4.42, p<.001).
Source: Azrin, N. H., Acierno, R., Kogan, E. S., Donohue, B., Besalel, V. A., & McMahon, P.T. (1996). Follow-up results
of supportive versus behavioral therapy for illicit drug use. Behavior Research and Therapy, 34, 41-46.
1. As is customary in journal article, the research did not state the null hypothesis. Write the
appropriate null hypothesis for the first t-test result reported in the excerpt.
2. Should the null hypothesis written for item 1 be rejected? Explain.
3. Should the null hypothesis be rejected for the second t test reported in the excerpt. Explain.
4. The last difference in the excerpt was statistically significant at the .001 level. Was it also
significant at the .05 level?
Answers: 1)The treatment of Behavioral Counseling Program will NOT significantly reduce drug use among
participants. 2) Yes, since the p-value is less than .05. 3) No, the p-value is greater than .05. 4)Yes, If it is
significant at p<.001 then it is also significant at p<.05.
Coke vs. Pepsi Experiment: t tests
Page 55
We are going to conduct an experiment using the Coke vs. Pepsi Taste Test that investigates two
research questions:
1) Are diet drinkers (when compared to regular drinkers) more accurate in tasting the
difference between Coke and Pepsi?
•
This question will utilize a t-test of independent samples, which you can complete for 5
points of extra credit (Extra Credit #1).
2) When tasting the difference between Coke and Pepsi, is one’s prediction of accuracy
significantly different from one’s actual ability/accuracy?
• This question will utilize a t-test of related samples, which you will complete for 5
points of extra credit (Extra Credit #2).
In order to complete this experiment, you need at least one other person (who has the same pop
preference as you) to participate. It would be great if you can find 2-4 more individuals.
Directions:
1. Identify your pop preference (Diet or Regular).
• If you prefer diet pop, purchase one can/bottle of Diet Coke and one of Diet Pepsi.
• If you prefer regular, purchase can/bottle of Coke and one of Pepsi.
2. In addition to the pop, you will need the following supplies to complete this experiment.
• 5 small paper cups for each participant
• Pen or pencil
• Napkins in case you spill
• Pretzels or chips for “cleansing one’s palate”
3. Once you have your supplies and participants together, record each participant’s name in the first
column of the data grid below and one’s preference (diet=1, regular=2) in the second column.
Data Grid
Name
Preference
Prediction %
Actual %
4. Have each participant predict how accurate they will be in identifying the pop as Coke or Pepsi.
Since each person will be given 5 cups of pop, predict how many times out of 5 chances you will be
correct in the identification process (e.g., 3/5). Then, convert that fraction into a percent (e.g.,
3/5=60%). Record this percent in the third column of the grid.
5. Determine who will complete the taste test first. Have that person turn away while another
participant fills 5 cups with pop (make sure that some cups have Pepsi and other cups have Coke
Page 56
and that you know which cups have which pop). Hint: Don’t write the name of the pop on the
bottom of the cup; it will show through as the person drinks the pop.
6. Have the taste tester proceed in identifying the pop in each cup, while another participant
records the accuracy. Don’t tell the results to the taster until all 5 cups have been tasted.
Calculate the number of correct tastes out of five. Convert that fraction into a percent and
record the percent in column 4 of the grid.
7. Once you and your fellow participants have finished the taste test, add your results to the
spreadsheet below.
8. Go to StatCrunch and enter ALL the data from the spreadsheet (including the data provided for
15 individuals). You should have a minimum of n=17 for your sample. Proceed with the t-test
directions.
Extra Credit Worksheets are in Computer Lab Packet!
Video #10: Analysis of Variance
Page 57
Analysis of Variance (ANOVA) is a hypothesis testing procedure that evaluates mean
differences between two or more treatments or groups; t test can only compare two groups.
Single Factor Design—studies the effect that one factor (independent variable) has on the
dependent variable. Note that although there is only one factor, this factor has more than two
categories so that we are comparing two or more groups/treatments.
Hypothesis Testing for ANOVA
•
•
Null hypothesis states that there is no difference among the groups or treatments
• H0: μ1 = μ2 = μ3
Alternative hypothesis states that at least one mean is different from the others
• H1: At least one mean will differ
ANOVA Test Statistic
ANOVA creates a test statistic called an F-ratio that is similar to t statistic
t=
•
Recall that
•
F is similar to t, but since there are more than two means to compare, variance will be
used to represent the differences between all the means being compared.
F=
•
obtained difference between sample means = tsingle = X - μ
difference expected by chance (error)
sX
variance (differences ) between sample means
variance (differences ) expected by chance (error)
Like t, a large F value indicates the treatment effect (mean differences) that is unlikely
due to chance.
• when the treatment had no effect so that the means are the same (H0 is true),
the F-ratio will be close to 1.00
Distribution of F-ratios
• Like t, F is also distributed
• But the F distribution is not normal; it is positively skewed, the degree of which depends
upon the degrees of freedom from the two variances.
• large df -------> nearly all F-ratios are clustered around 1.00
• small df -------> the F-ratios are more spread out
• Since the F distribution is positively skewed, we are only looking in one tail for the
difference. As a result we don’t need to indicate if the test is one or two tailed.
• Recall: we expect F near 1.00 if the null is true and expect a large F if the null is rejected
• therefore, significant F-ratios will be in the tail of the F distribution
F
Page 58
=
variance (differences) between group means
variance (differences) expected by chance/error (within groups)
Variance (differences) between groups can be due to:
• treatment effect
• individual differences (subjects within the various groups are different even before
the treatment begins
• experimental error (caused by poor equipment, lack of attention/knowledge on the
researcher’s part, unpredictable change of events)
Variance within groups can be due to:
• individual differences (subjects within the various groups are different even before
the treatment begins
• experimental error (caused by poor equipment, lack of attention/knowledge on the
researcher’s part, unpredictable change of events)
Consequently, if we divide the variance between treatments by the variance within treatments,
(individual differences and error cancel out) so we can determine the treatment effect.
F =
variance between groups =
variance within groups
treatment effect + individual differences + error
individual differences + error
The last few steps of ANOVA require the following calculations:
• df between groups = k – 1
Æ where k is number of groups
• df within-groups = N – k
Æwhere N is total number of individuals in groups
• MS between = variance between treatments = SSbetween
df between
• MS within = variance within treatments = SSwithin
df within
• F-ratio = MS between
MS within
Page 59
Putting it all together
Example: A number of studies on jetlag have found that jetlag seems to be worse when people are
traveling east. A researcher examines how many days it takes a person to adjust after taking a long
flight. One groups flies west across time zones (NY to CA); a second group flies east (CA to NY); and a
third group takes a long flight within one time zone (San Francisco to Seattle). Perform an analysis of
variance to determine if jetlag varies for the direction of travel. Use the .05 level of significance.
Computer Results
Analysis of Variance results for var2 grouped by var1
Sample means:
Group
n
Mean
Std. Error
1
6
2.5
0.4281744
2
6
6
0.57735026
3
6
0.5
0.2236068
ANOVA table:
Source
Treatments
df
SS
MS
F-Stat
2
93
46.5
Error
15
17
1.1333333
Total
17
110
41.02941
P-value
<0.0001
Step 1: Develop hypotheses
• State alternative—Direction of travel will significantly effect jetlag.
• H0: μ1 = μ2 = μ3
H1: At least one mean will differ
Step 2: Establish significance criteria
• ComputerÆ α=.05
Step 3: Collect and analyze sample data
• ComputerÆenter data
Step 4: Compare sample data to null------>calculate test statistic
• ComputerÆ
• Identify test statistic and p-value; F(2, 15)=41.03, p<.0001
• Compare p-value with alpha level
• .0001 is less than .05Æ reject null
Step 5: Draw conclusion
• Direction of travel significantly effected jetlag.
Page 60
Post Hoc Tests
So far, we have only been able to determine if there is a significant difference (treatment had
an effect), but we are unable to determine which group is different.
We could do a t test for each comparison, but we run the risk of a type I error when we run
several hypothesis tests, called experimentwise alpha level, the overall probability of a Type I
error over a series of separate hypothesis tests.
Fortunately, there are some test that are very conservative and allow us to determine which
group is different after ANOVA has been conducted and a difference has been found; these
are called Post Hoc Tests.
The Scheffe Test is the safest post hoc test used to compare two groups/treatments. It is
safe because it uses the value of k to calculate the df and the critical F-ratio from the original
ANOVA to determine if it is significant.
Unfortunately, StatCrunch is unable to conduct Post Hoc tests!
Reporting of ANOVA Results
Much of the time an ANOVA summary table is presented that includes SS, df, and MS for each
treatment as well as the F-ratio; in addition a table of means and standard deviations for each treatment
will be presented. Using the previous example, the tables would look like the following
M
SE
Westbound
2.5
0.43
Source
Between treatments
Within treatments
Total
Eastbound
6.0
0.58
Same zone
0.5
0.22
ANOVA SUMMARY
SS
df
93
2
17
15
110
17
MS
46.5
1.13
F = 41.02
When space is an issue, the results should include the F-ration with both degrees of freedom in
parentheses and the p-value. Do NOT indicate one-tailed or two-tailed!
• Travel direction does effect jetlag; F(2, 15) = 41.02, p < .05.
Assumptions of ANOVA: independent observations, samples are selected from normal
populations that also have equal variances.
ANOVA
Page 61
In-Class Practice Problems
1. The extent to which a person’s attitude can be changed depends on how big a change you are trying
to produce. In a classic study on persuasion, Aronson, et al. (1985) obtained three groups of
subjects. One group listened to a persuasive message that differed only slightly from the subjects’
original attitudes. For the second group, there was a moderate discrepancy between the message and
the original attitudes. For the third group, there was a large discrepancy between the message and
the original attitudes. For each subject, the amount of attitude change was measured. Data were
entered for the three groups (small, moderate, large discrepancy) and an ANOVA was utilized to
determine if the amount of discrepancy between the original attitude and the persuasive argument
has a significant effect on the amount of attitude change. Test at the .05 level.
Analysis of Variance results for var2 grouped by var1
Group
n
Mean
Std. Error
1
6
1.5
0.4281744
2
6
6.6666665
0.71492034
3
6
1
0.2581989
Source
df
SS
MS
Treatments
2
118.111115
59.055557
Error
15
22.833334
1.5222223
Total
17
140.94444
F-Stat
38.79562
P-value
<0.0001
a. Independent Variable =
Scale (circle): Categorical Quantitative
b. Dependent Variable =
Scale (circle): Categorical
c. Alternative hypothesis in sentence form.
d. Write the alternative and null hypotheses using correct notation.
H1:
H0:
e. Fcalculated =
f. Level of significance (p) =
g. Circle:
reject null
or
fail to reject null
h. Write your conclusion in sentence form.
Quantitative
Page 62
2. A psychologist would like to examine the relative effectiveness of three therapy
techniques for treating mild phobias. A sample of N=15 individuals who display a
moderate fear of spiders is obtained. These individuals are randomly assigned to the
three therapies. After a certain amount of therapy, the psychologist measures the
degree of fear reported by each individual. ANOVA was conducted to determine if
there are any significant differences among the three therapies. Test at the .05
level.
Analysis of Variance results for var2 grouped by var1
Group
n
Mean
Std. Error
Source
df
SS
MS
2
20.933332
10.466666
1.7
1
5
4
0.70710677
Treatments
2
5
1.6
0.50990194
Error
12
20.4
3
5
1.4
0.50990194
Total
14
41.333332
F-Stat
6.1568627
P-value
0.0145
a. Independent Variable =
Scale (circle): Categorical
Quantitative
b. Dependent Variable =
Scale (circle): Categorical
Quantitative
c. Alternative hypothesis in sentence form.
d. Write the alternative and null hypotheses using correct notation.
H0:
H 1:
e. Fcalculated =
f. Level of significance (p) =
g. Circle:
reject null
or
fail to reject null
h. Write your conclusion in sentence form.
Page 63
Additional Practice: Interpreting Research Articles
ANOVA
Read the following excerpt to complete the questions on the next page:
Researchers examined the impact of teacher self-efficacy on classroom technology use. Participants
included 101 teachers from four elementary (K-6) schools in Northwest Ohio. Of the 101 participants, 13
were male. Teachers were administered the Teacher Attribute Survey (TAS) which measured classroom
technology use (teacher, student, and overall). Teacher self-efficacy was also measured in the
instrument and represented one’s belief in affecting student performance. Low, moderate, and high
levels of self-efficacy were created. As such, a teacher with low self-efficacy was defined as 3.29 or
below, medium self-efficacy as range from 3.3 to 4.6, and high self-efficacy as 4.61 and higher.
Table 1. Means and ANOVA results for Self-Efficacy groups and Technology Use
Technology Use Means by Level of Self-Efficacy
Low (n=12)
Moderate (n=78)
High (n=11)
ANOVA Results
Teacher Tech Use
1.73
2.15
2.36
F(2,98)=3.77, p<.05
Student Tech Use
1.24
1.49
1.81
F(2,98)=4.52, p<.05
Overall Tech Use
2.08
1.82
2.08
F(2,98)=4.71, p<.05
1. Which type of technology use is the highest among all levels of self-efficacy?
2. Which group of teachers (low, moderate, or high self-efficacy) report the highest technology use
among their students?
3. Write the null hypothesis for self-efficacy and overall technology use, where the ANOVA results
indicate: F(2,98)=4.71, p<.05.
4. Considering the null hypothesis that you wrote for item 3, should the null hypothesis be rejected?
Explain.
Answers: 1) teacher technology use; 2) teachers with high self-efficacy (M=1.81); 3) Self-efficacy will
NOT significantly impact overall technology use among teachers; 4) Reject the null, F(2,98)=4.71, p<.05.
Video #11: Correlation and Regression
Page 64
Correlation—statistical technique used to measure and describe a relationship between two quantitative
variables; correlation measures 3 characteristics:
•
direction of relationship
• positive—as one variable increases so does the other (food intake & weight)
• negative (inverse)—as one variable increases the other decreases (exercise & weight)
y
y
x
Positive (r = +.90)
x
Negative (r = -.90)
•
form of relationship
• linear—the relationship between x and y falls in a straight line
• curvilinear— the relationship between x and y curves (age across the lifespan is a variable that
often creates a curvilinear relationship)
•
degree (strength) of relationship
• degree of relationship is reflected in a correlation coefficient (usually r)
• r ranges between -1 to +1, 0 indicating no relationship, while +1 indicates a perfect positive
relationship, and -1 indicates a perfect negative relationship
Pearson Correlation Coefficient
•
measures the degree and direction of linear relationship between two variables
•
r =
•
since we will be computing variability for each variable as well as their variability together, we will
be using SS and a new concept, SP, sum of products.
•
Sum of products is used to compute the amount of covariability of two variables
•
degree to which X and Y vary together =
degree to which X and Y vary separately
SP = Σ (X - X)(Y - Y)
SP
SSXSSY
Page 65
Correlation
•
•
•
does NOT measure cause and effect
when data have a limited range of scores, the value of the correlation can be exaggerated
interpreting strength of coefficient (practical significance):
• r > .8 is very strong
• r = .6 - .79 is strong
• r = .4 - .59 is fair
• r < .39 is weak
•
to describe how accurately one variable predicts the other, square r. For example, if r=.60,
then r2 = .36, which can be interpreted as 36% of the variability in Y scores can be
predicted from the relationship with X. r2 is called the coefficient of determination
because is measures the proportion of variability in one variable that can be determined
from the relationship with the other variable.
Hypothesis Testing (hypotheses use the Greek letter rho, ρ, to signify r)
One-tailed
Alternative
H 1: ρ > 0
Null
H0: ρ ≤ 0
Two-tailed
H 1: ρ ≠ 0
H0: ρ = 0
Putting it all together
Example: To measure the relationship between anxiety level and test performance, a
psychologist obtains a sample of n=6 college students from an intro stats course. Students
arrive fifteen minutes prior to the exam and complete physiological measures of anxiety (heart
rate, skin resistance, blood pressure, etc.). Anxiety ratings and exam scores are listed below.
Compute the Pearson correlation to determine if a negative relationship exists between anxiety
and test performance. Test at the .05 level.
•
Step 1: Develop hypotheses.
• State Alternative: Anxiety and test performance will negatively relate.
• It is a directional hypothesis ----Æone-tailed
H1: ρ < 0 (population shows negative correlation)
H0: ρ > 0 (population does not show negative correlation)
•
Step 2: Establish significance criteria
• ComputerÆ StatCrunch does not calculate the p-value for the correlation coefficient.
As a result, we must identify rcritical used for α, tails, and df
• df = n –2 = 6 – 2 = 4,
r critical = -.729
• Notice that df is n-2 for correlation, since we need two points to create a line.
• Hand calculationsÆIdentify rcritical used for α, tails, and df
Page 66
•
Step 3: Utilize sample data to calculate r
• ComputerÆ
• Hand calculationsÆCalculate SP, SSX, SSY, r
Anxiety Rating (X)
5
2
7
7
4
5
X= 5
•
(X - X)
0
-3
2
2
-1
0
(Y - Y)
-3
5
-3
-4
3
2
(X - X) (Y - Y)
0
-15
-6
-8
-3
0
SP = -32
(X - X)2
0
9
4
4
1
0
SSX=18
(Y - Y)2
9
25
9
16
9
4
SSY= 72
Step 4: Compare sample data to null------>calculate test statistic
• ComputerÆIdentify test statistic and compare rcalculated to rcritical
• Correlation between var2 and var1 is: -0.8888889
• r falls into critical region, it is significantÆreject null
ƒ Hand CalculationsÆ
• Calculate r =
SP
= - 32
= -32 = -.888
SSX SSY
18(72)
36
•
•
•
Exam Score(Y)
80
88
80
79
86
85
Y = 83
Compare rcalculated to rcritical
r falls into critical regionÆreject null
Step 5: Draw conclusion
• A negative relationship exists between anxiety and test performance, r(4)=-.889,
p<.05, one-tailed.
Computer Output
Correlation between var2 and var1 is: -0.8888889
Page 67
Regression
Regression—statistical technique for finding the best-fitting straight line for a set of data;
used when wanting to determine the ability of one variable to predict another variable (e.g.,
using SAT score to predict freshman college GPA)
Regression line—line that represents the linear relationship; represented by a linear equation
• Y = a + bX, where a = Y-intercept and b=slope
•
Least-squares method helps determine the best-fitting line by minimizing the error
between the predicted & actual values of Y.
•
Y = a + bX , where b = SP
SSX
and
a = Y – bX
Example: Using the correlation problem we just solved, let’s calculate the regression line.
• Step 1: Use X, Y, SSX, SP to calculate b and a
•
(previously calculated: X= 5, Y = 83, SP = -32, SSX=18, SSY= 72)
• b = SP = -32 = -1.777
18
SSX
• a = Y – bX
a = 83 – (-1.777)(5)
a = 83 + 8.888
a = 91.888
•
Step 2: Calculate regression equation
• Y = a + bX
Y = 91.89 -1.78X
We can
•
•
•
now use regression equation to predict Y for a given value of X.
If X=7, what is the predicted value of Y?
Y = 91.89 -1.78X
Y = 91.89 -1.78(7) = 79.43
Page 68
Computer Output
Computer Output: The output in the video will appear different, since a different version of StatCrunch was used.
Regression equation
Simple linear regression results:
Dependent Variable: var2
Independent Variable: var1
var2 = 91.888885 - 1.7777778 var1
Sample size: 6
R (correlation coefficient) = -0.8889
R-sq = 0.79012346
Estimate of error standard deviation: 1.9436506
Correlation coefficient
Parameter estimates:
Parameter
Estimate
Std. Err.
Intercept
91.888885
2.4241583
4
37.905483
<0.0001
-1.7777778
0.45812285
4
-3.88057
0.0178
Slope
DF
T-Stat
P-Value
Analysis of variance table for regression model:
Source DF
SS
MS
Model
1
56.88889
Error
4 15.111111 3.7777777
Total
5
F-stat
56.88889 15.058824
P-value
0.0178
Predicted
value for Y
when X=7
72
Predicted values:
X value
Pred. Y
7
79.44444
s.e.(Pred. y)
1.2120792
Ignore these pvalues since
they are NOT
for the
correlation
coefficient (r).
95% C.I.
(76.07917, 82.809715)
95% P.I.
(73.08468, 85.80421)
Video #11
In-Class Practice Problems
Page 69
1. You probably have read about he relationship between years of education and salary potential. The
following hypothetical data represent a sample of n = 10 men who have been employed for five years.
Does this data indicate a significant relationship between years of higher education and salary. Test
at the .05 level. Also find the regression equation for predicting salary from education.
(X) Years of Higher Education: 4, 4, 2, 8, 0, 5, 10, 4, 12, 0
(Y)Salary (in $1000s): 31, 29, 28, 42, 23, 35, 45, 27, 44, 24
Simple linear regression results:
Dependent Variable: salary
Independent Variable: education
salary = 23.135265 + 1.9723947 education
Sample size: 10
R (correlation coefficient) = 0.9601
R-sq = 0.92169785
Estimate of error standard deviation: 2.4466708
Parameter estimates:
Parameter
Estimate
Std. Err.
DF
T-Stat
P-Value
Intercept
23.135265
1.2611643
8
18.34437
<0.0001
Slope
1.9723947
0.20325504
8
9.704039
<0.0001
Analysis of variance table for regression model:
Source
DF
SS
MS
F-stat
Model
1
563.71045
563.71045
Error
8
47.88958
5.9861975
Total
9
611.6
94.168365
P-value
<0.0001
Predicted values:
X value
5
Pred. Y
32.99724
s.e.(Pred. y)
0.77397215
95% C.I.
(31.212456, 34.78202)
95% P.I.
(27.07964, 38.91484)
Page 70
a. Independent Variable =
Scale: Categorical
b. Dependent Variable =
Scale: Categorical
c. Circle:
One-Tailed
OR
Two-Tailed
d. Alternative hypothesis in sentence form.
e. Write the alternative and null hypotheses using correct notation.
H 1:
H0:
f. rcritical =
h. Circle:
g. rcalculated =
reject null
or
fail to reject null
i. Write your conclusion in sentence form.
j. Regression equation:
k. If one has 5 years of education, what is the predicted salary?
Quantitative
Quantitative
Page 71
2. Research has shown that similarity in attitudes, beliefs, and interests plays an important role in
interpersonal attraction. A therapist examines the correlation in attitudes between husbands (X)
and wives (Y). She administers a questionnaire that measures how liberal or conservative one’s
attitudes are. Low scores indicate that the person has liberal attitudes while high scores indicate
conservatism (scale 1-10). Ten couples participate. Test at the .01 level.
Simple linear regression results:
Dependent Variable: wife att
Independent Variable: hus att
wife att = 0.7785714 + 0.8035714 hus att
Sample size: 10
R (correlation coefficient) = 0.7869
R-sq = 0.61919034
Estimate of error standard deviation: 1.6673064
Parameter estimates:
Parameter
Estimate
Std. Err.
DF
T-Stat
P-Value
Intercept
0.7785714
1.4370375
8
0.54178923
0.6027
Slope
0.8035714
0.22280319
8
3.6066422
0.0069
Analysis of variance table for regression model:
Source
DF
SS
MS
Model
1
36.160713
36.160713
Error
8
22.239286
2.7799108
Total
9
58.4
F-stat
P-value
13.007869
0.0069
Predicted values:
X value
5
Pred. Y
4.7964287
s.e.(Pred. y)
0.57239175
95% C.I.
(3.4764907, 6.1163664)
95% P.I.
(0.73135275, 8.861505)
Page 72
a. Independent Variable =
Scale: Categorical
b. Dependent Variable =
Scale: Categorical
c. Circle:
One-Tailed
OR
Quantitative
Quantitative
Two-Tailed
d. Alternative hypothesis in sentence form.
e. Write the alternative and null hypotheses using correct notation.
H 1:
H0:
f. rcritical =
h. Circle:
g. rcalculated =
reject null
or
fail to reject null
i. Write your conclusion in sentence form.
j. Regression equation:
k. If the husband has moderate attitude of “5”, what is the value of the wife’s attitude?
Page 73
Additional Practice: Interpreting Research Articles
Correlation
Read the following excerpt to complete the questions on the next page:
Boivin and Hymel (1997) examined the relationships among social behavior, peer experiences and selfperception. A total of 793 French Canadian children participated in the study (393 girls, 400 boys). The
participants ranged from third to fifth grade, were from ten elementary schools and from a variety of
socioeconomic backgrounds. The following variables were measured:
ƒ Aggression and withdrawal were measure by showing a picture of all classmates and asking each
student to choose two classmates who best fit each descriptor. For aggression, a score was obtained
for each child by summing the number of times he or she was selected for these descriptors: “gets
into lots of fights,” “loses temper easily,” “too bossy,” and “picks on other kids.” For withdrawal, a
score was obtained for each child by summer the number of times he or she was selected for these
descriptors: “rather play alone than with others” and “very shy.”
ƒ Social preference was assessed by asking each child to name three other children they would like
most and like least for playing together, inviting others to a birthday party, and sitting next to each
other on a bus (Higher scores indicate greater social preference.)
ƒ Victimization by peers was measure by asking each child to nominate up to five other students who
could be described as being made fun of, being called names, and getting hit and pushed by other
kids. (Higher scores indicated greater victimization.)
ƒ Number of affiliative links was measured by asking, “You have probably noticed children in class who
often hang around together and others who are more often alone. Could you name children who often
hang around together?” (Higher scores indicate a larger number of affiliative links.)
ƒ Loneliness was measured with a 16-item questionnaire with higher scores indicating greater
loneliness.
ƒ Perceived social acceptance and behavior-conflict were two aspects of self-concept measured with
Harter’s Self-Perception Profile for Children. Higher scores reflect a better self-concept in each of
the two domains.
Table 1. Correlations among the social behavior, peer expectation, and self-perception measures
1
1. Withdrawal
2. Aggression
3. Social Preference
4. Victimization by Peers
5. # of Affiliate Links
6. Loneliness
7. Perceived social acceptance
8. Perceived behavior-conduct
--.10
-.39
.42
-.35
.29
-.27
.06
2
--.44
.53
.05
.12
-.04
-.32
3
--.68
.35
-.34
.28
.17
4
--.21
.34
-.26
-.17
5
6
--.18
.18
-.06
--.69
-.35
7
-.39
Source: Boivine, M. & Hymel, S. (1997). Peer experiences and social self-perceptions: A sequential
model. Developmental Psychology, 33, 135-143.
Notice that the correlation coefficients are presented in a matrix. The column header represent the same
variables presented in the row headers, however the column header only uses the number to indicate a certain
variable. For example, the circle coefficient of .39, represents the correlation between “Perceived Social
Acceptance” and “Perceived Behavior Conduct”.
8
--
Page 74
-861. What is the value of the Pearson r for the relationship between withdrawal and loneliness? Describe
this value in terms of strength and direction.
2. What is the value of the Pearson r for the relationship between social preference and victimization
by peers? Describe this value in terms of strength and direction.
3. Which variable has the strongest relationship with withdrawal?
4. Which variable has the weakest relationship with withdrawal?
5. The Pearson r for the relationship between withdrawal and loneliness indicates that those who tend
to be more lonely tend to be:
A. more withdrawn
B. less withdrawn
6. Which of the following pairs has the strongest relationship between them?
A. Perceived social acceptance and loneliness
B. Withdrawal and victimization by peers
C. Number of affiliate links and aggression
7. Which of the following pairs has the weakest relationship between them?
A. Withdrawal and social preference
B. Withdrawal and perceived social acceptance
C. Withdrawal and perceived behavior-conduct
Answers: 1) .29, weak and positive; 2) -.68, strong and negative; 3) Victimization by peers,
r=.42; 4) Perceived behavior-conduct, r=.06; 5) A, more withdrawn; 6) A; 7) C.
Video #12: Chi Square Test for Independence
Page 75
So far we have used parametric tests to evaluate a hypothesis about the population. Parametric
tests require certain assumptions about the population parameters, such as a normal
distribution, homogeneity of variance, and a quantitative (interval/ratio) dependent variable.
When these assumptions for parametric tests cannot be fulfilled, nonparametric tests can
be used.
Nonparametric tests
•
•
•
•
usually do not state a hypothesis in terms of the population distribution, so they are often
called distribution-free tests
are suited for data that utilize a nominal or ordinal scale
are not as sensitive as parametric tests—are more likely to fail in detecting a real
difference between two treatments
one commonly used nonparametric tests is the Chi Square Test for Independence.
Chi Square Test of Independence
•
Used to test a relationship (differences) between two categorical variables
•
If variables are independent of one another, then there is no relationship. As a result the
distribution of one variable will have the same shape for all the categories of the second
variable.
•
Alternative hypothesis for Chi Square Test for Independence can be written to focus on the
relationship or on the differences.
• H1: Gender is related to learning style.
• H1: Learning style will differ by gender.
•
Chi Square Test for Independence compares the observed and expected frequencies. Our
expected frequencies come from our null hypothesis and our observed data.
χ2 =
(fo-fe)2
fe
Building on our example of females and males with respect to learning styles, the table below
presents the data observed for a sample of 125 males and 75 females.
Males
Females
Audio
30
30
60
Visual
30
25
55
Kinesthetic
65
20
85
125
75
Page 76
•
If the distribution for gender is predicted to be the same for the each learning style
category, then the same proportion/percent of males and females in each category would be
expected.
•
to calculate the expected frequency for each category this formula is used
• fe = fcfr
where fc = column total, fr = row total,
n
n = sample size
•
the table of expected frequencies would look something like this
•
Audio
Visual
Kinesthetic
Males
60(125)/200=38
55(125)/200=34
85(125)/200=53
125
Females
60(75)/200=22
55(75)/200=21
85(75)/200=32
75
60
55
85
Degrees of freedom are calculated a bit differently
• df = (R - 1)(C - 1), where R= number of rows, C=number of columns
• in our example, df = (2-1)(3-1) = 1(2) = 2
• using this and α=.05, our χ2critical = 5.99
Page 77
Putting it all together
Example: Based upon the observed frequencies presented in the table below, can a researcher
conclude that learning styles differ by gender? Test at the .05 level.
Audio
Visual
Kinesthetic
Males
30
30
65
125
Females
30
25
20
75
60
55
85
•
Step 1: Develop hypotheses.
• State Alternative: Learning style will significantly differ by gender.
•
Step 2: Establish significance criteria
• ComputerÆ α = .05
• Hand calculationsÆIdentify χ2critical used for α and df
• df = (2-1)(3-1) = 2 χ2critical = 5.99
•
Step 3: Utilize sample data to calculate χ2
• ComputerÆ enter data
• Hand calculationsÆCalculate expected frequencies (fe), fo-fe, (fo-fe)2
male-audio
female-audio
male-visual
female-visual
male-kinesthetic
female-kinesthetic
•
fo
fe
fo-fe
30
30
30
25
65
20
38
22
34
21
53
32
-8
8
-4
4
12
-12
(fo-fe)2
fe
64
64
16
16
144
144
1.68
2.91
0.47
0.76
2.72
4.50
Σ = 13.04
Step 4: Compare sample data to null------>calculate test statistic
• ComputerÆIdentify test statistic and compare p-value to a level
Statistic
Chi-square
ƒ
•
(fo-fe)2
Hand
•
•
•
DF
Value
2
13.042
P-value
0.0019
o p-value is less than .05 Æreject null
CalculationsÆ
Calculate χ2 = 13.04
Compare χ2calculated to χ2critical
Since χ2 = 13.04 and exceeds the χ2critical= 5.99, the null is rejected
Step 5: Draw conclusion
• Males and females differ in learning styles; χ2(2, n=200)=13.04, p<.05.
Page 78
Computer Output
Contingency table results:
Rows: var1 (1=male, 2=female)
Columns: var2 (1=audio, 2=visual, 3=kinesthetic)
Cell
format:
1
Count
Row percent
Column percent
Total percent
Chi-square
DF
2
3
Total
1
30
24%
50%
15%
30
24%
54.55%
15%
65
52%
76.47%
32.5%
125
100.00%
62.5%
62.5%
2
30
40%
50%
15%
25
33.33%
45.45%
12.5%
20
26.67%
23.53%
10%
75
100.00%
37.5%
37.5%
60
30%
100.00%
30%
55
27.5%
100.00%
27.5%
85
42.5%
100.00%
42.5%
200
100.00%
100.00%
100.00%
Total
Statistic
2
Value
13.042
P-value
0.0019
Assumptions of Chi Square Tests
• Random sampling
• Independence of observations
• Expected frequency for any cell MUST be greater than 5
Reporting Chi Square Results
Statement should include chi-square value with df and n in parenthesis, and p-value:
•
Males and females differ in learning styles; χ2(2, n=200)=13.04, p<.05.
Page 79
Video #12
In-Class Practice Problems
1. The US Senate recently considered a controversial amendment for school prayer.
The amendment did not get the required two-thirds majority, but the results of the
vote are interesting when viewed in terms of the party affiliation of the senators.
Does the vote on the prayer amendment (var2: 1=yes, 2=no) differ by political party
(var1: 1=demo, 2=rep). Test at the .05 level.
Contingency table results:
Rows: var1
Columns: var2
Statistic
DF
Chi-square
1
Value
6.3032928
1
2
Total
1
19
42.22%
33.93%
19%
26
57.78%
59.09%
26%
45
100.00%
45%
45%
2
37
67.27%
66.07%
37%
18
32.73%
40.91%
18%
55
100.00%
55%
55%
56
56%
100.00%
56%
44
44%
100.00%
44%
100
100.00%
100.00%
100.00%
P-value
0.0121
Total
a. Independent Variable =
Scale (circle): Categorical
Quantitative
b. Dependent Variable =
Scale (circle): Categorical
Quantitative
c. Alternative hypothesis in sentence form.
d. χ
2
calculated
=
e. Level of significance (p) =
f. Circle:
reject null
or
)
fail to reject null
g. Write your conclusion in sentence form. (1 pt)
Page 80
2. A stats instructor would like to know whether it is worthwhile to require students to do weekly
homework assignments. For one section of the course, homework is assigned, collected and
graded each week. For the second section, the same problems are recommended but not
required. At the end of the semester, all students complete the same final exam. Letter
grades (A, B, C, D, F) are tabulated for each student by section. Do these data indicate
significant grade differences for students with homework versus no homework? Test at the
.05 level.
Contingency table results:
Rows: var1
Columns: var2
1
DF
Chi-square
Value
3
4
5
Total
1
6
30%
66.67%
14.29%
5
25%
50%
11.9%
5
25%
45.45%
11.9%
2
10%
28.57%
4.762%
2
20
10% 100.00%
40% 47.62%
4.762% 47.62%
2
3
13.64%
33.33%
7.143%
5
22.73%
50%
11.9%
6
27.27%
54.55%
14.29%
5
22.73%
71.43%
11.9%
3
22
13.64% 100.00%
60% 52.38%
7.143% 52.38%
9
10
11
7
5
42
21.43% 23.81% 26.19% 16.67%
11.9% 100.00%
100.00% 100.00% 100.00% 100.00% 100.00% 100.00%
21.43% 23.81% 26.19% 16.67%
11.9% 100.00%
Total
Statistic
2
P-value
4 2.4870248
0.647
a. Independent Variable =
Scale (circle): Categorical
Quantitative
b. Dependent Variable =
Scale (circle): Categorical
Quantitative
c. Alternative hypothesis in sentence form.
d. χ
2
calculated
=
e. Level of significance (p) =
f. Circle:
reject null
or
fail to reject null
g. Write your conclusion in sentence form. (1 pt)
Page 81
Additional Practice: Interpreting Research Articles
Read the following excerpt to complete the questions on the next page:
Researchers surveyed 120 college sophomores and juniors enrolled in general education
psychology courses. Participants were between the ages of 18 and 23 and completed a survey that
measured class absenteeism (cutting class) in the past month (for no valid reason) and seven negative
behaviors and two positive behaviors--all measured using yes/no response. Negative behaviors included:
speeding, slapped/hit someone, getting drunk, breaking the law, telling a significant lie, thinking about
dropping out of school, feeling depressed, getting a tattoo, piercing body. Positive behaviors were
reading a book that wasn’t required for class and visiting family.
Table 1. Number and percentage of students answering “yes” to behaviors by groups of students who have cut
class (n=68) and not cut class (n=52)
Cutting
Behavior
Getting drunk
Speeding
Breaking law
Telling significant lie
Thoughts of dropping out
Feeling depressed
Hitting/ slapping
Getting tattoo
Piercing body
Reading a non-required book
Visiting family
Note: * p<.05, ** p<.002
N
59
63
35
14
8
7
8
12
18
25
62
Not Cutting
%
87
93
51
21
12
10
12
19
26
37
91
N
24
39
10
8
3
5
11
4
7
15
40
%
46
75
19
15
6
10
21
8
13
29
77
χ2
22.79**
7.19*
13.07**
0.53
0.79
0.02
1.95
3.16
3.17
0.83
4.61*
Source: Trice, A.D. , Holland, S. A., & Gagne, P.E. (2000). Voluntary class absences and other behaviors in college
students: An exploratory analysis. Psychological Reports, 87, 179-182.
1. What percentage of students who did not cut class report reading a non-required book?
2. Is the difference in frequencies for speeding significant for the two groups? Explain.
3. Write the null hypothesis for group differences in getting drunk.
4. Should the null hypothesis you wrote for item 3 be rejected? Explain.
5. What can you conclude about students who cut class and get drunk?
Answers: 1) 29%; 2) yes, • 2 =7.10, p<.05; 3) Students who cut class will NOT significantly differ in the
behavior of getting drunk from students who do not cut class; 4) The null should be rejected since
• 2 =22.79, p<.002; 5) Students who cut class are more likely to get drunk and vice versa.
Page 82
Statistical Test Grid
Independent Variable
Dependent Variable
Categorical
Categorical
Quantitative
Chi Square
Test of Independence
1
Quantitative
t test (2)
Single Sample
Independent Samples
Related Samples
ANOVA (3+)
Pearson Correlation (relate)
Regression (predict)
2
3
Overview Items
Page 83
1. Does disability category (LD, EBD, none, etc.) differ by gender?
2. Does gender effect GRE scores?
3. Are GRE scores related to graduate GPA?
4. Does SES (low, middle, high) effect reading preparedness (as measured by a test)
among preschoolers?
5. Does a seminar on self-esteem increase self-esteem scores? (Self-esteem was
measure before and after the seminar)
6. Does learning style type differ by hand preference?
7. Do ACT scores predict college freshman GPA?
8. Do BGSU’s GRE scores for entering graduate students significantly differ from the
population norm?
9. Does a reading intervention significantly increase 4th grade reading proficiency
scores? Note: one group receives intervention, while another group receives
traditional instruction.
10. Does foot size (small, medium, large) effect IQ?
Page 84
Page 85
Page 86
Page 87
Page 88
Page 89
Page 90
Page 91
Page 92