Download Social Science Reasoning Using Statistics

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
Reasoning in Psychology
Using Statistics
Psychology 138
2015
• Exam 2 in lecture and lab on Wednesday
• Be prepared to do calculations (including square
roots) on calculator
Announcements
Reasoning in Psychology
Using Statistics
•
Mathematical cautions
–
–
–
•
Different scales: convert to z-scores
Restriction of range (e.g., age & height)
Outliers (especially in small samples)
Interpretive caution
–
Causal claims
Cautions with Correlations
Reasoning in Psychology
Using Statistics
• Change all scores to z-scores
 Both variables on same scale
 Correlation stays the same
 What happens to means?
Y
28
28
r=
=
= 0.898
60.8 *16 31.2
Convert X and Y to z-scores
zy
1.5
1.0
0.5
0
-0.5
-1.0
-1.5
6
5
4
3
2
1
1 2
3.6
3
4 5
6 X
-1.5 -1 -.5 0 .5 1 1.5 zx
Pearson’s r, z transformation
Reasoning in Psychology
Using Statistics
• Total data for positive correlation between SAT and GPA.
• What correlation between SAT and GPA in only those with
admitted and studied (400 < SAT < 700)?
• Get r = 0
Restriction of range
Reasoning in Psychology
Using Statistics
• One extreme score can change correlation (especially in
small sample).
• On left, 5 observations,
high X associated with high
Y: good predictability.
Outliers
Reasoning in Psychology
Using Statistics
• On right, same 5 observations
plus 1 other, high X associated
with high or low Y: poor
predictability.
•
We’d like to say:
–
•
X causes Y
To be able to do this:
1. The causal variable must come first
2. There must be co-variation between the two variables
3. Need to eliminate plausible alternative explanations
• Correlation procedures address point 2, but say
nothing about points 1 and 3.
• Careful: Do not make casual claims based on
correlations
Causal claims
Reasoning in Psychology
Using Statistics
• Directionality Problem
– Happy people sleep well
• Or is it that sleeping well makes you happy?
Causal claims
Reasoning in Psychology
Using Statistics
• Third Variable Problem:
– Happy people sleep well
– Or does sleeping well make you happy?
– OR something else makes people happy and sleep
well!
• Regular exercise
• Minimal use of drugs & alcohol
• Being a conscientious person
• Being a good relationship
• Etc.
Causal claims
Reasoning in Psychology
Using Statistics
Statistical procedures to help organize, summarize &
simplify large sets of data
1.
One variable (frequency distribution)
•
Display results in a frequency distribution table & histogram
(or bar chart if categorical variable).
Make a deviations table to get measures of central tendency
(mode, median, mean) & variability (range, standard deviation, variance).
•
2.
Two variables (bivariate distribution)
•
•
3.
Display results: Make a scatterplot.
Make a bivariate deviations or z-table table to get Pearson’s r.
Z-scores & normal distribution
Review for Exam 2:
Descriptive statistics
Reasoning in Psychology
Using Statistics
• Are hours sleeping related to GPA?
– You conduct a survey.
• Your sample of 10 gives these results for average hours per night
sleeping:
7, 6, 7, 8, 8, 7, 9, 5, 9, 6
• You also have respondents give their overall GPA:
2.4, 3.9, 3.5, 2.8, 3.0, 2.1, 3.9, 2.9, 3.6, 2.7
– We will focus on sleep results first and then both variables
together.
• What kind of scales are they?
• To find standard deviation, will we use formula for population or
sample?
Example
Reasoning in Psychology
Using Statistics
Hrs. sleep
n=10
7,6,7,8,8
7,9,5,9,6
X
9
8
f
p
%
cf
c%
7
6
5
∑
10
1.0 100
Step 1: Frequency distribution & histogram
Reasoning in Psychology
Using Statistics
Hrs. sleep
n=10
7,6,7,8,8
7,9,5,9,6
∑
X
9
8
f
2
2
7
6
5
3
2
1
10
p
%
cf
c%
1.0 100
Will enter first two columns as X and Y axes for frequency distribution
Step 1: Frequency distribution & histogram
Reasoning in Psychology
Using Statistics
Hrs. sleep
n=10
p = f/n
∑
X
9
8
f
2
2
p
.2
.2
%
20
20
7
6
5
3
2
1
.3
.2
.1
30
20
10
10
cf
c%
1.0 100
Step 1: Frequency distribution & histogram
Reasoning in Psychology
Using Statistics
∑
X
9
8
f
2
2
p
.2
.2
%
20
20
7
6
5
3
2
1
.3
.2
.1
30
20
10
10
cf
c%
1
10
1.0 100
Step 1: Frequency distribution & histogram
Reasoning in Psychology
Using Statistics
∑
X
9
8
f
2
2
p
.2
.2
%
20
20
7
6
5
3
2
1
.3
.2
.1
30
20
10
10
cf
c%
3
1
30
10
1.0 100
Step 1: Frequency distribution & histogram
Reasoning in Psychology
Using Statistics
∑
X
9
8
f
2
2
p
.2
.2
%
20
20
cf
c%
7
6
5
3
2
1
.3
.2
.1
30
20
10
6
3
1
60
30
10
10
1.0 100
Step 1: Frequency distribution & histogram
Reasoning in Psychology
Using Statistics
∑
X
9
8
f
2
2
p
.2
.2
%
20
20
7
6
5
3
2
1
.3
.2
.1
30
20
10
10
cf
c%
8
6
3
1
80
60
30
10
1.0 100
Step 1: Frequency distribution & histogram
Reasoning in Psychology
Using Statistics
∑
X
9
8
f
2
2
p
.2
.2
%
20
20
cf
10
8
c%
100
80
7
6
5
3
2
1
.3
.2
.1
30
20
10
6
3
1
60
30
10
10
1.0 100
Step 1: Frequency distribution & histogram
Reasoning in Psychology
Using Statistics
Hrs. sleep
X
9
8
f
2
2
7
6
5
3
2
1
F
R
E
Q
U
E
N
C
Y
6
5
4
3
2
1
5
6
7
8
9
SCORE
Step 1: Frequency distribution & histogram
Reasoning in Psychology
Using Statistics
• Suppose that you combine two groups together.
– How do you compute the new group mean?
Group 1
X1 =110
110 110 110
110 110
110 110
Group 2
X 2 =140
140
140 140
New Group
X1n1 + X 2 n 2
XN =
n1 + n 2
(110 * 7) + (140 * 3)
=
= 119
7+ 3
A weighted mean
Reasoning in Psychology
Using Statistics
• Suppose that you combine two groups together.
– How do you compute the new group mean?
Group 1
X1 =110
110 110 110
110 110
110 110
Group 2
X 2 =140
140
140 140
New Group
X1n1 + X 2 n 2
XN =
n1 + n 2
(110 * 7) + (140 * 3)
=
= 119
7+ 3
A weighted mean
Reasoning in Psychology
Using Statistics
Be careful
computing the
mean of this
distribution,
remember
there are
groups here
X
9
f
2
8
7
6
2
3
2
5
1
9
9
8
8
7
7
7
6
6
5
• The mean
• The standard deviation
– Change/add/delete a given score,
– Change/add/delete a given score,
then the mean will change.
then the mean will change.
– Add/subtract a constant to each
score, then the mean will change
by adding(subtracting) that
constant.
– Add/subtract a constant to each
score, then the standard deviation
will NOT change.
– Multiply (or divide) each score by
a constant, then the mean will
change by being multiplied by
that constant.
– Multiply (or divide) each score by
a constant, then the standard
deviation will change by being
multiplied by that constant.
Characteristics of a mean &
standard deviation
Reasoning in Psychology
Using Statistics
Hrs. sleep
n = 10
X
9
9
(X - X) (X - X)2
Create table, sorted in
descending order
8
8
7
7
7
6
6
5
Step 2: Deviations table
Reasoning in Psychology
Using Statistics
X
Hrs. sleep
n = 10
(X - X) (X - X)2
9
9
8
8
7
7
7
6
6
5
∑
72
Step 2: Deviations table
Reasoning in Psychology
Using Statistics
Mode = 7 (filled in)
Median = 7 (arrow)
Mean = (∑X)/n = 72/10 = 7.2
Range = 5 to 9
X
Hrs. sleep
n = 10
∑
(X - X) (X - X)2
9
1.8
9
1.8
8
.8
8
.8
7
-.2
7
-.2
7
-.2
6
-1.2
6
-1.2
5
-2.2
72
X 7.2
= 9-7.2
0
Step 2: Deviations table
Reasoning in Psychology
Using Statistics
Mode = 7
Median = 7
Mean = (∑X)/n = 72/10 = 7.2
Range = 5 to 9
X
Hrs. sleep
n = 10
∑
(X - X) (X - X)2
9
1.8
3.24 = 1.82
9
1.8
3.24
8
.8
.64
8
.8
.64
7
-.2
.04
7
-.2
.04
7
-.2
.04
6
-1.2
1.44
6
-1.2
1.44
5
-2.2
4.84
72
X 7.2
0
15.6 = SS
Step 2: Deviations table
Reasoning in Psychology
Using Statistics
Mode = 7
Median = 7
Mean = ∑X/n = 72/10 = 7.2
Range = 5 to 9
SD for sample
s=
å( X - X )
2
n -1
= √15.6/9 = √1.73 = 1.32
Person
A
B
C
D
E
F
G
H
I
J
Hrs.
7
6
7
8
8
7
9
5
9
6
GPA
2.4
3.9
3.5
2.8
3.0
2.1
3.9
2.9
3.6
2.7
G
P
A
4.0
3.5
3.0
2.5
2.0
1.5
1.0
5
6
7
8
Hours of sleep
Step 3: Scatterplot
Reasoning in Psychology
Using Statistics
9
Person
A
B
C
D
E
F
G
H
I
J
Hrs.
7
6
7
8
8
7
9
5
9
6
GPA
2.4
3.9
3.5
2.8
3.0
2.1
3.9
2.9
3.6
2.7
What does shape of envelope
indicate about correlation?
low positive correlation
G
P
A
4.0
B
3.5
3.0
G
C
H
2.5
I
DE
J
2.0
A
F
1.5
1.0
5
6
7
8
Hours of sleep
Step 3: Scatterplot
Reasoning in Psychology
Using Statistics
9
Person Hrs.
GPA
A
7
2.4
B
6
3.9
C
7
3.5
D
8
2.8
E
8
3.0
F
7
2.1
G
9
3.9
H
5
2.9
I
9
3.6
J
6
2.7
K
5
1.0
What does shape of envelope
indicate about correlation?
moderate positive correlation
G
P
A
4.0
B
3.5
3.0
G
C
H
2.5
I
DE
J
2.0
A
F
1.5
1.0
K
5
6
7
8
Hours of sleep
Step 3: Scatterplot, Effect of outlier
Reasoning in Psychology
Using Statistics
9
Person Hrs.
GPA
A
7
2.4
B
6
3.9
C
7
3.5
D
8
2.8
E
8
3.0
F
7
2.1
G
9
3.9
H
5
2.9
I
9
3.6
J
6
2.7
K
9
1.0
What does shape of envelope
indicate about correlation?
low negative correlation
G
P
A
4.0
B
3.5
3.0
G
C
H
2.5
I
DE
J
2.0
A
F
1.5
1.0
K
5
6
7
8
Hours of sleep
Step 3: Scatterplot, Effect of outlier
Reasoning in Psychology
Using Statistics
9
(X - X)
X
(X - X)2 Y
(Y - Y ) (Y - Y )2 (X - X)(Y - Y )
9
1.8
3.24
3.9
0.82
0.67
1.476
9
1.8
3.24
3.6
0.52
0.27
0.936
8
0.8
0.64
3.0
-0.08
0.01
-0.064
8
0.8
0.64
2.8
-0.28
0.86
-0.224
7
-0.2
0.40
3.5
0.42
0.18
-0.084
7
-0.2
0.04
2.4
-0.68
0.46
0.136
7
-0.2
0.04
2.1
-0.98
0.96
0.196
6
-1.2
1.44
3.9
0.82
0.67
-0.984
6
-1.2
1.44
2.7
-0.38
0.14
0.456
5
-2.2
4.84
2.9
-0.18
0.03
0.396
0.0
15.6
30.8
0.0
3.47
2.24
SSX
3.08
SSY
SP
Sum 72
Mean 7.2
Step 4: Bivariate Deviations Table
Reasoning in Psychology
Using Statistics
n=10
Note
signs!
+r or – r?
SP
r=
SSX SSY
X=
sX =
åX
n
å( X - X )
n -1
2
=
XY co-deviations
___2.24___
= _2.24_ = _2.24_ = .304
= √ 15.6 * 3.47
√54.132
7.357
X deviations, Y deviations
72
= 7.2
10
15.6
=
= 1.73 = 1.32
9
Y=
åY
n
=
308
= 3.08
10
SSY
3.08
sY =
=
= .58
n -1
9
Pearson’s r & summary statistics
Reasoning in Psychology
Using Statistics
SRA (Scientific Reasoning Assessment) (fictional)
• Based on normative data: Normal, μ = 50.0, σ = 10.0
• Preparing for your analyses
– Write down what you know
– Make a sketch of the distribution
(make a note: population or sample)
40
μ
60
An example
Reasoning in Psychology
Using Statistics
–
–
–
–
Determine the shape
What is best measure of center?
What is best measure of variability?
Mark the mean (center) and
standard deviation on your sketch
SRA (Scientific Reasoning Assessment) (fictional)
• Based on normative data: Normal distr.,  = 50.0, σ = 10.0
• Question 1
• If George got a 35 on the SRA, what
is his percentile rank?
Unit Normal Table
z=
0.0668
40
-1.0
m
60
1.0
X -m
s
=
35 - 50 -15
=
= -1.5
10
10
• Since a normal distribution, can use
Unit Normal Table to infer percentile.
That’s 6.68% at or
below this score
(definition of percentile)
z-scores & Normal Distribution
Reasoning in Psychology
Using Statistics
SRA (Scientific Reasoning Assessment) (fictional)
• Based on normative data: Normal distr., μ = 50.0, σ = 10.0
• Question 2
Unit Normal Table
• What proportion of people get
between a 40 and 60 on the SRA?
0.1587
0.1587
40
-1.0
m
60
1.0
That’s about 32% outside
these two scores
That leaves 68% between
these two scores
X - m = 40 - 50 = -10 = -1.0
z=
10
10
s
=
60 - 50 10
=
= 1.0
10
10
• Since a normal distribution, can use
Unit Normal Table to infer percentile.
z-scores & Normal Distribution
Reasoning in Psychology
Using Statistics
SRA (Scientific Reasoning Assessment) (fictional)
• Based on normative data: Normal distr., μ = 50.0, σ = 10.0
z=
X -m
s
X = Zs + m
transformation
• Question 3a
• Suppose that Chandra took a
different reasoning assessment (the
RSE: Based on normative data,
Normal distr., μ= 100, σ = 15). She
received a 130 on the RSE.
Assuming that they are highly
positively correlated, what is the
equivalent score on the SRA?
z-scores & Normal Distribution
Reasoning in Psychology
Using Statistics
SRA (Scientific Reasoning Assessment) (fictional)
• Based on normative data: Normal distr., μ = 50.0, σ = 10.0
X - m (for RSE)
z=
s
130 - 100 30
=
=
= 2.0
15
15
X = Zs + m (for SRA)
transformation
X = (2.0)10 + 50
X = 70
• Question 3a
• Suppose that Chandra took a different
reasoning assessment ( RSE: Based on
normative data, Normal distr., μ= 100,
σ = 15). She received a 130 on the
RSE. Assuming that they are highly
positively correlated, what is the
equivalent score on the SRA?
• Now know that predict equivalent only
if rRSE,SRA = 1.0.
z-scores & Normal Distribution
Reasoning in Psychology
Using Statistics
SRA (Scientific Reasoning Assessment) (fictional)
• Based on normative data: Normal distr., μ = 50.0, σ = 10.0
X - m (for RSE)
z=
s
130 - 100 30
=
=
= 2.0
15
15
X = Zs + m (for SRA)
transformation
X = (2.0)10 + 50
X = 70
• Question 3c
• Suppose that Chandra took a different
reasoning assessment (the RSE: Based
on normative data, Normal distr., μ=
100, σ = 15). She received a 130 on
the RSE. Assuming that they are
perfectly positively correlated, what is
the equivalent score on the SRA?
• What percent of those taking either
test will score below Chandra?
Know z = 2
From Unit Normal Table, p(z ≥ 2) = .0228
p(z < 2) = 1 - .0228 = .9772 = 98%
z-scores & Normal Distribution
Reasoning in Psychology
Using Statistics
SRA (Scientific Reasoning Assessment) (fictional)
• Based on normative data: Normal distr., μ = 50.0, σ = 10.0
X - m (for RSE) = 2.0
z=
s
zy = r zx so zSRA = .8 * 2 = 1.6
X = Zs + m (for SRA)
transformation
X = (1.6)10 + 50
X = 66
Reasoning in Psychology
Using Statistics
• Question 3b
• Suppose that Chandra took a different
reasoning assessment ( RSE: Based on
normative data, Normal distr., μ= 100,
σ = 15). She received a 130 on the
RSE. Assuming that they are highly
positively correlated, what is the
equivalent score on the SRA?
• If rRSE,SRA = .8, what is our best
estimate of her actual score?
• More on this later in course
z-scores, Normal Distribution,
& Correlation
• In lab: continue to review, including SPSS
• Questions?
Wrap up
Reasoning in Psychology
Using Statistics