Download Lecture 24 - Interpersonal Research Laboratory

Document related concepts

Psychometrics wikipedia , lookup

Foundations of statistics wikipedia , lookup

History of statistics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Omnibus test wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Categorical variable wikipedia , lookup

Misuse of statistics wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
Practice
• As part of a program to reducing smoking, a
national organization ran an advertising
campaign to convince people to quit or reduce
their smoking. To evaluate the effectiveness
of their campaign, they had 15 subjects record
the average number of cigarettes smoked per
day in the week before and the week after
exposure to the advertisement. Determine if
the advertisements reduced their smoking
(Alpha = .05).
Practice
Subject
Before
After
1
45
43
2
16
20
3
20
17
4
33
30
5
30
25
6
19
19
7
33
34
8
25
28
9
26
23
10
40
41
11
28
26
12
36
40
13
15
16
14
26
23
15
32
34
Practice
• Dependent t-test
• t = .45
• Do not reject Ho
• The advertising campaign did not reduce
smoking
Practice
• You wonder if there has been a significant
change (.05) in grading practices over the
years.
• In 1985 the grade distribution for the school
was:
Practice
• Grades in 1985
•
•
•
•
•
A: 14%
B: 26%
C: 31%
D: 19%
F: 10%
Grades last semester
Observed
A
32
B
61
C
64
D
31
F
12
Total
200
Step 1: State the Hypothesis
• H0: The data do fit the model
– i.e., Grades last semester are distributed the
same way as they were in 1985.
• H1: The data do not fit the model
– i.e., Grades last semester are not distributed the
same way as they were in 1985.
Step 2: Find 2 critical
• df = number of categories - 1
Step 2: Find 2 critical
• df = number of categories - 1
• df = 5 - 1 = 4
•  = .05
• 2 critical = 9.49
Step 3: Create the data table
A
Observed Expected
Prop.
32
.14
B
61
.26
C
64
.31
D
31
.19
F
12
.10
Total
200
Step 4: Calculate the Expected Frequencies
A
Observed Expected Expected
Prop.
Freq.
32
.14
28
B
61
.26
52
C
64
.31
62
D
31
.19
38
F
12
.10
20
Total
200
Step 5: Calculate 2
O = observed frequency
E = expected frequency
2
6.67
2
O
E
O-E
(O - E) (O - E)
E
16
.57
32
28
4
61
52
9
81
1.55
64
62
2
4
.06
31
38
-7
49
1.29
12
20
-8
64
3.2
2
Step 6: Decision
• Thus, if 2 > than 2critical
– Reject H0, and accept H1
• If 2 < or = to 2critical
– Fail to reject H0
2 = 6.67
Step 6: Decision
• Thus, if 2 > than 2critical
– Reject H0, and accept H1
• If 2 < or = to 2critical
– Fail to reject H0
2 crit = 9.49
Step 7: Put answer into words
• H0: The data do fit the model
• Grades last semester are distributed the
same way (.05) as they were in 1985.
The Three Goals of this Course
• 1) Teach a new way of thinking
• 2) Self-confidence in statistics
• 3) Teach “factoids”
Mean
r=
tobs = (X - ) / Sx
r=
What you have learned!
• Introduced to statistics and learned key
words
– Scales of measurement
– Populations vs. Samples
What you have learned!
• Learned how to organize scores of one
variable using:
– frequency distributions
– graphs
– measures of central tendency
What you have learned!
• Learned about the variability of
distributions
– range
– standard deviation
– variance
What you have learned!
• Learned about combination statistics
– z-scores
– effect sizes
– box plots
What you have learned!
• Learned about examining the relation
between two continous variables
– correlation (expresses relationship)
– regression (predicts)
What you have learned!
• Learned about probabilities
What you have learned!
• Learned about the sampling distribution
– central limit theorem
– determine probabilities of sample means
– confidence intervals
What you have learned!
• Learned about hypothesis testing
– using a t-test for to see if the mean of a single
sample came from a population value
What you have learned!
• Extended hypothesis testing to two samples
– using a t-test for to see if two means are
different from each other
• independent
• dependent
What you have learned!
• Extended hypothesis testing to three or
more samples
– using an ANOVA to determine if three or
means are different from each other
What you have learned!
• Extended ANOVA to two or more IVs
– Factorial ANOVA
– Interaction
What you have learned!
• Learned how to examine nominal variables
– Chi-Square test of independence
– Chi-Square test of goodness of fit
• CRN: 33496.0
Next Step
• Nothing new to learn!
• Just need to learn how to put it all together
Four Step When Solving a
Problem
• 1) Read the problem
• 2) Decide what statistical test to use
• 3) Perform that procedure
• 4) Write an interpretation of the results
Four Step When Solving a
Problem
• 1) Read the problem
• 2) Decide what statistical test to use
• 3) Perform that procedure
• 4) Write an interpretation of the results
Four Step When Solving a
Problem
• 1) Read the problem
• 2) Decide what statistical test to use
• 3) Perform that procedure
• 4) Write an interpretation of the results
How do you know when to use
what?
• If you are given a word problem, would you
know which statistic you should use?
Example
• An investigator wants to predict a male
adult’s height from his length at birth. He
obtains records of both measures from a
sample of males.
Possible Answers
a.
b.
c.
d.
e.
f.
g.
h.
i.
j.
Independent t-test
Dependent t-test
One-Sample t-test
Goodness of fit Chi-Square
Independence Chi-Squaren
Confidence Interval
Correlation (Pearson r)
Scatter Plot
Line Graph
Frequency Polygon
k.
l.
m.
n.
o.
p.
q.
r.
s.
t.
Regression
Standard Deviation
Z-score
Mode
Mean
Median
Bar Graph
Range
ANOVA
Factorial ANOVA
Example
• An investigator wants to predict a male
adult’s height from his length at birth. He
obtains records of both measures from a
sample of males.
• Use regression
Decision Tree
• First Question:
• Descriptive vs. Inferential
• Perhaps most difficult part
– Descriptive - a number or figure that
summarizes a set of data
– Inferential - use a sample to conclude
something about a population
• hint: these use confidence intervals or probabilities!
Decision Tree: Descriptive
• One or Two Variables
Decision Tree: Descriptive:
Two Variables
• Graph, Relationship, or Prediction
– Graph - visual display
– Relationship – Quantify the relation between two
continuous variables (CORRELATION)
– Prediction – Predict a score on one variable from a
score on a second variable (REGRESSION)
Decision Tree: Descriptive:
Two Variables: Graph
• Scatterplot vs. Line graph
– Scatterlot
– Linegraph
• Both are used to show the relationship between two
variables (it is usually subjective which one is used)
Scatter Plot
Neuroticism Score
25
20
15
10
5
0
0
2
4
6
Happiness Score
8
10
Line Graph
Neuroticism Score
25
20
15
10
5
0
0
2
4
6
Happiness Score
8
10
Decision Tree: Descriptive:
One Variable
• Central Tendency, Variability, Z-Score, Graph
– Central Tendency – one score that represents an entire
group of scores
– Variability – indicates the spread of scores
– Z-Score – converts a score so that is conveys the sore’s
relationship to the mean and SD of the other scores.
– Graph – Visual display
Decision Tree: Descriptive:
One Variable: Central Tendency
• Mean, Median, Mode
Decision Tree: Descriptive:
One Variable: Central Tendency
• Mean, Median, Mode
Mean
Median
Mode
Nominal
NO
NO
OK
Ordinal
NO
OK
OK
Interval
OK
OK
OK
Ratio
OK
OK
OK
Decision Tree: Descriptive:
One Variable: Variability
• Variance, Standard Deviation, Range/IQR
– Variance
– Standard Deviation
• Uses all of the scores to compute a measure of variability
– Range/IQR
• Only uses two scores to compute a measure of variability
• In general, variance and standard deviation are better to use a
measures of variability
Decision Tree: Descriptive:
One Variable: Graph
• Frequency Polygon, Histogram, Bar Graph
– Frequency Polygon
– Histogram
• Interchangeable graphs – both show frequency of continuous
variables
– Bar Graph
• Displays the frequencies of a qualitative (nominal) variable
Frequency
Frequency Polygon
20
18
16
14
12
10
8
6
4
2
0
8
11
14
17
20
23
26
Neuroticism Score
29
32
35
38
Frequency
Histogram
20
18
16
14
12
10
8
6
4
2
0
8
11
14
17
20
23
26
Neuroticism Score
29
32
35
38
Bar Graph
30
Frequency
25
20
15
10
5
0
Biology
History
Math
Major
Psychology
Sociology
Decision Tree: Inferential:
• Frequency Counts vs. Means w/ One IV vs. Means w/ Two or more
IVs
– Frequency Counts – data is in the form of qualitative (nominal) data
– Means w/ one IV – data can be computed into means (i.e., it is interval or
ratio) and there is only one IV
– Means w/ two or more IVs – data can be computed into means (i.e., it is
interval or ratio) and there are two or more IVs
– Confidence Interval - with some degree of certainly (usually 95%) you
establish a range around a mean
Decision Tree: Inferential:
Frequency Counts
• Goodness of Fit vs. Test of Independence
– Goodness of Fit – Used to determine if there is a good
fit between a qualitative theoretical distribution and the
qualitative data.
– Test of Independence – Tests to determine if two
qualitative variables are independent – that there is no
relationship.
Decision Tree: Inferential: Means
with two or more IVs
– Factorial ANOVA
Decision Tree: Inferential: Means
with one IV
• One Sample, Two Samples, Three or more
– One Sample – Used to determine if a single sample is
different, >, or < than some value (usually a known
population mean; ONE-SAMPLE t-TEST)
– Two Samples – Used to determine if two samples are
different, >, or < than each other
– Two or more – Used to determine if three or more
samples are different than each other (ANOVA).
Decision Tree: Inferential: Means
with one IV: Two Samples
• Independent vs. Dependent
– Independent – there is no logical reason to pair
a specific score in one sample with a specific
score in the other sample
– Paired Samples – there is a logical reason to
pair specific scores (e.g., repeated measures,
matched pairs, natural pairs, etc.)
Final Exam
• Test = 1 hour and 45 minutes
• 1:30 class is Mon, Dec 14 2:30-4:15
• 3:00 class is Tues, Dec 15 2:30-4:15
*Note about testing students
Cookbook
• Due: Final exam
• Early grade: Wednesday!