Download Inferential statistics 3

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Psychometrics wikipedia , lookup

Taylor's law wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

German tank problem wikipedia , lookup

Misuse of statistics wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
Inferential statistics 3
Maarten Buis
16/1/2006
outline
•
•
•
•
recap computer lab
significance of a correlation
significance of a regression coefficient
confidence interval
One, independent, or paired
sample t-test
• If we compare a mean in one sample to a
fixed value than we do a one sample t-test.
• If we compare the means of one variable
between two samples, than we do a
independent sample t-test
• If we compare the means of two variables
asked to the same persons, than we do a
paired sample t-test
one sample t-test
One-Sample Test
Tes t Value = 85
t
age age at day
of interview
-80,062
df
2704
Sig. (2-tailed)
Mean
Difference
,000
-14,42551
95% Confidence
Interval of the
Difference
Lower
Upper
-14,7788
-14,0722
independent sample t-test
Independent Samples Test
Levene's Test for
Equality of Variances
F
age age at day
of interview
Equal variances
ass umed
Equal variances
not as sumed
,014
Sig.
,906
t-tes t for Equality of Means
t
df
Sig. (2-tailed)
Mean
Difference
Std. Error
Difference
95% Confidence
Interval of the
Difference
Lower
Upper
-,105
2703
,917
-,03781
,36091
-,74550
,66988
-,105
2674,378
,917
-,03781
,36089
-,74547
,66985
paired sample t-test
Paired Samples Test
Paired Differences
Mean
Pair
1
nwirtot tot inst recei
<0..44> - nwertot tot
emot recei <0..44>
-6,39412
Std. Deviation
Std. Error
Mean
8,53306
,16699
95% Confidence
Interval of the
Difference
Lower
Upper
-6,72157
-6,06667
t
-38,289
df
2610
Sig. (2-tailed)
,000
One sided vs. two sided
• Look up different critical values in
Appendix B, table 2
Reporting test results
• Specifying H0, HA, a.
• H0 is the hypothesis you want to reject
• report the test statistic (in this case the tvalue), the degrees of freedom if applicable,
and the p-value.
• report your decision (reject or not reject H0)
Tests for correlation coefficients
• Correlation coefficient can range between
-1 and 1.
• The sampling distribution can’t be
symmetric if the real correlation is close to
either 1 or -1.
• The sampling distribution is symmetric and
approximately normal if the real correlation
is zero.
sampling distribution of correlation coefficients
real correlation .91
real correlation 0
6000
1.0e+04
Frequency
4000
5000
2000
0
0
.4
.6
.8
observed correlation
100,000 samples of 25 observations each
1
-1
-.5
0
.5
observed correlation
1
Test for correlation coefficient
• If you are testing a H0 that r is 0, than you
can assume normality of the sampling
distribution. Otherwise you can’t.
r r
t  obs
,
se
robs
t
2
1  robs
N 2
r  0,
2
1  robs
se 
N 2
• t only depends on observed r and N
Test for correlation
• You have to normalize the correlation if the
H0 is not equal to 0 or when testing
differences between correlations
• Fishers z-transformation, see appendix 2
table D
• The sampling distribution of the
transformed correlation coefficient will be
normally distributed with a standard error of
1
N 3
Significance testing in regression
Coefficientsa
Model
1
(Cons tant)
age age at day
of interview
Uns tandardized
Coefficients
B
Std. Error
4845,644
235,959
-33,002
3,317
Standardized
Coefficients
Beta
-,205
a. Dependent Variable: incmid hous ehold income in guilders
t
20,536
Sig.
,000
-9,950
,000
confidence intervals
• Until now we have made decisions about whether
or not to except the H0
• Sometimes we are more interested in a “good
guess” about the mean in the population.
• The mean in the sample is our “best guess”
• But we can also make an interval of “good
guesses”
• a small interval means a precise estimate, and a
wide interval less precise estimate
Confidence interval
• What is a “good” interval?
• A 95% confidence interval will contain the
true population parameter in 95% of all the
times it is computed.
• We are not 95% sure that the true value lies
in that interval.
• The confidence we have in the confidence
interval stems from the quality of the
procedure we have used.
Data: rents of rooms
room 1
rent
175
room 11
rent
240
room 2
room 3
room 4
180
185
190
room 12
room 13
room 14
250
250
280
room 5
room 6
room 7
200
210
210
room 15
room 16
room 17
300
300
310
room 8
room 9
room 10
210
230
240
room 18
room 19
325
620
confidence interval for mean rent
lb  x  se  ta
ub  x  se  ta
• N=19, so df =18
• look up the two sided critical t-value in Appendix
B, table 2: 2.101
99
 22.7
• mean is 258, s = 99, so se =
19
• lb = 258 - 22.7*2.101 = 210
• ub = 258 + 22.7*2.101 = 306