Download Multiple Regression

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Transcript
Statistics and Research methods
Wiskunde voor HMI
Bijeenkomst 3
Relating statistics and
experimental design
Contents




Multiple regression
Inferential statistics
Basic research designs
Hypothesis testing
–
Learn to select the appropriate statistical test in a
particular research design
Multiple Regression

Multiple correlation
–

The association between a criterion variable and
two or more predictor variables
Multiple regression
–
Making predictions with two or more predictor
variables
Multiple Regression

Multiple regression prediction models
–
–
Each predictor variable has its own regression coefficient
e.g., Z-score multiple regression formula with three predictor
variables:
ZˆY  ( 1 )( Z X1 )  (  2 )( Z X 2 )  ( 3 )( Z X 3 )
Standardized regression coefficients
Multiple Regression



Note: the betas are not the same as the correlation
coefficients for each predictor variable (because
predictors “overlap”)
Standardized regression coefficient (Beta) of a variable:
about unique, distinctive contribution of that variable
(overlap excluded)
There is also a corresponding raw score prediction
formula for multiple regression:
Ŷ = a + (b1)(X1) + (b2)(X2) + (b3)(X3)
Multiple correlation coefficient




R
In SPSS output: Multiple R
R is usually smaller than the sum of individual
correlation coefficients in bivariate regression
R2 is proportionate reduction in error =
proportion of variance accounted for
Research
Example
Inferential Statistics

Make decisions about populations based on
information in samples (as opposed to
descriptive statistics, which summarize the
attributes of known data)

Notations in statistical test theory
Sample and population
Basis
Specificity
Population Parameter
Sample Statistic
Scores of entire population
Scores of sample only
Usually unknown
Computed from data
Symbols
Mean

M
Standard
Deviation

SD
Variance
2
SD2
The Normal Distribution (Z-scores)

Normal curve and percentage of scores
between the mean and 1 and 2 standard
deviations from the mean
Basic research methods

Experimental method
–

Field studies – observation
–

manipulation of variables and measure effects
No outside intervention, e.g. ethnography
Quasi-experimental method
–
Combination of elements of other two
We concentrate on experiments and quasi-experiments
Experimental method




Manipulation of (levels of) one or more independent
variables (e.g. medication: pill or placebo; different
versions of a user interface)
 experimental conditions
Control (keep constant) other possibly intervening
variables
Measure dependent variables (e.g. effectiveness,
performance, satisfaction)
Test for differences between the conditions
Experimental design
How to assign subjects to conditions?

Between-subjects design
–

a subject is assigned to only one of the conditions
Within-subjects design or
Repeated measures design
–
Each subjects receives all the experimental
conditions
Between-subjects design


Randomization: assign subject at random to different
conditions
Matching: random assignment but control for variable
that is expected to be very relevant
Example: (if sex is important) seperately
assign men to experimental groups
assign women to experimental groups
Equal amount of men and women in conditions.
“the subjetcs in each condition were matched on sex”
Between-subjects design
(continued)

Matched pairs
–

Two subjects that are similar (on relevant variable(s))
assigned to different conditions
Randomized blocks design
–
–
–
Extension of matched pairs for more than two
conditions, e.g. 3 conditions
Form blocks of 3 similar subjects
Assign subjects in one block randomly to different
conditions
Between-subjects design
(continued)

Factorial designs
–
–
–
–
–
More than one independent variable
Study separate effects of each variable (main effects)
but also interaction between variables
Interaction effect: the impact of one variable depends
on the level of the other variable
Two-way factorial research design (two independent
variables); three-way with three indep. variables
2x2 if independent variables have two levels
(condions) or 3x3 with three levels
Within-subjects design


Same subjects in each experimental condition
Repeated measures design
–

Within-subjects design required if change is
measured as a consequence of an experimental
treatment (e.g. testscores before and after a
training)
In other situations: carryover effects
–
–
experimental conditions need to be counterbalanced
One half sequence AB the other half BA
Quasi-experimental method

Combination of elements from experimental
methods and field research
Hypotheses Testing

H0: Null hypothesis – No difference
–

The Independent variable has no effect
e.g. pill or placebo make no difference
H1 (or Ha): Alternative hypothesis –
Significant difference
–
The Independent variable has an effect
Hypothesis Testing Errors

Type I Error:
–
Null hypothesis is rejected but true.
No effect, but you say there is.
–

Alpha (α) probability of making type I error
Type II Error:
–
Null hypothesis is not rejected but false.
Real effect, but you say there’s not.
–
Beta (β) probability of making type II error
Type I and II errors
Reject H0
Retain H0
H0 Is True
H0 Is False
Type I error

Right decision

Right decision

Type II Error

α usually 0.05 or 0.01
β usually 0.20
Statistical Power
Power:
The probability that a test will correctly reject a
false null hypothesis (1- β )
An Example of Hypothesis Testing



A person claims to be able to identify people of aboveaverage intelligence (IQ) with her eyes closed
We devise a test – take her to a stadium full of
randomly selected people from the population and ask
her to pick someone with her eyes closed who is of
above average IQ.
If she does, we’ll be convinced. But she might pick
someone with an above-average IQ just by chance.
Distribution of IQ Scores
Distribution of IQ scores is normal with M = 100 and SD = 15
IQ Score
Z Score
p
145
130
115
+3
+2
+1
.13%
2%
16%
So we set in advance a score by which we will be convinced.
%chance
Z score
IQ
2%
1%
5%
+2
+2.33
+1.64
130
135
124.6
The Hypothesis Testing Process
1.
Restate the question as a research hypothesis
and a null hypothesis about the populations




Population 1
Population 2
Research hypothesis or alternative hypothesis
Null hypothesis
The Hypothesis Testing Process
2.
Determine the characteristics of the
comparison distribution

Comparison distribution: distribution of the sort you
would have if the null-hypothesis were true.
The Hypothesis Testing Process
3.
Determine the cutoff sample score on the
comparison distribution at which the null
hypothesis should be rejected


Cutoff sample score
Conventional levels of significance:
p < .05, p < .01
The Hypothesis Testing Process
4.
Determine your sample’s score on the
comparison distribution
5.
Decide whether to reject the null hypothesis
One-Tailed and Two-Tailed
Hypothesis Tests

Directional hypotheses
–

One-tailed test
Nondirectional hypotheses
–
Two-tailed test
Determining Cutoff Points With TwoTailed Tests

Divide up the significance between the two tails