Download Scale of Measurement Goal Interval/ratio Ordinal Nominal Describe

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Degrees of freedom (statistics) wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Misuse of statistics wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
Goal
Describe one group
Compare one group to
a hypothetical value
Compare two paired
groups
Compare two unpaired
groups
Compare three or more
paired groups
Compare three or more
unpaired groups
Quantify association
between 2 variables
Predict value from
another measured
variable
Interval/ratio
Mean, standard
deviation
One-sample ttest;
Z-test (for true variance
known; or very big
sample size)
Paired t-test
Unpaired/ two-sample
t-test
ANOVA repeated
measures
ANOVA
Pearson correlation
Simple linear regression
Scale of Measurement
Ordinal
Median; interquartile
range
Wilcoxon signed rank
test
Nominal
proportion
Chi-square test
Wilcoxon signed rank
test
Mann-Whitney U test
McNemar’s test
Chi-square test
Friedman test
Cochraine Q
Kruskal-Wallis
Chi-square test
Spearman correlation
Contigency coefficients
Test
One
sample ztest
When to use
Compare one
group (mean)
to an
hypothetical
value if
variance
known or
sample very
large;
For
ratio/interval
data
Null
hypothesis:
H0: =0
One
sample or
paired ttest
Compare one
group to an
hypothetical
value, or two
paired groups
For
ratio/interval
data
Null
hypothesis:
H0: =0
Assumptions Formula
The sample
x  0
comes from a Z
calc 

population
normally
n
distributed;
Random
Where:
sampling;
x - sample mean (if paired sample,
Independent do mean of the differences);
observations;  0 - hypothetical value under the
null hypothesis;
 - population standard deviation
(SD), or, for a paired sample ,
standard deviation of the differences> for very large samples it is
approximated by the sample
standard deviation;
n - sample size
Reject H0 if
The sample
comes from a
x  0
population
Tcalc 
s
normally
distributed;
n
Random
Where:
sampling
x - sample mean (if paired sample,
do mean of the differences);
 0 - hypothetical population mean
value under the null hypothesis (if
paired sample, typically zero);
s - sample standard deviation (for a
paired sample , SD of the
differences);
n - sample size
| Tcalc |>
| Z calc
|> | Z  |
Where  is the
significance
level (generally
0.05);
| | stands for
absolute value
| T ; |
I.e., Reject if
absolute Tcalc
above absolute
critical value
| T ; |
Where:
 - significance
level;
 - degrees of
freedom:
= n-1
Test
Unpaired/
two
sample ttest
When to use
Compares two
independent
(mean) groups
For
ratio/interval
data
H0: 1=2
or, equivalently
H0: 1 - 2 =0
Wilcoxon
signedrank test
Compare one
group to an
hypothetical
value, or two
paired groups
For ordinal
data (or
ratio/interval if
sample is small
and data not
normal)
Null
hypothesis:
H0: M=M0
(where M
represents
median)
Assumptions
The samples
comes from
normally
distributed
populations;
Both
populations
have
identical
variances
(i.e. 1=2);
Random
sampling;
Independent
observations;
Distribution
symmetrical
(does not
need to be
normal);
Random
sampling
Formula
T
( x1  x 2 )  ( 1   2 ) 0
SP
1
1

n1 n2
Where:
Sp – polled standard deviation,
calculated by:
SP 
(n1  1) S12  (n2  1) S 22
n1  n2  2
and
x1 - sample mean of population 1
x 2 - sample mean of population 2
s1 – sample SD of population 1
s2 – sample SD of population 2
n1 – sample size for population 1
n2 – sample size for population 2
(1-2)0– hypothesized difference
between the means of the
populations (generally zero)
T+= sum of the ranks having a positive
sign
T-= sum of the ranks having a
negative sign
How to do it:
1 For paired data: for each data
point, calculate the differences
between the 2 groups;
For comparison of one group to
an hypothetical value: for each
data point, subtract the
hypothesized median value
2 Rank the absolute differences,
from smaller to larger (i.e., 1 for
the smallest absolute difference)
3 Add the corresponding signs (+
for originally positive differences,
- for negative ones)
4 Calculate T+ and T- . Note: if one
of them is calculated, the other
can be calculated from:
T+=n(n-1)/2 - TWhere n is the sample size
Reject H0 if
| Tcalc |>
| T ; |
I.e., Reject if
absolute Tcalc
above absolute
critical value
( | T ; | )
Where:
 - significance
level;
 - degrees of
freedom:
= n1 + n2 -2
Choose T+ or T(whichever is
smallest) and
compare with
critical value
(from table).
Reject if below
or equal to the
critical value
Test
MannWhitney U
test
(also called
Wilcoxon
rank-sum,
or
Wilcoxon-
When to use
Compare one
group to an
hypothetical
value, or two
paired groups
Assumptions
Random
sampling;
Independent
observations;
For ordinal
data (or
ratio/interval if
sample is small
and data not
normal)
Compares two
or more groups
For nominal
data
H0: variable
represented in
the rows is
independent of
the variable
represented in
the columns
And
Alternative, one can be obtained
from the other through:
Reject H0 if
Choose U1 or U2
(whichever is
smallest) and
compare with
critical value
(from table).
Reject if below
or equal to the
critical value
Where:
R1 – sum of the ranks for population
1
R2 – sum of the ranks for population
2
n1 – sample size for population 1
n2 – sample size for population 2
Null
hypothesis:
H0: M1=M2
(where M
represents
median)
Chi-square
test
Formula
Random
sampling;
Independent
observations;
First put data in the form of a
contingency table, then calculate:
Where:
R – total number of rows;
C – total number of columns;
– number of observations
(frequency) in row i, column j
– Expected frequency in row i,
column j, which is calculated as:
> X 2  ;
I.e., Reject if
absolute
above critical
value ( X 2 ; )
Where:
 - significance
level;
 - degrees of
freedom:
= n-1
Where:
Ri – sum of all the frequencies of row
i (calculated as
);
Cj – sum of all the frequencies of
column j (calculated as
)
n – Sample size. Sum of all
frequencies. (calculated as
)
Test
ANOVA
When to use
Compares
three or more
groups
Assumptions
The samples
comes from
normally
distributed
For
populations;
ratio/interval
All
data
populations
have
H0: 1=2 =3 … identical
variances
(i.e.
1=2=3…);
Random
sampling;
Independent
observations;
Formula
Fcalc 
Reject H0 if
>
MSTR
MSE
F ;( dfTR ;dfE )
Where
MSTR – mean squared error of the
treatment
MSE – mean squared (residual) error
Calculated as:
MSTR 
SSTR
SSE
and MSE 
dfTR
df E
Where:
SSTR – sum of squares of the
treatment
SSTR – sum of squares of the
(residual) error
dfTR – degrees of freedom treatment;
dfE – degrees of freedom (residual)
error;
Calculated as:
g
SSTR   ni ( xi  x )
2
i 1
g
ni
SSE   ( xij  xi ) 2
i 1 j 1
dfTR = g -1 and
dfE =n - g
Where:
xij - observation j in group i
xi - mean of group i
ni - sample size of group i
x - total mean (averaging all values,
independently of the group)
g – total number of groups
n - total sample size
Also
SST=SSTR+SSE
Where SST is the total sum of squares
I.e., Reject if
absolute
above critical
value
( F ;( dfTR ;dfE ) )
Where:
 - significance
level;
dfTR – degrees
of freedom
treatment:
dfTR = g-1
dfE – degrees of
freedom
(residual) error:
dfE = n-g
g
ni
SST   ( xij  x ) 2
i 1 j 1