Download Lesson 7: ! Statistical basics I! Plan! Useful resources! 1. Types of

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Degrees of freedom (statistics) wikipedia , lookup

Psychometrics wikipedia , lookup

Categorical variable wikipedia , lookup

Transcript
Quantitative approaches!
Quantitative approaches!
Lesson 7: !
Statistical basics I!
Quantitative approaches!
Plan!
1.  Types of analysis!
2.  Types of variables : nominal, ordinal, interval, metric!
3. !Measures of central tendency: mode, median, mean!
4.  Measures of variability: variance and standard deviation!
Quantitative approaches!
Useful resources!
Rice Virtual Lab in Statistics!
http://onlinestatbook.com/rvls/index.html
http://www.socialresearchmethods.net/
1. Types of analysis!
Quantitative approaches!
Quantitative approaches!
Types of analysis!
Descriptive vs. inferential analysis!
!- descriptive or inferential!
!"Descriptive analysis is about the data you have in hand.
Inferential analysis involves making statements inferences - about the world beyond the data you have in
hand."!
!- univariate, bivariate, multivariate!
!"When you say that the average age of a group of
telephone survey respondents is 44.6 years, that's a
descriptive analytic statement. When you say that there is a
95% statistical probability that the true mean of the
population from which you drew your sample of
respondents is between 42.5 and 47.5 years, that's an
inferential statement. You infer something about the rest of
the world from data in your sample."!
!(Bernard, 2000: 502)!
Quantitative approaches!
Quantitative approaches!
Univariate, bivariate, multivariate!
- 
univariate : uses 1 variable!
- 
bivariate: uses 2 variables!
- 
multivariate: uses 3 and more variables!
Univariate, bivariate, multivariate!
- 
"Univariate analysis involves getting to know data
intimately by examining variables precisely and in detail.
Bivariate analysis involves looking at axssociations
between pairs of variables and trying to understand how
those associations work. Multivariate analysis involves,
among other things, understanding the effects of more than
one independent variable at a time on a dependent
variable."!
!(Bernard, 2000: 502)!
Quantitative approaches!
Quantitative approaches!
Univariate, bivariate, multivariate: how to
proceed!
Bivariate analysis: questions to ask!
1. 
1. 
Look at the variables one by one: what is their range,
mean, median, variance (is there variance!?), distribution
(univariate)!
2. 
Inspect associations between pairs of variables. How does
the independent variable "influence" the dependent
variable? (bivariate)!
3. 
Look at the associations of several variables
simultaneously. How do two or more independent
variables influence a dependent variable at the same time?
(multivariate)!
Quantitative approaches!
Multivariate analysis: questions to ask!
1. 
2. 
3. 
4. 
How is a relationship between two variables changed if a
third variable is controlled? (Multiple crosstabs, partial
correlation, multiple regression, MANOVA)!
What is the overall variance of a dependent variable that
can be explained by several independent variables. What
are the relative strenghs of different predictors
(independent variables)? (Multiple regression)!
What groups of variables tend to correlate with each other,
given a multitude of variables? (Factor analysis)!
Which individuals tend to be similar concerning selected
variables? (Cluster analysis)!
2. 
3. 
4. 
How big/important is the covariation? In other words,
how much better could we predict the score of a dependent
variable in our sample if we knew the score of some
independent variable? Covariation coefficients answer this
question!
Is the covariation statistically significant? Is it due to
chance, or is it likely to exist in the overall population to
which we want to generalize? Statistical tests answer this
question.!
What is it direction? (look at graphs)!
What is its shape? Is it linear or non linear? (look at
graphs)!
Quantitative approaches!
2. Types of variables : "
nominal, ordinal, interval, metric!
Quantitative approaches!
Variables : nominal, ordinal, interval!
Variables : !
Nominal
!
Ordinal
!
lot!
Interval
!have no inherent order!
!example: party preference, male-female!
!are ordered, but the distances are not
!
!quantifiable (we cannot add or subtract)!
!example: agree a lot, agree a bit, disagree a !bit, disagree a
!can be measured numerically; it makes sense to to"
!additions or subtraction "
!example : height, weight, income, !number !of cars!
Quantitative approaches!
3. Measures of central tendency: "
mode, median, mean!
Quantitative approaches!
Levels of measurement and covariation:
Analysis!
!Depend.
Independ.!
!Nominal
!Ordinal
!Interval!
Nominal
!!
!!
Ordinal
!!
!Crosstabs
!
!!
!
!!
!
!
!!
Interval
!!
!
!
!
!
!Correlation !!
!Regression!
!ANOVA!
!!
Quantitative approaches!
Definitions : Mode, Median, Mean!
!Mode =
!Value in the distribution of the variable
!that comes up most frequently!
!Median =
!Value in the distribution that has 50% !of
!the values «#to its right#» and 50% of the
!values «#to its left#». !
!Mean =
!Sum of the values divided by n!
Quantitative approaches!
Quantitative approaches!
Example : Size of 11 dwarfs!
Example : Size of 11 dwarfs!
Mode!
1. 
=
Size of 11 dwarfs: !
13, 7, 5, 12, 9, 15, 7, 11, 9, 7, 12 (cm)!
5, 7, 7, 7, 9, 9, 11, 12, 12, 13, 15 !
5, 7, 7, 7, 9, 9, 11, 12, 12, 13, 15!
Median!
Mean
9,7272
Mean
9.7272
5
7
7
7
Mode
9
9
11
12
12
13
Median
Quantitative approaches!
Calculating mean, mode, median!
mean = y =
15
!y
n
5 + 7 + 7 + 7 + 9 + 9 + 11 + 12 + 12 + 13 + 15
mean = y =
11
107
mean = y =
= 9.727273
11
median = 5, 7, 7, 7, 9, 9, 11, 12, 12, 13, 15 !
mode= 5, 7, 7, 7, 9, 9, 11, 12, 12, 13, 15 !
Quantitative approaches!
5. Measures of variability: "
variance and standard deviation!
Quantitative approaches!
Quantitative approaches!
Variance and standard deviation : definitions!
Variance!
!Variance and standard deviation are measures of the
«#variability#» of a variable. In other words: how much
they «#vary#» around the mean. !
mean = y =
!Variance = the sum of the square of the individual
departures from the mean divided by the degrees of
freedom!
variance =
!y
n
sum of squares
= s2 =
degrees of freedom
! (y " y)
(n " 1)
!Standard deviation = the square root of the variance. !
standard deviation = s =
Quantitative approaches!
Example: Dwarfs in 3 gardens!
! (y " y)
2
(n " 1)
Quantitative approaches!
Size of dwarfs in 3 gardens!
2
Quantitative approaches!
Quantitative approaches!
Size of dwarfs in 3 gardens!
Garden
!!5
!!2
Quantitative approaches!
A
3
4
4
3
2
3
1
3
!6!3!
!5!10!
" (y ! y) ;
VarA =
2
2
2
2
2
2
(10 ! 1)
2
2
mean(B) = y B = 5
mean(C) = yC = 5
var(A) = sA2 = 1.3
var(B) = sB2 = 1.3
var(C) = sC2 = 14.2
Boxplot = graphical summary of the
variability of a variable!
2
(3 ! 5) + (3 ! 5) + (2 ! 5) + (1 ! 5) + (10 ! 5) + (4 ! 5) + (3 ! 5) + (11 ! 5) + (3 ! 5) + (10 ! 5)
2
VarA =
2
mean(A) = y A = 3
Boxplots!
y=5
n !1
2
C
3
3
2
1
10
4
3
11
Quantitative approaches!
Computing variance of dwarfs in garden C!
Var = s 2 =
B
5
5
6
7
4
4
3
5
2
2
2
2
2
2
(!2) + (!2) + (!3) + (!4) + (5) + (!1) + (!2) + 6 + (!2) + (5)
9
4 + 4 + 9 + 16 + 25 + 1 + 4 + 36 + 4 + 25
VarA =
9
128
VarA =
= 14.2
9
2
2
!!!
75% quartile
Median (50% quartile)
25% quartile
Whiskers = lowest data point
that are not outliers or extreme
values. !
Quantitative approaches!
Quantitative approaches!
Outliers and extreme values in boxplots!
Outliers
!= values that are between 1.5 and 3 times
!the interquartile range!
Extreme values
!= values that are more than 3 times the
!interquartile range!
!= distance between the quartiles!
Interquartile range
!In boxplots, outliers and extreme values are represented by
circles beyond the whiskers. !
Showing differences between means and
variance graphically with „boxplots“!