Download here

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
INTRO TO SPSS
Part 1: Getting started
Nathan Walter
Annenberg School for Communication & Journalism
SPSS
 One of the most popular software packages for statistical analysis
 Originally tailored for the needs of social science scholars and practitioners
 Recent versions attempt to appeal to health sciences
 Very common in the industry!!!
What’s the type of scaling for the variable?
Scale Properties
Scale
Examples
Scale
has an
absolute
zero?
Ratios equal
(2/1 =4/2);
Proportions
appropriate
Interval
between
adjacent
scale points
same
Ordered
Arbitrary
Labeled
categories
Mathematical
Operations
(non-controversial
use)
Ratio
Age, number of
sex acts
Y
Y
Y
Y
N
Division, Multiplication,
Subtraction & Addition
Interval
Temperature
(Fahrenheit)
N
N
Y
Y
N
Subtraction & Addition
Ordinal
Likert scales;
Ranked
preferences
N
N
N
Y
N
Greater than or less than
operations (for Likertand similar scales S/A
typical)
Nominal
Gender, Race
N
N
N
N
Y
Counting
Measures of central tendency
What is the best way to describe the center of the distribution?
Mean – average
Median – middle
Mode – Most common
With that said…
 Measures of central tendency describe the most “typical” value in the
distribution, but they don’t tell us the whole story.
 For instance, the following two distributions have the same mean, mode, and
median. Does it mean that they are the same?
3,3,3,3
4
4
3
3
2
Freq
Freq
4,3,3,2
1
1
2
3
4
2
1
1
2
3
4
Measures of dispersion
Range – the difference between the largest score in the
distribution and the smallest score.
R  Max  Min
Interquartile range – the upper quartile minus the lower quartile.
IQR  Q3  Q1
Measures of dispersion
Variance – the squared mean of the deviations from the mean.
Standard deviation – the square root of the variance.
With that said….
Group 1: 95, 96, 97, 98, 99
Group 2: 5, 6, 7, 8, 9
SD for group 1:
2
2
2
2
2
4 1 0 1 4
(99  97)  (98  97)  (97  97)  (96  97)  (95  97)
 1.41


5
5
SD for group 2:
2
2
2
2
2
4 1 0 1 4
(9  7)  (8  7)  (7  7)  (6  7)  (5  7)
 1.41


5
5
Output
Statistics
How old are you ?
N
Valid
Missing
Mean
Median
Mode
Std. Deviation
Range
Percentiles
25
50
75
243
0
40.62
37.00
26
14.931
66
28.00
37.00
52.00
Normal distribution
Distribution of IQ among students (N = 89)
Distribution for years of schooling (N = 27750)
Distribution of “resistance to change”
(154548)
Normal Distribution: Plots & Tests
 Kolmogorov-Smirnov test- compares data with a normally distributed
probability sample.
 Shapiro-Wilk test- a null-hypothesis assumes that the sample is normally
distributed (more power).
Is there significant skew?
Why does this matter?
• Determining
whether there is a significant skew
z = Skew/ Standard Error of Skew
*Is Z greater than [3.29] (p <.001)?
If yes, conclude sample is apt to be from population that is
not normal.
Positive skew:
Negative skew:
Kurtosis:
Vertical Abnormalities in Shape
Leptokurtic: Too peaked
(positive)
Platykurtic: Too flat
(negative)
Calculate? Z = Kurtosis/standard error of kurtosis
Oh no, outliers!!!
 Assess the situation – Is it really that bad?!
What to do:
1. Remove the case
2. Transform the data
3. Change the score:
a. Highest/lowest score +/- 1
b. +/- 2SD
Still a problem?
1. Robust methods
2. Bootstrapping
Z (normal) distribution vs. T distribution
Relative frequency
Z
t
Analysis of group means – T tests
A collection of statistical models used to analyze the differences among group
means and their associated procedures.
a) One sample T-test
b) Independent samples T-test
ti 
xi   x
Sx
( x  x )  (   )
t
Sx  x
1
2
1
1
c) Paired samples T-test
d  d
t
SD d
n
2
2
Scales Construction
Factor analysis:
A. Development of objective tests (CFA)
B. Test theory (CFA)
C. Data Reduction (EFA)
Reliability check/ internal consistency (Cronbach’s alpha)
Common rotation techniques
Technique
Type
Goals
Comments
Varimax
Orthogonal
Minimize complexity of
factors (simplify columns
of loading matrix) by
maximizing variance of
loadings on each factor)
Most commonly used
rotation. Default
recommended.
Direct
oblimin
Oblique
Simplify factors by
minimizing crossproducts of loadings
Allows wide range of
factor intercorrelations
Data manipulation
Questions?
Nathan Walter
[email protected]