Download Week 5: Using the Normal Curve and Z

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
PS 390.002 - SP 14
Week 5: Using the Normal Curve and Z-Scores
Nick Davis (Teaching Assistant)
Office: Bolton 656
Office Hour: Wed. 4-5pm
Web page: http://democracyobserver.org/390
Data for Week
•
this week you will use the ANES 2008 dataset for the lab exercises:
http://democracyobserver.org/data/nes2008.dta
•
recall that Stata commands . describe and . codebook varname
allow you to learn more about the data, where varname is a
variable name
Getting Add-ons for Stata
•
you need zcalc to
calculate z-scores
•
the code . findit zcalc
will open a viewer
window, scroll to where it
says “1 package found”
and click the link to install
•
to learn what zdemo2
needs, use . help zcalc
Normal Curve
400
the assumption that our variables take on a normal distribution
(looks like a bell-curve) underlies our statistical analysis
0
100
Frequency
200
300
•
0
10
20
30
40
50
60
70
Feeling thermometer: CONGRESS
80
90
100
Description to Inference
•
we need to be able to describe before we can make inferences
about variables and their relationships with each other
•
normality (and mean, standard deviation) allow us to precisely
describe empirical distributions
•
normality allows us to make credible statements about data, since
a normally distributed variable will always have 99.72% of the
distribution in 3 standard deviations (+/-) from the mean of that
variable
Z-Scores
•
a z-score for a value of a variable tells you where where that
observation fits on the curve (and how many observations lie
above or below that point)
•
get the summary statistics of a variable using . tabstat varname,
s(n mean sd) by(varname2)
Comparing Distributions
•
zdemo2 allows you to compare two distributions by entering the
mean and standard deviation of each group: . zdemo2 60.79 20.2
47.29 21.89
Two Normal Distributions
Republican
-40.27
Democrat
47.29 60.79
Congress Feeling Thermometer
141.59
Calculating Z-Scores
•
if you want to know how many observations lie above and below a
given point (respondent answers 75 out of 100) get the z-score:
. zcalc x mean sd!
!
!
!
•
the z-score here is 0.94, which we can then use to find the
proportion of area under the normal curve below and above 75
Area Under Normal Curve
•
proportion of the area under the normal curve that lies below
z=0.94: . display normprob(value)!
•
proportion of the area under the normal curve that lies above
z=0.94: . display 1 - normprob(value)
Area Under Normal Curve
•
proportion of the area under the normal curve that lies below
z=0.94: . display normprob(value)!
•
proportion of the area under the normal curve that lies above
z=0.94: . display 1 - normprob(value)
Lab Exercises
•
use any data set other than ANES 2008 and find two different
variables for which you would like to calculate z-scores
•
pick a value for each variable and find its z-score
•
find the area of the normal cure above and below that value’s zscore
•
lab manual data sets can be found at:
http://democracyobserver.org/390