Download Document

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia, lookup

Transcript
RESEARCH METHODOLOGY &
STATISTICS
LECTURE 6: THE NORMAL DISTRIBUTION AND CONFIDENCE
INTERVALS
Addictions Department
MSc(Addictions)
From sample to population…
units
population
inference
sample
Background to statistical inference
normal
distribution
sampling
distribution
standard
normal
distribution
area under the
curve
percentage
points
confidence
intervals
p-values
(significance)
RESEARCH METHODS
AND STATISTICS
The normal
distribution
The normal distribution
• Reasonable description of most continuous variables
– given large enough sample size
Standard deviation = 6.5cm
Mean = 171.5cm
The normal distribution
• Reasonable description of most continuous variables
– given large enough sample size
• Location determined by the mean
• Shape determined by standard deviation
•Total area under the curve sums to 1
The standard normal distribution
• Has a mean of 0 and a standard deviation of 1
mean
standard deviation
The standard normal distribution
• Has a mean of 0 and a standard deviation of 1
• Relates to any normally-distributed variable by conversion:
standard normal deviate = observation – variable mean
variable standard deviation
• Calculations using the standard normal distribution can be
converted to those for a distribution with any mean
and standard deviation
Area under the curve of the normal distribution
• Percentage of men taller than 180cm?
– Area under the frequency distribution curve above 180cm
– Standard normal deviate: (180 - 171.5)/6.5 = 1.31
mean
sample
standard deviation
SND
0.0951
Area under the curve of the normal distribution
• Percentage of men taller than 180cm?
– Area under the frequency distribution curve above 180cm
– Standard normal deviate: (180 - 171.5)/6.5 = 1.31
• Percentage of men taller than 180cm is 9.51%
0.0951
Area under the curve of a normal distribution
• Percentage between 165cm and 175cm?
– Find proportions below and above this
– Subtract from 1 (remember: total area under the curve is 1)
-1 0.54
1 – 0.2946 – 0.1587 = 0.5467
0.1587
0.2946
Area under the curve of a normal distribution
• Percentage between 165cm and 175cm?
– Find proportions below and above this
– Subtract from 1 (remember: total area under the curve is 1)
• 54.6% of men have a height between 165cm and 175cm
1 – 0.2946 – 0.1587 = 0.5467
0.1587
0.2946
Percentage points of the normal distribution
• The SND expresses variable values as number of
standard deviations away from the mean
• Exactly 95% of the distribution lies between -1.96 and 1.96
– The z-score of 1.96 is therefore 5% percentage point
2.5%
2.5%
95%
COMPUTER EXERCISE
The normal
distribution
Distributions and the area under the curve
www.intmath.com/counting-probability/normal-distributiongraph-interactive.php
Exercises
1. Drag the mean and standard deviation left and right to
see the effect on the bell curve
2. My variable has mean = 6 and standard deviation = 0.9
• What proportion of observations are between 5 and 7?
• How does this change when standard deviation = 2?
Hint: click on “Show probability calculation”
3. Verify that 95% of observations are within 2 standard
deviations of the mean for any distribution
Hint: the red dashed lines are standard deviation units
RESEARCH METHODS
AND STATISTICS
Sampling
distributions
and
confidence
intervals
Sampling distributions
population
6
mean
sample
Sampling distributions
population
6
5
mean
sample
Sampling distributions
population
5
6
mean
sample
6
Sampling distributions
population
4
6
7
5
6
7
8
5
6
7
8
9
sampling distribution
sampling distribution of the mean
sample
Relationship between distributions
population
the distribution of the mean is normal
even if
the distribution of the variable is not
mean
mean
sample
mean
sampling distribution
Relationship between distributions
population
how precisely the population mean
is estimated by the sample mean
standard
error
deviation
standard
deviation
sampling distribution
√sample
size
sample
95% confidence interval for a mean
population
mean
mean
mean
mean -1.96 x s.e.
sample
mean +1.96 x s.e.
95% probability that sample mean
is within 1.96 standard errors of the
population mean
95% confidence interval for a mean
population
mean?
mean
mean
mean -1.96 x s.e.
sample
mean +1.96 x s.e.
95% probability that population mean
is within 1.96 standard errors of the
sample mean
95% confidence interval for a mean
population
mean?
mean
mean
mean -1.96 x s.d.
√size
sample
mean +1.96 x s.d.
√size
95% probability that population mean
is within 1.96 standard errors of the
sample mean
Sampling and inference
population
mean?
mean
sample
mean
sampling
distribution
Interpreting confidence intervals
• Don’t say:
“There is a 95% probability that the population mean
lies within the confidence interval”
• The population mean is unknown but it is a fixed
number
• The confidence interval varies between samples
1. Take multiple random, independent samples
2. For each, calculate 95% confidence interval
3. On average, 19/20 (95%) of the confidence intervals will
overlap the true population mean
COMPUTER EXERCISE
Confidence
intervals
Modify Java settings
1. Go to the Java Control Panel (On Windows Click Start
and then type Configure Java)
2. Click on the Security tab
3. Click on the Edit Site List button
4. Click the Add button
5. Type http://wise.cgu.edu
6. Click the Add button again
7. Click Continue and OK on the security window dialogue
box
Creating confidence intervals
http://wise.cgu.edu/ci_creation/ci_creation_applet/index.html
Exercises
1. How does altering the sample size affect the confidence
intervals calculated?
2. When the population distribution is skewed, how does
this affect the confidence intervals calculated?