Download Document

Document related concepts
no text concepts found
Transcript
MBP1010 - Lecture 2: January 14, 2009
1. Density curves and standard
normal distribution
2. Sampling distribution of the mean
4. Confidence Interval for the mean
5. Hypothesis testing
(1 sample t test)
Reading: Introduction to the Practice of Statistics:
1.3, 3.4, 5.2, 6.1-6.4 and 7.1
Standard deviation vs standard error
for describing data
Table 1. Characteristics of study subjects (n=35)
Variable
Mean
Standard
Deviation
Standard
Error
Age (yrs)
43.5
4.78
0.81
Height (cm)
165.8
5.66
0.97
Weight (kg)
64.3
8.61
1.46
Blood Cholesterol
(mmol/l)
5.00
0.94
0.16
Importance of Normal Distribution*
1. Distributions of real data are often close to normal.
2. Mathematically easy to work with so many statistical
tests are designed for normal (or close to normal)
distributions).
3. If the mean and SD of a normal distribution are
known, you can make quantitative predictions about
the population.
* also called Gaussian curve
Red bars = scores  6
Proportion = 0.303
Red area under the
density cure are  6.
Proportion = 0.293
Cumulative proportion for value x is the proportion of all
observations that are  x; this is the area to the left of the curve.
Mean = 64.5 inches
SD = 2.5 inches
“The 68-95-99.7 Rule”
The standard normal distribution is:
a normal distribution with a mean of 0 and a SD of 1.
Normal distributions can be transformed to
standard normal distributions by the formula:
where X is a score from the original normal distribution,
μ is the mean of the original normal distribution, and
σ is the standard deviation of original normal distribution.
The standard normal distribution is sometimes called the
z distribution.
Standardized Normal Distribution
Z-score
A z score always reflects the number of standard deviations
above or below the mean a particular score is.
Ex. If a person scored 70 on a test with mean of 50 and
SD of 10, then they scored 2 standard deviations above
the mean. Converting the test scores to z scores, an X of 70
would be:
So, a z score of 2 means the original score was 2 SD
above the mean.
Z Scores
-Provide a meaningful way to compare individuals from
different normal distributions – on the same scale
Ie. How many SD above or below the mean?
Eg, - bone density measures
- growth charts – height of children at different ages
- “normalized” data
Quantile-Quantile (Q-Q) Plot
QQ-plot shows the theoretical quantiles versus the empirical
quantiles. If the distribution is “normal”, we should observe a
straight line.
Rice Virtual Lab in Statistics
http://onlinestatbook.com/rvls/
Hyperstat Online
Section 5. Normal Distribution
- theory
Sampling and Estimation
Populations and Samples
Population: entire group of individuals that we
want information about
Sample:
a part of the population that we
actually examine in order to gather
information
Goal:
to try to draw conclusions about the
population from the sample
Whole Population
Mean = 
SD = 
Sample
Inference
Sample
Mean = x
SD =
s
Parameter:
- a number that describes the population
- number is fixed but in practice we do not
know its value (eg, μ)
Statistic:
- a number that describes a sample (eg, x).
- its value is known when we take a sample,
but it can change from sample to sample.
- often used to estimate an unknown parameter .
Statistical inference is the process by which we
draw conclusions about the population from the
results observed in a sample..
Two main methods used in inferential statistics:
estimation and hypothesis testing.
In estimation, the sample is used to estimate a
parameter and a confidence interval about the
estimate is constructed.
Random Sampling is Key!
- every individual in the population sampled must have a
chance of being included in the sample
- the choice of one subject does not influence the
chance of other subjects being chosen
- use a method of sampling in which chance alone operates
- toss of a coin, draw from a hat
- random number generators
- random assignment in clinical trials results in randomly
selected groups
Simple Random Sampling (SRS)
- the chances for each individual in the population to
be selected is equal
- every possible sample an equal chance to be chosen
Stratified Sampling
- divide the population into strata
- choose SRS in each stratum
- combine these SRS to form full sample
eg. Strata: prognostic factors in cancer patients;
male/female, age
- consult a statistician for more complex sampling
Sample mean (x) as an estimator of the
population mean ()
What would happen if we repeated the sample
several times?
Sampling variability:
- repeated samples from the same population
will not have the same mean
- depends partly on how variable the underlying
population is and on the size of the sample
selected
Sampling Distribution of X
- the distribution of values taken by the mean (x) in all
possible samples of the same size from the same
population
-
1. Mean of sampling distribution of x = 
2. SD of sampling distribution =
- called standard error of the mean
3. Shape of the sampling distribution is approximately
a normal curve, regardless of the shape of the
population distribution, provided n is large enough
(Central Limit Theorem)
Simulation of Sampling Distribution
Central Limit Theorum
Rice Virtual Lab in Statistics
http://onlinestatbook.com/rvls/
Population: All MBP1010 students
n=37
 = 1.00 cup
 = 1.07 cups
Population
n=37
One Randomly
Selected Sample
n=12
 = 1.00
 = 1.07
x = 0.875
s = 0.78
Population
n=37
 = 1.00
 = 1.07
Sampling Distribution
1000 repeats of n=12
Mean = 1.00
SD = 0.26
Population
n=37
 = 1.00
 = 1.07
Sampling Distribution
1000 repeats of n=12
Mean = 1.00
SD = 0.26
(SEM)
One Sample
n=12
x = 0.875
s = 0.78
SEM = 0.23
s/n
Confidence Interval of the Mean
Standard Normal Distribution
95% Confidence Interval
= 0.95
=0.025
-1.96
2.5 th
=0.025
1.96
97.5 th
95% Confidence Interval for a population mean
If population  known (not realistic)
Pr (-1.96  z  1.96) = 0.95
Pr (-1.96 
x-
 1.96) = 0.95
/n
Pr (x -1.96/n  
Express x in standardized
form: z statistic
 x + 1.96/n ) = 0.95
x - 1.96(/n) and x + 1.96(/n) are the
95 percent confidence intervals on the
population mean 
24 out of 25 samples
included  (96%)
In the long run, 95% of all samples will have an interval
that includes .
90% Confidence Interval
=0.05
-1.645
5 th
= 0.90
=0.05
1.645
95 th
Confidence Interval for a population mean
population  NOT known (usual)
- use sample standard deviation (s)
as an estimate of 
- therefore, /n estimated from sample
using: s/n (standard error of the mean;SE)
- SE of the sample is the estimate of the SD
that would be obtained from the means of a large
number of samples drawn from that population
Problem:
x-
Critical Ratio =
s/n
is not normally distributed
-need to consider reliability of both x and s as
estimators of  and  respectively
- shape of the distribution depends on the
sample size n
Therefore
x-
s/n
follows the t distribution
t - distribution
- a family of distributions indexed by the
degrees of freedom (n-1)
- degrees of freedom refer to number of
independent quantities among a series of
numerical quantities
Degrees of Freedom
For SD:
- there are n deviations around the mean
- there is one restriction: sum of deviations = 0
- therefore once we have calculated n-1 deviations
around the mean, the last number would be already
determined as the sum must be 0
(ie. not independent).
- for n deviatons around the mean there are
n-1 degrees of freedom (DF)
95% Confidence Interval for a population mean
population  NOT known (usual)
A sample consists of 25 mice with a mean tumor size of
2.1 cm and SD = 1.9 cm.
x - t 24,0.975 x s/n,
x + t 24,0.975 x s/n
t 24,0.975 = 2.064 (from tables of t dist)
2.1 - (2.064 x 1.9/  25), 2.1 + (2.064 x 1.9/  25)
= 1.32 , 2.88 cm
Confidence interval for a Mean
Estimate of mean tumor size = 2.1 cm; n=25.
95% CI = 1.32 , 2.88 cm
Interpretation:
- 95% of the intervals that could be constructed from
repeated random samples of size 25 contain the
true population mean 
- we are 95% confident that the mean tumor size
is between 1.32 and 2.88 cm.
Factors affecting the length of the confidence interval
x  t n-1, .975 x s/n
s/n = SE
Sample size: as n increases, length of the CI decreases
variation:
as s, which reflects variability of the distribution
of observations, increases, the length of the
CI increases
level of
confidence: as the confidence desired increases (ie 90,95,
99% CI), the length of the CI increases.
Standard deviation vs standard error
for describing data
Table 1. Characteristics of study subjects (n=35)
Variable
Mean
Standard
Deviation
Standard
Error
Age (yrs)
43.5
4.78
0.81
Height (cm)
165.8
5.66
0.97
Weight (kg)
64.3
8.61
1.46
Blood Cholesterol
(mmol/l)
5.00
0.94
0.16
Standard deviation vs standard error
for describing data
If the purpose is to describe the data (eg. to see if
subjects are typical): standard deviation
- variability of the observations
If the purpose is to describe the results (outcome) of the
Study: standard error
confidence interval
- precision of the estimate of a population parameter
Note:
-can calculate one from the other
- indicate clearly whether reporting SD or SE
What Formal Statistical Inference Cannot Do
-tell you what population you should be interested in
- ensure that you sampled properly from the population
- determine whether measurements made are
biased (systematically wrong)
DOES:
- give a quantitative indication of how much random
variation may have affected your results
What/who are we trying to study?
Target Population
Patients with
rheumatoid
arthritis
All
voters
Population Sampled
Patients admitted
to a particular
hospital
telephone
listings
Sample Studied
Sample of
records of
above patients
sample of
above listings
Hypothesis Testing
Dietary fat intake in the low fat and control groups
Plots
(n=151Schematic
intervention
and 187 control)
|
45 +
|
|
|
|
|
|
40 +
|
|
|
|
|
|
|
35 +
0
|
|
0
|
|
0
+-----+
|
|
|
30 +
|
|
|
|
*--+--*
|
|
|
|
|
|
|
|
25 +
|
|
|
|
|
+-----+
|
|
|
|
+-----+
|
20 +
|
|
|
|
|
|
|
|
*--+--*
|
|
|
|
|
15 +
|
|
|
|
+-----+
|
|
|
|
10 +
|
|
|
|
|
|
5 +
------------+-----------+----------GROUP
1
2
Low Fat
Control
Blood HDL-cholesterol levels in the low fat and control
groups (n=163 intervention and 199 control)
|
2.6 +
|
|
|
|
2.4 +
|
|
|
|
0
|
|
|
2.2 +
|
|
|
|
|
|
|
|
|
|
2 +
|
|
|
|
|
|
|
|
|
|
|
1.8 +
|
+-----+
|
|
|
|
|
|
|
|
|
+-----+
|
|
1.6 +
|
|
|
|
|
|
|
| + |
|
|
|
*-----*
|
*--+--*
|
|
1.4 +
|
|
|
|
|
|
|
|
|
|
|
|
+-----+
|
+-----+
|
1.2 +
|
|
|
|
|
|
|
|
|
|
|
1 +
|
|
|
|
|
|
|
|
|
|
0.8 +
------------+-----------+----------GROUP
1
2
Low Fat
Control
mean = 1684 kcal/day
SD = 380.5 kcal/day
Examples of conclusions of hypothesis tests
The mean intake of dietary fat is significantly lower
in the low-fat group as compared to the control
group (17.5 vs 28.3 percent energy from fat; p 
0.001). (2 sample t test)
Does the energy intake of women in a sample
differ from the “recommended” level of 1850 kcal?
(1 sample t test)
Hypotheses
- hypotheses stated in terms of the population
parameters (true means)
- null hypothesis: Ho
- statement of no effect or no difference
- assess the strength of evidence against
null hypothesis
- alternative hypothesis: Ha
- what we expect/hope to see
- Usually a 2 sided test
Control
c
Xc
=
vs
Intervention
T
XT
Overview of hypothesis testing
Compute the probability of obtaining a
difference as large or larger than the observed
difference assuming that, in fact, there is no
difference in the true means.
If the probability is not very small, we conclude
that observing such a difference is plausible,
even when true means are equal, I.e. the data
do not provide evidence that true means are
different.
if probability is very small, we conclude there is
a difference between the means.
Significance tests answers the question:
Is chance or sampling variation a likely
explanation of the discrepancy between
a sample results and the null hypothesis
population value?
Yes: sample result is compatible with idea
that sample is from population in which
null hypothesis is true
No: discrepancy unlikely due to chance
variation - sample result is not compatible
with idea that sample is from population in which
null hypothesis is true
Steps in Hypothesis Testing
1. State hypothesis.
2. Specify the significance level.
3. Calculate the test statistic.
4. Determine p value.
5. State conclusion.
One Sample T test
One Sample T test: Energy intake in women
For a sample of randomly selected 29 women:
Mean energy intake = 1,684 kcal/day
Standard deviation (s) = 380.5 kcal/day
Does the energy intake of women in this
study differ from the “recommended” level
of 1850 kcal?
Example of energy intakes
1. State hypotheses:
Ho: the true mean energy intake of women in the
trial is not different from 1,850 kcal/day
Ha: the true mean energy intake of women in the
trial is different from 1,850 kcal/day
Specific Notation:
Ho:  = 1,850
Ha:   1,850 (2 sided)
2. Significance Level
- how much evidence against Ho we require
to reject Ho (determine in advance)
- compare the p value with a fixed value
that is considered decisive
- this value is called significance level
- denoted as 
- commonly use  = 0.05
Significance Level
 = 0.05
- require that the data give evidence against
Ho so strong that it would happen not more
than 5% of the time (1 in 20), when Ho is true.
 = 0.01
- require that the data give evidence against
Ho so strong that it would happen not more
than 1% of the time (1 in 100), when Ho is true.
3. Calculate the test statistic
- test statistic measures compatibility between
null hypothesis and the data
- to assess how far the estimate is from parameter:
standardize the estimate
- z statistic (when  known)
- t statistic (when  not known)
One Sample t test
- use t distribution when population standard
deviation () not known
To test hypothesis Ho:  = o based on a SRS
of size n, compute the t statistic:
degrees of freedom = n-1
Step 3. Calculate test statistic.
Based on sample of 29 women:
x = 1684 kcal/day;
standard deviation (s) = 380.5 kcal/day
t=
x-
s/n
= 1684 - 1850
380.5/29
= -2.35
Determine the p value
- probability of getting an outcome as extreme
or more extreme than the actually observed
outcome
- extreme: far from would be expected if
null hypothesis is true
- smaller the p value, the stronger the evidence
against the null hypothesis
Energy Intake in Women
t=
1684 - 1850
380.5/29
= -2.35
2 sided test:
P(t  -2.35 or t  2.34)
P(t  -2.35) = 0.0130
P(t  2.35) = 1 - 0.9870 = 0.0130
P value = 2P( t  -2.35) = 0.026
Step 4. Determine p value.
p = 0.0130
p = 0.0130
t = -2.35
2 sided p = 0.026
t = 2.35
What does a “small” p value mean?
1. An unlikely event occurred (getting a large
value for the test statistic by chance).
2. The null hypothesis is false.
P value for a 2 sided test:
Probability of getting an outcome as extreme or
more extreme than the actually observed outcome
in either direction, if the null hypothesis is
true.
Statistical Significance
In the example:
p value = 0.026
2.6% chance of observing a mean energy intake of
1684 kcal/day in a sample of women even if the
true mean is not different from the recommended level
of 1850 kcal/day.
What do we conclude?
Statistical Significance
p value = 0.026
We reject the null hypothesis, Ho.
The mean energy intake of women is significantly
lower than the recommended intake (p < 0.05).
The mean energy intake of women is significantly
lower than the recommended intake (p = 0.03).
(Significant at the 5% but not the 1% level)
Using R – One Sample t-test
R code: t.test(energy.intake, mu=1850)
R Output:
One Sample t-test
data: energy.intake
t = -2.3493, df = 28, p-value = 0.02610
alternative hypothesis: true mean is not equal to 1850
95 percent confidence interval:
1539.260 1828.741
sample estimates:
mean of x
1684.001
Statistical Significance
If recommended level is 1750 kca/day;
then p = 0.36.
36% chance of observing a mean energy intake of
1684 kcal/day in a sample of women even if the
true mean is not different from the recommended level
of 1750 kcal/day.
What do we conclude?
Statistical Significance
p value = 0.36
We do not reject the null hypothesis, Ho.
The data do not provide evidence that mean energy
Intake of women is different from the recommended
level.
The mean energy intake of women in the study is
not significantly different from recommended level of
1750 kcal/day (p = 0.36).
One sided test
Ho:  = 1850
Ha:  < 1850
p = 0.0130
Probability values for one-tailed
tests are one half the value for twotailed tests as long as the effect is
in the specified direction.
One-sided vs two-sided tests
- one sided tests are rarely justified
- decide on appropriate test prior to experiment
- Do not decide on a one-sided test after
looking at the data
eg. p value for 2 sided is 0.09
p value for 1 sided is 0.045
If any doubt: choose 2 sided test!
General guidelines for stating significance
If:
results are:
0.01 p < 0.05
significant
0.001 p < 0.01
highly significant
p < 0.001
very highly
significant
p > 0.05
not statistically
significant (NS)
0.05  p < 0.10
trend towards
statistical significance
Reporting actual p values
A. p value = 0.0512
Conclude: result is NS, p > 0.05
If the effect is interesting and potentially
important would probably want to:
- repeat study
- check power of study
b. p value = 0.75
Conclude: result is NS, p > 0.05
- likely no effect
Comments/Cautions about hypothesis testing
Statistical vs clinical significance
- look at the size of effect not just p value
- look at confidence interval for parameter of
interest
- with a large sample size, a very small effect
may be statistically significant
Exploratory data analysis vs hypothesis testing
- exploratory data analysis is important
- but cannot test a hypothesis on the same data
that first suggested it
- if report findings - clearly state - post hoc
- need to design a new study to test the hypothesis
Relationship between confidence interval
and p value
95% Confidence interval for a population mean
A sample consists of 25 mice with a mean tumor
size of 2.1 cm and SD = 1.9 cm.
x - t 24,0.975 x s/n,
x + t 24,0.975 x s/n
t 24,0.975 = 2.064 (from tables of t dist)
2.1 - (2.064 x 1.9/  25), 2.1 + (2.064 x 1.9/  25)
= 1.32 , 2.88 cm
CI and Hypothesis Test
95 % CI for mean tumor size
= 1.32 , 2.88 cm
Ho:  = 2.9
Ha:   2.9
t=
x-
s/n
x = 2.1 cm
s = 1.9 cm
= 2.1- 2.9
1.9/25
p = 0. 0459
=
2.105
Related documents