Download Purpose and Goals Descriptive statistics and standard error of the mean

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts

Regression toward the mean wikipedia, lookup

Student's t-test wikipedia, lookup

Taylor's law wikipedia, lookup

Bootstrapping (statistics) wikipedia, lookup

Transcript
FUNDAMENTALS OF MEDICAL RESEARCH
Ed Gracely, Ph.D.
Family, Community, and Preventive Medicine
July 15, 2014
Descriptive statistics and standard error of the mean
Purpose and Goals
Attendees should be able to:
Define and interpret the most common descriptive and epidemiologic statistics, including
1. Averages (mean, median)
2. Measures of variability (SD, range, interquartile range)
3. The standard error of the mean
4. Relative risks and odds ratios.
5. Absolute risk reductions and number needed to treat.
6.
Apply some basic rules for reading/creating graphs and tables.
A) Basic descriptive statistics
The sample is the set of subjects actually included in the study.
The population is the larger body of individuals, generally too large to study, to whom the
results are intended to generalize.
1) Averages:
The mean is the ordinary average. Add up the N values and divide by N.
The symbols X (sample mean) and μ (population mean) are commonly encountered.
The median is the number closest to the middle of a rank ordered set. The 50-th percentile.
~
The median is sometimes symbolized as X
Decisions: The mean better uses all of the data, and is generally preferred. Its big fault is that a few
very high (or very low) values can pull it quite far from the center of the data. In those cases,
the median is often preferred.
Q:
The mean salary of major league baseball players in 1993 was $1.16 million. The median
was $490,000. (Source lost). What do the mean and median tell you about this group? a
Q:
In a study of bacterial endocarditis, time till diagnosis had median = 60 days, mean = 82,
range 0 - 490. What can you conclude about this data set? (Source lost). b
2) Measures of variability
Standard deviation: Perhaps the most common measure of the variability of the data
around the mean. Roughly indicates how far the "typical" value is from the mean.
1
In a symmetrical (in particular in a bell-curve "normal" distribution) 2/3 of the values will be
within 1 SD of the mean, 95% within 2 SD's. This statistic is often used with the mean.
Ex: Mean + SD for heart rate was 140 + 35 in a group of children admitted for an attack of
asthma. (modified St Chris data).
Q: If heart rate is normally distributed, within what interval did about 2/3 of the heart rates
fall? How about 95%? c
A z-score is the number of standard deviations a value is above or below the mean,
with values below the mean having negative z scores.
DEXA scan results are reported as z scores, which are a z score based on the patient's own age
and sex. A DEXA "T" score is also a z-score, but using the mean and SD of a healthy young
population rather than the patient's own age group.
Q: An older woman has a DEXA z score of -0.8 and a T score of -2. What does this pattern
mean? d
3) Other measures of variability Both are often used with the median.
Range: The difference between the largest and smallest values (and often just presented
as those two rather than the "difference").
Interquartile range: The difference between the 75th and 25th percentiles, also often
just given as those two values, rather than as the difference.
B) Standard error of the mean (SEM)
The SEM indicates how closely the sample mean approximates the population mean it
estimates.
You can be approximately 95% sure that the true mean (that is, the population mean) for the
parameter of interest is within 2 SEM's of the sample mean.
The standard error of the mean (SEM or just SE) is calculated as
SEM = SD/ √ (number of subjects)
Ex: "After 48 hours on oral rehydration therapy, a group of 36 children had a hematocrit of
37, SD = 4". The "37" is the sample mean.
So SEM = 4/√36 = 4/6 = 0.67.
Thus you can be 95% sure that the population mean hematocrit for subjects like those in the
example is between 37 - 2(.67) and 37 + 2(.67) = 35.66 to 38.34
Ex: 36 children with hyaline membrane disease had a mean + SE birth weight of 1400 + 50
gm (modified from Pediatrics, August 1976).
Q: Within what interval are you 95% sure the population mean birth of children like
those in the study weight falls? e
2
Key points/distinctions:

The SD tells the variability of individual subjects/values in either the sample or
the population.

The SEM tells how closely the sample mean approximates the population mean.

In trying to determine the variability in the data in an article, always check to
see if you are being given the SD or the SEM. Since the SEM divides the SD by
the square root of the sample size, it is always much smaller!
Standard errors for other parameters (percents, regression statistics) are sometimes encountered. They
are used in the same way as the SEM, but there is no SD involved, and the calculation is different.
Normally they will be given to you in the text.
C) Some simple graphs and tables guidelines
Ex:
Consider the following data:
Note: Subjects were asked if they had any limitations (such as physical issues) on
activity.
Presence or absence of personal activity limits in adults
surveyed,
by educational attainment
No activity
Any limit
Total
Education
limit
Less than High
546 (73%)
204 (27%)
750
School N (%)
High School or
2,581 (89%) 311 (11%)
2,892
Higher N (%)
Rule: Whenever you have two or more groups to be compared on a yes/no (or other two-level)
dependent variable, the best approach is to find the percentage in the more interesting level of
the dependent (outcome) variable for each of the groups to be compared. Then the comparison
of interest is “Of those in group 1, x% had this characteristic. Of those in group 2, y% had that
same characteristic.”
Q: Take my example above and put it into this form. f
Ex: Consider the following data:
Age
Young
Old
Total
Fail
30
6
36
Succeed
70
14
84
Total
100
20
120
A researcher argues: “70% of the Young group succeeded (70/100), whereas only 30% of
them failed (30/100). Furthermore, a full 83.3% of the successful subjects were in the young
group (70/84). These results clearly show the association between age group and success”.
Q: This argument is not valid. Why? Use my suggested approach above to make more
appropriate comparisons. g
3
Remember: In almost all applications in medicine that involve comparisons of groups on a yes/no
characteristic or outcome, we are comparing rates or percentages with the characteristic. Be sure you are
comparing apples to apples. And note that it is rarely appropriate to use raw numbers. There are
exceptions, but they represent special cases.
Rule: All graphs should be clearly labeled with a title and axis labels. Show units,
where appropriate. Ideally, a graph in a publication should stand alone, with
minimal need to refer to text or other graphs just to know what it shows.
Avoid undefined abbreviations or acronyms, unique undefined terms, and
so on.
Similar rules apply to tables. As for graphs, tables must be given a descriptive title. They should
also have clear row and column headers. It is common to give both numbers and percentages. Also,
indicate the units or contents of the table cells, like "N (%)".
D) Relative risks and odds ratios
Relative risk: the incidence in one (exposed) group divided by the incidence in the unexposed.
 When greater than 1, it is common to interpret a relative risk as how many times "more
 likely" something (like a disease) is to happen in one group than in another.
 When the relative risk is less than 1, it is often simpler to subtract it from 1 and interpret the
result as a relative risk reduction.
Ex: The relative risk of lung cancer in smokers may be 10 compared to non-smokers. This would
mean that smokers were 10 times as likely to develop it as non-smokers.
Ex: The relative risk of some disorder in smokers who quit years ago may be 0.3 compared to
current smokers. The incidence in that group is 0.3 times that in current smokers. And we can
say that quitting reduced their risk by 1-0.3 = 0.7 (or 70%).
Q: A large group of healthy patients is followed for 4 years, with symptomatic diverticular
disease as an outcome of interest. NSAID users were found to have a relative risk of 2.24
compared to non-users for this outcome. What does this mean? (Aldoori et al, Arch Fam Med,
May-Jun 1998). h
Ex: Kearney et al, Am. J. Epidemiology May 1, 1996, followed a group forward after assessing
dietary variables. They were interested in colon cancer. The relative risk for that cancer was
about 0.6 in those with the highest calcium compared to those with the lowest.
Q: This suggests that people with high calcium had (pick all that apply): i
a. A risk 0.6 times that of people in the low calcium group of getting colon cancer.
b. A 0.6% probability of getting colon cancer.
c. A risk of colon cancer that was 1.6 times that in the low calcium group.
d. A risk that was 40% lower than those in the low calcium group.
Q: What relative risk would indicate no difference? j
After adjusting for some confounding variables, the relative risk was much larger and no
longer convincingly different from 1. We’ll discuss this kind of thing further in a later class in
this series.
4
Odds ratio: How many times the "odds" of something happening are increased in one group
compared to another.
Rule: when the outcomes are uncommon (say, < 10% in either underlying population), the
odds ratio approximates the relative risk. It is generally interpreted in that way.
EBM note: For clinical purposes, absolute risk reductions are more useful than relative. This
is the difference in the %'s.
Ex: 2% of cases of a certain infection recur after cure. With a new antibiotic, only 1% do.
This is a relative risk reduction of 50%.
Ex: 40% recur, but only 20% on a new antibiotic. This is also a 50% relative risk reduction.
Q: Which represents a bigger real impact (or are they the same)?
The absolute risk reductions are 1% and 20%. These tell you more about the value of the new
treatment.
The number needed to treat is the inverse of the absolute risk reduction (as a decimal):
1/.01 = 100.
1/.20 = 5
So, using the new antibiotic would prevent one patient in 100 from recurring with the first
disease, but fully 1 in 5 with the second.
QUIZ
1.
You are reporting on a sample of stroke patients, giving the number of years since their stroke. The
results are:
0.6, 1.2, 1.5, 2, 2.6, 3, 3.2, 5, 8, 25, 45
What single statistic best summarizes the average response?
a.
b.
c.
d.
e.
2.
In your practice of hypertensive patients, you have a database that records their most recent SBP and
DBP. Your assistant calculates the mean + SD for the SBP, and finds 120 + 10.
a.
b.
3.
The standard deviation
The standard error of the mean
The median
The mean
The relative risk
Within what interval (assuming rough normality) would you expect about 95% of the SBP values
to fall?
About what % of your patients had a last SBP value of 140 or higher?
Wintemute et al. studied the criminal activities of handgun purchasers who had, or did not have, a prior
misdemeanor conviction as of the date of their purchase. Over the next 15 years, 10% of those without
an initial conviction committed a subsequent crime. For those with a prior conviction, the relative risk
of a second crime (compared to those initially crime-free) was 5. [JAMA, Dec 23/30, 1998]
What % of those with a prior conviction went on to commit a second crime?
5%
10%
20%
25%
50% 75% All of them
5
4.
You read papers on multiple sclerosis in two different major US cities. The paper for Philadelphia
presents an annual incidence rate. The one for Chicago presents a prevalence rate. You don't notice that
difference and think both are incidence rates. You are surprised at the data! Which city would have the
higher reported rate, and why?
5.
The normal range for WBC is 5-10 (thousand per microliter). In a sample of patients with a certain lowlevel infection, the range of WBC is given as 5 to 40, median 25, IQR 10 to 35. What % of this sample
is in the normal range for WBC? What % is above 25?
6.
To help determine the distribution of individual subjects, one should use the (SD // SEM). To get a
margin of error for estimating the population mean, one should use the (SD // SEM).
7.
You see that the mean on a certain blood parameter was 20, SEM = 1. Does this indicate that the
researcher has very little variability in the data?
a.
b.
c.
d.
Yes
No, because the SEM was 5% of the mean.
No, because the SEM depends on N and therefore does not directly indicate data variability.
No, because we don't have the relative risk.
8.
In #7, where are you 95% sure that the true (population) mean lies?
to
.
9.
10% of patients given the standard treatment recur. 8% of patients given the new treatment recur. How
many patients would have to get the new treatment rather than the old in order for 1 to benefit?
10. A researcher reports that 20 of the older children had attained a skill (like self-injection) after training,
but only 10 of the younger group did so. Your comment:
a.
b.
c.
Interesting result. If this is beyond chance it would show the importance of age on learning this
skill.
I need the SD of the yes/no skill outcome in order to interpret this result.
I need the number of old and young kids and then the % successful in each age group to interpret
this result.
FMR.1_of_4.descriptives and SEM.doc
Quiz answers
1: c, since the data has high extremes 2: 100-140 2.5% would be 140 and above 3: RR=5 4: Chicago: since people live many
years with MS, the prevalence will be many times the annual incidence. 5: 25% in NR, 50% above 25.
6: SD then SEM 7. c 8: 18 to 22 (2 SEM’s on either side of the mean)
9: 50 (Absolute risk reduction is 2%  0.02. 1/0.02 = 50). 10 C
a
b
c
d
e
f
g
h
i
j
The few players who make very high salaries pull up the mean but don't affect the median.
There are some very long times till diagnosis. The mean is > median, and neither is mid-range.
105-175 for 2/3. 70-210 for 95%.
She is moderately low for her age group (0.8 SD below the mean) but more seriously low compared to a young
population (2 SD's below their mean. A T score < -1 is osteopenia).
1300-1500
Of those with less than high school, 27% had any limit. Of those with HS or higher, 11% did.
In the young group, 70% succeeded. In the old group, 70% (14/20) succeeded. The success percentages are the
same in the two age groups.
NSAID users were 2.24 times as likely to develop the disease.
(a) and (d).
1.0
6