Download In general, “statistically significant” means “reject null,” “investigate

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
ES 25 Quantitative Thinking: Homework 4
Due in Class on Tuesday, May 1st
Your Workbook will be COLLECTED. Neatness counts (particularly because it leads to
correct answers). Go through your workbook and make sure it is complete, and your
mathematical process is clear (can you understand it?). This will be a good way to
review for the Midterm Exam on Thursday, May 3rd.
If you need to redo any problems, please feel free to revise problems on a blank sheet of
paper and insert into the workbook (behind the original page).
1. Complete all workbook problems through page 54. (if you have attended all
classes (and finished past workbook assignments), the only new problems should
be on pages: 36, 46, 50-54. See partial solution for page 50-51, below.
o Page 36 solution: The consumer ratings for Greenies-R-Us have just been
released, as well as the standard deviations for each category:
z-score =( observation – mean)/(standard deviation)
z-score is the number of standard deviations an observation is away from the mean (+
indicated above mean, - indicates below mean)
Category
Reliability
Eco-Friendliness
Ergonomics
Customer Service
Mean
6.2
4.5
5.4
7.1
Standard
deviation
0.8
2.2
1.5
0.5
Greenies-R-Us
Score
5.0
6.5
6.2
6.0
Greenies-R-Us
z-score
-1.5
0.9
0.5
-2.2
a) Compute Greenies’ standardized test score for each category.
Show your work!
b) Based on the standard scores, on which test did Greenies’ score highest? Lowest?
**Lowest = Customer service(note that reliability “appears” lower using raw score)
**Highest = eco-friendliness (note there was a large st. dev’n in this category, this
means that vehicles vary quite a bit in their “eco-friendliness,” while vehicles seem to
vary only a little bit in their “customer service” rating.
o Page 46 solution:
Based on the 1990 census, the number of hours per day adults spend watching television
varies with a mean of 5 hours and a standard deviation of 2 hours. Based on these results
alone, can you conclude that about 95% of the adult population spends between 1 and 9
hours per day watching television?
IF the distribution of “hours per day adults watch TV” is normally distributed, (with
the mean and standard deviation given), then we can conclude that 95% of the population
watches between 1 and 9 hours per day.
Be sure to draw a good picture that shows the 68/95/99.7 rule.
z(x = 9) = (9 – 5)/2 = 2 (so “9 hours” is two standard deviations above the mean)
Z(x = 1) = (1 – 5)/2 = -2 (so “1 hour” is two standard deviations below the mean).
Since 95% of the data (in this case, each observation is a person’s reported daily hours
watching TV) is contained within 2 standard deviations of the mean, we can conlude that
95% of the population watches between 1 and 9 hours of television.
o Second problem, page 46.
The NAS and EPA also have concluded that 32 parts per billion of mercury in the blood
of pregnant women corresponds to approximately a doubling in the risk of abnormal
performance on a range of neurodevelopmental tests. If mercury concentrations in
pregnant women follow the distribution N(20, 7) ppb, what percent of children should we
expect to have at least double risk of neurodevelopment problems?
Note: we can NOT compute a probability (or percent) for an EXACT z-score, it must be
for a range of values (see #2, below). I revised the question in yellow highlighter to
reflect this fact.
z(32) = (32-20)/7 = 12/7 = 1.7 (so, 32 ppb is 1.7 standard deviations above the mean)
Since I know my 1-2-3, 68/95/99.7 rule, I know that the percentage is between 2.5%
(if it were 2 standard deviations) and 16% (if it were 1 standard deviation)…
To approximate the percentage, look at the area that you have shaded on your sketch
of the distribution, and guess what percentage of the total area under the curve the
shaded area represents. Using the website
(http://faculty.vassar.edu/lowry/tabs.html#z), I find that z>=1.7 = 0.446.
We should expect about 4.5 % of children to be at least doubly at risk of
neurodevelopmental problems due to mercury levels in pregnant women.
2. Read pages 37-42. Think about how the first sentence on the top of page 38
relates to the difficulty of assigning a probability to observing a pine needle of
length 5.42 (hunter problem, page 44). Remember, we had to find the probability
of 5.42, or more extreme to get an area under the curve (for the South Forest).
(for the North Forest, it seems to make the most sense to report the z-score for the
needle length, which would show just how close to the mean the needle length
was). No written answer is required, but put a star/note to yourself next to the
key sentence.
The key here is that the AREA under the curve represents the proportion of values that
are in the range you are investigating. To get an area, you have to multiply the height of
the density curve by a “width” (range) on the x-axis. Thus, you cannot evaluate a
probability at an exact point.
3. IF you need more practice, repeat the problem on page 41, using a value of 70
mm instead of 66mm for parts 1-5. For parts 6-8, recalculate the answers, using a
value of 80 mm instead of77.5 mm. You may do this problem on page 61 of your
workbook, in pencil, to keep all of your work together. Note: this is not a
required problem, but it is highly recommended if you need extra practice.
o 1. z= 1
o 2. .16 or 16% (by the 68/95/99.7 rule, (1-.68)/2 tails = .16
o 3. 320 bones
o 4. 84%
o 5. .68/2 = .34 or 34%
o 6. p(z>2) = .025 or 2.5%
o 7. 50 bones
o 8. 47.5%
Sample solution for pages 50-51. I have attempted to explain my answers clearly,
for easy understanding (since we went through it quickly in class). Please use your
own words (and you can be much more concise) when you fill in your workbook.
Proper article citation:
.
Fried, Peter, B Watkinson, D James, and R Gray, 2002. Current and former
marijuana use: preliminary findings of a longitudinal study of effects on IQ in young
adults . Canadian Medical Association Journal.
Online: http://www.ecmaj.ca/cgi/content/abstract/166/7/887 (click on Full
Text.pdf for entire article)
1. Interpret the p-value of < 0.001 for “prenatal substance exposure (mean),
Marijuana, joints/wk.”
The p-value always tells you the probability of getting the observed value(s), or more
extreme, if the null (no-effect) hypothesis is true. But what is the null hypothesis?
For each of the “characteristics” listed in bold on the left hand side of Table 1, the null
hypothesis is, “Characteristic has no effect on current level of marijuana use.”
For this question, the Null Hypothesis is: “average prenatal exposure to marijuana has no
effect on current level of marijuana use.”
o An associated p-value of 0.001 means, “We would expect to get these value, or
values even more extreme, 1/1000 times, just by random chance, if prenatal
exposure to marijuana has no effect on current level of marijuana us.” That is SO
unlikely (improbable) that we don’t believe it (we reject the null hypothesis, at a
.05 significance level). There is support for the alternative hypothesis, (“prenatal
exposure to marijuana has an effect on current level of marijuana use”).
o Notice that the statement is not specific as to “which “ (of 1.4, 1.4, 11.6, 1.5)
are too extreme, though I expect that smoking 11.6 joints/week, while
pregnant, affects an unborn baby. This value makes me suspicious of the
study. Does the sample of “heavy smokers, > 5 joints/wk” represent the entire
population of heavy pot smokers adequately? I doubt if worldwide, the
average heavy smoker’s mother smoked 11.6 joints/wk while pregnant.
o Interestingly, this p-value says nothing about the effect of prenatal
marijuana exposure to IQ (only to current smoking level). In the full
report, the authors say, “although some characteristics did differ across the
4 groups (such as father’s and mother’s education), none of these was
associated with the IQ difference score; therefore, they were not used as
covariates.”
2. Which of the “family characteristics” measured had a “significant” effect on current
users marijuana smoking?
Both “mother’s education” (p-value=.013) and “father’s education” (p-value=.0009) had
a statistically significant effect on current levels of marijuana use. Try writing a
sentence, “An associated p-value of 0.001 means…” for this problem. Think: what are
the null and alternative hypotheses?
3. Do you believe that light marijuana smoking actually leads to an increase in IQ score
(between preteen and young adult ages), compared to non-users?
o According to the chart, light smokers had a “within subject difference score of
5.8” which is the largest increase among all groups (and the highest overall
score). This suggests that smoking < 5 joints/wk improves IQ more than not
smoking (non-users had a 2.6 average gain in IQ). I found this hard to believe, so
I read more in the full document, which says, “For analyses in which number of
joints smoked per week was treated as a categorical variable, ANOVA with
Dunnett’s procedure indicated that the mean IQ difference score for the heavy
current user group was significantly different from that for non-users (-4.0 v 2.6,
p-<0.05) whereas no significant difference were evident in comparisons with the
light current users and former users (5.8 vs. 2.6 and 3.5 vs 2.6).” Thus, it seems
that the p-values in Table 1 only tell us that (at least) two of the four values
presented are different enough from each other to make the difference statistically
significant.
o A larger issue is that there are only 9 people in the light user group. This small
sample size makes me question the entire study.
o Before I believe that marijuana either does, or does not effect IQ, I would want to
know if a causal mechanism has been suggested to explain how the drug effects
IQ. This reasoning is similar to controversy about low frequency electromagnetic
fields, and the possibility of the exposure causing cancer. Though the two have
been correlated, a causal mechanism has not been explained (or accepted by the
scientific community). (http://www.who.int/mediacentre/factsheets/fs263/en/)
4. What other factors (confounding variables) might explain the observed relationship
between marijuana use and increase, or decrease, in IQ score?
o As stated above, the only “statistically significant” effect of marijuana use on IQ
(change from preteen to young adult) appeared to be between the non-users and
heavy users (> 5 joints/wk).
o Perhaps the heavy smoking teens stopped attending class, thereby
lowering their IQ. “Not going to class” is the confounding variable.
o Perhaps these heavy smoking adolescents are social deviants, who do not
place high value on IQ tests, and thus do not try as hard. Thus, effort
confounds the relationship between “current use” and “mean IQ score
difference.”
Sample Solution for page 54:
Childhood leukemia and parents' occupational and home exposures.
Lowengart RA, Peters JM, Cicioni C, Buckley J, Bernstein L, Preston-Martin S,
Rappaport E.
A case-control study of children of ages 10 years and under in Los Angeles County was
conducted to investigate the causes of leukemia. The mothers and fathers of acute
leukemia cases and their individually matched controls were interviewed regarding
specific occupational and home exposures as well as other potential risk factors
associated with leukemia. Analysis of the information from the 123 matched pairs
showed an increased risk of leukemia for children whose fathers had occupational
exposure after the birth of the child to chlorinated solvents [odds ratio (OR) = 3.5, P =
.01], spray paint (OR = 2.0, P = .02), dyes or pigments (OR = 4.5, P = .03), methyl ethyl
ketone (CAS: 78-93-3; OR = 3.0, P = .05), and cutting oil (OR = 1.7, P = .05) or whose
fathers were exposed during the mother's pregnancy with the child to spray paint (OR =
2.2, P = .03). For all of these, the risk associated with frequent use was greater than for
infrequent use. There was an increased risk of leukemia for the child if the father worked
in industries manufacturing transportation equipment (mostly aircraft) (OR = 2.5, P =
.03) or machinery (OR = 3.0, P = .02). An increased risk was found for children whose
parents used pesticides in the home (OR = 3.8, P = .004) or garden (OR = 6.5, P = .007)
or who burned incense in the home(surprising to me) (OR = 2.7, P = .007). The risk was
greater for frequent use. Risk of leukemia was related to mothers' employment in
personal service industries (OR = 2.7, P = .04) but not to specified occupational
exposures. Risk related to fathers' exposure to chlorinated solvents, employment in the
transportation equipment-manufacturing industry, and parents' exposure to household or
garden pesticides and incense remains statistically significant after adjusting for the other
significant findings.
Implies that other reported risks (such as spray paint) are NOT significant, one adjusted
(presumably for other confounding factors).
How the Studies Are Done (Cancer in Children and Pesticide Exposure Summary - M.
Moses M.D)
Epidemiology is the study of diseases and their causes in human populations. It compares
groups of people with an exposure to those without it, or people with a disease to those
without it. In the studies in this table, groups of children with cancer or with pesticide
exposure are the“cases”. Groups of children without cancer or without exposure to
pesticides are the “controls”.
The aim is to find out if the children with cancer (the cases) are more likely to have
exposure to pesticides than the children without cancer (the controls). Or to find out if the
children with pesticide exposure (the cases) are more likely to have cancer than children
without pesticide exposure (the controls).
How Study Results are Reported
Study results are reported as risk ratios. These ratios indicate whether the children with
cancer were more likely to be exposed to pesticides (at increased risk), equally likely to
be exposed to pesticides (no difference in risk), or less likely to be exposed to pesticides
(at decreased risk) than the children without cancer. Or whether the children with
pesticide exposure were more likely to have cancer (at increased risk), equally likely to
have cancer (no difference in risk), or less likely to have cancer (at decreased risk) than
the children without pesticide exposure.
For example: In a study of leukemia**, the cases would be children with leukemia, and
the controls children without it. There are three possible outcomes. The children with
leukemia could be more likely, equally likely, or less likely to have exposure to
pesticides.
1. More likely: If the ratio is greater than 1 (> 1), this means that the children with
leukemia were more likely to have exposure to pesticides – that pesticide exposure
increases the risk of leukemia. The size of the ratio indicates how much the risk is
increased. A ratio of 1.4 means a 40% increase in risk. A ratio of 2.0 means a doubling of
the risk, or a 200% increase. At least a doubling of the risk is considered more important
than ratios less than 2.
2. Equally likely - If the ratio is equal to one ( = 1) this means that there was no
difference in pesticide exposure found in the children with or without leukemia –
pesticides did not increase the risk of leukemia in the study.
3. Less likely - If the ratio is less than one (< 1), this means that children with leukemia
were less likely tobe exposed to pesticides than children without it, or the risk was
decreased. The smaller the number the lower the risk. A ratio of 0.80 means that children
with leukemia are 20% less likely to have been exposed to pesticides. A ratio of 0.40, that
they are 60% less likely.
When studying humans, it is impossible to determine every factor that might influence
the results of a study. It might have occurred anyway, by chance (p-value). It is possible
that any increase in risk was not from pesticides, but something else (if true cause is
correlated with pesticide use, pesticide use would be a confounding factor). This could be
something the researcher didn’t think of, or didn’t even ask about. Or it could be from
pesticide exposure in combination (interesting.. pesticides + shaving cream?? ;) with
other unknown or unstudied factors.
Therefore, finding an increase in risk does not mean that pesticides “cause” leukemia.
This is why it is common to report an increase in risk by stating that “pesticide exposure
increases the risk of leukemia in children”, or “pesticide exposure is a risk factor for
leukemia in children”, and not that pesticides “cause” leukemia.
Are the Study Results “Significant”?
There are methods to determine how strong the link or associations between leukemia
and pesticides are, and if they occurred by chance(p-value!) They are called tests of
statistical significance. The statistical part is usually left out, and the results reported as
“significant” or “not significant”.
“p” value: This tests whether the findings could have occurred by chance 5% of the time
or less. The 5% is converted to a fraction and written as 0.05. For example, you will see
the results as “p = 0.05" (read as p equals point 0 5 ), or “p < 0.05" (read as p less than
point 0 5), or “p  0.05" (read as p less than or equal to point 0 5).
If the “p” value is less than or equal to 0.05, the findings are considered to be statistically
significant (very arbritrary); that is, they are unlikely to have occurred by chance. The
smaller the “p” value the more significant the findings. For example” p  0.01" (read as p
less than or equal to point 0 1) means that it could have occurred by chance 1% of the
time or less.

Which of the “treatments” was correlated with the largest increase in risk of
childhood leukemia?
Children whose parents use pesticides in the garden [OR =6.5, p=.007]. It a little
misleading, this actually means, “Children with leukemia are 650 times more likely to
have parents who use pesticides in the garden than kids without leukemia.” IT IS
NOT THE SAME as “Parents who use pesticides in the garden are 650 times more
likely to have kids with leukemia.”

The study reports, “…increased risk of leukemia for children whose fathers
had occupational exposure after the birth of the child to methyl ethyl ketone
(CAS: 78-93-3 (chemical reference number); OR = 3.0, P = .05).
o Write a sentence interpreting the OR and p-value for someone who has not
taken a statistics class.
Kids with leukemia were 300 % more likely to have fathers who worked with MEK than
Kids without leukemia. We would expect to get this odds ratio, or an odds ratio even
more extreme, about 5% of the time (by chance alone).
o Search www.wikipedia.org for methyl ethyl ketone. Do you feel that
this chemical is “safe” or “dangerous”? Explain.
It sounds pretty harmless. Before I start adding my 2 cents to Wiki, I think I will review
a broader literature on links between leukemia and MEK (or other common solvents). In
general, it is always good to look at multiple sources of information before deciding what
to believe.
I would be much more easily convinced that MEK actually causes leukemia (the study
just reports the risk associated with correlation) if a mechanism was proposed to explain
how MEK leads to leukemia.
I also try to look at who the author is, their sponsoring organization, reputation, etc.

What “significance level” () was used for this study? Offer an argument for a
higher or lower significance level (you choose).
They used  (significance) = 0.05. Thus, if an odds ratio has less than a 5% of occurring,
just by chance, we believe that the effect is real. If I want to ‘take action sooner,’ and
don’t need quite as compelling of evidence, I might choose .10. Thus, if the effect (odds
ratio, in this case… but think of “pine needle length” in the hunter problem) has a 10%
chance of occurring, or less, I will take action (“reject the null hypothesis that there is no
effect”). However, if the costs of “reacting” are high (or the costs of “not reacting” are
low), we could set  (significance) = 0.01. In this case, we won’t reject the hypothesis
that there was “no effect” beyond random chance until the probability of getting the
effect are 1/1000.
In general, “statistically significant” means “reject null,” “investigate
further,” or “take action.”
If you want action to be taken sooner, choose a larger value for the
significance level.