Download Shaken Baby Syndrome On Trial: A Statistical Analysis of Arguments

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Forensic linguistics wikipedia , lookup

Progeroid syndromes wikipedia , lookup

Forensic epidemiology wikipedia , lookup

Transcript
Shaken Baby Syndrome On Trial:
A Statistical Analysis of Arguments Made by the
Defense and Prosecution
April 19, 2016
Maria Cuellar1
Carnegie Mellon University
Statistics Department and Heinz College School of Public Policy
Committee: Stephen E. Fienberg 2 , Joseph Kadane 3 , and Amelia Haviland 4 .
Advisor: Stephen E. Fienberg.
Abstract
Over 1,100 individuals are in prison today on charges related to the diagnosis of
Shaken Baby Syndrome (SBS). In recent years this diagnosis has come under scrutiny,
and more than 20 convictions made on the basis of SBS have been overturned (Medill
Justice Project, 2015b). The overturned convictions have fueled a controversy about
alleged cases of SBS. In this paper, I review the arguments made by the prosecution
and defense in cases related to SBS and point out two problems: much of the evidence
used has contextual bias, and the expert witnesses and attorneys ask the wrong causal
questions. To resolve the problem of asking the wrong causal questions, I suggest that
a Causes of Effects framework be used in formulating the causal questions and answers
given by attorneys and expert witnesses. To resolve the problem of bias, I suggest that
only the task-relevant information be provided to the individual who determines the
diagnosis. I also suggest that in order for this to be possible, there must be a change
in the definition of SBS so it does not include the manner in which the injuries were
caused. I close with recommendations to researchers in statistics and the law about
how to use scientific results in court.
1
Email: [email protected]. Current address: 5000 Forbes Ave. Pittsburgh, PA, USA
Department of Statistics and Heinz School of Public Policy. Email: [email protected]
3
Department of Statistics. Email: [email protected]
4
Heinz School of Public Policy. Email: [email protected]
2
1
Contents
1 Introduction
3
2 The debate about Shaken Baby Syndrome
4
2.1
The Trial of Trudy Muñoz . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4
2.2
Problems With Trudy Muñoz’s Trial . . . . . . . . . . . . . . . . . . . . . .
6
2.3
The history of Shaken Baby Syndrome . . . . . . . . . . . . . . . . . . . . .
7
2.4
A battle of the experts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9
2.5
Specific arguments made in court . . . . . . . . . . . . . . . . . . . . . . . .
10
3 Asking the wrong causal questions
14
3.1
Problem: The research asks and answers the wrong causal questions . . . .
14
3.2
A solution: Effects of Causes (EoC) and Causes of Effects (CoE) . . . . . .
14
3.3
Examples of asking the wrong causal questions . . . . . . . . . . . . . . . .
20
3.3.1
Example 1: Deaths caused by short falls . . . . . . . . . . . . . . . .
20
3.3.2
Example 2: A statistical model to predict abuse . . . . . . . . . . .
25
4 Biased evidence
29
4.1
Problem: Multiple sources of bias in Shaken Baby Syndrome . . . . . . . .
29
4.2
A solution: Blinding, i.e. using task-relevant information . . . . . . . . . . .
30
4.3
Examples of biased evidence . . . . . . . . . . . . . . . . . . . . . . . . . . .
32
4.3.1
Example 1: Bias in data about Shaken Baby Syndrome . . . . . . .
32
4.3.2
Example 2: Bias in the outcome variable in a predictive model . . .
33
4.3.3
Example 3: Racial bias in convictions of Shaken Baby Syndrome . .
34
5 These problems occur outside of Shaken Baby Syndrome
34
5.1
Asking the wrong causal questions in the Motherisk controversy . . . . . . .
34
5.2
Bias in the prediction of child abuse by Best et al. . . . . . . . . . . . . . .
36
6 Research agenda
37
7 Recommendations
38
8 Conclusion
40
2
1
Introduction
The use of scientific evidence in court has come under increasing scrutiny in the past 10
years. A report from the National Research Council of the National Academy (Committee on Identifying the Needs of the Forensic Science Community, 2009) revealed serious
problems with both the way forensic evidence is gathered, and with the way it is used by
attorneys and expert witnesses in criminal cases. This has given rise to an expanding literature on the scientific and statistical basis of forensic analyses (Cole and Edmond, 2015),
as well as the formation of groups of scientists that are reviewing the science and statistics
used in forensics today (see for example the Center for Statistics and Applicantions in
Forensic Evidence (2016)). Driving this work are a number of issues, the most important
of which is the worry that individuals are being wrongly convicted due to the misuse of
forensic evidence (Kafadar, 2015; Spiegelman and Tobin, 2013). Though the introduction
of DNA analysis has helped identify many cases of wrongful convictions (Thompson, 2013),
there are other cases that cannot be decided through DNA evidence, yet they have been
found to be wrongful convictions for other reasons (Innocence Project, 2016; University of
Michigan, Michigan Law, Innocence Clinic, 2016).
Alleged child abuse is a particularly controversial and important set of cases that have
been decided through methods other than DNA analysis. In such cases, the testimony of
the child victim is often unreliable or unavailable, there are typically no witnesses, and
experiments are not possible because they would be unethical. Therefore, there are only
“medical findings” to point to what caused the injuries (Dunstan et al., 2002; Dixon et al.,
2005; Truman and Ayoub, 2002). Specifically, in cases of Shaken Baby Syndrome5 , a
specific type of child abuse, certain clinical findings have been used since the early 1970s
to determine whether the child was abused by shaking or had an accident (Caffey, 1972).
Shaken Baby Syndrome is relevant today because over 1,100 individuals are currently
in prison on charges related to the diagnosis, and there are at least one hundred new cases
per year in the United States alone (Medill Justice Project, 2015b). If the medical findings
commonly used are in fact not indicative of abuse (i.e. have high specificity), there could
be many wrongfully convicted individuals who are already prison and in child registries,
or who might continue to be wrongfully convicted.
In recent years there have been over 30 recognized cases of wrongful convictions related
to Shaken Baby Syndrome (Innocence Project, 2016). This has caused a heated debate
between a group who argues that it is possible to determine whether shaking caused the
head trauma in children and a group who argues that it is not possible to determine
whether it was shaking or an accident that caused the child’s head trauma. There have been
numerous studies and disagreements between expert witnesses in court about the research
and the mechanism driving the head trauma found in cases of Shaken Baby Syndrome.
5
The American Academy of Pediatrics suggested that “Shaken Baby Syndrome” be called “Abusive
Head Trauma”. The terms are often used interchangeably in court and in the media, so I use only
“Shaken Baby Syndrome” in this study, for consistency.
3
A reason for why there are two competing views about Shaken Baby Syndrome is that
the problem is not well structured. Statisticians and other researchers have been unable to
analyze this problem in depth because it has been presented in such a way as to prevent
serious theoretical or empirical analysis.
In this study, I bring to bear a methodology of Causes of Effects and Effects of Causes,
which has not been previously used in the analysis of Shaken Baby Syndrome. The methodology of CoE and EoC can be consistently applied to this problem, and it can provide a
framework that might make it more amenable for analysis to researchers. In my study, I
show that this is the correct way to look at the issue because it helps generate the proper
causal questions and answers for legal and medical applications.
I also study the way contextual biases might be affecting the research and arguments
that have been made about Shaken Baby Syndrome so far. Analyses of contextual biases
have been explored in other forensics applications, but in cases of child abuse they are less
well-defined. In this study, I give a more structured approach to the analysis that could be
made about the diagnosis by applying the previous work about contextual bias to cases of
Shaken Baby Syndrome.
Specifically, in this paper I review the medical literature and testimonies from court.
I find that there are problems with biased evidence and asking the wrong causal questions. To resolve the issues of bias, I suggest that attorneys use the task-relevant versus
task-irrelevant framework proposed by the National Commission on Forensic Science (McCormack et al., 2015). I also point to some difficulties in using task-relevant information
in the medical setting for Shaken Baby Syndrome. To resolve the issue of asking the wrong
causal questions, I suggest that attorneys and researchers use the Causes of Effects and
Effects of Causes methodology from Dawid et al. (2014). I close by making suggestions
about the data that could be used to study Shaken Baby Syndrome.
2
2.1
The debate about Shaken Baby Syndrome
The Trial of Trudy Muñoz
In a heavily publicized case in 2009, 45-year-old nanny Trudy Muñoz was charged with two
felony counts related to Shaken Baby Syndrome (Commonwealth of Virginia vs. Rueda,
2009; Bazelon, 2011; Cenziper, 2015).
Muñoz is originally from Peru, and she moved to the United States in 2001 with her
husband, who was trained as an attorney. She used to run a travel agency and teach college
courses for tour guides in Peru. In Virginia she started a daycare center in her home so
she could work while spending time with her two daughters.
On the afternoon of April 20, 2009, she was taking care of five-month-old Noah Whitmer, and he started crying. According to her testimony (Commonwealth of Virginia vs.
Rueda, 2009), she gave him the bottle to help put him to sleep. She noticed that Noah’s
eyes rolled toward the back of his head and he went limp, so Muñoz called 911.
4
When Noah arrived at the hospital, his physicians gave him a computerized tomography
(CT) scan, which showed a subdural hemorrhage (bleeding in a space between the skull and
the brain) and a cerebral edema (a swelling brain). He also had an ophthalmological exam
that revealed retinal hemorrhaging (bleeding at the back of the eyes). One expert witnesses
who testified, Dr. Futterman, said, “...when you have a child who has...had massive retinal
hemorrhages in the back of the eyes. He had significant injury to the substance of the
brain with no evidence of any kind of violent injury like a car crash. This tells me there
are tremendous acceleration forces going into the substance of the brain itself... and this
child was shaken or shaken and slammed against something.” For decades, these three
symptoms have been considered the telltale signs of Shaken Baby Syndrome (also called
Abusive Head Trauma).
Before April 20, Noah was a healthy baby. At the hospital he had no external marks on
his body. He had no bruises or cuts or fractures (no sign that he was forcefully gripped and
no evident neck injury that would seem to result from vigorous shaking). But a magnetic
resonance image (M.R.I.) confirmed the CT scan findings. The physicians tested Noah
for clotting disorders that can cause these kinds of hemorrhages. The tests came back
negative.
The defense attorney, James R. Kearney, asked Dr. Futterman, “If there was a [chronic
hematoma] and it in fact re-bled, could that have caused the type of damage you saw in
Noah Whitmer’s brain?” And Dr. Futterman’s answer was, “No sir.”
The physicians told Noah’s parents, that they strongly suspected Noah was violently
shaken in the moments before he stopped breathing. Noah lapsed into a coma as soon as
he arrived at the hospital, and the police proceeded to interrogate Muñoz. A day later
Muñoz was arrested.
The trial was held one year later. The prosecutor presented six physicians who testified
that Noah’s brain scans showed he had been abused. One said the findings were “inconsistent with accidental trauma,” and another said “the child was shaken, or shaken and then
slammed against something.” Another said that the onset of the injuries is very rapid, so
since Muñoz was with the child when he became ill, she was the one who shook him. The
defense presented a physician, Ronald Uscinski, who has testified in several Shaken Baby
Syndrome cases. Uscinski said he believed the bleeding on the scans was the result of birth
trauma and on April 20 rebleeding had occurred spontaneously. However, a physician
from the prosecution said there was no evidence of a rebleed from a chronic hematoma
from birth.
Muñoz had nothing suspect in her past. In fact, several mothers who had hired her
as a nanny testified at the trial saying she was “more patient than all of us.” But the
physicians reading of the scans were the main evidence that a crime had taken place, of
the crime’s timing, and even of Muñoz’s state of mind since they agreed that only an act of
great violence could inflict such injuries. Muñoz testified not only that she did not shake
Noah, but also that she had never admitted that she shook him. But she had spoken to
the police and social worker without an attorney or recordings, and both the police and
5
social worker testified that Muñoz told them she did shake Noah. In addition, Muñoz did
not have an interpreter when she spoke to the police (despite the fact that her level of
English is very basic and she did not understand several of the questions she was asked
in court). The social worker had written a note that said, “Might have shaken him about
three times, but not sure.”
After five days of testimony, the jury deliberated for five hours and Muñoz was convicted
of two felony counts: abuse of a child causing serious injury and cruelty to a child. Muñoz
is currently serving a 10 1⁄2-year prison sentence. Noah came out of the coma and had as
many as 32 seizures per day for the next year. He has had developmental delays since, and
his prognosis is uncertain.
2.2
Problems With Trudy Muñoz’s Trial
From a scientific point of view, there are several problems with Trudy Muñoz’s case. They
fall into two categories: biased evidence and asking the wrong causal questions.
Listing these problems and asking detailed questions about this case is of interest because
Muñoz’s trial is not unique. There are hundreds of cases related to the Shaken Baby
Syndrome diagnoses every year in the United States, and the arguments made in Muñoz’s
case are similar to those made in many other cases.
First, I list some problems with biased evidence. In the days surrounding the accident
when Muñoz spoke with the police and social worker there were no recordings (partially
because Muñoz declined being recorded). There were only notes from the social worker.
Since the only evidence about her confessions comes from individuals’ accounts, is it possible that the accounts (and perhaps even their memories) were biased by the prosecution’s
argument in court, i.e. allegiance bias? The problem here is that testimonies from individuals could be distorted by the information given in court or provided through prior
interactions with attorneys.
Trudy Muñoz is from Peru, and her level of English is very basic. When the police
interrogated her, she did not have an interpreter, and the conversation was not recorded.
While giving her testimony in court, she did not understand several of the questions she
was asked. Is it possible that the fact that her inability to communicate comfortably and
clearly could have biased the opinions of the police, social workers, and even the jury? The
problem here is that since the individuals involved were responsible for some interpretation
of the evidence, it is possible that they could have misinterpreted Muñoz’s testimony.
It is likely that the physicians who saw Noah have seen similar cases of brain injuries in
the past. Is it possible that the physicians were affected by the results from previous cases?
If so, this would likely be recorded in the transcripts. The prosecution usually establishes
the expertise for their witnesses, and prior SBS cases would help the prosecution. The
problem here is that the physicians’ diagnosis at the hospital could have been biased by
having this case be similar to other cases that were considered to be Shaken Baby Syndrome.
Second, I list some problems with asking the wrong causal questions. The six physicians
6
called to testify by the prosecution said that Noah’s brain scans showed that he had been
abused. How did the physicians know that Noah was abused? Is some medical expertise
necessary to understanding the relationship between shaking and injuries like Noah’s? Is
it equivalent to ask these three questions? i) “If a healthy child is shaken, how likely is it
that he will get brain injuries like Noah’s?,” ii) “If a child has brain injuries like Noah’s,
how likely is it that he was shaken?,” and iii) “If a healthy child was shaken and he got
these brain injuries, how likely is it that the shaking, and not something else, caused the
injuries?” The problem here is that it is not clear why the research that was cited in court
supports the claim that Muñoz abused Noah.
I also note a problem about missing information, which influences the answers to the
causal questions. When Noah arrived at the hospital after Muñoz called 911, the physicians
performed a test for clotting disorders that can cause the kinds of hemorrhages that were
observed. The test was negative. Is it possible that the test was not good enough to detect
the disorders, but Noah actually did have them? What is the sensitivity and specificity of
the test?6 Are there other conditions that were not tested for, but could have caused the
same injuries? The problem here is that we do not know whether Noah had a different
disease. The reason this affects the causal questions is that if there is another possible
cause for the injuries, then it is possible that even if Noah was shaken, it was not shaking
that caused the injuries, but something else.
The problems with Trudy Muñoz’s case help to provide a framework to analyze the statistical arguments made in court and the medical literature about Shaken Baby Syndrome
in the two categories I mentioned (biased evidence and asking the wrong causal questions).
I will speak about these problems in more detail later, but first I begin by providing a brief
history of Shaken Baby Syndrome.
2.3
The history of Shaken Baby Syndrome
The first study about Shaken Baby Syndrome was written by British pediatric neurosurgeon
Norman Guthkelch in 1971. In his paper, Guthkelch reviewed 13 cases of infant subdural
hematoma, five of which had no external signs of violence to the head. After a mother
admitted that she had shaken her child, Guthkelch hypothesized that, “perhaps the babies’
heads could have been shaken.” A year later, Caffey (1972) reviewed 27 cases thought to
involve force that included shaking. He said it was possible that the injuries might have
been caused by “the grabbing and gripping of an infant or younger child by the extremities
of by one leg or arm and then shaking.” Caffey noted that babies were often shaken by
6
This is an important question because if the test is not sensitive enough, or has a high false negative
rate, it is possible that Noah did have the clotting disorder, but the test just did not report it. Assuming
that the disease was not present because the test did not detect it is a form of the ignorance fallacy (i.e. P
has never been absolutely proven and is therefore certainly false). The phrase, “Absence of evidence is not
evidence of absence,” is often used to recall that the disease could have been present. But this is an issue
that I do not cover in depth in this paper. I believe that asking the wrong causal question is a deeper
problem because this problem could be fixed with better information.
7
their parents in a whiplash motion, and it often resulted in grave permanent damage to
their brains and eyes. He ended his 1974 article with a warning: “The wide practice of
habitual whiplash-shaking for trivial reasons warrants a massive nationwide educational
campaign to alert everyone responsible for the welfare of infants on its potential and actual
pathogenicity.”
In the 1980s, the term Shaken Baby Syndrome came into broad use, and a national
prevention and awareness campaign was launched in the United States.7 At this point,
the syndrome was defined as characterized by a constellation of three symptoms called the
triad: subdural hemorrhage (bleeding in a space between the skull and the brain), cerebral
edema (a swelling brain), and retinal hemorrhaging (bleeding at the back of the eyes).
A number of articles address biomechanics experiments regarding Shaken Baby Syndrome. I describe three noteworthy ones here. Ommaya (1968) anesthetized rhesus monkeys, secured them in a contoured chair, and then accelerated them along a 20-foot track
until they crashed into a wall. He concluded that whiplash can cause cerebral concussions
and brain injury, including bleeding on the surface of the brain. Duhaime et al. (1987)
reviewed the autopsy results of 13 babies with symptoms associated with Shaken Baby
Syndrome, but the authors found evidence that the trauma was actually caused by a blunt
impact. The authors worked with biomechanical engineers to create infant-sized dummies
equipped with sensors to measure acceleration. From the biomechanical results and the
autopsies, they concluded that “shaking alone in an otherwise normal baby is unlikely to
cause the shaken baby syndrome.” Weber (1983, 1985) conducted 50 pediatric cadaver
drop tests for forensic research on child abuse. He concluded that “we should no longer
assume that the skull of infants is not damaged after falls from table height.” This biomechanical evidence has shown that short falls can cause brain injuries like the ones seen in
Shaken Baby Syndrome cases.
Some studies also showed that the triad could be caused by other conditions. A child
abuse textbook calls these conditions “mimics” of Shaken Baby Syndrome. Sirotnak (2006)
says some of the mimics are, “prenatal and perinatal conditions, including birth trauma;
congenital malformations; genetic conditions; metabolic disorders; coagulation disorders;
infectious disease; vasculitis and autoimmune conditions; oncology; toxins and poisons;
nutritional deficiencies; complications from medical-surgical procedures, including lumbar
puncture; falls; motor vehicle crashes; and playground injuries.” Barnes (2011) also has an
article listing mimics in imaging for Shaken Baby Syndrome, and he says it is very likely
that other causes are still undiscovered.
In 1992 the National Center on Child Abuse and Neglect funded a national campaign
to raise awareness about SBS. As a part of it prosecutors around the country were trained
to investigate and pursue charges based solely on the triad (Tuerkheimer, 2009). Almost
two decades later, a 2009 position paper from the American Academy of Pediatrics, written
7
Other countries, notably the United Kingdom, Sweden, and Australia, have had similar but smaller
national prevention campaigns (Moran et al., 2012).
8
by Cindy Christian, recommended that doctors use the more general term Abusive Head
Trauma, but also calls shaking an “important mechanism of such trauma.” Physicians
and the media still use the term Shaken Baby Syndrome as a synonym to Abusive Head
Trauma.
As the diagnosis became more prevalent, physicians started using different groups of
symptoms to diagnose Shaken Baby Syndrome. Thus, in 2012 the Centers for Disease
Control and Prevention issued a report titled “Pediatric Abusive Head Trauma: Recommended Definitions for Public Health Surveillance and Research” (Parks et al., 2012a),
which provides the following definition of Shaken Baby Syndrome:
CDC definition of Shaken Baby Syndrome/Abusive Head Trauma:
Pediatric abusive head trauma is defined as an injury to the skull or intracranial
contents of an infant or young child (< 5 years of age) due to inflicted blunt impact
and/or violent shaking.
The American Academy of Pediatrics says the injuries in infants with
Shaken Baby Syndrome may include:
Bleeding over the surface of the brain (subdural hemorrhages), other brain injuries,
including brain swelling and injuries to the white matter of the brain, bleeding on the
back surface of the eyes (retinal hemorrhages), some victims have evidence of blunt
impact to the head (others do not), some victims have other evidence of physical
abuse, including bruises, abdominal injuries, and recent or healing broken bones
(others do not).
However, the definition of Shaken Baby Syndrome appears to be constantly shifting as new
trials are held and new expert witnesses testify in court (Moran et al., 2012).
After the campaign from the 1980s, the number of trials related to the Shaken Baby
Syndrome diagnosis rose dramatically. In the 1980s there were fifteen cases in the appeals
courts, in the 1990s there were over two hundred cases, and in the 2000s there were over
eight hundred.
The CDC estimates that the reported conviction rate in fatal child abuse cases is 88%.
The incidence is estimated to be 20 to 30 cases per 100,000 children under one year of age
with a case fatality rate exceeding 20% and significant disability for about two-thirds of
the survivors (Parks et al., 2012a).
2.4
A battle of the experts
Today, there is a heated debate between two groups. On the one side, a group of physicians
who treat child abuse say that decades of clinical observation, as well as confessions, show
that it is possible to determine whether shaking caused the triad of subdural and retinal
bleeding and brain swelling (Jenny et al., 1999). On the other side are the physicians and
attorneys who say that it is not possible to determine whether it was shaking or an accident
9
that caused the child’s injuries or death (Moran et al., 2012; Squier, 2008). Some of the
members of this latter group go even further to suspect that since there has never been an
observed case in which a child is shaken and then experiences the triad of brain injuries,
shaking simply does not produce these features (Barnes, 2015).
In court, what has ensued is what some called a “battle of experts” (Tuerkheimer,
2009). The prosecution calls expert witnesses who belong to the former group and the
defense calls expert witnesses who belong to the latter. Then the jury is responsible for
deciding who is correct and whether the child was abused.
There are legal standards that determine what scientific evidence is admissible in court.
This functions as a judicial screening device to exclude potentially inaccurate and unreliable
evidence based on a new untested theory or methodology. Daubert is the federal court
standard (Daubert v. Merrell Dow Pharmaceuticals, 509 U.S. 579, 1993), replacing the
Frye standard (Frye v. United States, 293 F. 1013 D.C. Cir., 1923), but in state and local
courts, many jurisdictions continue to use Frye standards.
There are certain differences between the two standards, but generally what they require
is that if there is a scientific consensus about a scientific statement, then it is admissible
in court. So, the former group argues that a scientific consensus has been in place since
Guthkelch (1971).
So, the constellation of injuries characteristic of Shaken Baby Syndrome are indicative
of abuse. It argues that the latter group is a very small group of physicians who are working
against the medical establishment and speaking outside their area of expertise. The latter
group says there is new research showing that there is not enough evidence to say that some
of these adults have inflicted child abuse (Cenziper, 2015; Haberman, 2015; Smith, 2016).
A documentary from 2014 created by Meryl Goldsmith8 describes this debate in detail.
It follows Dr. Ronal Uscinski and other physicians who have questioned the syndrome in
their daily life.
In 2009 and 2012 the American Academy of Pediatrics and the Centers for Disease
Control and Prevention have issued policy statements reassuring readers that the diagnosis is valid and updating the clinical features that can be used to identify the syndrome
according to new findings (Christian et al., 2009; Parks et al., 2012b).
In March 2016, this debate became so polarized and publicized that pediatric neuropathologist Waney Squier’s medical license was revoked. In a trial on March 11, 2016,
the Medical Practitioners Tribunal Service panel, the disciplinary arm of the General Medical Council in the United Kingdom, ruled that in Squier’s “written and oral evidence, [she
was] dogmatic, inflexible and unreceptive to any other view.” The panel said that “Dr.
Squier went well outside her sphere of expertise and gave evidence and opinions in relation
to areas of medicine and physiology in which she was not an expert.”9 Squier can no longer
8
The Syndrome documentary will be available starting April 15, 2016, http://www.resetfilms.com/.
Last accessed: April 3, 2016.
9
Online at: http://www.yourlocalguardian.co.uk/news/national/14339261.display/. Last accessed:
April 9, 2016.
10
practice medicine. As this paper was being written, the latter group, who believes there
are possibly many wrongful convictions related to Shaken Baby Syndrome, is meeting to
discuss actions to stop the former group from removing physicians licenses for expressing
their professional opinion.
2.5
Specific arguments made in court
To analyze the specific arguments made in court by the defense and prosecution, it is useful
to refer to the testimonies from the trials related to Shaken Baby Syndrome. Out of the
thousands of trials that have been held, I read fifty of their testimonies. There are recurring
arguments used by the prosecution and the defense. A manual from 2010 written by the
National District Attorneys Association for the prosecution includes many of the recurring
arguments. There is also an article from the American Bar Association for the defense that
included other arguments (Judson, 2015).
The manual for the prosecution is called “Overcoming Defense Expert Testimony in
Abusive Head Trauma Cases,” and it was written by the National Center for Prosecution
of Child Abuse of the National District Attorneys Association (Odom et al., 2010). It was
written partly because “a group of physicians who testify frequently and convincingly for
the defense in AHT cases, even though many of their opinions are outside the consensus
of the medical community.” The manual outlines a set of arguments commonly made in
Shaken Baby Syndrome cases by the defense and the prosecution. In Table 1 I copy the
specific arguments written by the National District Attorneys Association. The article
for the defense also provides good arguments that are worth investigating in detail in the
future.
Table 1: Defense and prosecution claims from a manual that helps prosecutors prepare
for trial written by the National District Attorneys Association. From Odom et al. (2010).
Defense claim
The evidence that children can sustain
brain injuries from having been shaken
in too unreliable to be admitted in court
under Daubert, Frye, or state-specific
admissibility tests.
Prosecution claim
• Every jurisdiction that has considered the issue currently
holds that AHT evidence is admissible under Daubert, Frye,
or state-specific admissibility tests.
It is impossible to cause brain injuries
through shaking absent a concomitant
neck injury.
• The defense relies upon a biomechanical study by Faris
Bandak that has been heavily criticized in the medical community.
• Shaking rarely results in soft tissue and bone injuries
to the infant’s neck, and cervical nerve injuries are often
present but difficult to detect.
11
Short falls commonly produce the types
of fetal cranial injuries seen in AHT
cases.
A hypoxic rather than traumatic event
might cause subdural hemorrhage.
A child’s previously existing subdural
hematoma may “rebleed,” either spontaneously or as a result of a trivial injury, and cause severe brain injuries.
The infant may have suffered a head injury without manifesting symptoms until hours or days later—a “lucid interval.”
The best study, of which I’m aware, says
that 75 percent will occur within the
first 24 hours, and the remaining 25 percent can take up to three or four days.
Retinal hemorrhages are nonspecific injuries and do not indicate child abuse.
• Contact head injuries from falls produce brain injuries
that differ from the types of injuries caused by shaking.
• Short falls are frequently provided as false histories when
in fact child abuse has occurred.
• The risk of a short fall causing fatal injuries in infants is
less than one in a million.
• Geddes’s research is highly experimental, and does not
demonstrate that hypoxia causes subdural hematomas.
• Courts and subsequent research have rejected hypoxia as
a cause of subdural hematoma.
• There is no evidence that an infant’s subdural hematoma
resulting from a traumatic event will spontaneously rebleed
as a result of a minor injury.
• The literature does not support the notion that an infant’s
chronic subdural hematoma can spontaneously rebleed.
• If the birth process causes subdural bleeding in an asymptomatic newborn, the subdural bleed will quickly resolve
without ever becoming symptomatic.
• The “rebleed” theory is not accepted by the vast majority
of the medical community.
• There is no evidence of infants having lucid intervals following non-contact head injuries that lead to death.
• Lucid intervals are primarily associated with epidural, not
subdural hematomas.
• The vast majority and the most reliable evidence to date
indicates that infants who sustain lethal inflicted as opposed
to accidental head trauma experience neurological deterioration and loss of consciousness very rapidly, if not immediately following, craniocerebral trauma.
• This position is supported by an overwhelming consensus
of the medical community.
• Retinal hemorrhages, particularly when multilayered, bilateral, and covering the whole retina to its edges, are overwhelmingly associated with abusive head injury characterized by repetitive acceleration-deceleration injury with or
without blunt head trauma.
• Recent animal studies indicate that accelerationdeceleration forces can cause retinal hemorrhages in infants.
• Retinal hemorrhages are not caused by CPR or bleeding
disorders.
• Increased intracranial pressure (ICP) does not cause
the diffuse retinal hemorrhages commonly associated with
AHT.
12
It is biomechanically impossible to
cause massive brain injuries including
subdural hematoma in children through
shaking alone.
• Duhaime’s experiment did not accurately measure the
amount of force caused by shaking an infant because she
used dummies with necks that did not mechanically resemble an infant’s neck.
• Cory and Jones found that using injury thresholds based
upon impact data may be necessary when calculating the
forces generated by shaking, but Duhaime calculated her
results using injury thresholds based upon angular acceleration data only.
• Duhaime derived injury thresholds from studies in which
primates were exposed to a single angular accelerationdeceleration force, which fails to account for the cumulative
damage caused by repeated shaking incidents.
• Duhaime used simple mass-scaling in extrapolating injury
thresholds from an adult primate’s brain to a human infant’s
brain, and simple brain mass scaling does not accurately
predict thresholds for traumatic axonal injury in immature
brains.
• Duhaime’s models were only shaken in a single direction,
which does not account for the increased acceleration found
by shaking in various directions.
In the following sections I review some of the claims made by the defense and prosecution, and I provide suggestions to improve them.
13
3
3.1
Asking the wrong causal questions
Problem: The research asks and answers the wrong causal questions
In cases related to Shaken Baby Syndrome, often the wrong causal questions are asked. In
the trial of Trudy Muñoz (Section 2.1) I mentioned a few problems: the physicians tested
for a clotting disorder and since the test was negative they decided that Noah must have
Shaken Baby Syndrome, and the six physicians who testified in the case said Noah had
been abused after listening to the evidence from the case. To determine whether the proper
causal question is being asked, I turn to the Effects of Causes (EoC) and Causes of ffects
(CoE) framework.
3.2
A solution: Effects of Causes (EoC) and Causes of Effects (CoE)
I use the Causes of Effects and Effects of Causes methodology from Dawid et al. (2016)
to review several arguments that have been made in the clinical, statistical, and legal
literatures about Shaken Baby Syndrome. In this section, I describe the difference between
Effects of Causes and Causes of Effects, and then I generate a formulation to answer
questions about the legal cases.
In court, expert witnesses are sometimes called to speak about the statistical or epidemiological evidence that could be used to determine whether the defendant committed
a crime. There are papers that address the questions of how judges should manage cases
involving complex scientific and technical evidence (Green et al., 2011), and there are
standards stating which scientific arguments are admissible in court (Daubert v. Merrell
Dow Pharmaceuticals, 509 U.S. 579, 1993; Frye v. United States, 293 F. 1013 D.C. Cir.,
1923). But only a small number of a papers by a group of authors (Dawid, 2000, 2011,
2015; Dawid et al., 2014, 2015, 2016) note the difference between Effects of Causes versus
Causes of Effects analyses, which gives a detailed methodology for the types of statistical
arguments that should be made in court.
Dawid and co-authors argue that on one hand, statisticians and quantitative social
scientists typically study the Effects of Causes (EoC), which can be addressed by using
experimental design and epidemiology. On the other hand, attorneys and the Courts are
more concerned with understanding the Causes of Effects. The authors argue that the
evidence that is cited in court is often useful to answer EoC questions. But, because it
does not focus on CoE, the evidence that is often cited it is not relevant to assign blame
to a specific individual.
This EoC versus CoE formulation is somewhat controversial. Indeed, it has been said
that the search for causes used to be “cocktail party chatter that is outside the realm
of science.” Now researchers are starting to dismiss this criticism and focus on possible
formulations for CoE (Gelman and Imbens, 2013).
Some authors (Pearl, 2014; Gelman and Imbens, 2013) disagree about the difference
14
between EoC and CoE. In Gelman and Imbens’s opinion, the difference between them is
similar to the difference between forward and backward causality. He says there is no need
for new methodology because the EoC framework can be used to answer questions about
CoE.
Dawid and co-authors say that the approach by Gelman and Imbens deals with a
population (non-individual) case where the effect is known, and they are asking: “What,
among many possibilities, might be the cause of a general effect in the population?” This
approach assumes we do not know the cause, and it is asking precisely, what is the cause?
Gelman and Imbens use a model checking approach to the Effect of Causes framework.
Dawid and co-authors are instead looking at a specific cause, and asking, “What is the
probability that this was the single cause of the observed effect?”
In Pearl’s opinion, there is a real distinction between EoC and CoE, but the probability
of causation should be calculated using Pearl’s “probability of necessity.” This probability
of necessity captures what Pearl calls the “but for” criterion. That is, according to this
criterion, “judgment in favor of a plaintiff should be made if and only if it is more probable
than not that the damage would not have occurred but for the defendant?s action”.
Dawid et al. (2014) respond to Pearl by saying that his framework “begs too many deep
philosophical questions and offers little guidance as to how any given real problem should
be modeled and how the empirical evidence should be used in the complex real-world
settings of legal disputes and courtroom expert testimony.”
In addition, Dawid and co-authors say that Pearl omitted one of the two implicit
assumptions required for their analysis (that of “sufficiency”, as defined by Dawid and
co-authors), and this renders their analysis invalid. However, if he had used sufficiency,
then their analysis would still be valid. So, it is possible that Pearl does not disagree given
this assumption. This research is quite recent, and it is possible that new arguments are
being currently published.
Because I believe it provides the correct causal reasoning, in this study I adopt the
personalist Bayesian framework exposed by Dawid et al. (2014) to provide a structure to
the diagnosis of Shaken Baby Syndrome.
3.2.1
The methodolgy of EoC and CoE
In this section, I outline the Effects of Causes and Causes of Effects methodology that
is described in greater detail in Dawid et al. (2016). In Section 3.2 I apply this method
to the Shaken Baby Syndrome problem. Before diving into the probabilities of causation,
I exemplify the difference between EoC versus CoE by presenting two questions with a
simple example about taking aspirin for a headache from Dawid et al. (2016), which was
inspired by Rubin (1974):
15
CoE and EoC in the aspiring example:
Effects of Causes (EoC): Ann has a headache. She is wondering whether to take
aspirin. Would that cause her headache to disappear in, say, 30 minutes?
Causes of Effects (CoE): Ann had a headache and took aspirin. Her headache
went away after 30 minutes. Was that caused by the aspirin?
The reader will notice that to answer the EoC question, the exposure can be taken
as given (Ann takes aspirin) and the response is unknown, which makes it a source of
uncertainty. To answer the CoE question, on the other hand, we notice that both the
exposure (Ann takes aspirin) and the response (Ann’s headache goes away) are known, so
there is no uncertainty left coming from these two sources. Then how can we arrive at
an answer to the second question in terms of the probability of causation if there is no
uncertainty? This can be answered with a Causes of Effects analysis.
3.2.2 Calculating probabilities of causation for EoC and CoE
The EoC question can be answered by calculating the difference between two conditional
probabilities. The probability of causation is the probability that Ann has a recovery
(denoted by Response R = 1) given that she took the aspirin (denoted by Exposure E = 1)
minus the probability that Ann has a recovery given that she did not take the aspirin
(denoted by Exposure E = 0). We write this as
EoC probability of causation: PC = Pr(R = 1|E = 1) − Pr(R = 1|E = 0).
(1)
The CoE question can be answered by calculating a single conditional probability, as
described by Dawid et al. (2016). But first, we define the potential outcome of the response
R that will eventuate if the exposure E equals 0 (resp. R = 0).
Definition of R0 and R1 , the potential outcomes of the response:
We define R0 (resp. R1 ), the potential outcome of the response R that will eventuate
if the exposure E equals 0 (resp. R = 0). This potential response exists before E is
determined. So, the disappearance of Ann’s headache is caused by taking the aspirin
only if R1 = 1 and R0 = 0. In other words, Ann’s headache disappears if she takes
the aspirin, but does not disappear if she does not take it. This requirement eliminates the possibility that another cause, say drinking water, caused Ann’s headache
to go away. Note that Ann taking the aspirin (E = 1) and her headache going away
(R1 = 1) are only causally connected if R0 = 0.
Now that we have defined the potential outcomes R0 and R1 , we can define the CoE
probability of causation as the probability that the potential outcome of R if E = 0 equals
0, given all the background knowledge I have about Ann (denoted by H), the fact that
16
Ann took the aspirin (E = 1), and that the potential outcome of R if E = 1 equals 1. We
write this as
CoE probability of causation: PCA = PrA (R0 = 0|H, E = 1, R1 = 1),
(2)
where PrA denotes my probability distribution over attributes of Ann. Since we can never
observe R0 and R1 on the same individual, we cannot evaluate Equation 2. But we can
bound it,
PrA (R0 = 0|H, E = 1)
PrA (R0 = 1|H, E = 1)
min 1,
≥ PCA ≥ max 0, 1 −
, (3)
PrA (R1 = 1|H, E = 1)
PrA (R1 = 1|H, E = 1)
where the ratio found in the lower bound is the inverse of the risk ratio for Ann,
RRA :=
PrA (R1 = 1|H, E = 1)
.
PrA (R0 = 1|H, E = 1)
(4)
If RRA > 2 then P CA must exceed 50% and if RRA < 2 we cannot be sure that P CA
exceeds 50%. The upper bound is more subtle, so for simplicity, Dawid et al. (2016) use
the upper bound of 1, so the bounds reduce to the simplified bounded CoE probability of
causation,
1
1 ≥ PCA ≥ max 0, 1 −
.
(5)
RRA
|=
|=
We can further simplify the risk ratio by assuming that my background knowledge H
of Ann is sufficiently detailed, so at the point before Ann has decided whether to take
the aspirin my uncertainty, conditional on H, about the way her treatment decision E
will be made does not further depend on the potential responses (R0 , R1 ). The sufficiency
assumption is thus
(R0 , R1 ) A E|H,
(6)
where A is the conditional independence in my distribution PA for Ann’s characteristics.
Note that if I cannot make the sufficiency assumption, then I cannot replace the counterfactual denominator of RRA by something that I can estimate from the data. Under the
sufficiency assumption, Equation 2 becomes
PCA = PrA (R0 = 0|H, R1 = 1),
(7)
and Equation 4 becomes
RRA =
3.2.3
PrA (R1 = 1|H)
.
PrA (R0 = 0|H)
(8)
Using EoC quantities to calculate CoE bounds
It would be very useful to translate from EoC to CoE quantities, and it can be done
17
in a nontrivial way. We need to account for the fact that the CoE question deals with
an individual and not a population, as well as that it can be bounded only by certain
quantities. Dawid et al. (2016) list four strong assumptions that are necessary for being
able to use EoC quantities to calculate the CoE bounds.
Assumption 1:
Conditional on my knowledge of the pre-treatment characteristics of Ann and the trial
subjects, I regard Ann’s potential responses as exchangeable with those of the treated
subjects having characteristics H.
Assumption 2:
Conditional on my knowledge of the pre-treatment characteristics of Ann and the trial
subjects, I regard Ann’s potential responses as exchangeable with those of the untreated
subjects having characteristics H.
Assumption 3:
• H is exogenous.
• H is sufficient for Ann’s response.
• Conditional on H, Ann’s potential responses are exchangeable with those of the trial
subjects.
Assumption 4:
The sufficiency assumption in Equation 6 holds.
Under these assumptions, the CoE probability of causation in Equation 2 becomes
PCA = Pr(R0 = 0|H, R1 = 1),
(9)
and the bounds Equation 3 become
1
1 ≥ PC ≥ max 0, 1 −
ORR
,
(10)
where ORR is the observable risk ratio, or the population counterpart of the RRA ,
ORR :=
Pr(R = 1|H, E = 1)
.
Pr(R = 1|H, E = 0)
(11)
This ORR is exciting, because it is a quantity that we can actually calculate from data.
Under Assumptions 1-4, these bounds apply to any individual.
18
3.2.4
Accounting for uncertain exposure
A complication arises if we are not certain that Ann took the aspirin. So far we have
assumed we know that E = 1 and R = 1 and do not know whether there is a causal link
between the two. What about the situations in which we observe the response but are not
sure whether the individual was exposed at all? In these cases, we need to multiply the
probability of causation by the probability of exposure, conditioned on the (known) fact
that there was a positive response. Thus, we write the modified probability of causation
from Equation 2 as
PC∗A = PCA · PrA (E = 1|H, R = 1),
(12)
and the bounds in Equation 9 as
Pr(E = 0|H, R = 1)
Pr(E = 1|H, R = 1) ≥ PC ≥ max 0, 1 −
Pr(E = 0|H)
∗
3.2.5
.
(13)
EoC and CoE questions about Shaken Baby Syndrome
I now rephrase the questions from Section 3.2 as they pertain to cases of Shaken Baby
Syndrome (SBS). The formulation is somewhat different because in the cases above, Ann
took the aspirin herself. In cases of SBS the child cannot “shake himself,” so I chose an
angry father to be the “shaker” for the sake of example. In these cases I take exposure
E = 1 to denote abuse and R = 1 to denote the triad of brain injuries.
CoE and EoC in Shaken Baby Syndrome:
Effects of Causes: A healthy child is crying and has no brain injuries. An angry
father is wondering whether to shake the child. Would that cause the child
to get brain injuries after, say, 30 minutes?
Causes of Effects: A healthy child was crying and had no brain injuries, and then
he was shaken. The child got brain injuries after 30 minutes. Was that caused
by shaking?
Although these questions may sound contrived, the difference between them should not
be dismissed as unnecessary.
In court, we might be interested in performing three different tasks listed below in the
gray box. I will use these three tasks as guidelines to evaluate whether certain arguments
used in court were valid.
19
Three different questions:
Forecasting: If the child is abused, what is the probability the child will suffer the
triad of brain injuries? i.e. What is Prc (triad of brain injuries|abuse)?
Backcasting: If the child suffers the triad of brain injuries, what is the probability
the child was abused? i.e. What is Prc (abuse|triad of brain injuries)?
Attribution: If the child suffers the triad of brain injuries, what is the probability
that this was caused by abuse? This is the same as the probability of causation
for the Causes of Effects framework. i.e. What is PC∗c (as in Equation 12),
where
PC∗c = Prc (R0 = 0|H, R1 = 1) · Prc (E = 1|H, R = 1)?
(14)
Here, E is abuse, R is the triad of brain injuries, H denotes all the background
knowledge I have about the child, R0 [resp., R1 ] is the potential value of the response
R that will eventuate if in fact E = 0 [resp., E = 1], and Prc is the probability
distribution over the child’s attributes.
The connection between CoE/EoC and forecasting/backcasting/attribution is the following: Causes of Effects asks the same question as Attribution. So, to answer questions
of attribution one can use the Causes of Effects probability of causation. Effects of Causes
is related to forecasting, but is not the same. The probability of forecasting, P(positive
response | exposure), has a counterfactual probability, P(positive response | no exposure).
If we calculate the probability of forecasting minus the probability of the counterfactual of
forecasting, then we get the Effects of Causes probability of causation:
P(positive response | exposure) − P(positive response | no exposure).
(15)
For legal settings, it is often useful to ask questions of attribution, which you can answer
with the CoE probability of causation.
3.3
3.3.1
Examples of asking the wrong causal questions
Example 1: Deaths caused by short falls
In recent years, there have been several SBS cases in the United States (Medill Justice
Project, 2015a) in which the defendant has argued that the impact from a short fall, not
shaking, caused the child’s trauma and consequently death, and therefore that the cause
of death was accidental. In some of these cases, Chadwick et al.’s 2008 paper is invoked,
which states that the annual risk of death resulting from short falls among young children
is 0.48 deaths per million. This answers the question, “How likely is it that a child will die
from a short fall?”
20
Plunkett (2001) and Moran et al. (2012) also study short fall deaths, but they do not
provide other values for the risk of death due to short falls in children. Plunkett (2001)
lists 13 cases of children who had a short fall and died from the injuries, but does not
provide an estimate of prevalence of deaths due to short falls. Moran et al. (2012) argue
that Chadwick et al.’s quantity is calculated improperly because the database it uses is
biased (in the sense that cases categorized as a death due to a short fall might be listed as
death due to shaking), among other reasons, and thus the correct quantity is likely higher
than 0.48 in a million.
Chadwick et al.’s paper is valuable to attorneys because no other paper estimates the
prevalence of deaths due to short falls in infants. In addition, Chadwick et al. provide
specific definitions of short falls, which helps define the scope of the problem and makes
the estimate more transparent.
3.3.1.1
How is Chadwick et al.’s quantity calculated?
Chadwick et al. selected data from an injury database compiled by the State of California
Department of Health Services called the California Injury Data Online and provided by the
Epidemiology and Prevention for Injury Control Branch (EPIC) for the years 1999–2003
to create an estimate of the mortality rate due to short falls.10
The EPIC database contains information from discharges and death certificates submitted by all California hospitals and county medical examiners, respectively. The authors selected the short-fall deaths by selecting the 20 fall death subcategories (denoted by
ICD-10 codes) and separating them into groups called “short fall,” “long fall,” and “not
applicable.”
The authors selected the relevant cases to include in their analysis by determining which
types of falls recorded in the data counted as short falls. They included the following as
“short falls” in their analysis: fall on same level involving ice and snow, fall on same level
from slipping, tripping, and stumbling, other fall on same level attributable to collision
with, or pushing by, another person, fall while being carried, fall involving wheelchair, fall
involving bed, fall involving chair, fall involving other furniture, and fall on and from stairs
and steps. They excluded the following because they considered them to be “long falls”:
fall on and from ladder, fall on and from scaffolding, fall from, out of, or through building
or structure, fall from tree, fall from cliff, and other fall from one level to another. They
excluded the following because they considered them “not applicable”: fall involving ice
skates, skis, roller skates, or skateboards, fall involving playground equipment, diving or
jumping into water causing injury, and unspecified fall.
Chadwick et al. found that there were at most 13 short-fall deaths in the population of
2.5 million California children who were less than 5 years of age. Seven of the 13 cases were
dismissed due to coincidence of suffocation, falling from a two-story window that was too
10
Today, the data can be found online at
http://epicenter.cdph.ca.gov/ReportMenus/InjuryDataByTopic.aspx. Last accessed on October 20, 2015.
21
high for the criterion of “short fall”, falling from an undetermined height, falling onto rocks
in the arms of an adult, and crush injuries from heavy furniture falling. The six remaining
cases were considered possibly valid short-fall deaths, which yielded the calculation,
Number of infants who have died from a short fall in California in a year
Number of all infants in California in a year
=
6 cases/2.5 million children
0.48 cases/1 million children
=
.
5 years
year
(16)
This can be written as the conditional probability
Pr( Individual had a short fall and died | Individual is an infant in California ).
(17)
Chadwick et al. also noted that since some of the short fall histories in the set were
“incorrect,” the true incidence of short-fall deaths is likely less than 0.48 cases per 1 million
children.
I repeated the analysis by Chadwick et al. by using the updated EPIC database from
1999–2013, which now has 25 cases (there have been 12 cases in the 2004–2013 period in
addition to the 13 cases from 1999–2003). Chadwick et al.’s procedure yields:
25 cases/2.5 million children
0.71 cases/1 million children
=
.
14 years
year
(18)
This shows that, indeed, the fatality codes for short falls are used very rarely in infants.
Chadwick et al. also reviewed other sources to check whether they could be used to calculate
an estimate of the incidence of death from short falls, including the Consumer Product
Safety Commission Data, five studies of “multiply and reliably witnessed falls, 25 studies
of child care-related injuries, 12 studies using biomechanical analyses, over 50 studies of
large clinical populations, seven studies comparing abusive and unintentional injuries, and
others.” In addition to using the EPIC data, the authors proposed another analysis, by
using the Consumer Product Safety Commission Data, called the National Electronic Injury
Surveillance System (NEISS), to provide another measure of incidence, which is that 0.625
cases per 1 million young children per year die from short falls. However, the authors
say that these data might not be “sufficiently reliable for the purposes of estimating the
incidence because they do not include information about violence leading to the falls and
due to the nature of the data they may miss deaths resulting from short falls that are not
involved with products.”
Chadwick et al. finally settle on the quantity of 0.48 as an upper bound of the number
of short-fall deaths in children 0–5 years of age in one year.11
11
Chadwick et al.’s quantity is specific to the state of California, for which the data were readily
available. They do not address the question of how his quantity might translate to the entire United
States, or to other individual states. But that is out of the scope of this paper.
22
3.3.1.2
How is Chadwick et al.’s quantity used in court?
Prosecutors have used Chadwick’s quantity to make the argument that since the probability
that a child will die from a short fall is so low (0.48 in 1 million), it must be that the
defendant has an alibi that is extremely unlikely, so the probability the child died from
shaking is very high. Therefore, the defendant is guilty of child abuse and possibly murder
(see Figure 1).
Short falls almost never
cause deaths
Defendant is relying on
an argument that is
virtually impossible
Defendant must have
abused the child
Figure 1: Argument used by prosecutors that uses Chadwick et al.’s quantity to argue
that the defendant must have abused the child.
In terms of probabilities, the prosecutors’ argument is
Pr(Individual had a short fall and died|Individual is an infant in California) = 0.000048%
=⇒ Pr( Abuse | Injuries and death ) = 1 − 0.000048% = 99.999952%. Although it is not
mentioned explicitly, the prosecution assumes that the proportion calculated for California
is the same proportion for the United States, which is probably not true, but I will not focus
on this in this paper because there is a larger problem with the argument. This argument
has been made numerous times in court (People vs. Bailey, 2014; State of Florida vs.
Kareem Daniel Farrell, 2013; Cathy Lynn Henderson Hearing, 2009; State of Florida Vs.
Ramgoolie, 2014; State of Wisconsin vs. Patrick L. Donley, 2014).
More explicitly, the National Center for Prosecution of Child Abuse from the National
District Attorneys Association published a manual for prosecutors titled “Overcoming Defense Expert Testimony in Abusive Head Trauma Cases” (Odom et al., 2010). This manual
says that Chadwick et al.’s study “confirms that short falls do not result in death or serious
injury to children.”
Determining whether a specific child with head trauma (who might have died or might
still be alive) was abused is a difficult task. It makes sense, therefore, that prosecutors resort
to using the only quantity available. However, I argue that for the statement, “The child
seemed healthy, and then had a short fall,” the question, “Could this child’s brain injuries
and death have been caused by a short fall?” cannot be answered by using Chadwick et
al.’s quantity.
It seems that the prosecutors’ argument is attempting to address a question of backcasting. If the prosecutors’ argument were addressing backcasting, then the question they
should have asked for backcasting (which is still not the correct one to ask in trial) is,
23
Backcasting question:
“If the child suffers the triad of brain injuries and death, what is the probability the
child had a short fall?”
But in fact, the prosecutors’ argument does not address backcasting or either of the
other two tasks from the gray box in Section 3.2: forecasting or attribution.
3.3.1.3
The proper way to address short fall deaths
The proper way to address this issue is by treating it as a question of attribution, as shown
in the gray box in Section 3.2.12 The proper question to ask is,
Attribution question:
The child was healthy and had a short fall. The child suffered the triad of brain
injuries and death. Was that caused by the short fall?”
So, we need to use the Causes of Effects probability of causation for the child adjusted
by the probability of exposure (which is a version of Equation 21),
PC∗c = Pc (R0 = 0|H, R1 = 1) · Pc (E = 1|H, R = 1)
(19)
where, E denotes whether the child had short fall, R denotes whether the child got brain
injuries and death, H denotes all the background knowledge I have about the child, R0
[resp., R1 ] is the potential value of the response R that will eventuate if in fact E = 0
[resp., E = 1].
However, it is possible that the child got the brain injuries and death from a different
non-abusive cause that was not a short fall. Therefore, the quantity that could actually
be helpful in court to determine whether the child was abused is the same as above, but
instead of E denoting whether the child had a short fall, E denotes whether the child was
abused. That is, we need to answer the question:
Attribution question:
“If the child suffers the triad of brain injuries and death, what is the probability
that this was caused by abuse?”
12
The reader might think that using a likelihood ratio might be the correct way to approach this
problem. A likelihood ratio could compare the probability that the child was shaken given the child’s
injuries to the probability that the child was not shaken given the child’s injuries. However, this would be
answering a question of backcasting rather than attribution.
24
We can arrive at a bound for P Cc∗ only if we can make all the assumptions in the gray
box in Section 3.2 by using Equation 13 copied here for the readers convenience,
Pr(E = 0|H, R = 1)
∗
,
(20)
Pr(E = 1|H, R = 1) ≥ PC ≥ max 0, 1 −
Pr(E = 0|H)
where E denotes whether the child was abused, R denotes whether the child got brain
injuries and died, H denotes all the background knowledge I have about the child, R0
[resp., R1 ] is the potential value of the response R that will eventuate if in fact E = 0
[resp., E = 1].
• To calculate Pr(E = 0|H) we could look for data on infants who have very similar
demographic and medical characteristics as the child in question (denoted by H),
and find the ones out of those who were not abused.
• To calculate Pr(E = 0|H, R = 1) we could look for data on infants who have very
similar demographic and medical characteristics as the child in question and had
injuries and death13 , and find the ones out of those who were not abused.
• To calculate Pr(E = 1|H, R = 1) we could look for data on infants who have very
similar demographic and medical characteristics as the child in question and had
injuries and death, and find the ones out of those who were abused.
It seems that we could proceed to run a numerical analysis to estimate an interval for
the probability of causation. But I will stop here because determining whether a child was
abused is a nontrivial task, and not because of lack of data. There are databases that
include all this information (see Section 4.3.1), but the determination of abuse is based
partially on confessions, which have been shown to be unreliable for a variety of reasons
(Leo and Ofshe, 1998), and circular reasoning from the variables about the physical injuries.
True reports by witnesses are extremely rare, and it is impossible to tell whether the reports
are true or not.
3.3.2
Example 2: A statistical model to predict abuse
Due in part to the “battle of the experts” mentioned earlier, UK Doctor Sabine Maguire
and her research team saw the need for a scientific tool that could help make the diagnosis
of Abusive Head Trauma more objective. They thus designed a tool called PredAHT that
they claim can help the medical practitioner decide whether the case deals with abuse or
accident in cases where children have brain injuries. This tool was described in a paper
published in August 2015, although two earlier papers helped develop the theory used to
generate the tool.
13
The bolded words show the difference between this statement and the previous one.
25
3.3.2.1
How does Maguire et al.’s model work?
The way the tool works is (heuristically) as follows: A practitioner first determines whether,
in addition to brain injury, an infant has any of the following clinical indicators: apnea,
retinal hemorrhages, rib, skull, and long-bone fractures, seizures, and head and/or neck
bruising). Then the practitioner can input these values into the tool, and the tool will output the probabilities that the specific infant has abusive head trauma versus non-abusive
head trauma, along with some level of uncertainty. At first glance this seems like a gold
mine for the diagnosis of Abusive Head Trauma because the tool seemingly allows practitioners to make decisions about a controversial diagnosis with an objective and scientific
basis.
Maguire et al. first made a model in 2009 that predicted whether an individual had
abusive or non-abusive head trauma based on individual clinical features, then in 2011 they
generated a similar model that could analyze combinations of the features, and finally in
2015 they tested this model with a new dataset.
3.3.2.2
What do Maguire et al.’s papers say specifically?
In the first paper (Maguire et al., 2009) the authors performed a key word search in the
literature to select their data sources. After performing an all-language search of key
terms (although they did not describe how they selected these key terms) in 20 databases
and other sources, medical professionals who were specifically trained for this purpose
reviewed 320 studies twice. They finally selected 14 studies according to specific selection
criteria. The criteria included that the study had to have both infants with inflicted
and non-inflicted brain injury. They then determined which individual clinical features are
indicative of abusive versus non-abusive head trauma. They did this by running a multilevel
logistic regression to find the positive predictive value and odds ratios for individual clinical
indicators (n=1,655 children, 779 with inflicted brain injury).
In the second paper (Maguire et al., 2011) Maguire et al. used six data sets from
previous studies that had collections of cases of infants with some of the relevant clinical
indicators for AHT and non-AHT in children younger than three years of age (a total of
1053 infants). Again, they ran a multilevel logistic regression, but this time they identified
combinations of the clinical features indicative of abusive and non-abusive head trauma
(n=1,053 children, 348 with abusive head trauma).
In the third paper (Cowley et al., 2015) they used two data sources: one containing
patients from Cardiff University School of Medicine in the UK and another with patients
from Lille University Hospital in France. They identified the individuals at the former
by searching in the database of children undergoing neuroimaging, a Pediatric Intensive
Care database, and a Child Protection database. M. Vinchon and colleagues from Lille
University Hospital provided the latter sample as anonymized cases, which were obtained
in a large-scale prospective study of abusive and accidental head injuries in infants. The
American Academy of Pediatrics recently posted a summary of this third paper on their
26
website under the title “Research Validates New Prediction Tool for Shaken Baby Syndrome
(Abusive Head Trauma)”.14
There are several problems with the papers by Maguire et al. (2009), Maguire et al.
(2011), and Cowley et al. (2015): bias in the determination of the outcome variable, sample
selection, imputation of missing values, and the omission of covariates in the final regression
of the 2011 study. I describe the bias in the determination of the outcome variable in section
4.3.2 and discuss the other problems in the appendix.
3.3.2.3
The proper way to address the statistical predictive models
Maguire et al. (2009), Maguire et al. (2011), and Cowley et al. (2015) argue that by
generating their model, they “contribute to a more refined tool to inform clinical decisions
about the likelihood of Abusive Head Trauma [Shaken Baby Syndrome].” It may seem
that the authors are attempting to address a question of backcasting. If that were the
case, then the question they were attempting to answer is,
Backcasting question:
“If the child suffers a specific set of injuries, what is the probability the child had
Abusive Head Trauma/Shaken Baby Syndrome?”
i.e. What is Prc (AHT/SBS | specific set of injuries)? According to the authors, the set of
specific injuries is: apnea, retinal hemorrhages, rib fractures, long bone fractures, bruising
to the head and/or neck, seizures, skull fractures.
But in fact, if the authors wanted to provide a tool to help with diagnosis, they should
be focusing on the question of attribution. In other words, they should be asking the
question,
Attribution question:
“If the child suffers a specific set of injuries, what is the probability that this was
caused by abuse?”
i.e. What is PC∗c , (as defined in Equation 21)? Now, to calculate a bound PC∗c one would
have to make some strong assumptions (see Section 3.2) and have adequate data. In this
study, however, the data have some serious problems.
The authors assume that the abuse variable in the data is correct, as long as it follows
criteria 1 and 2. This is problematic because it is possible that the outcome variable was
defined precisely by the combination of clinical features, either at the level of the physician
14
The American Academy of Pediatrics posting can be found online at
https://www.aap.org/en-us/about-the-aap/aap-press-room/pages/Research-Validates-New-Prediction\
\-Tool-for-Shaken-Baby-Syndrome-(Abusive-Head-Trauma).aspx. Last checked: March 22, 2016.
27
or of the interdisciplinary team. So, there is an issue of circularity. And in addition, the
other ways of determining abuse include confessions by perpetrator, but there are many
documented false confessions. So, in my opinion, there is no way to properly calculate PC∗c
with these databases.
3.3.2.4
A note on the difference between clinical and statistical diagnoses
Some physicians have noted that a statistical model like the one proposed by Maguire
et al. (2009), Maguire et al. (2011), and Cowley et al. (2015) is inadequate for reasons
other than the EoC and CoE distinction. Some argue that clinicians will always be better
at making diagnoses for individual cases for two reasons. First they have years of experience
at diagnosing these cases. Second, they are able to account for unforeseen details for which
a statistical model would never be flexible enough to account.15
British neurosurgeon Norman Guthkelch, who wrote the first paper on Shaken Baby
Syndrome, suggested in his 2012 paper that it is also important not to rely too heavily on
statistics. He said that it might be attractive to use statistics when seeking answers about
the syndrome. However, he warned that
Statistics are helpful when we are dealing with relationships between welldefined populations, but this stage has not been achieved in the study of SBS/AHT. Instead, cases involving retino-dural hemorrhage of infancy encompass
varying age groups, genetic characteristics, underlying conditions, and potential causes, including birth injuries, dehydration, metabolic disorders, illness
and seizure disorders.
Guthkelch and the child abuse specialist from the Children’s Hospital of Pittsburgh
argue that since there are so many different types of individuals with the symptoms, a
database could never capture all the possible variables that could be causing the symptoms.
For example, what if the cause of the triad of symptoms is exposure to radioactivity, and
this was not included in the database? Then a statistical model would miss this, and it
would likely attribute the cause to something else.
This is not the reasons that we need to be cautious when using statistics. We need
to be cautious because it is easy to misinterpret the results of a statistical model. This is
especially true if data are available and the model can be run easily with some piece of
software. It is important to understand the data generation process and justify why the
data and the model are answering the question at hand.
Statistics can be used to study a disease that has different types of individuals. As
long as there is a representative survey of the population, with detailed information, a
small proportion of values that are missing at random, and an objective determination of
15
I obtained this information from personal communication with a pediatrician and child abuse
specialist from the Children’s Hospital of Pittsburgh.
28
the clinical features, then a statistical model can reveal two things. First, it can reveal if
any of the variables already included in the database occur frequently with the triad of
symptoms. Second, it can reveal if there is another latent variable that is not included in
the database that occurs frequently with the triad. Then further analysis can reveal more
information about this latent variable.
Meehl (1954) wrote about the difference between statistical and clinical prediction in
the field of phsychology. He analyzed the claim that mechanical (formal, algorithmic, statistical) methods of data combination outperformed clinical (e.g. subjective, informal, “in
the head”) methods when such combinations are used to arrive at a prediction of behavior.
Meehl (1954) argued that mechanical methods of prediction would, used correctly, make
more efficient decisions about patients’ prognosis and treatment. Meehl said that Indeed,
mechanical prediction tools often incorporate clinical judgments, properly coded, in their
predictions. The defining characteristic is that, once the data to be combined is given, the
mechanical tool will make a prediction that is always reliable. That is, it will make exactly
the same prediction for exactly the same data every time. Clinical prediction, on the other
hand, does not guarantee this.
Making explicit the differences between clinical and statistical predictions in Shaken
Baby Syndrome could be useful because it might make researchers more aware of how
diagnoses are made in clinical settings, and it may make physicians more willing to cooperate with developing adequate statistical models. After this short tangent, I turn to cases
outside Shaken Baby Syndrome that have similar problems.
4
Biased evidence
4.1
Problem: Multiple sources of bias in Shaken Baby Syndrome
In cases related to Shaken Baby Syndrome, there are often multiple sources of contextual
bias. In the trial of Trudy Muñoz (Section 2.1) I mentioned the problems that testimonies
from individuals could be distorted by the information given in court, the possible misinterpretation of Muñoz’s testimonies because of her basic English level, and the fact that the
physicians diagnosis at the hospital could have been biased by having this case be similar
to other cases that were considered to be Shaken Baby Syndrome. These are just a few
examples of possible sources of contextual bias in these cases.
Contextual bias is present when “judgment is influenced by information irrelevant or
inappropriate to the task.”16 The National Academies 2009 Report (Committee on Identifying the Needs of the Forensic Science Community, 2009) stated that forensic science
experts are vulnerable to cognitive and contextual bias, which “renders experts vulnerable
to making erroneous identifications.” The authors of this report recommend that forensic
16
William Thompson, SAMSI conference, August 2015.
http://www.samsi.info/sites/default/files/Thompson august2015.pdf Last accessed: April 2, 2016.
29
science needs to develop “rigorous protocols to guide these subjective interpretations, but
so far there is no evidence that the community has made a sufficient effort to address this
issue.” The report cites the work of Dror et al. (2006) and several other researchers to
demonstrate that there is extensive contextual bias in forensic evidence in diverse fields
such as document examination, fingerprint interpretation, crime scene analysis, bite mark
analysis, DNA interpretation, blood spatter analysis, and forensic anthropology.
Since cases of Shaken Baby Syndrome specifically, and child abuse in general, have very
little evidence outside of the medical diagnosis determined by the physician, this leaves
ample room for contextual bias. As per the recommendation of the National Academies,
it is important to make an effort to address this issue.
4.2
A solution: Blinding, i.e. using task-relevant information
Using only task-relevant information (sometimes referred to as “blinding”) is a solution that
was recently recommended by the United States government. The National Commission on
Forensic Science of the US Department of Justice recently wrote a report called “Ensuring
That Forensic Analysis Is Based Upon Task-Relevant Information” (McCormack et al.,
2015). This report describes what is the proper evidentiary basis for a forensic science
opinion. The goal of the report is to answer the question, “What facts should forensic
science service providers (FSSPs) consider and what facts should they not consider when
drawing conclusions from physical evidence?” Their answer is that FSSPs should rely
solely on task-relevant information when performing forensic analyses.
For example, when a FSSP is given the task of determining whether a fingerprint
that was found in the crime scene matches a fingerprint in a database, the FSSP should
have access only to the two fingerprints. The FSSP could also have access to other taskrelevant information, such as the material from which the prints were lifted, as this might
help determine the distortion on the image. The FSSP should not have access to the
name/race/age or other demographic information of the individual whose fingerprint is in
the database, the location of the crime scene, or any other contextual information.
In cases of Shaken Baby Syndrome it would be ideal if the individual determining the
diagnosis (for live infants) or cause or manner of death (for dead infants) had access only
to the physical information. That is, information like checking the child’s body, the results
of the CT scan and MRI, the results of the ophthalmological exam, and testing for signs
of other diseases. Even names should be blinded because names can be predictive of race.
However, as I show in the sections below, it is not so simple.
4.2.1
The person who diagnoses Shaken Baby Syndrome
Unfortunately, diagnosis of Shaken Baby Syndrome varies both by state and by the
circumstances in which the child’s injuries are evaluated. This makes blinding difficult.
In cases of Shaken Baby Syndrome a child can go to the hospital and live, go to the
30
hospital and die, die before arriving at the hospital, or die after arriving at the hospital.
If the child lives, then the physician who sees the child is responsible for providing principal and secondary diagnoses. If a death is sudden, unexpected, or results from violence
(this includes the children who die before or after going to the hospital), a medicolegal
investigator (e.g. coroner, medical examiner17 , forensic pathologist, physicians assistant)
is responsible for determining whether a homicide, suicide, or accident occurred and certifying the cause and manner of death (Committee on Identifying the Needs of the Forensic
Science Community, 2009).
Medicolegal investigators are generally given the task of determining both the cause
and manner of death. In fact, both cause and manner of death are fields in the death
certificate that they must fill out. The cause of death is that which produces the fatality
and without which the end result would not have occurred (e.g. heart attack, a gunshot
wound, and a skull fracture). The manner of death explains how the cause of death arose,
and whether it was a natural (disease) or non-natural death (accident, homicide, suicide,
or undetermined). The manner of death requires knowing the contextual evidence about
the case.
If an individual is shot in the head, the manner of death requires determining who shot
the gun and the intention of the shooter. If the the individual himself shot the gun, did
he mean to do it (suicide) or did he not (accident)? The cause of death does not require
knowing the contextual evidence. The cause of death in this case would be “gunshot
wound.”
But, if one individual is in charge of determining both the cause and manner of death,
it is possible that he or she will have access to the contextual information throughout the
analysis, and it is possible that this can bias the results of the analysis. This is a problem
with medicolegal investigators in general: They need only some information to determine
the cause of death, and they need all the information available to determine the manner of
death. But blinding themselves to some of the information (after seeing it) to determine
the cause of death is difficult or impossible to do, even if they have the intention to do so
(Thompson et al., 2013).
In cases of Shaken Baby Syndrome in which the child died after going to the hospital,
there is not only the problem of bias on the cause of death from knowing the contextual
information, but there is the bias from the additional input given by the physician. If a
physician has determined that the child has Shaken Baby Syndrome, he or she will most
likely communicate with the medicolegal investigator.18 So the physician’s opinion can
easily bias the medical examiner’s or coroner’s opinion.
17
The National Academy report (Committee on Identifying the Needs of the Forensic Science
Community, 2009) suggests that since coroners do not necessarily have medical training, they should be
replaced by medical practitioners. But this issue is outside the scope of my study.
18
I obtained the information about the medicolegal investigators speaking with the physicians by
means of personal telephone communication with Dr. Karl Williams, the medical examiner of Allegheny
county on April 2, 2016.
31
4.2.2
Difficulties with blinding in the diagnosis of Shaken Baby Syndrome
Physicians are usually expected to have as much information as possible about their
patients. For example, who brought the child into the hospital, that person’s criminal
history, race, where he or she is from, etc., the police’s opinion, the social worker’s opinion,
the reported situation in which the child started exhibiting changes, and so on. With this
information, in addition to the medical findings, the physician makes a diagnosis. This
diagnosis likely suffers from contextual bias, since it was made by including task-irrelevant
information.
But how can a physician blind himself or herself to the task-irrelevant information?
This would go against the protocols that physicians follow to diagnose all other pathologies.
Thus, this problem cannot be solved completely unless there is a change in the definition of
the diagnosis since the very diagnosis conflates the manner and cause! One cannot remove
the task-irrelevant information in these cases unless there is a change in the definition of
the syndrome or in the physician’s task.
Norman Guthkelch, who wrote the first paper on Shaken Baby Syndrome, wrote a new
paper in 2012 recognizing this problem of the syndrome’s name. He pointed out that other
syndromes are named either after their discoverer or for a prominent clinical feature, but in
contrast, Shaken Baby Syndrome implies both the mechanism and the intent. So, he said
he suggested that the elements of the classic triad of symptoms would be better defined in
terms of their medical features. Then, if necessary, other medical findings can be added to
the name.
4.3
4.3.1
Examples of biased evidence
Example 1: Bias in data about Shaken Baby Syndrome
The data about Shaken Baby Syndrome also suffers from contextual bias. It contains
information that states whether the child was abused or had an accident. This can be seen
as a variable for the diagnosis that says the child had Shaken Baby Syndrome, or as a
variable that states directly whether the child was abused. The bias (and also circularity)
is present in the determination of this variable.
Data sources that could be used to study Shaken Baby Syndrome
Some sources of data that have been used to study Shaken Baby Syndrome are hospital
emergency department data (such as the National Hospital Ambulatory Medical Care Survey and the State Emergency Department Data), hospital inpatient discharge data (such
as the Healthcare Cost and Utilization Project Nationwide Inpatient Sample, the Healthcare Cost and Utilization Project Kids Inpatient Database, and the National Hospital
Discharge Survey), and fatal data sources with vital statistics data (such as the National
Vital Statistics System and the California Epidemiology and Prevention for Injury Control
32
Branch (EPIC) database). There are also private datasets that are collected by physicians
in hospitals, such as the data used in Maguire et al. (2011).
4.3.2
Example 2: Bias in the outcome variable in a predictive model
In the Maguire papers discussed earlier (see Section 3.3.2), the authors did not address
the potential biases in the classification of the outcome variable, which denotes whether
the infant has abusive or non-abusive head trauma. The authors acknowledged that there
could be circularity in the decision of abusive versus non-abusive head trauma, so they
attempt to minimize it by excluding studies in which “the decision of abuse had relied
solely on clinical features.” They also implemented their “ranking of abuse”, which states
that an infant is abused if he or she “had an outcome confirmed by multi-agency child
protection teams, legal decision, witnessed abuse or perpetrator admission” (rank 1-2).
The full ranking of abuse determined by their previously published paper is the following
“Criteria used to define abuse”:
1. Abuse confirmed at case conference or civil, family or criminal court proceedings or
admitted by the perpetrator, or independently witnessed.
2. Abuse confirmed by stated criteria including multi-disciplinary assessment.
3. Diagnosis of abuse defined by stated criteria.
4. Abuse stated as occurring, but no supporting detail given as to how it was determined.
5. Abuse stated simply as ‘suspected, no details on whether it was confirmed or not.
The practitioners who generated the data had some method of determining whether the
head trauma suffered by each infant was abusive or non-abusive. For the cases of abuse,
they allegedly followed the description in criteria 1 or 2, or both. The following are some
questions that are unanswered in the papers:
• The authors say they exclude studies in which the decision of abuse had relied solely
on clinical features, but is it possible that the interdisciplinary team determined that
it was abuse precisely thanks to those indicators? In this case the circularity would
still be present.
• For the cases in which “abuse was admitted by the perpetrator,” was the defendant
who was being questioned filing for an appeal? That is, is it possible that they
confessed to get out of the long-term sentence or interrogation room?
• In which part of the process was the abuse ranking determined? Was it after the
trial, or during the medical examination? What if the medical examiner’s diagnosis
was questioned in the trial?
• It seems pertinent to have more clarification for the determination of abuse in each
case.
Examples of this kind of bias can be found in other datasets, such as the ones mentioned
in Section 4.3.1.
33
4.3.3
Example 3: Racial bias in convictions of Shaken Baby Syndrome
Researchers say that race is a risk factor for Shaken Baby Syndrome,19 as it appears that
African American/Black children are more prone to get Shaken Baby Syndrome. Indeed,
researchers have shown that African American/black children are overrepresented among
children investigated for child abuse and neglect (Lanier et al., 2014). Some researchers
believe that the issue is poverty, not race.20
There are no reliable sources that provide the race distribution of individuals who have
been convicted due to child abuse related to Shaken Baby Syndrome. Since Shaken Baby
Syndrome is not itself a criminal charge, it would be necessary to review the file of every
person who is in prison due to child abuse to determine whether the case was related to the
syndrome. It would also be possible to review a random sample of these, but in order to get
a random sample, it would be necessary to have a sampling frame (i.e. a list) of all the cases
of this type. In addition, a possibly large proportion of the individuals who are accused
of child abuse plead guilty, so they get a shorter sentence in prison and perhaps less time
on the child abuse registry. The author of this study is not aware of any unified dataset
(sample or census) that could allow one to make inferences about the racial distribution of
individuals who have been found guilty of committing child abuse related to Shaken Baby
Syndrome.
5
These problems occur outside of Shaken Baby Syndrome
Shaken Baby Syndrome is not the only field that has these problems with bias and asking
the wrong causal questions. There are a whole collection of legal cases that are similar for
a variety of reasons. Below I mention the Canadian Motherisk controversy and the paper
by Nicky Best et al.
5.1
Asking the wrong causal questions in the Motherisk controversy
There has been a recent controversy with a facility in Canada that was testing hair-strands
for traces drugs and alcohol. The Motherisk facility has been reviewed independently by
Justice Susan Lang, a judge of the Court of Appeal for Ontario. Lang (2015) stated that
19
For more information about racial distribution of Shaken Baby Syndrome, the post titled
“Substantiated Cases of Child Abuse and Neglect, by Race/Ethnicity” is online at
http://www.kidsdata.org/topic/7/childabuse-cases-race/table#fmt=2323&loc=2,127,347,1763,331,348,
336,171,321,345,357,332,324,369,358,362,360,337,327,364,356,217,353,328,354,323,352,320,339,334,365,
343,330,367,344,355,366,368,265,349,361,4,273,59,370,326,333,322,341,338,350,342,329,325,359,351,363,
340,335&tf=79&ch=7,11,8,10,9&sortColumnId=0&sortType=asc. Last accessed: April 2, 2016.
20
Two articles about poverty being a cause of child abuse: http://wolterskluwer.com/company/
newsroom/news/health/2014/09/poverty-not-bias-explains-racial-ethnic-differences-in-child-abuse.html,
http://www.theroot.com/articles/culture/2011/03/
black childabuse statistics report debunks bias assumptions.html. Last accessed: April 8, 2016.
34
“the hair-strand drug and alcohol testing used by the Motherisk Drug Testing Laboratory
between 2005 and April 2015 was inadequate and unreliable for use in child protection
and criminal proceedings and that the Laboratory did not meet internationally recognized
forensic standards.” Thus, the report states, the evidence from the Motherisk facility has
serious implications for the fairness of the criminal proceedings. This has led to a new
effort that will review the 16,000 test results that were provided during that time.
The reason Lang found that the results were unreliable was that lab was “regularly
relying on enzyme-linked immunosorbent assay (ELISA) test results, and those tests were
meant to be used for screening,” that is, whether there is a drug present in the system
or not. If the results of the ELISA test are that there is a drug present in the system,
a different test must be carried out to determine the prevalence of the drug. According
to the review, Motherisk communicated all “maybe” results as “positives” to their clients
from child protection agencies, instead of carrying out a follow up test.
They also found that Motherisk lab technicians failed to routinely wash hair samples
before analysis, which could alter the results of the tests, and there were no written records
of any kind. They also said they found “significant reporting anomalies or errors when they
looked through randomly selected case files.”
Unlike the short falls argument and statistical models used in Shaken Baby Syndrome
(see Sections 3.3.1 and 3.3.2), this task uses individual data to make inferences about
individuals. It seems that the criminal proceedings were using the test results to answer
the question, “If we get a positive result in the ELISA test, what is the probability the
individual was using the drug?” There are other possible reasons for which the ELISA
test was marked as having a positive result. For example, the test gave erroneous results
sometimes, the technician marked an inconclusive result as positive (this can be further
exacerbated by the pressure from attorneys and social workers to hurry the proceedings to
attempt to help a child), or the test is so sensitive that it detected trace amounts of the
drug even if the individual was not using it. To note how prevalent drug traces are in the
U.S. one just needs to read the research by forensic sciences academics, which showed that
90 percent of bills in the United States contain traces of cocaine (Negrusz et al., 1997).
It seems that the criminal proceedings were asking a question of backcasting,
Backcasting question:
“If the Motherisk test results show that there are traces of drugs, what is the probability that the individual was using drugs?”
They should have been asking a question of attribution,
35
Attribution question:
“If the Motherisk test results show that there are traces of drugs, what is the probability that this was caused by drug use?”
i.e. What is PC∗A , (as defined in Equation 21)?
PC∗A = PrA (R0 = 0|H, R1 = 1) · Prc (E = 1|H, R = 1)?
(21)
Here, E is the drug use, R is the fact that the test was positive for detecting drugs, H
denotes all the background knowledge I have about the individual, R0 [resp., R1 ] is the
potential value of the response R that will eventuate if in fact E = 0 [resp., E = 1], , and
PrA is the probability distribution over the attributes of individual A.
To calculate a bound PC∗A one would have to make some strong assumptions (see Section
3.2) and have adequate data. First, we would need to know the rates of false positives and
false negatives produced by the test. Second, we would need to know what other reasons
there could be, and what would be the probability that each would occur, for the test to
yield a positive result. Only then could the evidence be presented in criminal proceedings
in a way that gives the correct probabilities of causation using Causes of Effects.
5.2
Bias in the prediction of child abuse by Best et al.
In Best et al. (2013), there is a fundamental question about the contextual bias arising
from the data they used for their model.
Best et al. (2013) developed a formal Bayesian methodology to quantify the intrinsic uncertainty in complex clinical diagnostic problems. They used a motivating case-study of the
diagnosis of abuse in an infant presenting with an acute life threatening event and a nosebleed. They used Bayes theorem to formulate the diagnosis in terms of prior and inverse
conditional probabilities and adapted systematic review methodology and Bayesian evidence synthesis to estimate these and to propagate the associated uncertainty. They found
that the estimated probability of abuse was far more uncertain than might be supposed
from either expert advice or an informal reading of the literature. Also, their estimates
depended crucially on assumptions that were made about the conditional independence of
multiple signs of abuse. This study highlights the importance of having a formal statistical
methodology such as this to assist clinicians in reaching a diagnosis.
Dawid et al. (2016) say that Best et al. (2013) focused on the backcasting task: of
assessing whether or not abuse has in fact taken place, based on the data on the individual
case and on relevant statistical studies. To make a diagnosis, the authors should have been
working with a question of attribution. So, Dawid et al. (2016) use the data from Best et al.
(2013) to calculate the probability of abuse by using a Causes of Effects analysis. They find
several intervals for the probability of causation that depend on the priors. Under certain
assumptions, they find that the interval is (0, 0.043), which shows that the probability of
36
causation is very low.
Dawid et al. (2016) presented a very careful analysis, and I agree that Best et al. (2013)
should have been asking for the attribution question rather than the backcasting question.
However, the authors did not address the question of whether the determination of abuse
in their data was valid. How and who determined whether the children were abused in
the data provided by Best et al. (2013)? This potential contextual bias could render the
analysis invalid in a similar way as I reviewed with the Maguire data due to its bias.
6
Research agenda
Future research is required to properly address the issues with the legal and medical arguments used in court. The following are research projects that would be informative and
enlightening to the debate about the medical diagnosis of Shaken Baby Syndrome.
1. A review of all the arguments made in Table 1, by using the Causes of Effects framework, would be very useful to ensuring that correct scientific arguments are made
in court. This analysis would involve generating the questions and answers that are
required to arrive at a proper causal claim. Are there other arguments that are commonly made in court? Are there new arguments that started being used after the
publication of the manual of the National District Attorneys Association?
2. A search for more datasets in addition to the ones mentioned in this study (see
Section 4.3.1), along with an evaluation of the data, would be useful. To answer the
statistical questions posed through the method in the previous point require highquality data. How are the data collected? What is the process that generated the
data, from beginning to end? Are there confidential datasets that could be more
useful for answering the questions from the previous point?
3. Answering the Causes of Effects questions from point 1 by using the data from point
2 would provide a numerical value (or interval) that could be used in court by the
expert witnesses or attorneys to argue that an individual abused or did not abuse a
child — only this way they would be using careful statistical arguments. What priors
should be used to arrive at the final values? How do the assumptions affect the final
intervals?
4. Designing a clear and simple way to explain the Causes of Effects intervals to a
lay audience would help get the proper information to the entire audience in court.
Would the jury understand these values? How can the attorneys or expert witnesses
explain their choice of priors, or the sensitivity of their choice? Would a pilot study
help in assessing whether the scientific arguments are understandable to the jury,
judge, attorneys, and expert witnesses?
37
5. Performing an analysis about the benefits and limitations of clinical and statistical
predictions in Shaken Baby Syndrome could be useful because it might make researchers more aware of how diagnoses are made in clinical settings, and it may make
physicians more willing to cooperate with developing adequate statistical models.
Meehl (1954) posed question of whether clinical or statistical combinations of data
yielded better predictions in applied psychology. A similar analysis could be made
about Shaken Baby Syndrome.
6. A case control study might be useful for solving the problem of selection bias in the
data about Shaken Baby Syndrome cases. A case control study compares patients
who have a disease with patients who do not have the disease, and looks back retrospectively to compare how frequently the exposure to a risk factor is present in each
group to determine the relationship between the risk factor and the disease. This
could shine a light on other similarities that might be undiscovered between patients
who have Shaken Baby Syndrome-type injuries.
These projects can be completed with more time and careful analysis of the information
that is available today. Perhaps physicians and attorneys could help in providing new
guidance if there is interest in the community.
7
Recommendations
I provide two suggestions to medical researchers and attorneys for addressing the problems
of contextual bias and asking the wrong causal questions in cases related to Shaken Baby
Syndrome.
1. To resolve the problem of contextual bias in the diagnosis of Shaken Baby Syndrome,
I suggest that only the task-relevant information be provided to the individual who
determines the diagnosis. This individual could be the physician or the medical examiner/coroner, depending on whether the child has died. For medical examiners/coroners, the National Academies 2009 report has recommended that only task-relevant
information be provided for determining the cause of death, and all the information
from the case (including task-irrelevant information) be provided for determining the
manner of death. Since it is a physician who determines the diagnosis if the child is
alive (or has been alive at some point in the hospital), it is necessary to devise some
system such that the physician does not have access to task-irrelevant information
before making the diagnosis. However, since physicians generally are expected to
have all the information for each patient, it seems difficult to make an exception for
Shaken Baby Syndrome cases. This calls for a change in the definition of Shaken Baby
Syndrome. The definition of Shaken Baby Syndrome (see Section 2.3) needs not to
include the section about the manner in which the injuries were caused (i.e. “inflicted
38
blunt impact and/or violent shaking”). The only persons that should be allowed to
determine this cause should be the medical examiner/coroner when determining the
manner of death, and the jury of the defendant’s peers.21
2. To resolve the problem of asking the wrong causal questions in Shaken Baby Syndrome trials, I suggest that a Causes of Effects (CoE) framework be used in formulating the causal questions and answers. For example, say we are interested in a child
who had the triad of brain injuries and died, and we wonder whether this was caused
by a short fall. Instead of asking “If the child suffers the triad of brain injuries and
death, what is the probability the child had a short fall?,” we should be asking “If the
child suffers the triad of brain injuries and death, what is the probability that this
was caused by a short fall?” It is somewhat useful to provide Effects of Causes (EoC)
statistics and experimental results to determine whether the defendant committed
the crime, but it is necessary to understand that even after some strong assumptions,
these will only bound the desired probability.
3. To improve the sources of data that could be used to study Shaken Baby Syndrome,
it is important to: a) have a representative sample of all children (which includes
some cases of infants with Shaken Baby Syndrome), b) explicitly state how it was
determined that the child was abused, c) have no or few missing values in the conditions.
Having a representative sample is a difficult task in these situations because one
would like to collect as many cases as possible that are relevant to study Shaken
Baby Syndrome, and since it is a rare occurrence, it is easier to collect all the cases
that were determined to be abuse instead of collecting a random sample of individuals. The random sample, or if possible a census of all the children in a region,
allows researchers to make inferences about the population (including things like the
prevalence of the syndrome, the racial distribution, and other relationships between
physical symptoms).
Stating how it was determined that the child was abused is helpful because the determination could suffer from contextual bias. Perhaps some types of determinations
are different from others, and knowing how the determination was made could help
explore these patterns. Having a standardized set of tests that are performed when a
child is suspected to have Shaken Baby Syndrome is essential to having useful data.
This can help ensure that the statistical models will provide less biased results. It is
also important to have the probabilities for undercounts and overcounts of short fall
deaths. Some immigrant deaths might not be reported and some short fall deaths
21
Further analysis is required to determine whether the definitions of other diseases imply the cause of
the disease. It is possible that there are some, but if they do not have serious legal consequences they are
not as controversial as Shaken Baby Syndrome.
39
might be recorded as something else. Just as some recording of short fall deaths were
wrong, some recording of not short fall deaths could be wrong.
4. Physicians are accustomed to making diagnoses without using statistical models.
But, it is important to note that high-quality statistical models can make predictions
that could be useful for studying difficult diagnoses, such as Shaken Baby Syndrome.
Statistical models could expose certain patterns that were not evident to the physician, and clinical diagnoses could help inform the statistics. Having the physicians
communicate with the statisticians about the benefits and limitations of statistical
and clinical diagnoses could help improve both.
8
Conclusion
This is a case study on how scientific evidence can often be misused in court, as evidenced
by the shaken Baby Syndrome cases. It is a study at the intersection of statistics, medicine,
and the law. It can give statisticians a way to think about medicine, human rights, and
justice, and it can give attorneys and physicians a way to view old problems in a new light.
To summarize the problem, there is an ongoing debate about the diagnosis of Shaken
Baby Syndrome. One side says physicians can tell whether a child was abused when
they find certain clinical features. The other side says no one can tell what the cause
of the injuries was, and we need to be very careful when deciding that the child was
abused, especially if there are other diseases that mimic Shaken Baby Syndrome. Getting
a diagnosis wrong could either let a child abuser go free, or it could imprison an innocent
individual.
The two competing views about Shaken Baby Syndrome could be reconciled if the
problem were better structured. Until now, it has been difficult for researcher to address
this problem because of the lack of data and serious scientific evidence. In this paper I
propose the methodology of Causes of Effects and Effects of Causes, which has not been
previously used to analyze Shaken Baby Syndrome. The methodology of CoE and EoC
can be consistently applied to the problem of Shaken Baby Syndrome in addition to other
legal problems in child abuse.
I described the particular case of Trudy Muñoz, a Peruvian nanny who is currently
serving a 10 1⁄2-year prison sentence due to charges related to Shaken Baby Syndrome.
Muñoz’s trial exemplifies the fact that these cases, and in general cases of child abuse,
have very little evidence other than the clinical features that are observed by the physician
and the police and social worker reports about the interrogations with the individual.
Because there is not much evidence, it is essential to be very careful in how the scientific
evidence is combined and presented to a jury. It is also necessary to track the possible
sources of contextual bias that could be affecting the interpretation of the limited evidence.
By using the framework provided by this case, I stated that the major problems were the
40
sources of contextual bias in the diagnosis of the syndrome, and thus in the data that are
collected, and asking the wrong causal questions in court.
The Causes of Effects framework described by Dawid et al. (2014) provides a useful
way to evaluate whether the scientific statements that are being used in court, by the
attorneys or the expert witnesses, are relevant to the specific case in question. Statistics
has usually been used by scientists to answer causal questions about future events in the
population, such as, “Will taking aspirin remove a headache?” But in legal settings the
relevant questions are about individuals, and they are about events that already happened,
such as, “I had a headache and took an aspirin. The headache went away. Did it go away
because of the aspirin?” And the proper way to use statistics to answer the second question
is very different than the first. Thus, attorneys and expert witnesses should be aware of
the Causes of Effects methodology before they start presenting statistics that are irrelevant
to the case and might be misleading to the jury in the trial, juries in future trials, and
physicians in future emergency rooms that need to determine whether a specific case deals
with abuse.
The National Commission on Forensic Science 2015 report McCormack et al. (2015) and
the National Academies’ 2009 report (Committee on Identifying the Needs of the Forensic
Science Community, 2009) provide useful guides to preventing sources of contextual bias
from contaminating the evidence. However, for cases of Shaken Baby Syndrome it is not
as easy as restricting the information given to the fingerprint examiner to just the two
fingerprints. Since the very diagnosis conflates the manner and cause, it is essential to
change the definition of Shaken Baby Syndrome so it does not include the contextual
evidence.
In 2012, Norman Guthkelch—the author of the 1971 seminal paper on Shaken Baby
Syndrome—expressed deep concern with the convictions related to Shaken Baby Syndrome.
He said that in 1971 he had suggested shaking as a hypothesis for why the children had
this specific constellation of symptoms, not a fact (Guthkelch, 2012). When asked about
having prosecutors use his science as a basis to convict people, Guthkelch answered, I am
“absolutely and utterly shocked. Desperately disappointed. I was against defining this
thing as a syndrome in the first place.”22 Guthkelch (2012) provided a warning: “What
follows is a Serious Call. . . to members of the medical and legal professions to consider these
problems with restraint. It is, in short, a call for civility in scientific discourse.” With this
warning, Guthkelch urged the medical and legal communities to use science properly and
stop convicting innocent individuals. In a sense, with this paperI am echoing his concerns.
A meeting between pediatricians and social workers, defense attorneys, prosecution
attorneys, statisticians, and medical examiners and coroners, might be very useful to reduce
the number of wrongful convictions that are occurring today in cases related to Shaken
Baby Syndrome. Some of the main institutions that could be present, and might have
22
New York Times video, “Discovering Shaken Baby Syndrome”. Available at:
http://www.nytimes.com/video/us/100000003906972/retro-report-voices-the-doctor.html. Last accessed:
April 15, 2016.
41
opposing views, are the National Commission on Forensic Science, the Centers for Disease
Control and Prevention, and the American Academy of Pediatrics. At this meeting, the
members could discuss the problems with contextual bias and misuse of scientific evidence
mentioned here. They could discuss the possibility of changing the definition of Shaken
Baby Syndrome so it does not include the manner in which the injuries were caused, and
they could discuss new datasets that could be gathered in a way that makes it possible to
run high-quality statistical models to study Shaken Baby Syndrome. This is an ambitious
goal, but it could help make the United States legal system more just. It could help
clarify a class of cases that is frustrating for both prosecution and defense, and which has
the potential of allowing child abusers to be free or breaking up families through wrong
incarcerations.
42
Appendix:
Problems unrelated to Causes of Effects in Maguire et al.
(2009, 2011); Cowley et al. (2015)
There are several problems with the papers by Maguire et al. (2009), Maguire et al. (2011),
and Cowley et al. (2015): bias in the determination of the outcome variable, sample selection, imputation of missing values, and the omission of covariates in the final regression of
the 2011 study. Here I discuss all the problems except the bias in the determination of the
outcome variable, which can be found in Section 4.3.2.
Sample selection
In all three papers, the sample of patients was collected in a non-randomized manner. Since
brain injury is a rare condition, and databases that contains both abusive and non-abusive
head injury are even rarer, it is not surprising that the authors would use whatever data
they could find, and it is even laudable that they were able to find so many cases. However,
they omit a careful revision of the method in which the cases were gathered, as well as the
way in which this method could affect their results.
Imputation of missing values
The data that Maguire et al. used in the three papers had a high percentage of missing
values. For example, in the 2011 paper, apnea was recorded only in 301 of the 1053 cases,
so it is missing in almost one third of the cases. So they imputed the missing values in
order to run the model.
In 2009 Maguire et al. used what they called an “extremely conservative imputation
approach” to impute the missing values. For the infants with inflicted brain injury, if
they had a missing value for some features, they assigned a negative value for the missing features. On the other hand, for the infants with non-inflicted brain injury, if they
had missing values, the authors assigned a positive value for the missing features. This
could have unexpected effects in the model and does not guarantee that the model be
conservative.
In 2011, the authors imputed the data by copying the values from other individuals in
the data set that did not have missing values. This approach, the authors acknowledge,
is “statistically valid provided the data are missing at random.” It seems that a feature
like long-bone fracture, which was missing for 70% of the cases in one study and 100% in
another, is likely to be missing because it was absent rather than because the doctor did
not know or forgot to write it down (we assume a long-bone fracture is something that
should be difficult to miss). So, imputation assuming the data are missing at random when
they are not is likely to result in an overestimation of presence of the features.
43
In 2015, Maguire et al. used multiple imputation by chained equations (MICE), a
method that they again acknowledge depends on a missing at random assumption. The
same problem holds for the 2011 paper.
Omission of covariates in final regression of 2011 study
They ran three of multilevel logistic regressions: the first included age and gender, the
second included all seven clinical features (skull fractures, rib fractures, long-bone fractures,
retinal hemorrhages, head and/or neck bruising, apnea, seizures), and the third included
only the features that had a statistically significant coefficient in the second regression
(they removed age, gender, and skull fractures in the third regression). They showed the
results of the second and third regressions (Tables 4 and 5 in the paper).
The omission of the covariates from the third model, since they did not have statistically significant coefficients in the second model, could lead to omitted variable bias. For
the coefficients and their significance to be valid, all the variables need to remain in the
regression. The authors should analyze their results from Table 4, not Table 5. Fortunately
they did not change dramatically, but analyzing the results from Table 4 is more correct.
It is unclear if they did this in the 2015 paper as well.
44
References
Barnes, P. D. (2011), “Imaging of Nonaccidental Injury and the Mimics: Issues and Controversies in the Era
of Evidence-Based Medicine,” Radiologic Clinics of North America, 49, 205–229.
— (2015), “The Significance of Macrocephaly or Enlarging Head Circumference in Infants With the Triad
Further Evidence of Mimics of Shaken Baby Syndrome,” American Journal of Forensic Medicine and
Pathology, 36, 111–120.
Bazelon, E. (2011), “Shaken Baby Syndrome Faces New Questions in Court,” The New York Times http:
//www.nytimes.com/2011/02/06/magazine/06baby-t.html.
Best, N., Ashby, D., Dunstan, F., Foreman, D., and McIntosh, N. (2013), “A Bayesian approach to complex
clinical diagnoses: a case-study in child abuse,” Journal of the Royal Statistical Society, 176, 53–96.
Caffey, J. (1972), “On the Theory and Practice of Shaking Infants: Its Potential Residual Effects of Permanent Brain Damage and Mental Retardation,” American Journal of Diseases of Children, 124, 161–169.
Cathy Lynn Henderson Hearing (2009), District Court 229th Judicial District Travis County, Texas, Travis
County District, Attorney’s Office, P.O. Box 1748, Austin, Texas 78767.
Center for Statistics and Applicantions in Forensic Evidence (2016), http://forensic.stat.iastate.edu/. Last
accessed: April 2, 2016.
Cenziper, D. (2015), “A disputed diagnosis imprisons parents,” The Washington Post https://www.
washingtonpost.com/graphics/investigations/shaken-baby-syndrome/.
Christian, C. W., Block, R., and the Committee on Child Abuse and Neglect (2009), “Abusive Head Trauma
in Infants and Children,” Pediatrics, 123, 1409–1411.
Cole, S. A. and Edmond, G. (2015), “Science without Precedent: The Impact of the National Research
Council Report on the Admissibility and Use of Forensic Science Evidence in the United States,” British
Journal of American Legal Studies, 4, 585–617.
Committee on Identifying the Needs of the Forensic Science Community (2009), Strengthening Forensic
Science in the United States: A Path Forward, National Research Council of The National Academies,
500 Fifth Street N.W. Washington, DC 20001.
Commonwealth of Virginia vs. Rueda (2009), Commonwealth of Virginia Circuit Court testimony: Commonwealth of Virginia vs. Trudy Eliana Munoz Rueda, Circuit Courtroom 4J, Fairfax County Courthouse,
Fairfax, Virginia.
Cowley, L. E., Morris, C. B., Maguire, S. A., Farewell, D. M., and Kemp, A. M. (2015), “Validation of a
Prediction Tool for Abusive Head Trauma,” Pediatrics, 136.
Daubert v. Merrell Dow Pharmaceuticals, 509 U.S. 579 (1993).
Dawid, A. P. (2011), Perspectives on Causation, Oxford: Hart Publishing, chap. The role of scientific and
statistical evidence in assessing causality, pp. 133–147.
— (2015), “Statistical Causality from a Decision-Theoretic Perspective,” Annual Review of Statistics and
Its Applications, 2, 273–303.
Dawid, A. P., Faigman, D. L., and Fienberg, S. E. (2014), “Fitting Science Into Legal Contexts: Assessing
Effects of Causes or Causes of Effects? (with Discussion),” Sociological Methods and Research, 43, 359–390.
— (2015), “On the Causes of Effects: Response to Pearl,” Sociological Methods and Research, 44, 165–174.
45
Dawid, A. P., Musio, M., and Fienberg, S. E. (2016), “From Statistical Evidence to Evidence of Causality,”
Bayesian Analysis, TBA, 1–28.
Dawid, P. A. (2000), “Causal Inference Without Counterfactuals,” Journal of the American Statistical Association, 95, 407.
Dixon, L., Browne, K., and Hamilton-Giachritsis, C. (2005), “Journal of Child Psychology and Psychiatry,”
Risk factors of parents abused as children: A mediational analysis of the intergenerational continuity of
child maltreatment (part 1), 46, 47–57.
Dror, I. E., Charlton, D., and Peron, A. E. (2006), “Contextual information renders experts vulnerable to
making erroneous identifications,” Forensic Science International, 156, 74–78.
Duhaime, A.-C., Gennarelli, T. A., Thibault, L. E., Bruce, D. A., Margulies, S. S., and Wiser, R. (1987),
“The shaken baby syndrome: A clinical, pathological, and biomechanical study,” Journal of Neurosurgery,
66, 409–415.
Dunstan, F. D., Guildea, Z. E., Kontos, K., Kemp, A. M., and Sibert, J. R. (2002), “A scoring system for
bruise patterns: A tool for identifying abuse,” Archives of Disease in Childhood, 330–333.
Frye v. United States, 293 F. 1013 D.C. Cir. (1923).
Gelman, A. and Imbens, G. (2013), “Why ask Why? Forward Causal Inference and Reverse Causal Questions,” http://www.stat.columbia.edu/∼gelman/research/unpublished/reversecausal 13oct05.pdf Last accessed: March 23, 2016, working Paper 19614, National Bureau of Economic Research, Cambridge, MA.
Green, M. D., Freedman, M. D., and Gordis, L. (2011), “Reference Guide on Epidemiology,” in Reference
Manual on Scientific Evidence: Third Edition, Washington, D.C.: The National Academies Press, pp.
549–632.
Guthkelch, A. (2012), “Problems of infant retino-dural hemorrhage with minimal external injury,” Houston
Journal of Health Law and Policy, 12, 201–208.
Guthkelch, N. (1971), “Infantile Subdural Haaematoma and its Relationship to Whiplash Injuries,” British
Medical Journal, 2, 430–431.
Haberman, C. (2015), “Shaken Baby Syndrome: A Diagnosis That Divides the Medical World,” The New
York Times, http://www.nytimes.com/2015/09/14/us/shaken-baby-syndrome-a-diagnosis-that-dividesthe-medical-world.html.
Innocence Project (2016), “The Causes of Wrongful Conviction,” http://www.innocenceproject.org/
causes-wrongful-conviction, Last accessed: April 2, 2016.
Jenny, C., Hymel, K. P., Ritzen, A., Reinert, S. E., and Harvard, T. C. H. (1999), “Analysis of missed cases
of abusive head trauma,” Journal of the American Medical Association, 281, 621–626.
Judson, K. (2015), “What Child Welfare Attorneys Need to Know about Shaken Baby
Syndrome,”
http://apps.americanbar.org/litigation/committees/childrights/content/articles/
spring2015-0315-welfare-attorneys-shaken-baby-syndrome.html Last accessed: April 12, 2016.
Kafadar, K. (2015), “Statistical Issues in Assessing Forensic Evidence,” International Statistical Review, 83,
111–134.
Lang, T. H. S. E. (2015), “Report of the Motherisk Hair Analysis Independent Review,” http://www.m-hair.
ca/docs/default-source/default-document-library/motherisk enbfb30b45b7f266cc881aff0000960f99.pdf?
sfvrsn=2 last accessed: March 24, 2016, Ontario Ministry of the Attorney General.
46
Lanier, P., Maguire-Jack, K., Walsh, T., Drake, B., and Hubel, G. (2014), “Race and ethnic differences in
early childhood maltreatment in the united states,” Journal of Developmental and Behavioral Pediatrics,
35, 419–426.
Leo, R. A. and Ofshe, R. J. (1998), “The consequences of false confessions: Deprivations of liberty and
miscarriages of justice in the age of psychological interrogation,” The Journal of Criminal Law and Criminology (1973-), 88, 429–496.
Maguire, S., Pickerd, N., Farewell, D., Mann, M., Tempest, V., and Kemp, A. (2009), “Which clinical
features distinguish inflicted from non-inflicted brain injury? A systematic review,” Archives of Disease
in Childhood, 94, 860–867.
Maguire, S. A., Kemp, A. M., Lumb, R. C., and Farewell, D. M. (2011), “Estimating the Probability of
Abusive Head Trauma: A Pooled Analysis,” Pediatrics, 128.
McCormack, B. M., Epstein, J., Albright, T., Champagne, G., Fienberg, S., Pulaski, P., Sah, S., Ambrosino,
M., Cole, S., Dror, I., Farid, H., Leben, D., Christian Meissner, P., Nerheim, M., Risinger, M., Scheck,
B., Sudkamp, L., Thompson, W., and Hollway, J. (2015), Ensuring That Forensic Analysis Is Based
Upon Task-Relevant Information, National Commission on Forensic Science of the National Institute of
Standards and Technology, 950 Pennsylvania Avenue, NW Washington, DC 20530-0001, 1st ed.
Medill Justice Project (2015a), “Short fall arguments often used in court,” Northwestern University, Correspondence.
—
(2015b),
“U.S. Shaken-Baby Syndrome Database,”
http://www.medilljusticeproject.org/
u-s-shaken-baby-syndrome-database/ Last accessed: March 24, 2016.
Meehl, P. E. (1954), Clinical versus statistical prediction: A theoretical analysis and a review of the evidence,
University of Minnesota Press.
Moran, D. A., Findley, K. A., Barnes, P. D., and Squier, W. (2012), “Shaken Baby Syndrome, Abusive
Head Trauma, and Actual Innocence: Getting it Right,” Houston Journal of Health Law and Policy, 12,
209–312.
Negrusz, A., Moore, C., and Perry, J. (1997), “Detection of Cocaine on Various Denominations of United
States Currency,” Journal of Forensic Sciences, 43.
Odom, E., Appelbaum, A., and Pendle, D. (2010), Overcoming Defense Expert Testimony in Abusive Head
Trauma Cases, National Center for Prosecution of Child Abuse at the National District Attorneys Association.
Ommaya, A. K. (1968), “Whiplash Injury and Brain Damage: An Experimental Study,” Journal of the
American Medical Association, 204, 285–289.
Parks, S. E., Annest, J. L., Hill, H. A., and Karch, D. L. (2012a), Pediatric Abusive Head Trauma: Recommended Definitions for Public Health Surveillance and Research, Centers for Disease Control and Prevention., Division of Violence Prevention, Atlanta, Georgia.
— (2012b), Pediatric Abusive Head Trauma: Recommended Definitions for Public Health Surveillance and
Research, Centers for Disease Control and Prevention, National Center for Injury Prevention and Control,
Division of Violence Prevention, Atlanta, Georgia.
Pearl, J. (2014), “Causes of Effects and Effects of Causes,” Sociological Methods and Research, 44, 149–164.
People vs. Bailey (2014), State of New York, County Court testimony: The people of the state of New York
vs. Renee S. Bailey, Hall of Justice, 99 Exchange Boulevard, Rochester, New York 14614.
Plunkett, J. (2001), “Fatal pediatric head injuries caused by short-distance falls,” American Journal of
Forensic Medicine and Pathology, 22, 1–12.
47
Rubin, D. B. (1974), “Estimating Causal Effects of Treatments in Randomized and Nonrandomized Studies,”
Journal of Educational Psychology, 66, 688–701.
Sirotnak, A. P. (2006), Abusive Head Trauma in Infants and Children: A Medical, Legal, and Forensic
Reference, G W Medical Publishing, Inc., chap. Medical Disorders that Mimic Abusive Head Trauma, 1st
ed., pp. 191–226.
Smith, C. S. (2016), “This shaken baby syndrome case is a dark day for science – and for justice,” The
Guardian,
http://www.theguardian.com/commentisfree/2016/mar/14/
shaken-baby-syndrome-science-doctor-challenging-theory-infant-waney-squier.
Spiegelman, C. and Tobin, W. A. (2013), “Analysis of experiments in forensic firearms/toolmarks practice
offered as support for low rates of practice error and claims of inferential certainty,” Law, Probability and
Risk, 12, 115–133.
Squier, W. (2008), “Shaken baby syndrome: the quest for evidence,” Developmental Medicine and Child
Neurology, 50, 10–14.
State of Florida vs. Kareem Daniel Farrell (2013), Circuit Court of the Nineteenth Judicial Circuit In and
For St. Lucie Conty, State of Florida, Office of the Public Defender, 216 South Second Street, Fort Pierce,
Florida 34950.
State of Florida Vs. Ramgoolie (2014), In the Circuit Court of the Fifth Judicial Circuit of the State of
Florida, In and For Marion County, Marion County Courthouse, 110 NW 1st Avenue, Ocala, FL 34785.
State of Wisconsin vs. Patrick L. Donley (2014), State of Wisconsin, Circuit Court Branch V, Brown County,
Law Enforcement Center, Green Bay, Wisconsin 54301.
Thompson, W. (2013), Genetic Explanations: Sense and Nonsense, Harvard University Press, chap. Forensic
DNA Evidence: The Myth of Infallibility, pp. 227–255.
Thompson, W., Vuille, J., Biedrmann, A., and Taroni, F. (2013), “The role of prior probability in forensic
assessments.” Frontiers in Genetics, 4, 220–223.
Truman, T. L. and Ayoub, C. C. (2002), “Considering suffocatory abuse and Munchausen by proxy in the
evaluation of children experiencing apparent life-threatening events and sudden infant death syndrome,”
Child Maltreatment, 7, 138–148.
Tuerkheimer, D. (2009), “The Next Innocence Project: Shaken Baby Syndrome And The Criminal Courts,”
Washington University Law Review, 87, 1–58.
University of Michigan, Michigan Law, Innocence Clinic (2016), “Causes of Wrongful Convictions,” http:
//www.law.umich.edu/special/exoneration/Pages/detaillist.aspx. Last accessed: April 2, 2016.
Weber, W. (1983), “Experimental studies of skull fractures in infants,” Journal of legal medicine, 92, 87–94.
— (1985), “Biomechanical fragility of the infant skull,” Z Rechtsmed, 94, 93–101.
48