Survey

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
```Assessing the Effect of Visualizations on
Bayesian Reasoning through Crowdsourcing
Luana Micallef
Pierre Dragicevic
Jean-Daniel
Fekete
The probability that a woman at age 40 has breast cancer is 1%.
The probability that the disease is detected by a mammography
is 80%.
The probability that the test misdetects the disease although the
patient does not have it is 9.6%.
If a woman at age 40 is tested as positive, what is
the probability that she indeed has breast cancer?
0% - 30%
30% - 60%
60% - 100%
ATTENTION
The probability that a woman at age 40 has breast cancer is 1%.
The probability that the disease is detected by a mammography
is 80%.
The probability that the test misdetects the disease although the
patient does not have it is 9.6%.
If a woman at age 40 is tested as positive, what is
the probability that she indeed has breast cancer?
0% - 30%
30% - 60%
60% - 100%
The probability that a woman at age 40 has breast cancer is 1%.
The probability that the disease is detected by a mammography
is 80%.
The probability that the test misdetects the disease although the
patient does not have it is 9.6%.
If a woman at age 40 is tested as positive, what is
the probability that she indeed has breast cancer?
0% - 30%
30% - 60%
60% - 100%
P ( Cancer | Positive Mammography ) =
7.8%
95 doctors out of 100
said the answer is between 70% to 80%
Why
the correct answer is so low
Bayes’ Theorem
P ( cancer | +ve mammography )
=
P ( +ve mammography | cancer)
P (+ve mammography | cancer) + P (+ve mammography | cancer)
The probability that a woman at age 40 has breast cancer is 1%.
women
with cancer
women
without cancer
The
a woman
probability
atthat
age
that
40the
is test
tested
misdetects
positive,
theby
what
disease
is the
although
probability
the
TheIf
probability
the
disease
isasdetected
a mammography
is 80%.
that
patient
shedoes
indeed
nothas
have
breast
it is 9.6%.
cancer?
7.8%
women
with cancer
women
without cancer
Can such
visualizations facilitate
Bayesian reasoning
Proposed Visualizations
contingency table
signal detection curves
bar-grain boxes
Bayesian boxes
trees
+
Euler diagram
frequency grid
Euler diagram + glyphs
Previous Studies
Mainly in Psychology
Claim that
Bayesian problem representation impacts comprehension
but …
Inconsistent findings
Most effective Bayesian problem representation? UNCLEAR
Inconsistent and sometimes inappropriate diagram designs
Diagrams do not match textual information
(Sloman et al., 2003)
Area-Proportional
Not Area-Proportional
and the subjects
…
Specific background
usually highly-focused university students
Specific age group
Sometimes,
specific department
carried out as part of their course
so …
cannot generalize their findings to
a more diverse population of laypeople
Our Work
Assessing the Effect of Visualizations
on Bayesian Reasoning through
Crowdsourcing
to identify…
- the most effective visualization for the crowd
- whether hybrid visualizations are helpful
- the link between the visualizations and different
spatial and numeracy abilities
but…
how appropriate is
Amazon MTurk
Used and evaluated for research and InfoVis
Demographics of workers are well-understood
Captures aspects of real-world problem solving better
- a large diverse population with different
backgrounds, education, occupations, age, gender
- workers carry out tasks rapidly but accurately to improve their rating
- reduces experimental biases, as demand characteristics
http://www.eulerdiagrams.org/eulerGlyphs
Experiment
168 workers
with MTurk approval rate ≥ 95%
Demographics
25 min
\$1
3 Bayesian problems
classics in Psychology
in natural frequencies format
followed by
objective and subjective numeracy tests
paper folding spatial abilities test
brief questionnaire
Results
We failed to replicate previous findings
subjects’ accuracy was remarkably lower
visualizations exhibited no measurable benefit
even though …
overall
12%
no visualization
6%
14%
11%
21%
7%
11%
14%
no vis
Combined Error
V0
V1
6% exact
V0
V1
V2
V2
V3
21% exact
V4
V3
V5
V4
V6
0.0
0.2
0.4
0.6
0.8
1.0
Answer errors for all three Bayesian problems combined
per visualization type (N = 24 each)
1.2
V5
V6
12%
40% - 80%
our study
previous studies
Thus
we failed to demonstrate measurable
benefits from visualizations to
facilitate Bayesian reasoning.
Qualitative Feedback
53 out of the 168 subjects
participated
89% ‘somehow’ used the diagram
Most found the diagram very useful
BUT
Various did not understand the diagram
Some doubted the diagram’s credibility
However
must understand and trust the diagram
the answer is in the visualization
The
a woman
probability
atthat
age
that
40the
is test
tested
misdetects
positive,
theby
what
disease
is the
although
probability
the
TheIf
probability
the
disease
isasdetected
a mammography
is 80%.
that
patient
shedoes
indeed
nothas
have
breast
it is 9.6%.
cancer?
7.8%
women
with cancer
women
without cancer
How
either
help them understand and relate the diagram
to the text
or
force them to get the answer from the diagram
change the text
Another Experiment
480 workers
with MTurk approval rate ≥ 95%
did not participate in experiment 1
1 Bayesian problem
the Mammography problem
10 out of every women at age forty who participate in routine screening
have breast cancer.
8 of every 10 women with breast cancer will get a positive
mammography.
95 out of every 990 women without breast cancer will also get a
positive mammography.
classic
10 out of every women at age forty who participate in routine screening
have breast cancer (compare the red dots in the diagram below with
the total number of dots).
8 of every 10 women with breast cancer will get a positive
mammography (compare the red dots that have a black border with
the total number of red dots).
95 out of every 990 women without breast cancer will also get a
positive mammography (compare the blue dots that have a black
border with the total number of blue dots).
with instructions
10 out of every women at age forty who participate in routine screening
have breast cancer.
8 of every 10 women with breast cancer will get a positive
mammography.
95 out of every 990 women without breast cancer will also get a
positive mammography.
without numbers
A small minority of women at age forty who participate in routine
screening have breast cancer.
A large proportion of women with breast cancer will get a positive
mammography.
A small proportion of women without breast cancer will also get a
positive mammography.
without numbers
10 out of every women at age forty who participate in routine screening
have breast cancer.
8 of every 10 women with breast cancer will get a positive
mammography.
95 out of every 990 women without breast cancer will also get a
positive mammography.
classic
Results
The Most Effective
Textual Representation
A small minority of women at age forty who participate in routine
screening have breast cancer.
A large proportion of women with breast cancer will get a positive
mammography.
A small proportion of women without breast cancer will also get a
positive mammography.
without numbers
classic text
+
no visualization
classic text
+
text with instructions
+
text without numbers
+
(N=120)
Mammography Error
classic + noV0
vis
classic +V4
vis
with instructions +V4a
vis
without numbers +V4b
vis
0.0
0.5
1.0
1.5
2.0
Answer errors for the Mammography Bayesian problem
per presentation type (N = 120 each)
2.5
Conclusion
Using crowdsourcing, we assessed
6 visualizations and text alone for
3 classic Bayesian problems
We failed to replicate previous findings
subjects’ accuracy was remarkably lower
visualizations exhibited no measurable benefit
A follow-up experiment confirmed …
simply adding a visualization to a textual Bayesian
problem does not help
diagrams can help but numerical values have to be
removed and the text should be used to merely
set the scene
We need …
novel visualization that holistically combine
text and visualization and promote the use of
estimation rather than calculation
more studies in settings that better capture
real-life rapid decision making
To …
facilitate reasoning of statistical information
for both layman and professionals
Luana Micallef
Thanks
Pierre Dragicevic
Jean-Daniel
Fekete
error = log 10