Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Assessing the Effect of Visualizations on Bayesian Reasoning through Crowdsourcing Luana Micallef Pierre Dragicevic Jean-Daniel Fekete The probability that a woman at age 40 has breast cancer is 1%. The probability that the disease is detected by a mammography is 80%. The probability that the test misdetects the disease although the patient does not have it is 9.6%. If a woman at age 40 is tested as positive, what is the probability that she indeed has breast cancer? 0% - 30% 30% - 60% 60% - 100% ATTENTION The probability that a woman at age 40 has breast cancer is 1%. The probability that the disease is detected by a mammography is 80%. The probability that the test misdetects the disease although the patient does not have it is 9.6%. If a woman at age 40 is tested as positive, what is the probability that she indeed has breast cancer? 0% - 30% 30% - 60% 60% - 100% The probability that a woman at age 40 has breast cancer is 1%. The probability that the disease is detected by a mammography is 80%. The probability that the test misdetects the disease although the patient does not have it is 9.6%. If a woman at age 40 is tested as positive, what is the probability that she indeed has breast cancer? 0% - 30% 30% - 60% 60% - 100% P ( Cancer | Positive Mammography ) = 7.8% 95 doctors out of 100 said the answer is between 70% to 80% Why the correct answer is so low Bayes’ Theorem P ( cancer | +ve mammography ) = P ( +ve mammography | cancer) P (+ve mammography | cancer) + P (+ve mammography | cancer) The probability that a woman at age 40 has breast cancer is 1%. women with cancer women without cancer The a woman probability atthat age that 40the is test tested misdetects positive, theby what disease is the although probability the TheIf probability the disease isasdetected a mammography is 80%. that patient shedoes indeed nothas have breast it is 9.6%. cancer? 7.8% women with cancer women without cancer Can such visualizations facilitate Bayesian reasoning Proposed Visualizations contingency table signal detection curves bar-grain boxes Bayesian boxes trees + Euler diagram frequency grid Euler diagram + glyphs Previous Studies Mainly in Psychology Claim that Bayesian problem representation impacts comprehension but … Inconsistent findings Most effective Bayesian problem representation? UNCLEAR Inconsistent and sometimes inappropriate diagram designs Diagrams do not match textual information (Sloman et al., 2003) Area-Proportional Not Area-Proportional and the subjects … Specific background usually highly-focused university students Specific age group Sometimes, specific department carried out as part of their course so … cannot generalize their findings to a more diverse population of laypeople Our Work Assessing the Effect of Visualizations on Bayesian Reasoning through Crowdsourcing to identify… - the most effective visualization for the crowd - whether hybrid visualizations are helpful - the link between the visualizations and different spatial and numeracy abilities but… how appropriate is Amazon MTurk Used and evaluated for research and InfoVis Demographics of workers are well-understood Captures aspects of real-world problem solving better - a large diverse population with different backgrounds, education, occupations, age, gender - workers carry out tasks rapidly but accurately to improve their rating - reduces experimental biases, as demand characteristics http://www.eulerdiagrams.org/eulerGlyphs Experiment 168 workers with MTurk approval rate ≥ 95% Demographics 25 min $1 3 Bayesian problems classics in Psychology in natural frequencies format followed by objective and subjective numeracy tests paper folding spatial abilities test brief questionnaire Results We failed to replicate previous findings subjects’ accuracy was remarkably lower visualizations exhibited no measurable benefit even though … reasonably confident with their answer overall 12% exact answers no visualization 6% 14% 11% 21% 7% 11% 14% no vis Combined Error V0 V1 6% exact V0 V1 V2 V2 V3 21% exact V4 V3 V5 V4 V6 0.0 0.2 0.4 0.6 0.8 1.0 Answer errors for all three Bayesian problems combined per visualization type (N = 24 each) 1.2 V5 V6 exact answers 12% 40% - 80% our study previous studies Thus we failed to demonstrate measurable benefits from visualizations to facilitate Bayesian reasoning. Qualitative Feedback 53 out of the 168 subjects participated 89% ‘somehow’ used the diagram Most found the diagram very useful BUT Various did not understand the diagram Some doubted the diagram’s credibility However must understand and trust the diagram the answer is in the visualization The a woman probability atthat age that 40the is test tested misdetects positive, theby what disease is the although probability the TheIf probability the disease isasdetected a mammography is 80%. that patient shedoes indeed nothas have breast it is 9.6%. cancer? 7.8% women with cancer women without cancer How either help them understand and relate the diagram to the text or force them to get the answer from the diagram change the text Another Experiment 480 workers with MTurk approval rate ≥ 95% did not participate in experiment 1 1 Bayesian problem the Mammography problem 10 out of every women at age forty who participate in routine screening have breast cancer. 8 of every 10 women with breast cancer will get a positive mammography. 95 out of every 990 women without breast cancer will also get a positive mammography. classic 10 out of every women at age forty who participate in routine screening have breast cancer (compare the red dots in the diagram below with the total number of dots). 8 of every 10 women with breast cancer will get a positive mammography (compare the red dots that have a black border with the total number of red dots). 95 out of every 990 women without breast cancer will also get a positive mammography (compare the blue dots that have a black border with the total number of blue dots). with instructions 10 out of every women at age forty who participate in routine screening have breast cancer. 8 of every 10 women with breast cancer will get a positive mammography. 95 out of every 990 women without breast cancer will also get a positive mammography. without numbers A small minority of women at age forty who participate in routine screening have breast cancer. A large proportion of women with breast cancer will get a positive mammography. A small proportion of women without breast cancer will also get a positive mammography. without numbers 10 out of every women at age forty who participate in routine screening have breast cancer. 8 of every 10 women with breast cancer will get a positive mammography. 95 out of every 990 women without breast cancer will also get a positive mammography. classic Results The Most Effective Textual Representation A small minority of women at age forty who participate in routine screening have breast cancer. A large proportion of women with breast cancer will get a positive mammography. A small proportion of women without breast cancer will also get a positive mammography. without numbers exact answers classic text + no visualization 3.3% exact answers classic text + 5% exact answers text with instructions + 5% exact answers text without numbers + 1 exact answer (N=120) Mammography Error classic + noV0 vis classic +V4 vis with instructions +V4a vis without numbers +V4b vis 0.0 0.5 1.0 1.5 2.0 Answer errors for the Mammography Bayesian problem per presentation type (N = 120 each) 2.5 Conclusion Using crowdsourcing, we assessed 6 visualizations and text alone for 3 classic Bayesian problems We failed to replicate previous findings subjects’ accuracy was remarkably lower visualizations exhibited no measurable benefit A follow-up experiment confirmed … simply adding a visualization to a textual Bayesian problem does not help diagrams can help but numerical values have to be removed and the text should be used to merely set the scene We need … novel visualization that holistically combine text and visualization and promote the use of estimation rather than calculation more studies in settings that better capture real-life rapid decision making To … facilitate reasoning of statistical information for both layman and professionals Luana Micallef Thanks Pierre Dragicevic Jean-Daniel Fekete error = log 10 answergiven answerexpected