Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Effective Displays of Data Need More Attention in Statistics Education Thomas E. Bradstreet, Ph.D. Experimental Medicine Statistics, Merck Research Labs Michael Nessly, M.S. Clinical Biostatistics, Merck Research Labs Thomas H. Short, Ph.D. Mathematics Department, Indiana University of PA JSMs 2006 Outline Motivation Philosophy, strategy, approach Information continuum Perception, design, and construction Educational objectives Handson activities, interactive discussion, real data Course content Parallel coverage for graphs (70%) and tables (30%) Interactive discussions Workshops and examples Student evaluations Questions and comments 2 Motivation 3 Why are graphs and tables so important? 4 Importance of Graphs and Tables In General Data analysis Graphs reveal structure and patterns. Tables organize and document findings. Communicate results from experiments and surveys Oral presentations Written reports and refereed publications Target audiences Unfamiliar with details of the data Less skilled quantitatively More statistically naïve Internal or external to an organization 5 Importance of Graphs and Tables Internal Communications – Industry Document past activities; summarize ongoing efforts; support decisions on future initiatives Example: Pharmaceutical research (animal and human) Science is communicated through series of oral presentations and written reports. Critical review by different scientific disciplines and levels of management Competition for resources Education and training Interdisciplinary communication 6 Importance of Graphs and Tables External Activities – In General Professional Meetings: presentations Refereed journals: publications Competitions: best written paper, best oral presentation, best data analysis 7 Importance of Graphs and Tables External Activities – Industry and Academia Industry Product research, development, and marketing Productivity and fiscal health Recruiting efforts Academia Grant writing Seminars and colloquia Consulting and contract work Student internships Job interviews Academic – Industrial collaborations 8 Importance of Graphs and Tables Academic Preparation Teaching: statistics, many service disciplines Course work: data analysis, modeling, simulation Research (Ph.D., M.S., B.S. honors): oral and written presentation Qualifying exams: RSS Examination Board concerns and guidance 9 Course Philosophy, Strategy, and Approach 10 Information Continuum Data Analysis Presentation Exploration Understanding Communication Discovery Inference Clarity Insight Decisions Efficiency 11 Effective Communication = Perception ∩ Design ∩ Construction Perception Design Effective Communication Construction 12 Educational Objectives 1. Provide exposure to the principles of perception, design, and construction. 2. Be able to construct, revise, critique, and interpret graphic and tabular displays. 3. Take a more informed leadership role in effective communication and strategic decision making. 4. Build upon the intellectual tools and resources provided by the course. 13 Pedagogical Strategy Workshop and example driven Interactive discussions Real examples and data Merck studies Scientific literature Mass media 14 Course Content Parallel coverage for graphs (70%) and tables (30%). 15 Course Content Introduction Importance of graphs and tables Graphs vs. tables vs. text Context, common sense “Grables” Motivating examples 16 “Grables” (More Graph Than Table) Three Bioequivalence Trials 2.70 2.32 AUC Ratio (Test/Standard) 1.80 (1.27) 1.25 1.00 (1.08) (1.03) (0.98) (1.12) (1.01) (0.91) (1.12) (0.98) 0.80 0.50 Trial 1 Trial 2 Trial 3 edd7L May 14, 2004 17 “Grables” (More Table Than Graph ) Individual (Ο) and Mean ( ) Percent Changes From Baseline Hour 24 10.0 mg Fasted 5.0 mg Fasted 2.0 mg Fasted 1.0 mg Fasted 0.5 mg Fasted 0.2 mg Fasted Placebo (Panel B) Placebo (Panel A) N 3 6 6 6 6 6 5 6 1 Mean SD 0.47 0.06 0.23 0.31 0.28 0.31 0.63 0.65 0.02 0.35 0.22 0.26 0.06 0.26 0.13 0.37 Min 0.4 0.6 0.3 0.4 0.5 0.6 0.3 0.4 Max 0.5 0.2 0.6 1.5 0.4 0.0 0.4 0.5 0 1 Change (%) edd8L May 14, 2004 18 Course Content Design and Construction Anatomy Graphs: Flatland; small multiples; multifunctioning graphical elements; specific components, … Tables: Vertical and horizontal alignment, specific components, … Guidelines Graphs: Effective graphs; erase nondataink and redundant data ink; data density; small multiples, … Tables: Create a logical visual pattern; rounding numbers, … Workshops 1 and 2 19 Course Content Perception, clarity, and communication Graphs: Lie factor, visual area vs. numeric measure, proportionality, aspect ratio, mental subtraction, chartjunk, scales, scale breaks, zero, plotting symbols, reference lines, color, Cleveland’s ordered perceptual tasks, … Tables: Illustrative, archival/storage, presentation, text, matrix, … Workshop 3 20 Course Content Overheads Software 21 Visual Area vs. Numerical Measure Dot Chart vs. Pie Chart 220 I G 50 I A B Labels C 20 D B 70 H C A E 180 F G F E H 25 50 70 90 150 180 220 20 40 Amounts 25 40 D 90 edd22L Aug. 9, 2005 22 150 Proportionality: Data, Lines, Curves Square Scatter Plot NET AUC (pg•hr/mL × 103) 280 Drug D 80 80 Placebo 280 23 Proportionality: Data, Lines, Curves Does physical slope = algebraic slope? Portrait 280 Landscape Drug D Drug D 280 80 80 Placebo 280 80 80 Placebo 280 24 Avoid Mental Subtraction What is going on here? Drug A Mean Response Drug B 0 0 Time 25 Avoid Mental Subtraction Average Change in Supine Blood Pressure following Rizatriptan and Placebo (left) and Average Difference between Rizatriptan and Placebo Change (right) Mean Change in Supine Blood Pressure Following MK and Placebo (left) and at Each Time Point Difference Between MK and Placebo Mean Change (right) at Each Time Point 10 Mean Change from Baseline Difference in Mean Changes 20 Supine 0 Diastolic B.P. (mmHg) Placebo MK 10 0 MK Pbo 10 20 10 Baseline: MK=76.1 Pbo=77.2 Baseline: MK Pbo = 1.1 10 20 Supine Systolic 0 B.P. (mmHg) Placebo MK 10 MK Pbo 0 10 20 10 0 2 4 8 12 0 Baseline: MK=116.6 Pbo=116.8 2 4 8 12 Baseline: MK Pbo = 0.2 Hours Postdosing edd47L May 14, 2004 26 Moiré Effects 27 Error Bars (No) Plasma Nicorandil Concentration (ng/ml) 600 10 mg 20 mg 40 mg 60 mg 500 400 300 200 100 0 0 2 4 6 8 Time (hrs) edd55L May 25, 2004 10 28 Workshops Workshop 1: Constructing Graphs An “Unusual Episode” and “Favorite Datasets” Workshop 2: Constructing Tables Age discrimination data and “Favorite Datasets” Workshop 3: Revised Graphs and Tables Existing graphs and tables, internal to Merck, published in literature, and others 29 Workshop Approach and Benefits Mix of categorical and continuous examples Groups of six prepare and present results Variability of approaches interests class and staff Variability of backgrounds and responsibilities leads to discussion and division of labor Workshops break up lecture presentations 30 Workshop 1: Oral Toxicity in Dogs 31 Workshop 2: Alcohol Interaction Study in Men 32 Workshop 3: Space Shuttle Data 33 Student Course Evaluations 34 Student Course Evaluations Of those completing the course evaluation form … 26/27 (96.3%): Assist them in preparing effective displays of data 28/29 (96.6%): Strongly agree/agree the workshops were interesting, helpful, and fun. 28/29 (96.6%): Rated course as either excellent (62.1%) or good (34.5%). 27/27 (100%): Would recommend this course to others in their discipline with similar job responsibilities. 35 Student Course Evaluations Some students’ comments: “I was able to add my comments where necessary in areas I knew about. In areas I was not knowledgeable, I watched and learned.” “The workshops were really valuable in reinforcing or giving meaning to the things we learned about and to consider when trying to communicate well with graphical displays of data.” “A great course filling a core need within Merck. I will be discussing with RA management how attendance will improve effective external communications and enhance regulatory interactions.” 36 Student Course Evaluations “I agree that it (the course) needs to be made mandatory for really many disciplines, Clin. Pharm., Clin. Research, Drug Metabolism, Safety Assessment, BARDS, Reg. Affairs, Epidemiology, WCDMO.” “I never got this or saw it offered before in school or at Merck.” “Relevant to the job, relevant to push Merck back to the top of big pharma.” “Everyone can walk away with something they didn’t know before.” 37 Summary Both students and working professionals need to be better equipped in creating and interpreting graphical and tabular displays of data. Participants in the course see the immediate value of their course experience, and they are motivated to continue to improve their skill sets. Our activity based approach was received well. In our scientific and regulatory environment, effective displays of data are essential to communication, decision making, and competition. This applies to other environments as well. 38 References Anscombe, F.J. (1973). “Graphs In Statistical Analysis”, The American Statistician, 27: 1721. Becker, R.A. and KellerMcNulty, S. (1996). “Presentation Myths”, The American Statistician, 50: 112115. Block, J.R., and H.E. Yuker (1992). Can You Believe Your Eyes?, New York: Brunner/Mazel Publishers. Bradstreet, T.E. (1999). “Graphical Excellence – The Importance of Sound Principles and Practices for Effective Communication”, Bulletin of the International Statistical Institute, Book 2, 52nd Session, Helsinki, Finland, August 1018, 1999, 271274. Chambers, J.M., W.S. Cleveland, B. Kliner, and P.A. Tukey (1983). Graphical Methods for Data Analysis, Belmont, CA: Wadsworth International Group and Boston, MA: Duxbury Press. Cleveland, W.S. (1984). “Graphical Methods for Data Presentation: Full Scale Breaks, Dot Charts, and Multibased Logging”, The American Statistician, 38(4): 270280. (1985, 1994). The Elements of Graphing Data, Monterey,CA: Wadsworth Advanced Books and Software and Summit, NJ: Hobart Press. 39 References (1988). The Collected Works of John W. Tukey, Volume V Graphics: 1965 1985, Pacific Grove, CA: Wadsworth and Brooks/Cole Advanced Books and Software. (1993). Visualizing Data, Summit, NJ: Hobart Press. Cleveland, W.S. and M.E. McGill (1988). Dynamic Graphics for Statistics, Belmont, CA: Wadsworth and Brooks/Cole Advanced Books and Software. Cleveland, W.S. and R. McGill (1984). “Graphical Perception: Theory, Experimentation, and Application to the Development of Graphical Methods”, Journal of the American Statistical Association, 79 (387): 531554. Dalal, S.R., E.B. Fowlkes, and B. Hoadley (1989). “Risk Analysis of the Space Shuttle: PreChallenger Prediction of Failure”, Journal of the American Statistical Association, 84(408): 945957. Dalal, S. and B. Hoadley (1991). “Comment”, Journal of the American Statistical Association, 86(416): 921922. David, H. (1998). “Pictures, Please!”, RSS News, 26(1): 7. Ehrenberg, A.S.C. (1977). “Rudiments of Numeracy”, Journal of the Royal Statistical Society, Series A, 140(3): 277297. Farquhar, A.B. and H. Farquhar (1891). Economic and Industrial Delusions, New York: G.P. Putnam’s Sons. 40 References Gould, A.L., H. Kaplan, P.A. Lachenbruch, and K. Monti (1999). “Guidelines for Preparing Effective Presentations”, http:// www.enar.org/presentationguidelines.htm. Kosslyn, S.M. (1994). Elements of Graph Design, New York: W.H. Freeman and Company. Lavine, M. (1991). “Problems in Extrapolation Illustrated With Space Shuttle O Ring Data”, Journal of the American Statistical Association, 86(416): 919 921. Oliver, F. (1998). “How to Present Information in Graphics and Diagrams”, Notes on Behalf of the Examinations Board, Royal Statistical Society. Pikounis, B., T.E. Bradstreet, and S.P. Millard (2001). “Graphical Insight and Data Analysis for the 2,2,2 Crossover Design”, Chapter 7 in S.P. Millard and A. Krause (eds.), Applied Statistics in the Pharmaceutical Industry with Case Studies Using SPLUS, New York: SpringerVerlag. Robbins, N.B. (2005). Creating More Effective Graphics, Hoboken: John Wiley & Sons. Short, T.H. and T.E. Bradstreet (1997, 2001). http:// www.math.iup.edu/~tshort/bradstreet/ 41 References Sprent, P. (1998). “Conference Presentations Need Improving”, RSS News, 26(4):1213. Tufte, E.R. (1983) The Visual Display of Quantitative Information, Cheshire, CT: Graphics Press. (1990). Envisioning Information, Cheshire, CT: Graphics Press. (1997). Visual Explanations, Cheshire, CT: Graphics Press. (2001). The Visual Display of Quantitative Information, Second Edition, Cheshire, CT: Graphics Press. (2003). The Cognitive Style of PowerPoint, Cheshire, CT: Graphics Press. Tukey, J.W. (1977). Exploratory Data Analysis, Reading MA: AddisonWesley Publishing Company. (1990). “DataBased Graphics: Visual Display in the Decades to Come”, Statistical Science, 5: 327339. (1993). “Graphic Comparisons of Several Linked Aspects: Alternatives and Suggested Principles” (with Discussions and Rejoinder), Journal of Computational and Graphical Statistics, 2: 149. Wainer, H. (1984). “How to Display Data Badly”, The American Statistician, 38: 137147. 42 References (1997). Visual Revelations, New York: Copernicus, SpringerVerlag. (2005). Graphic Discovery, Princeton: Princeton University Press. Wilkinson, L. (2005). The Grammar of Graphics, Second Edition, New York: Springer. 43 Acknowledgements Cindy White (Sr. Statistician Assistant) Bert Gunter (Biometrics Research) Larry Gould (Investigative Research) Vanessa Radcliff (Administrative Assistant) 44 Questions and Comments 45 Thank You! 46 Backup Slides 47 Use Common Sense Sales of SuperCaff Soda on College Campuses 10 5 8 4 $ Billions Thousands Number of Ph.D. Degrees Awarded on College Campuses 6 3 4 2 2 1960 1 1960 1970 1980 1990 1970 1980 1990 edd3L May 14, 2004 48 Motivating Example What does this graph tell you? Serum Alkaline Phosfatase 80 75 70 65 60 55 50 45 40 35 30 25 20 15 10 5 0 Legend Day1 Day2 Day3 Total Placebo pvalue = .03 5 mg 10 mg Dose 100 mg 49 Motivating Example 50 Effective Graphs Serve a defined purpose: Exploration, understanding, communication. Show the data. Tell the truth. Encourage comparison of different pieces of data. Reveal a large amount of quantitative information in a small region. Reveal the data at several levels of detail; effectiveness increases with the complexity of the data. 51 Effective Graphs Are only as complex as required by the task that they are designed to perform; they avoid pomposity Provide impact: Communicate with clarity, precision, and efficiency. Are a visual metaphor for the data Are closely integrated with statistical and verbal descriptions of the data 52 Erase NonDataInk AUC (nMol*hr) NO 205 11 125 1 45 2 2 1 22 1 85 125 Drug B 165 205 11 1 45 2 2 1 2 125 85 1 1 2 2 1 1 2 45 2 165 Drug A Drug A 165 85 205 2 2 YES 22 1 1 1 2 1 1 22 45 85 125 165 205 Drug B edd13L May 14, 2004 53 Vertical Alignment Pleasing vertical pattern makes table appealing and highlights outliers 54 Visual Area vs. Numerical Measure Stacked Bar Charts Clinic by Age Categories: What is going on? 21 to 30 yrs. 31 to 40 yrs. 41 to 50 yrs. 51 to 60 yrs. 61 to 72 yrs. 100 Count 80 60 40 20 0 Clinic 1 Clinic 2 Clinic 3 Clinic 4 55 Visual Area vs. Numerical Measure Dot Chart Age Categories by Clinic: What is going on? Age (yrs.) 61 to 72 51 to 60 41 to 50 31 to 40 21 to 30 Clinic 1 Clinic 2 Clinic 3 Clinic 4 0 10 20 30 Count 40 50 56 Visual Area vs. Numerical Measure Dot Chart Age Categories by Clinic: What is going on? Age (yrs.) 61 to 72 51 to 60 41 to 50 31 to 40 21 to 30 Clinic 1 Clinic 2 Clinic 3 Clinic 4 0 10 20 30 Count 40 50 57 Effective Displays of Data.ppt 179 Proportionality: Data, Lines, Curves Does physical slope = algebraic slope? NonZero Origin (105, 210) Broken Axes Unbroken Axes 230 230 220 220 210 210 0 105 110 115 0 105 110 115 58 Proportionality: Data, Lines, Curves Does physical slope = algebraic slope? NonZero Origin (105, 210) Broken Axes Unbroken Axes 230 230 220 220 210 210 0 105 110 115 0 105 110 115 59 Avoid Mental Subtraction Mean Response Over Time Outcome 1 200 Outcome 2 Mean Response Drug A Drug A Drug B 0 200 Drug B Outcome 4 Outcome 3 Drug A Drug A 0 Drug B Drug B 0 50 100 150 200 0 50 100 150 200 Days edd42L May 25, 2004 60 Avoid Mental Subtraction Difference Difference (Drug A – Drug B) in Mean Responses 40 30 20 10 0 10 0 50 100 Days 150 200 edd44L May 14, 2004 61 % Reflux Time Abusive Tick Marks and Labels 13 12 11 10 9 8 7 6 5 4 3 2 1 0 Mean Response Placebo 40 mg h.s. 20 mg b.i.d. 11.3 7.4 5.9 40 mg b.i.d. 2.5 edd51L May 25, 2004 62 Abusive Tick Marks and Labels (All that is really needed is …) Mean % Reflux Time 11.3 Pbo 7.4 40 mg h.s. 5.9 20 mg b.i.d. 2.5 40 mg b.i.d. 0 63 Small Multiples (Do Not Do This) Mean Change From Baseline in Supine Diastolic Blood Pressure Day 1, Hour 6 Day 1, Hour 24 0 0 mmHg mmHg 4 8 8 12 16 4 P 50 100 150 AC 12 P Dose (mg) Dose (mg) Day 5, Hour 6 Day 5, Hour 24 0 0 mmHg mmHg 5 10 5 10 15 20 50 100 150 AC P 50 100 150 AC Dose (mg) 15 P 50 100 150 AC Dose (mg) 64 Small Multiples (Try This) Mean Change From Baseline in Supine Diastolic Blood Pressure Hour 6 Hour 24 0 5 Day 1 10 mmHg 15 20 Day 5 P 50 100 150 AC Dose (mg) 65