Download Effective Displays of Data for Communcation, Decision Making, a

Document related concepts
no text concepts found
Transcript
Effective Displays of Data Need More Attention in Statistics Education
Thomas E. Bradstreet, Ph.D.
Experimental Medicine Statistics, Merck Research Labs
Michael Nessly, M.S.
Clinical Biostatistics, Merck Research Labs
Thomas H. Short, Ph.D.
Mathematics Department, Indiana University of PA
JSMs 2006
Outline
Motivation
Philosophy, strategy, approach
Information continuum
Perception, design, and construction
Educational objectives
Hands­on activities, interactive discussion, real data
Course content
Parallel coverage for graphs (70%) and tables (30%)
Interactive discussions
Workshops and examples
Student evaluations
Questions and comments
2
Motivation
3
Why are graphs and tables so important?
4
Importance of Graphs and Tables In General
Data analysis
Graphs reveal structure and patterns.
Tables organize and document findings.
Communicate results from experiments and surveys
Oral presentations
Written reports and refereed publications
Target audiences
Unfamiliar with details of the data Less skilled quantitatively More statistically naïve
Internal or external to an organization
5
Importance of Graphs and Tables
Internal Communications – Industry
Document past activities; summarize ongoing efforts; support decisions on future initiatives
Example: Pharmaceutical research (animal and human)
Science is communicated through series of oral presentations and written reports.
Critical review by different scientific disciplines and levels of management
Competition for resources
Education and training
Interdisciplinary communication
6
Importance of Graphs and Tables External Activities – In General
Professional Meetings: presentations
Refereed journals: publications
Competitions: best written paper, best oral presentation, best data analysis
7
Importance of Graphs and Tables External Activities – Industry and Academia
Industry
Product research, development, and marketing
Productivity and fiscal health
Recruiting efforts
Academia
Grant writing
Seminars and colloquia
Consulting and contract work
Student internships
Job interviews
Academic – Industrial collaborations
8
Importance of Graphs and Tables Academic Preparation
Teaching: statistics, many service disciplines
Course work: data analysis, modeling, simulation
Research (Ph.D., M.S., B.S. honors): oral and written presentation
Qualifying exams: RSS Examination Board concerns and guidance
9
Course Philosophy, Strategy, and Approach
10
Information Continuum Data Analysis
Presentation
Exploration
Understanding
Communication
Discovery
Inference
Clarity
Insight
Decisions
Efficiency
11
Effective Communication = Perception ∩ Design ∩ Construction
Perception
Design
Effective
Communication
Construction
12
Educational Objectives
1. Provide exposure to the principles of perception, design, and construction.
2. Be able to construct, revise, critique, and interpret graphic and tabular displays.
3. Take a more informed leadership role in effective communication and strategic decision making.
4. Build upon the intellectual tools and resources provided by the course.
13
Pedagogical Strategy
Workshop and example driven
Interactive discussions
Real examples and data
Merck studies
Scientific literature
Mass media
14
Course Content
Parallel coverage for graphs (70%) and tables (30%).
15
Course Content
Introduction
Importance of graphs and tables
Graphs vs. tables vs. text
Context, common sense
“Grables”
Motivating examples
16
“Grables”
(More Graph Than Table)
Three Bioequivalence Trials
2.70
2.32
AUC Ratio (Test/Standard)
1.80
(1.27)
1.25
1.00
(1.08)
(1.03)
(0.98)
(1.12)
(1.01)
(0.91)
(1.12)
(0.98)
0.80
0.50
Trial 1
Trial 2
Trial 3
edd7L May 14, 2004
17
“Grables”
(More Table Than Graph )
Individual (Ο) and Mean ( ) Percent Changes From Baseline
Hour 24
10.0 mg Fasted
5.0 mg Fasted
2.0 mg Fasted
1.0 mg Fasted
0.5 mg Fasted
0.2 mg Fasted
Placebo (Panel B)
Placebo (Panel A)
N
3
6
6
6
6
6
5
6
­1
Mean SD
0.47 0.06
­0.23 0.31
0.28 0.31
0.63 0.65
0.02 0.35
­0.22 0.26
0.06 0.26
0.13 0.37
Min
0.4
­0.6
­0.3
­0.4
­0.5
­0.6
­0.3
­0.4
Max
0.5
0.2
0.6
1.5
0.4
0.0
0.4
0.5
0
1
Change (%)
edd8L May 14, 2004
18
Course Content
Design and Construction
Anatomy
Graphs: Flatland; small multiples; multifunctioning graphical elements; specific components, …
Tables: Vertical and horizontal alignment, specific components, …
Guidelines
Graphs: Effective graphs; erase non­data­ink and redundant data ink; data density; small multiples, …
Tables: Create a logical visual pattern; rounding numbers, …
Workshops 1 and 2
19
Course Content
Perception, clarity, and communication
Graphs: Lie factor, visual area vs. numeric measure, proportionality, aspect ratio, mental subtraction, chartjunk, scales, scale breaks, zero, plotting symbols, reference lines, color, Cleveland’s ordered perceptual tasks, …
Tables: Illustrative, archival/storage, presentation, text, matrix, …
Workshop 3
20
Course Content
Overheads
Software
21
Visual Area vs. Numerical Measure
Dot Chart vs. Pie Chart
220
I
G
50
I
A
B
Labels
C
20
D
B
70
H
C
A
E
180
F
G
F E
H
25 50 70 90
150 180 220
20 40
Amounts
25
40
D
90
edd22L Aug. 9, 2005
22
150
Proportionality: Data, Lines, Curves
Square Scatter Plot
NET AUC (pg•hr/mL × 10­3)
280
Drug D
80
80
Placebo
280
23
Proportionality: Data, Lines, Curves
Does physical slope = algebraic slope?
Portrait
280
Landscape
Drug D
Drug D
280
80
80
Placebo
280
80
80
Placebo
280
24
Avoid Mental Subtraction
What is going on here?
Drug A
Mean Response
Drug B
0
0
Time
25
Avoid Mental Subtraction
Average Change in Supine Blood Pressure following Rizatriptan and Placebo (left)
and Average Difference between Rizatriptan and Placebo Change (right)
Mean Change in Supine Blood Pressure Following MK and Placebo (left) and at Each Time Point
Difference Between MK and Placebo Mean Change (right) at Each Time Point
10
Mean Change from Baseline
Difference in Mean Changes
20
Supine
0
Diastolic
B.P. (mmHg)
Placebo
MK
10
0
MK ­ Pbo
­10
­20
­10
Baseline: MK=76.1 Pbo=77.2
Baseline: MK ­ Pbo = ­1.1
10
20
Supine
Systolic
0
B.P. (mmHg)
Placebo
MK
10
MK ­ Pbo
0
­10
­20
­10
0
2
4
8
12
0
Baseline: MK=116.6 Pbo=116.8
2
4
8
12
Baseline: MK ­ Pbo = ­0.2
Hours Postdosing
edd47L May 14, 2004
26
Moiré Effects
27
Error Bars (No)
Plasma Nicorandil Concentration (ng/ml)
600
10 mg
20 mg
40 mg
60 mg
500
400
300
200
100
0
0
2
4
6
8
Time (hrs)
edd55L May 25, 2004
10
28
Workshops
Workshop 1: Constructing Graphs
An “Unusual Episode” and “Favorite Datasets”
Workshop 2: Constructing Tables
Age discrimination data and “Favorite Datasets”
Workshop 3: Revised Graphs and Tables
Existing graphs and tables, internal to Merck, published in literature, and others
29
Workshop Approach and Benefits
Mix of categorical and continuous examples
Groups of six prepare and present results
Variability of approaches interests class and staff
Variability of backgrounds and responsibilities leads to discussion and division of labor
Workshops break up lecture presentations
30
Workshop 1: Oral Toxicity in Dogs
31
Workshop 2: Alcohol Interaction Study in Men
32
Workshop 3: Space Shuttle Data
33
Student Course Evaluations
34
Student Course Evaluations
Of those completing the course evaluation form …
26/27 (96.3%): Assist them in preparing effective displays of data
28/29 (96.6%): Strongly agree/agree the workshops were interesting, helpful, and fun.
28/29 (96.6%): Rated course as either excellent (62.1%) or good (34.5%).
27/27 (100%): Would recommend this course to others in their discipline with similar job responsibilities.
35
Student Course Evaluations
Some students’ comments:
“I was able to add my comments where necessary in areas I knew about. In areas I was not knowledgeable, I watched and learned.”
“The workshops were really valuable in reinforcing or giving meaning to the things we learned about and to consider when trying to communicate well with graphical displays of data.”
“A great course filling a core need within Merck. I will be discussing with RA management how attendance will improve effective external communications and enhance regulatory interactions.”
36
Student Course Evaluations
“I agree that it (the course) needs to be made mandatory for really many disciplines, Clin. Pharm., Clin. Research, Drug Metabolism, Safety Assessment, BARDS, Reg. Affairs, Epidemiology, WCDMO.”
“I never got this or saw it offered before in school or at Merck.”
“Relevant to the job, relevant to push Merck back to the top of big pharma.”
“Everyone can walk away with something they didn’t know before.”
37
Summary
Both students and working professionals need to be better equipped in creating and interpreting graphical and tabular displays of data.
Participants in the course see the immediate value of their course experience, and they are motivated to continue to improve their skill sets.
Our activity based approach was received well.
In our scientific and regulatory environment, effective displays of data are essential to communication, decision making, and competition. This applies to other environments as well.
38
References
Anscombe, F.J. (1973). “Graphs In Statistical Analysis”, The American Statistician, 27: 17­21.
Becker, R.A. and Keller­McNulty, S. (1996). “Presentation Myths”, The American Statistician, 50: 112­115.
Block, J.R., and H.E. Yuker (1992). Can You Believe Your Eyes?, New York: Brunner/Mazel Publishers.
Bradstreet, T.E. (1999). “Graphical Excellence – The Importance of Sound Principles and Practices for Effective Communication”, Bulletin of the International Statistical Institute, Book 2, 52nd Session, Helsinki, Finland, August 10­18, 1999, 271­274.
Chambers, J.M., W.S. Cleveland, B. Kliner, and P.A. Tukey (1983). Graphical Methods for Data Analysis, Belmont, CA: Wadsworth International Group and Boston, MA: Duxbury Press.
Cleveland, W.S. (1984). “Graphical Methods for Data Presentation: Full Scale Breaks, Dot Charts, and Multibased Logging”, The American Statistician, 38(4): 270­280.
 (1985, 1994). The Elements of Graphing Data, Monterey,CA: Wadsworth Advanced Books and Software and Summit, NJ: Hobart Press.
39
References
 (1988). The Collected Works of John W. Tukey, Volume V Graphics: 1965­
1985, Pacific Grove, CA: Wadsworth and Brooks/Cole Advanced Books and Software.
 (1993). Visualizing Data, Summit, NJ: Hobart Press.
Cleveland, W.S. and M.E. McGill (1988). Dynamic Graphics for Statistics, Belmont, CA: Wadsworth and Brooks/Cole Advanced Books and Software.
Cleveland, W.S. and R. McGill (1984). “Graphical Perception: Theory, Experimentation, and Application to the Development of Graphical Methods”, Journal of the American Statistical Association, 79 (387): 531­554.
Dalal, S.R., E.B. Fowlkes, and B. Hoadley (1989). “Risk Analysis of the Space Shuttle: Pre­Challenger Prediction of Failure”, Journal of the American Statistical Association, 84(408): 945­957.
Dalal, S. and B. Hoadley (1991). “Comment”, Journal of the American Statistical Association, 86(416): 921­922.
David, H. (1998). “Pictures, Please!”, RSS News, 26(1): 7.
Ehrenberg, A.S.C. (1977). “Rudiments of Numeracy”, Journal of the Royal Statistical Society, Series A, 140(3): 277­297.
Farquhar, A.B. and H. Farquhar (1891). Economic and Industrial Delusions, New York: G.P. Putnam’s Sons.
40
References
Gould, A.L., H. Kaplan, P.A. Lachenbruch, and K. Monti (1999). “Guidelines for Preparing Effective Presentations”, http://
www.enar.org/presentationguidelines.htm.
Kosslyn, S.M. (1994). Elements of Graph Design, New York: W.H. Freeman and Company.
Lavine, M. (1991). “Problems in Extrapolation Illustrated With Space Shuttle O­
Ring Data”, Journal of the American Statistical Association, 86(416): 919­
921.
Oliver, F. (1998). “How to Present Information in Graphics and Diagrams”, Notes on Behalf of the Examinations Board, Royal Statistical Society.
Pikounis, B., T.E. Bradstreet, and S.P. Millard (2001). “Graphical Insight and Data Analysis for the 2,2,2 Crossover Design”, Chapter 7 in S.P. Millard and A. Krause (eds.), Applied Statistics in the Pharmaceutical Industry with Case Studies Using S­PLUS, New York: Springer­Verlag.
Robbins, N.B. (2005). Creating More Effective Graphics, Hoboken: John Wiley & Sons.
Short, T.H. and T.E. Bradstreet (1997, 2001). http://
www.math.iup.edu/~tshort/bradstreet/
41
References
Sprent, P. (1998). “Conference Presentations Need Improving”, RSS News, 26(4):12­13.
Tufte, E.R. (1983) The Visual Display of Quantitative Information, Cheshire, CT: Graphics Press.
 (1990). Envisioning Information, Cheshire, CT: Graphics Press.
 (1997). Visual Explanations, Cheshire, CT: Graphics Press.
 (2001). The Visual Display of Quantitative Information, Second Edition, Cheshire, CT: Graphics Press.
 (2003). The Cognitive Style of PowerPoint, Cheshire, CT: Graphics Press.
Tukey, J.W. (1977). Exploratory Data Analysis, Reading MA: Addison­Wesley Publishing Company.
 (1990). “Data­Based Graphics: Visual Display in the Decades to Come”, Statistical Science, 5: 327­339.
 (1993). “Graphic Comparisons of Several Linked Aspects: Alternatives and Suggested Principles” (with Discussions and Rejoinder), Journal of Computational and Graphical Statistics, 2: 1­49.
Wainer, H. (1984). “How to Display Data Badly”, The American Statistician, 38: 137­147. 42
References
 (1997). Visual Revelations, New York: Copernicus, Springer­Verlag.
 (2005). Graphic Discovery, Princeton: Princeton University Press.
Wilkinson, L. (2005). The Grammar of Graphics, Second Edition, New York: Springer.
43
Acknowledgements
Cindy White (Sr. Statistician Assistant)
Bert Gunter (Biometrics Research)
Larry Gould (Investigative Research)
Vanessa Radcliff (Administrative Assistant)
44
Questions and Comments
45
Thank You!
46
Back­up Slides
47
Use Common Sense
Sales of SuperCaff Soda
on College Campuses
10
5
8
4
$ Billions
Thousands
Number of Ph.D. Degrees
Awarded on College Campuses
6
3
4
2
2
1960
1
1960
1970
1980
1990
1970
1980
1990
edd3L May 14, 2004
48
Motivating Example
What does this graph tell you?
Serum Alkaline Phosfatase
80
75
70
65
60
55
50
45
40
35
30
25
20
15
10
5
0
Legend
Day1
Day2
Day3
Total
Placebo
p­value = .03
5 mg
10 mg
Dose
100 mg
49
Motivating Example
50
Effective Graphs
Serve a defined purpose: Exploration, understanding, communication.
Show the data.
Tell the truth.
Encourage comparison of different pieces of data.
Reveal a large amount of quantitative information in a small region.
Reveal the data at several levels of detail; effectiveness increases with the complexity of the data.
51
Effective Graphs
Are only as complex as required by the task that they are designed to perform; they avoid pomposity
Provide impact: Communicate with clarity, precision, and efficiency.
Are a visual metaphor for the data
Are closely integrated with statistical and verbal descriptions of the data
52
Erase Non­Data­Ink
AUC
(nMol*hr)
NO
205
11
125
1
45
2
2 1
22
1
85
125
Drug B
165
205
11
1
45
2
2 1
2
125
85
1 1
2 2
1 1 2
45
2
165
Drug A
Drug A
165
85
205
2
2
YES
22
1
1 1
2
1 1 22
45
85
125
165
205
Drug B
edd13L May 14, 2004
53
Vertical Alignment
Pleasing vertical pattern makes table appealing and highlights outliers
54
Visual Area vs. Numerical Measure
Stacked Bar Charts
Clinic by Age Categories: What is going on?
21 to 30 yrs.
31 to 40 yrs.
41 to 50 yrs.
51 to 60 yrs.
61 to 72 yrs.
100
Count
80
60
40
20
0
Clinic 1 Clinic 2 Clinic 3 Clinic 4
55
Visual Area vs. Numerical Measure
Dot Chart
Age Categories by Clinic: What is going on?
Age (yrs.)
61 to 72
51 to 60
41 to 50
31 to 40
21 to 30
Clinic 1
Clinic 2
Clinic 3
Clinic 4
0
10
20
30
Count
40
50
56
Visual Area vs. Numerical Measure
Dot Chart
Age Categories by Clinic: What is going on?
Age (yrs.)
61 to 72
51 to 60
41 to 50
31 to 40
21 to 30
Clinic 1
Clinic 2
Clinic 3
Clinic 4
0
10
20 30
Count
40
50
57
Effective Displays of Data.ppt 179
Proportionality:
Data, Lines, Curves
Does physical slope = algebraic slope?
Non­Zero Origin (105, 210)
Broken Axes
Unbroken Axes
230
230
220
220
210
210
0
105 110 115
0
105 110 115
58
Proportionality:
Data, Lines, Curves
Does physical slope = algebraic slope?
Non­Zero Origin (105, 210)
Broken Axes
Unbroken Axes
230
230
220
220
210
210
0
105
110
115
0
105
110
115
59
Avoid Mental Subtraction
Mean Response Over Time
Outcome 1
200
Outcome 2
Mean Response
Drug A
Drug A
Drug B
0
200
Drug B
Outcome 4
Outcome 3
Drug A
Drug A
0
Drug B
Drug B
0
50
100 150 200
0
50
100 150 200
Days
edd42L May 25, 2004
60
Avoid Mental Subtraction
Difference
Difference (Drug A – Drug B) in Mean Responses
40
30
20
10
0
­10
0
50
100
Days
150
200
edd44L May 14, 2004
61
% Reflux Time
Abusive Tick Marks and Labels
13
12
11
10
9
8
7
6
5
4
3
2
1
0
Mean Response
Placebo
40
mg h.s.
20
mg b.i.d.
11.3
7.4
5.9
40
mg b.i.d.
2.5
edd51L May 25, 2004
62
Abusive Tick Marks and Labels
(All that is really needed is …)
Mean % Reflux Time
11.3
Pbo
7.4
40 mg h.s.
5.9
20 mg b.i.d.
2.5
40 mg b.i.d.
0
63
Small Multiples
(Do Not Do This)
Mean Change From Baseline in Supine Diastolic Blood Pressure
Day 1, Hour 6
Day 1, Hour 24
0
0
mmHg
mmHg
­4
­8
­8
­12
­16
­4
P
50 100 150 AC
­12
P
Dose (mg)
Dose (mg)
Day 5, Hour 6
Day 5, Hour 24
0
0
mmHg
mmHg
­5
­10
­5
­10
­15
­20
50 100 150 AC
P
50 100 150 AC
Dose (mg)
­15
P
50 100 150 AC
Dose (mg)
64
Small Multiples
(Try This)
Mean Change From Baseline in Supine Diastolic Blood Pressure
Hour 6
Hour 24
0
­5
Day 1
­10
mmHg
­15
­20
Day 5
P 50 100 150 AC
Dose (mg)
65