Download Statistical Thinking and Analysis

© 2011 Jones and Bartlett Publishers, LLC Donald E. Lighter, MD, MBA, FAAP, FACHE Professor, University of Tennessee Statistical Thinking and Analysis © 2011 Jones and Bartlett Publishers, LLC Deming – Theory of Profound Knowledge • Systems thinking – – – – – • System is more than a sum of its parts Understanding that the parts interact to produce the end product Coordination and collaboration of parts increase productivity of system Interconnected subsystems and processes affect each other Process variation – – All processes have variation that can either be inherent in the process or due to external influences – Variation is a major source of nonconforming output leading to reduced quality and higher cost – Identifying and reducing sources of variation is a major undertaking for performance improvement initiatives • Theory of knowledge – – Information must be tempered by experience and theory to become knowledge – Effective managers combine experience and theory to create organizational knowledge • Psychology – – Understanding the variation in people is as important as understanding the variation in processes – Successful managers use human psychology to effectively coordinate, collaborate, and motivate workers to optimize system outcomes © 2011 Jones and Bartlett Publishers, LLC Statistical thinking • Ron Snee (1986): “… statistical thinking is used to describe the thought processes that acknowledge the ubiquitous nature of variation and that its identification, characterization, quantification, control, and reduction provide a unique opportunity for improvement. ” “ ….Every enterprise is made up of a collection of interconnected processes whose input, control variable, and output are subject to variation. This leads to the conclusion that statistical thinking must be used routinely at all levels of the organization.” Snee, R. D. (1986). "In Pursuit of Total Quality." Quality Progress 20(8): 25-31. © 2011 Jones and Bartlett Publishers, LLC Shewhart’s concept of process variation • Common cause • Special (attributable) cause – Variation is based on the process materials and procedures; – Variation is predictable using mathematics related to probability and chance; – Variation is irregular, i.e. shows no particular pattern; and – High and low values within the measurements are statistically indistinguishable – Variation due to new, unanticipated, emergent, or previously unknown factors within the system; – Variation that is entirely unpredictable, even using statistical probability techniques; – Variation that is outside historical trends; and – Variation that indicates an underlying change in the system or some previously unidentified factor. © 2011 Jones and Bartlett Publishers, LLC Translated into statistical thinking: • Common Cause variation – Data points from process fall inside control limits – Data points are statistically indistinguishable • Special Cause variation – Data points from process fall outside control limits (3 standard deviations) – < 0.3% (3 chances per 1000) probability of occurrence • Tampering – Deming’s concept of treating common cause variation like a Jones special cause © 2011 and Bartlett Publishers, LLC Why is this stuff important? Let’s Review Some Data Collection Rules… © 2011 Jones and Bartlett Publishers, LLC Data collection principles • Understand need for information collected • Collect everything you think might be needed • Least invasive methods of collection • Operational definition of each data element • Appropriate format for analysis • Before starting, review the study to ensure correctness © 2011 Jones and Bartlett Publishers, LLC Common data sources in health care Source Claims Pro Con •Data used to pay claims •Analyzed for errors by edits in payer computer system •Data entry errors – insurer, provider •Paucity of information (limited clinical info) •Inconsistent payments for same services •Upcoding •Capitation effects Medicaid •Consistent coding systems within state •Population fairly uniform •Same as above •Varying types of plans around the US •Tendency to upcode more pronounced Medicare •Relatively consistent data set •Edits tend to reduce coding errors •Upcoding still a problem •Payment schedules vary by region, more than by specialty Provider Billing Systems •Source data from point of care •Usually consistent within a practice •Broad variation in coding between practices •Coding variation also for same services •Variety of formats •Original source data from point of care •Complete record of clinical encounter •Expensive to review •Variation in recording •Handwriting •Variety of recording conventions •Measures customer opinions directly •Often can be done simply •Lack of scientific approach, leading to bias •Selection bias •Validation Patient Charts Surveys © 2011 Jones and Bartlett Publishers, LLC Statistical process control (SPC) – a method to understand variation • Shows trends in the process mean over time • Evaluates process variability at each point in time • Provides graphic evidence that process is in control (or not) at each point © 2011 Jones and Bartlett Publishers, LLC Two primary types of data • Attributes – Counts of individual items – Examples? • Continuous (variables) – Variables along a measurement scale – Real numbers, no “gaps” between measures © 2011 Jones and Bartlett Publishers, LLC Types of Control Charts • Attribute data charts – p and np charts – c and u charts • Continuous data charts – IX-MR charts – X-bar and R charts – X-bar and s charts © 2011 Jones and Bartlett Publishers, LLC © 2011 Jones and Bartlett Publishers, LLC Commonly used control charts Control Charts for Attributes Data © 2011 Jones and Bartlett Publishers, LLC Attribute chart selection • p-charts – Proportions of nonconformities – Example: C-section rates • np-charts – Numbers of nonconformities – Example: maternal deaths • c-chart – Nonconformities per inspection unit, constant number of inspection units – Examples: housekeeping errors per room; missed appointments per day • u-chart – Nonconformities per inspection unit, like c, BUT… – Used when the number of inspection units varies © 2011 Jones and Bartlett Publishers, LLC Attributes data limb of decision tree… © 2011 Jones and Bartlett Publishers, LLC Example c-chart 20 15 Mean LCL 10 UCL Nonconformities 5 10 7 6 5 4 3 2 0 1 Number of Nonconformities c-chart for XYZ Clinic Day of Study © 2011 Jones and Bartlett Publishers, LLC u-charts u-chart for St. Elsewhere Food Service Note “wavy” control limit line – why? 0.0700 0.0600 UCL 0.0400 LCL 0.0300 u 0.0200 Mean u 0.0100 Day of Data Collection © 2011 Jones and Bartlett Publishers, LLC 7 -0.0100 4 0.0000 1 u-value 0.0500 Commonly used control charts Control Charts for Continuous Variables © 2011 Jones and Bartlett Publishers, LLC Continuous (variables) data limb of decision tree… © 2011 Jones and Bartlett Publishers, LLC IX-MR chart creation • Example: ALOS for a hospital – Data obtained from a hospital over 24 months – Calculate mean of all samples, plot as center line – Calculate MR, average moving range – Control limits = + D4 * MR-bar (D4 is the “correction factor”, see Table 5.7, p 191 in the text) – Plot on graph Remember: software does this work for you… © 2011 Jones and Bartlett Publishers, LLC IX-MR chart 40 IX Chart 30 24.7170 Average 20 10 7.3636 0 -10 -9.9897 -20 Range Date/Time/Period 30 25 20 15 10 5 0 MR Chart Note out of control MR chart point! 21.3133 6.5238 Date/Time/Period/Number IX chart has sample size of 1 Moving range is the difference between successive points and is surrogate for standard deviation (with correction factor) © 2011 Jones and Bartlett Publishers, LLC IX-MR chart • What’s important? – MR Chart – what does it mean if the MR is out of control? – IX Chart – what does it mean if an IX value is out of control? – What other analyses could we do? © 2011 Jones and Bartlett Publishers, LLC Common control charts depend on… • “Reasonable” conformity of the data set to a Gaussian (normal, bell shaped) distribution • Most analysis programs will provide a histogram of the data to determine if data are normally distributed IX-MR Histogram 12 10 0.08 24.71 -9.99 0.07 0.06 8 Number 0.05 6 0.04 0.03 4 0.02 2 0.01 0 -9.986920881 0 -3.046697983 3.893524915 10.83374781 17.77397071 24.71419361 Note bimodal distribution of data, indicating reason for MR chart in previous slide to be out of control; thus, IX-MR may not be appropriate for this data set © 2011 Jones and Bartlett Publishers, LLC What if the MR is out of control? • Determine special cause using root cause analysis and eliminate • Re-run the analysis with special cause eliminated • Track data through more cycles to ensure that attributable cause was correctly identified • Other options: – Data transformation, e.g. natural log of each point – Usually better to identify special © 2011 Jones and Bartlett Publishers, LLC cause Other types of continuous variable charts © 2011 Jones and Bartlett Publishers, LLC X-bar and Range Chart • Similar to IX-MR chart, except: – Subgroup size = 2 – 9 – Measure of variation is range • Procedure: – Mean of each subgroup plotted – Mean of those means is centerline – Range of each subgroup plotted – Mean of those ranges is centerline – D4 is used to adjust ranges to control limits – A2 is used to create X-bar control limits © 2011 Jones and Bartlett Publishers, LLC X-bar-R chart 50 Phlebotomist Time - Notify to Draw Average Time (X-bar) 40 30 20 10 Day of Study 60 Subgroup Range 4.5943 0 R Chart - Phlebotomist Time 40 20 0 0.0000 Day of Study Note the R chart is in control © 2011 Jones and Bartlett Publishers, LLC The histogram is “reasonably” normally distributed… Histogram 40 35 59.54 -22.82 30 Number 25 20 15 10 5 0 -22.8158868-14.57998216-6.344077532 1.8918271 10.12773173 18.36363636 26.599541 34.83544563 43.07135026 51.30725489 59.54315952 67.77906415 © 2011 Jones and Bartlett Publishers, LLC The last commonly used continuous variable chart… © 2011 Jones and Bartlett Publishers, LLC X-bar and s chart • Similar to others, except – Subgroup size >9 – Measure of variation = sample standard deviation • Procedure – Mean of each subgroup plotted – Mean of those means is centerline – s of each subgroup calculated and plotted – Mean of those s-values is centerline – B4 and B3 are used to adjust s to control limits – A3 is used to create X-bar control limits © 2011 Jones and Bartlett Publishers, LLC Airflow Example • Airflow measurements on a clinical unit • Ten measurements a day, spaced throughout the day • Subgroup size = 10 • Subgroup time period = 1 day • Measurements then plotted on x-bar s chart © 2011 Jones and Bartlett Publishers, LLC The Airflow Example 37.43 UCL 33.66 CL 29.89 LCL 6.62731 UCL 3.86207 CL Xbar SD 1.09683 LCL 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Day Note similarity to X-bar-R chart © 2011 Jones and Bartlett Publishers, LLC 21 22 23 24 25 Data conversion… • Used when raw data are not normally distributed • Used when raw data sample sizes are not uniform • Types of conversions – Lognormal – Arcsin – z-score • How do we calculate z-scores? © 2011 Jones and Bartlett Publishers, LLC Example Z-score plot Z-Score Chart 4.000 UCL 3.000 2.000 1.000 0.000 Mean -1.000 -2.000 -3.000 LCL -4.000 Z scores are x-values divided by the standard deviation © 2011 Jones and Bartlett Publishers, LLC In summary… • Data types - attribute vs. continuous variables determine type of control chart • Control charts have center line (average of control chart means) and upper and lower control limits (+3s) • For attribute charts, data points are nonconformity values or rates • For continuous variable charts, data points are sample values or averages of sample values • Measures of variation for control charts are corrected using bias correction tables © 2011 Jones and Bartlett Publishers, LLC Other useful analyses ANOM ANOVA Regression © 2011 Jones and Bartlett Publishers, LLC Rankings are used in health care • Concept of rankings – How are they used? – Are they valid? – What about control limits? • Measures falling within control limits are common cause - statistically indistinguishable • Can’t be ranked! – Time factor - most rankings are for specific period of time • Physician or provider profiles – experiences? © 2011 Jones and Bartlett Publishers, LLC Ranking – some approaches to validation • 95% Confidence Intervals – Not time series based, usually single point in time – Help establish the level of variation in the measurement used for the ranking (higher variation, less predictive ability from ranks) – Still difficult to identify outliers © 2011 Jones and Bartlett Publishers, LLC Percentiles • Often used for comparisons – Examples • • • • Percent mortality post-op Nosocomial infection rates Error rate for claims entry Others? • Problems with percentages – Denominator size may vary, making comparisons potentially invalid – Case mix adjustment not often done to adjust for sampling bias © 2011 Jones and Bartlett Publishers, LLC Now for something a little different… Analysis of Means! • Not time series data • Used for attribute (count) data with unequal subgroup sizes – Rate of particular measure of count data – Examples? • • • • C-section rates Antibiotic utilization rates Infection rates post-op Others? • Does provide adjustment for issues like case mix, if done correctly © 2011 Jones and Bartlett Publishers, LLC ANOM example: C-section rates ANOM Chart - Comparison of Proportion Data 0.600 0.500 Proportion 0.400 Proportions 0.300 Lower Common Cause Limits Upper Common Cause Limits 0.200 0.100 0.000 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 Subject Number C-section rates among providers doing deliveries UCCL = upper control limits for each provider LCCL = lower control limits for each provider Control limits adjusted for opportunities, i.e. cases, that provider treats © 2011 Jones and Bartlett Publishers, LLC 27 28 29 30 Analysis of variance (ANOVA) • Test hypotheses about differences between two or more means • Used in DOE to determine if changes in mean in one intervention subgroup statistically differ from other intervention subgroups • See Example 5.6 (p 220) © 2011 Jones and Bartlett Publishers, LLC Regression analysis • Test hypotheses of relationships between a response variable (Y) and one or more predictor variables (X) • Determination of statistical significance of relationships (r value) • Sign of coefficient (b) for predictor variable determines if effect is positive or negative • R2 value provides predictive level of model (i.e. how much of the variation in Y is due to the selected predictor variables) © 2011 Jones and Bartlett Publishers, LLC Types of regression • Simple linear regression – relates one xvariable to one dependent y-variable Linear Model 120 100 80 y = 4.007x + 8.663 60 R² = 0.9793 40 20 0 0 5 10 15 20 © 2011 Jones and Bartlett Publishers, LLC 25 30 Types of regression • Multiple regression – One dependent variable with multiple predictor variables – Graphic output is a multidimensional surface, so usually not provided – Output includes: • Coefficients (b) and levels of significance (pvalue) for each x-value • r value • R2 value © 2011 Jones and Bartlett Publishers, LLC Design of Experiments A scientific approach to improvement © 2011 Jones and Bartlett Publishers, LLC DOE – when evidence is needed • Method for validating processes and determining which factors are most important • Just like in science class – multiple runs, varying “factors” (predictor variables) at different “levels” • Statistically valid approach to identify “main effects” (primary effect of each factor) and “interaction effects” (effects caused by combinations of factors) • Optimization of experiment is desirable to ensure identification of salient factors © 2011 Jones and Bartlett Publishers, LLC

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Statistical Thinking and Analysis