Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
9/17/2015 Basic Statistics for the Healthcare Professional 1 FRANK COHEN, MBB, MPA DIRECTOR OF ANALYTICS DOCTORS MANAGEMENT, LLC Purpose of Statistic 2 Provide a numerical summary of the data being analyzed. Data (n) Factual information organized for analysis. Numerical or other information represented in a form suitable for processing by computer Values from scientific experiments. Provide the basis for making inferences about the future. Provide the foundation for assessing process capability. Provide a common language to be used throughout an organization to describe processes. Relax….it won’t be that bad! Objectives 3 Identify the basic tenets of statistics and statistical theory Define mean, median and other central measurements Define standard deviation, inter-quartile ratios and other measurements of variability Explain the difference between data analysis and statistics Describe hypothesis testing and other tests of statistical significance Articulate how to build relationships through regression analysis 1 9/17/2015 3 Degrees of Separation 4 Measures of Location (central tendency) Mean, Median and Mode 1st statistical moment Measures of Variation (dispersion) Range, Standard deviation, Interquartile Range 2nd statistical moment Measures of Error (estimation) Standard error and confidence intervals Measures of Relationships Covariance, Correlation, Regression Descriptive Statistics 5 Descriptive statistics are numbers that are used to summarize and describe data Mean conversion factor Cost per RVU Average Collection by provider Work RVUs that define 1 FTE New office visits to initial consults Measures of central tendency include the mean, median, and mode Measures of variability include the range, variance, and standard deviation. Descriptive statistics are just descriptive and do not involve generalizing beyond the data at hand Inferential Statistics 6 Inferential statistics depends upon a sample of a population to draw (or infer) conclusions about the population as a whole Inferences are made based on central tendency or any of a number of other aspects of a distribution For example, it is not practical to review every chart for a practice so extrapolation is used to assess overpayment No given sample will represent exactly the population, so distribution techniques and sample error calculations are very important 2 9/17/2015 Important Definitions Universe The complete set of objects included within the database in question Sample Frame A homogenous set of objects that the investigator is interested in studying Sample A subset of the population that is actually being studied Variable A characteristic of an individual or object that can have different values (as opposed to a constant) Independent variable The variable that is systematically manipulated or measured by the investigator to determine its impact on the outcome. Important Definitions Dependent variable The outcome variable of interest Data The measurements that are collected by the investigator Statistic Summary measure of a sample Parameter Summary measure of a population Types of Data 9 Attribute Data (Qualitative) Is always binary, there are only two possible values (0, 1) Yes, No Go, No go Pass/Fail Variable Data (Quantitative) Variables are properties or characteristics of some event, object, or person that can take on different values or amounts (as opposed to constants such as π that do not vary). Independent Variables Variables that are manipulated by the experimenter Dependent Variables A variable that measures the experimental outcome 3 9/17/2015 Discrete Variables 10 Discrete variables are whole numbers (count numbers) that do not pass through the space between each number. The number of patients seen in a single day The number of surgical procedures reported by a provider The number of 99213 codes reported for the practice The number of different specialties The number of charts with coding errors Continuous Variables 11 Continuous variables are real numbers that can occupy and infinite amount of space between discreet variables Frequency distribution of E/M codes within a category Total number of calculated FTEs in a practice Minutes per work RVU Ratio of initial consults to new office visits Cost per RVU Charge per Hour Frequency Distribution 12 WELCOME TO THE FAMILY! 4 9/17/2015 Frequency Distributions 13 A frequency distribution shows the number of observations falling into each of several ranges of values Frequency distributions are portrayed as frequency tables, histograms, or polygons Frequency distributions can show either the actual number of observations falling in each range or the percentage of observations From the frequency distribution table, you can calculate the mean, median, mode, and range Normal Distribution 14 In many natural processes, random variation conforms to the normal distribution Characteristics Symmetric, Unimodal, Extends to +/- infinity Completely described by two parameters Mean and Standard deviation 68.27 % of the data will fall within +/- 1 standard deviation 95.45 % of the data will fall within +/- 2 standard deviations 99.73 % of the data will fall within +/- 3 standard deviations Normal Distribution - Illustrated 15 -6 -5 -4 -3 -2 -1 +1 +2 +3 +4 +5 +6 5 9/17/2015 Properties of the Normal Distribution 16 1. 2. 3. 4. 5. 6. 7. 8. It is bell-shaped The mean, median and mode are equal and located in the center of the distribution It is unimodal (has only one mode) The curve is symmetric about the mean The curve is continuous (for each value of x there is a corresponding value of y) The curve never touches the x-axis (goes to infinity) The total area under the curve is 1.00 The area under the curve that lies 1, 2 and 3 STD is equivalent to 68%, 95% and 99.7% respectively Same Mean, Different Standard Deviations 17 Different Mean, Same Standard Deviation 18 6 9/17/2015 Different Mean, Different Standard Deviation 19 Looking at the Curve 20 Skew Skew measures the degree to which the distribution is biased (or skewed) right or left of normal expectation Kurtosis (it’s not a disease) Beyond skewness, kurtosis tells us when our distribution may have high or low variance, even if normal. Skewness – the 3rd Moment 21 The third moment, is used to define the skewness of a distribution: Skewness is a measure of the symmetry of the shape of a distribution. If a distribution is symmetric, the skewness will be zero. If there is a long tail in the positive direction, skewness will be positive, while if there is a long tail in the negative direction, skewness will be negative. 7 9/17/2015 Kurtosis – The 4th Moment The fourth moment, is used to define the kurtosis of a 22 distribution: Kurtosis is a measure of the flatness or peakedness of a distribution. Flat-looking distributions are referred to as “platykurtic,” while peaked distributions are referred to as “leptokurtic.” A kurtosis value of 3 represents a normally distributed data set. <3 approaches platykurtic while > 3 approaches leptokurtic. Normal or Not? 23 Normal Not Normal Summary Report for Data Summary Report for Frequency Anderson-Darling Normality Test A-Squared P-Value 0.00949 1.00143 1.00287 -0.0073412 -0.0347934 10000 Minimum 1st Quartile Median 3rd Quartile Maximum -3.3 -2.2 -1.1 0.0 1.1 2.2 3.3 Anderson-Darling Normality Test 0.13 0.986 Mean StDev Variance Skewness Kurtosis N -3.50651 -0.66662 0.01314 0.68064 3.95809 A-Squared P-Value 61.61 <0.005 Mean StDev Variance Skewness Kurtosis N 9.1974 5.1472 26.4937 1.36825 2.62067 2507 Minimum 1st Quartile Median 3rd Quartile Maximum 1.0000 5.0000 8.0000 11.0000 41.0000 95% Confidence Interval for Mean -0.01014 0.02912 95% Confidence Interval for Mean 8.9959 9.3990 95% Confidence Interval for Median 95% Confidence Interval for Median 6 -0.01225 0.04014 95% Confidence Interval for StDev 0.98774 12 18 24 30 36 8.0000 8.0000 95% Confidence Interval for StDev 1.01551 5.0086 95% Confidence Intervals 5.2938 95% Confidence Intervals Mean Mean Median Median -0.01 0.00 0.01 0.02 0.03 8.00 0.04 8.25 8.50 8.75 9.00 9.25 9.50 Individual standard deviations are used to calculate the intervals. Benford’s Distribution 24 The distribution of first digits in Benford's Distribution any series of naturally occurring numbers, according to Benford's law. Each bar represents a digit, and the height of the bar is the percentage of numbers that start with that digit. 35.00 30.00 30.10 Percent Distribution 25.00 20.00 17.61 15.00 12.49 9.69 10.00 7.92 6.69 5.80 5.12 4.58 8 9 5.00 0.00 1 2 3 4 First 5Digit 6 7 8 9/17/2015 Even We Are Obliged to Follow Benford 25 Benford's Distribution Benford Powers of 2 Area Population 35.00 Percent Distribution 30.00 25.00 20.00 15.00 10.00 5.00 0.00 1 2 3 4 First 5Digit 6 7 8 Weighted average charges for 11,066 individual procedure codes Taken from the 100% Medicare claims database 9 Measures of Position and Central Tendency 26 MEAN, MEDIAN, MODE, PERCENTILES Measures of Central Tendency In the study of statistics there are many measurements of central tendency. The three most common are the mean, median, and mode. These metrics are used to identify the approximate location of the center of the data 9 9/17/2015 Mean The mean, or arithmetic average, is found by adding a group of numbers and dividing the sum by the number of items added. The mean is the best known and most used measure of central tendency. The group of numbers is sometimes referred to as the data or data set. The mean measures the central location of the values within the database The mean is useful for predicting but only where there are no extreme values The Mean (or Average) 29 Arithmetic Mean (average) 30 Create a metric for each code using the same method n x w i xw = i.e., divide the charge by the RVU total Divide the grand total by the number of samples Pros: Easy to calculate Eliminates frequency bias Cons: Does not take into account the frequency of occurrence Not accurate if data is not normally distributed w i 1 Add each of the results together to get a grand Code 99201 99202 99203 99204 99205 99211 99212 99213 99214 99215 Total Count Average i i =1 n i CF 80.83 81.61 81.67 106.79 78.81 86.96 87.78 68.43 70.59 75.73 819.20 10.00 81.92 10 9/17/2015 Mean Sensitivity to Outliers 31 CF With Outliers – 89.8 CF Without Outliers – 77.6 Median The median is the middle number in a set of data that is arranged in either ascending or descending order. One-half of the numbers will be on either side of the median. The median is good for use with non-normally distributed data as it is far less affected by outliers Order the data in ascending order Count the number of records and divide by two Pick the middle number If an even number of records, get the average of the middle two Example of Median Calculation 33 Even Number of Records Odd Number of Records Code 99213 99214 99215 99205 99201 99202 99203 99211 99204 Fee 111 174 254 368 78 138 205 46 400 Frequency 6,219 1,563 75.73 129 25 87 246 206 654 RVU 1.622 2.465 3.354 4.854 0.965 1.691 2.51 0.529 3.854 The median is 99201 80.83 CF Code 68.43 70.59 75.73 75.81 80.83 81.61 81.67 86.96 103.79 99213 99214 99215 99205 99201 99202 99203 99212 99211 99204 Fee 111 174 254 368 78 138 205 82 46 400 Frequency 6,219 1,563 75.73 129 25 87 246 2,080 206 654 RVU 1.622 2.465 3.354 4.854 0.965 1.691 2.51 0.993 0.529 3.854 CF 68.43 70.59 75.73 75.81 80.83 81.61 81.67 82.58 86.96 103.79 The median is halfway between 99201 and 99202 (80.83 + 81.67) / 2 = 81.25 11 9/17/2015 Median is Less Sensitive to Outliers 34 CF With Outliers – 81.0 CF Without Outliers – 80.0 Mean vs. Median 35 This table shows that the median wage is substantially less than the average wage. The reason for the difference is that the distribution of workers by wage level is highly skewed. 0.01% = $23,846,950 0.10% = $2,802,020 1.00% = 1,019,089 10.0% = 161,139 Mode 36 The mode is the one value within a data set that is reported the most There can be more than one mode A single mode is called Unimodal Two modes is called Bimodal Many modes is called MultiModal A multimodal distribution can indicate groups of variables with difference characteristics within the same data set Paid amounts for E&M visit vs. Surgical procedures 12 9/17/2015 UniModal 37 Histogram of Procedure Code 35000 30000 Frequency 25000 20000 15000 10000 5000 0 12000 24000 36000 48000 60000 72000 84000 96000 Procedure Code Bimodal 38 Summary Report for Mean Universe Anderson-Darling Normality Test A-Squared P-Value Mean StDev Variance Skewness Kurtosis N Minimum 1st Quartile Median 3rd Quartile Maximum 23.62 <0.005 44.853 13.976 195.323 -0.236830 -0.773360 1014 0.000 34.619 46.872 56.557 79.756 90% Confidence Interval for Mean 44.131 45.576 0.0 12.5 25.0 37.5 50.0 62.5 90% Confidence Interval for Median 75.0 40.797 48.913 90% Confidence Interval for StDev 13.485 14.507 90% Confidence Intervals Mean Median 40 42 44 46 48 50 Multimodal 39 Histogram of Payment 4000 Frequency 3000 2000 1000 0 0 35 70 105 140 175 210 245 Payment 13 9/17/2015 Percentiles 40 th The p percentile is the observation, when the set n*p th 100 of observations are arranged in order or magnitude; where n is the sample size. Percentiles report the value for a given variable where a certain percent of observations fall above and below the value For example: At the 20th percentile, some 20% of values are lower and some 80% of value are higher A percentile is 1/100 of the total of an ordered data set Splits the data into hundredths All data are ordered around the median (50 th percentile) Other Standard Metrics 41 Quartiles divide the data into four equal parts 1st quartile = 25th percentile 2nd quartile = 50th percentile (mean) 3rd quartile = 75th percentile Deciles divide the data into 10 equal parts Quartiles and Deciles 42 Deciles Quartiles Qi Li hi n F , i 1, 2, 3 fi 4 L0 = Lower limit of the i-th Quartile class n = Total number of observations in the distribution h = Class width of the i-th Quartile class fi = Frequency of the i-th Quartile class F = Cumulative frequency of the class prior to the i-th quartile class Pi Li h fi i n F , i 1, 2, 3 10 L0 = Lower limit of the i-th Decile class n = Total number of observations in the distribution h = Class width of the i-th Decile class fi = Frequency of the i-th Decile class F = Cumulative frequency of the class prior to the i-th Decile class 14 9/17/2015 Interquartile Range 43 The interquartile range (IQR) is a measure of variability, based on dividing a data set into quartiles. The IQR identifies the middle 50% of the data Example: Age Distribution 10 44 th The p percentile is the observation, when the set n*p th 100 of observations are arranged in order or magnitude; where n is the sample size. For the age distribution, n = 121; The 75th percentile for the age distribution is the (121 *75)/100 = 90.75 ~ 91st observation when the ages are arranged in an increasing order of magnitude. The 75th percentile of the ages is therefore 31 years; the 25th percentile, 50th and 80th percentile are the 31st, 61st, and 97th observations respectively, as shown on the next slide. 25th percentile 50th percentile 75th percentile 80th percentile Age Frequency 21 6 Cumulative Frequency 6 22 16 22 23 11 33 24 9 42 25 17 59 26 13 72 27 6 78 28 5 83 29 4 87 30 3 90 31 1 91 32 4 33 3 98 34 2 100 35+ 21 121 Total 121 95 The 31st observation falls in this group The 61st observation falls in this group The 97th observation falls in this group 10 - 45 15 9/17/2015 Box Plot 75th percentile + 1.5 IQR Quantitative Scale Referred to as whisker 75th percentile Average/mean 50th percentile/median 25th percentile Referred to as whisker 25th percentile - 1.5 IQR Individual box symbol IQR: Interquartile range, which is calculated by substracting the 25 th percentile of the data from 75th percentile; consequently, it contains the middle 50% of the observations. 10 - 46 Box Plots 47 Other Ways to Graph Mean, Median and Mode 48 Summary Report for Fee - 27137 Anderson-Darling Normality Test A-Squared P-Value Mean StDev Variance Skewness Kurtosis N Minimum 1st Quartile Median 3rd Quartile Maximum Dotplot of Fee for 27137 72.45 <0.005 5881.1 4069.2 16558200.7 2.5239 10.9663 1632 231.7 3594.8 5008.4 6952.5 39796.0 95% Confidence Interval for Mean 5683.5 6078.7 0 6000 12000 18000 24000 30000 95% Confidence Interval for Median 36000 4933.5 5154.7 95% Confidence Interval for StDev 3934.2 95% Confidence Intervals 4213.8 0 6000 12000 18000 24000 30000 36000 Fee Mean Each symbol represents up to 8 observations. Median 5000 5200 5400 5600 5800 6000 6200 16 9/17/2015 Measures of Variability 49 Variance 50 A measure of the average squared distance of possible values from the expected value (arithmetic average) Code Fee 99201 99202 99203 99204 99205 99211 99212 99213 99214 99215 78 138 205 400 368 46 82 111 174 254 Frequency 25 87 246 654 129 206 2,080 6,219 1,563 75.73 RVU CF 0.965 80.83 1.691 81.61 2.51 81.67 3.854 103.79 4.854 75.81 0.529 86.96 0.993 82.58 1.622 68.43 2.465 70.59 3.354 75.73 80.80 Difference 0.029 0.809 0.873 22.988 (4.986) 6.157 1.778 (12.366) (10.212) (5.070) 0.000 Variance 0.001 0.654 0.763 528.461 24.863 37.903 3.161 152.917 104.280 25.700 878.702 n ( x - x ) 2 i 2 S = i =1 n-1 Note that the difference (not squared) always adds up to zero n ( x -x ) 2 Standard Deviation i S= 51 i =1 n-1 The measure of spread of values around the mean Code Fee 99201 99202 99203 99204 99205 99211 99212 99213 99214 99215 78 138 205 400 368 46 82 111 174 254 Frequency 25 87 246 654 129 206 2,080 6,219 1,563 75.73 RVU CF 0.965 80.83 1.691 81.61 2.51 81.67 3.854 103.79 4.854 75.81 0.529 86.96 0.993 82.58 1.622 68.43 2.465 70.59 3.354 75.73 80.80 Difference 0.029 0.809 0.873 22.988 (4.986) 6.157 1.778 (12.366) (10.212) (5.070) 0.000 Variance 0.001 0.654 0.763 528.461 24.863 37.903 3.161 152.917 104.280 25.700 878.702 •The square root of the variance (878.702) equals the standard deviation (29.643) •For normally distributed data: •68.2% of the population values will fall between 51.57 and 110.44 •95.4% of population values will fall between 21.54 and 140.05 17 9/17/2015 Coefficient of Variance 52 The coefficient of variance (CV) is calculated by dividing the standard deviation by the mean The coefficient of variation is a measure of spread that describes the amount of variability relative to the mean Because the coefficient of variation is unitless, you can use it instead of the standard deviation to compare the spread of data sets that have different units or different means. Large CV means that the data are more dispersed while a lower CV means that the data CV for Two Different Codes 53 99213 Specialty GS CA FP GE IM OS PM 99205 Fee Stdev CV 125.11 27.81 22.23% 120.86 19.36 16.02% 162.11 25.84 15.94% 118.21 20.56 17.39% 111.67 19.40 17.37% 130.12 21.99 16.90% 113.60 18.84 16.58% Specialty GS CA FP GE IM OS PM Fee Stdev CV 383.37 83.45 21.77% 369.92 65.81 17.79% 304.10 45.37 14.92% 349.83 76.19 21.78% 328.28 64.04 19.51% 380.78 74.71 19.62% 339.65 72.98 21.49% Interquartile Range 54 The interquartile range (IQR) is the distance between the 75th percentile and the 25th percentile The IQR is essentially the range of the middle 50% of the data Because it uses the middle 50%, the IQR is not affected by outliers or extreme values IQR is often represented using Boxplots 18 9/17/2015 Box Plots 55 Error and Confidence Intervals 56 x t (1 - )(n -1)df 2 p̂ Z (1 - ) SE p̂ 2 where S n SE = p̂(1 - p̂) n What is Sample Error 57 In statistics, sampling error is the error caused by observing a sample instead of the whole population. [Bunns & Grove, 2009] The sample is never identical to the population Basically, all samples have some error when used to predict a value within the population 19 9/17/2015 Causes of Sampling Error 58 Population specific error Not understanding who or what to sample Sample frame error Occurs when the wrong sub-population is identified and/or used Selection error When data points are not selected correctly Non-response error Occurs when data are missing, variable fields are zero or other similar issues Sampling error Can occur when the wrong sample type is selected (e.g. SVRS, Cluster, Convenience) Calculating the Margin of Error 59 Margin of error rules go something like this: The larger the sample, the smaller the error The smaller the variance, the smaller the error SE can be calculated using two primary assessment types Attribute Variable Attribute Variable Example of SE for Variable Assessment 60 A sample of average charges for 99213 was taken from 50 practices in a given area Mean = $82.40 and STDev = $15.55 Assume normal distribution 68.26% of values between $66.85 and $97.95 SE = Stdev/sqrt(N), or 15.55/sqrt(50), or 15.55/7.07 = 2.2 The standard error for our estimate of the mean of $82.40 is $2.20 20 9/17/2015 Example for SE Calculation for Attribute Assessment 61 In a chart review, a practice finds that $1,809 out of $5,742 was paid in error This equates to a paid error rate of 31.5% To calculate the sample error, we use this formula: Where p = .315 (1-p) = 1 - .315 = .685 n = 5,742 (p(1-p)/n = .216 / 5742 = 0.00376% The square root of 0.00376% is 0.613% Plus and minus p = 30.88% to 32.213% Example for SE Calculation for Attribute Assessment 62 In a chart review, a practice finds that 6 out of 30 charts contained a medical necessity error This equates to a coding error rate of 20% To calculate the sample error, we use this formula: Where p = .2 (1-p) = 1 - .2 = .8 n = 30 p(1-p)/n = .16 / 30 = 0.0053 The square root of 0.0053 is 0.073 (7.3%) Plus and minus p = 12.7% to 27.3% Common “Z” levels of confidence Z-score is a measure of standard distance in a normally distributed distribution Many believe that if the data set is large (n > 30), you can assume a normal or near normal distribution (NOT TRUE) Commonly used confidence levels are 90%, 95%, and 99% 21 9/17/2015 Margin of Error (1/2 Interval) 64 The margin of error is the z or t score times the standard error z and t values depend on how wide or narrow you want the margin of error to be The higher the value, the higher the margin of error Sample of 50, 95% margin of error Mean = 82.40, stdev = 15.55, SE = 2.20 Margin of error = (z or t) times SE z = 1.96*2.20 = 4.31 t = 2.009 * 2.20 = 4.42 What is a Confidence Interval (CI)? 65 The purpose of a confidence interval is to validate a point estimate; it tells us how far off our estimate is likely to be A confidence interval specifies a range of values within which the unknown population parameter may lie Normal CI values are 90, 95%, 99% and 99.9% The width of the interval gives us some idea as to how uncertain we are about an estimate A very wide interval may indicate that more data should be collected before anything very definite can be inferred from the data This means when a sample is drawn there are ?? chances in 100 that the sample will reflect the sampling frame at large within the sampling error Interpreting the CI 66 Using our average charge example: Mean = 82.40, SD= 15.55, SE = 2.20, ME = 4.42 (t-score) CI = 82.40 +/- 4.42, or 95% CI = $77.98 to $86.82 False Statement: I am 95% confident that the true average charge for this code is somewhere between $77.98 and $86.82 True Statement: In 95% of samples, the population mean will be somewhere between $77.98 and $86.82 22 9/17/2015 Example for SE Calculation for Attribute Assessment 67 In a chart review, a practice finds that $1,809 out of $5,742 was paid in error This equates to a paid error rate of 31.5% To calculate the sample error, we use this formula: Where p = .315 (1-p) = 1 - .315 = .685 n = 5,742 (p(1-p)/n = .216 / 5742 = 0.00376% The square root of 0.00376% is 0.613% Plus and minus p = 30.88% to 32.213% Example for SE Calculation for Attribute Assessment 68 In a chart review, a practice finds that 6 out of 30 charts contained a medical necessity error This equates to a coding error rate of 20% To calculate the sample error, we use this formula: Where p = .2 (1-p) = 1 - .2 = .8 n = 30 p(1-p)/n = .16 / 30 = 0.0053 The square root of 0.0053 is 0.073 (7.3%) Plus and minus p = 12.7% to 27.3% Attribute 95% Confidence Interval 69 In attribute example 1, the SE was 0.00613 (.613%) The 95% half interval is .00613 * 1.96 = 0.012 (1.2%) Plus and minus the p of 31.5% = 30.3% to 32.7% In attribute example 2, the SE was 0.073 (7.3%) The 95% half interval is .073 * 1.96 = 0.143 (14.3%) Plus and minus the p of 20% = 5.7% to 34.3% The confidence interval range has a huge impact in inferential statistics 23 9/17/2015 95% Confidence Intervals To Change the Confidence Interval 71 To narrow the confidence interval Decrease the variability Lower your z/t score Increase the sample size s x Z n To get a better level of confidence Decrease the variability Accept a broader CI (i.e., 80%) Increase the Sample Size s x Z n CI Applications – Physician Productivity 72 In physician productivity studies, the 95% CI gives us a range of work RVU values. In 95% of the samples we take, the true mean for the population would fall somewhere between the lower and upper bound Lower Work RVUs per Mean Work RVUs per Upper Work RVUs per Specialty FTE -National FTE -National FTE -National GP EM PD CV PY GS FP OS 4,289.35 6,153.49 4,220.96 4,005.59 3,232.74 4,435.13 4,104.91 4,394.81 4,543.80 6,732.48 4,618.12 4,382.48 3,536.91 4,852.44 4,491.15 4,808.32 4,798.25 7,109.50 4,876.73 4,627.90 3,734.98 5,124.18 4,742.65 5,077.59 24 9/17/2015 CI Applications – Auditing 73 If a hundred similar audits were Summary Report for Overpaid performed, in 95 of them, the actual mean damage would be somewhere between $107.02 and $164.44 Assume the universe is 10,000 claims The difference between the mean and the lower bound of the 95% CI is $28.71 This translates to a difference in damage estimates of 107,020 rather than 135,730 (28,710) Anderson-Darling Normality Test A-Squared P-Value Mean StDev Variance Skewness Kurtosis N Minimum 1st Quartile Median 3rd Quartile Maximum 0.69 0.066 135.73 89.77 8059.52 0.15625 -1.16048 40 8.27 34.48 131.82 222.77 300.72 95% Confidence Interval for Mean 107.02 164.44 0 80 160 240 95% Confidence Interval for Median 320 96.97 174.64 95% Confidence Interval for StDev 73.54 115.27 95% Confidence Intervals Mean Median 100 120 140 160 180 For More Information 74 Frank D. Cohen www.drsmgmt.com [email protected] 800.635.4040 www.frankcohengroup.com Click on the Library tab (at top of page) Click on Workshop Toolboxes Select the Statistics toolbox Password is 68487122 To Get the Toolbox 25