Download Metalco: The SAP Proposal

Data Mining Journal Entries for Fraud Detection: A Pilot Study by Roger S. Debreceny & Glen L. Gray Discussed by Severin Grabski Objective • Explore research issues related to the application of statistical data mining to fraud detection in journal entries – Is this important? – YES! Most significant frauds are not conducted by the users of the ERP systems, they are done “outside” of these well controlled systems. • Was this accomplished? – Maybe Accomplished? • Used Benford’s Law in examining Journal Entries • Statistically significant differences in First Digit distributions were found (Chi Square test), should these be investigated? – A 0% difference (Omicron) gives a statistically significant p < 0.015. What does this tell me? – Is a 1% difference between observed and predicted indicative of a problem? – Could use Mean Absolute Deviation Entity Beta Chi ChiEta ChiNu ChiPi Delta Eta EtaNu EtaPi Nu Total Dev 0.19 0.03 0.06 0.11 0.30 0.06 0.20 0.10 0.08 0.34 MAD 0.0211 0.0033 0.0067 0.0122 0.0333 0.0067 0.0222 0.0111 0.0089 0.0378 Benford’s Law & First 5 Firms 35% 30% 25% Benford Beta Chi ChiEta ChiNu ChiPi 20% 15% 10% 5% 0% 1 2 3 4 5 6 7 8 9 Accomplished? • Identification of “violations” of the Benford’s First Digit Law only provides a preliminary indication – Nigrini and Mittermaier (1997) recommend using the first digit as an initial test of reasonableness Other “Benford’s Law” Digit Tests • Second Digit Test – This also only gives a preliminary indication • First Two Digits Test – Provide more direction • Number Duplication – Identify and rank order duplicate numbers Other Benford’s Law Research • Carslaw (1988) found support for rounding up of income figures using the expected second digit frequencies (more 0s, fewer 9s than expected). • Thomas (1989), again using second digits found support for rounding up of income and down for losses. • Nigrini – (1994) used first two digit frequencies to analyze payroll fraud, and – (1996) used first two digit frequencies to examine tax compliance Fourth Digit Test • Chi Square to test for distributional difference of fourth digit – “…distribution of the fourth digit for each organization for all dollar amounts over $999.” – Was this the fourth digit to the left or right? – What if the transaction was for $100,000? • While statistically significant differences were found, should these be investigated? Three Digit Test • Examined Last (Three) Digits in dollar amounts – Used the “top 5” of the last three digit pattern – Found that 4 of 29 entities had 30-60% of their transactions consisting of the top 5 last three digit patterns • Would be interesting to note if these were the entities that “failed” Benford’s Law Data Mining J/E Questions • Would have liked a more reasoned/theoretical approach in specifying where and why data mining techniques should be applied • Sources of J/E? – Influence Data Mining • Unusual patterns between classes of J/Es? • Class of J/E influence nature of J/E (i.e., do any type of J/E have a higher probability of fraud)? • Evidence from Benford’s Law or Right Most Digits? • Underlying issues that will guide effective and efficient data mining of JEs Descriptive Statistics • Any way to group the firms by industry? • What can be found based upon grouping and analyzing by size? Other Questions • What other approaches (than Benford’s Law) can be applied to mining journal entries? • What is currently done by audit teams for computerized analysis of journal entries? • The analysis expects to see a “large enough” number of Journal Entries in order to highlight that fraud might be occurring. What if only a few JEs are made? What is the sensitivity of this approach? Confusion • Number of organizations? – 36 organizations – 8 data sets had less than 1 year – 1 data set was incomplete – 27  why 29 observations? • Did you count each year for the 2 organizations that provided 2 years of data as separate observations? – What is the justification? – Why not do a year-to-year comparison for those organizations? What’s Missing? • Interpretation and more detailed analysis of the data – Know that there are “violations” but never know if there is really fraudulent activity • What are the other data mining techniques that are planned? • Analytical reasoning as to what tests should be done or what is revealed by certain tests Data Mining Extensions • Compare the entities with “larger” average line items per journal entry (e.g. >10) in one pool? • Alternatively look at those in which the maximum number of line items is large (e.g. >100) Summary • Objective – explore research issues related to the application of statistical data mining to fraud detection in journal entries • Good first step – and this is a pilot study • Would like more theoretical motivation for tests & research issues • Would have liked more data analysis • Could I apply this in an audit? • I’m not sure - - - more research is needed Thank You

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Metalco: The SAP Proposal