Download Applications of Data Mining In U.S. Crop Insurance

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts

Nonlinear dimensionality reduction wikipedia, lookup

Transcript
Data Mining Application in U.S Crop
Insurance Program
Alexis Garcia
ISQS 7342
U.S Crop Insurance Primer
• Crop Insurance is a vital component of many farm
operations throughout the nation
• Farming is an inherently risky enterprise
• Producers rely on insurance policies to protect
their investments in land, livestock, seed, and
crops
U.S Crop Insurance Primer
• In 2004, the program provided producers with over
$47B in liability protection to about 1.2 ,million
policies at a cost of $3.6B
• Incur loss of $160M as a result of waste, fraud and
abuse
Application of Data Mining
• Through ARPA 2000, the use of crop insurance data
mining was funded in an effort to detect abuse and
schemes in filing fraudulent insurance claims
losses
• CAE-USDA RMA database incorporates weather
data, soils, and other agronomical relevant factors
that aids in formulating farm policy strategies
Application of Data Mining
• Database contains more than 2 terabytes of
information and enables linkage of data across time
to allow multi-year comparisons
• CAE-USDA RMA partnership produced more than
200 data mining research products among these is
the SPOTCHECK Program.
Spotcheck Program
• Designed to identify suspicious patterns indicating
possible program abuse (fraudulent insurance
claims)
• Designed data mining algorithms are based on
starting points such as anecdotes from the field or
experience of investigators, producers, agents, or
adjusters about schemes to exploit the program
• These schemes are analyzed to determine whether
they occur in the national data, where and to what
extent and whether or not the scheme is structured
and results in personal benefit
Spotcheck Program
Results of the US Crop Insurance
Data Mining Program
• Decrease in the number of fraudulent crop
insurance claims
• Decrease in the amount of loss due to fraud, waste,
and abuse
• RMA did manage to catch $300M in fraudulent
claims between 2001 and 2004
Popular Articles in U.S Crop Insurance
with Data Mining Application
• Using Data Mining to Detect Anomalous Producer
Behavior: An Analysis of Soybean Production and
the Federal Crop Insurance Program (Olson, Little,
Lovell, 2003)
• Collusion in The U.S. Crop Insurance Program:
Applied Data Mining (Little, Johnston, Lovell,
Rejesus, Seed, 2003)
Using Data Mining to Detect
Anomalous Producer Behavior
• Develop a data mining algorithm and to apply
algorithm to identify anomalous producers and
counties within Land Resource Regions (LRR)
based upon the percentage of acres harvested
• LRRs are used to group spatially insured producers
into agronomical homogenous groups to account
for the natural resource availability.
Using Data Mining to Detect
Anomalous Producer Behavior
• Dependent variable – percentage of acres
harvested
• Other variables include state and county code, LRR,
reinsurance year, crop code, practice code, acres
planted, acres harvested, liability, indemnity,
producer risk premium
• 625,031 unique producers over 2.58 million
observations from reinsurance years 1994 to 2001
Using Data Mining to Detect
Anomalous Producer Behavior
• An exceptionally low or high percentage acres
harvested could be an indicator of anomalous
producer behavior
• Smoothing the dependent variable by using the fiveyear moving average process
• The percentage was normalized by z-score within
LRR
Using Data Mining to Detect
Anomalous Producer Behavior
• After normalization, an outlier detection method
was used to identify producers with anomalous
behavior
• Identified as anomalous if they were at or below the
5th percentile and had a p<=0.01 or if at 1st
percentile and had p<=0.01
• Profiling the normal vis-à-vis anomalous producers
Using Data Mining to Detect
Anomalous Producer Behavior
Using Data Mining to Detect
Anomalous Producer Behavior