Download Data Mining Using Recursive Partitioning

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Cluster analysis wikipedia , lookup

Nonlinear dimensionality reduction wikipedia , lookup

Transcript
Data Mining Using
Recursive Partitioning
Peter Westfall
With some help from
Dr. Barry Macy, Dr. Seul-Hee Yoo, and
TTU Institutional Research
Business Intelligence
= Transforming Business Data into Action
What Data?
Lots of data.
http://www.pcworld.com/news/article/0,aid,11
3170,00.asp
Text, numeric, sound, pictures, video.
Old and New Learning Paradigms
Old:
THEORY
New:
DATA
ANALYSIS
Data
Analysis
Theory
THEORY
DATA
ANALYSIS
Typical Data Mining Methods
• Clustering (eg, customer segmentation)
• Affinity (eg, what items do people buy
together)
• Exception analysis (eg, credit card fraud,
terrorism)
• Predictive Modeling (eg, deciding loans,
predicting employee turnover, predicting
likely customers)
Recent Horizons in Data Mining
•
•
•
•
Visualizations
Text mining
Audio mining
Video mining
Requirements of DM Tools
• Simple (even an MBA can use it)
• Actionable results
• Flexible, open-ended (“Analysis at the
speed of thought”)
• Scale-Up: Can handle massive data sets
• Drill-Down: Ability to investigate sub-units
Recursive Partitioning
•
•
•
•
A predictive modeling tool
Also called “Decision Trees”, “CART”
Works by recursively splitting data set
Software:
– SAS Enterprise Miner
– SPSS Clementine
– SPLUS
– Lots of Freeware
– Demo: “Partitionator” of Eureka! Technologies.
http://www.eurekatechnologies.com/MoreDetails.aspx
Example 1: Survey of Innovative
Organizations
• Action Orientation: Which management
levers lead to better performance?
• V24=earned profit in last 5 years:
– 1=all five
– 2=most of 5
– 3 = some of five
– 4 = none of 5
Interesting Variables
• V617B = Number of years that elimination of
perks for certain groups of people has been in
effect
• V894A = Percent of workforce involved in
SPC/SQC/TQC training
–
–
–
–
1=None
2=1-20%
…
7=100%
Example 2:
Texas Tech University Ratings
By Thesis Students
• Who is satisfied? Who is not satisfied?
• Action Orientation –
– Improve pockets where students are
dissatisfied.
– Emulate pockets where students are satisfied.
Example 3: Business Dress Styles Rated
Lower Rated Dress Types
Final Tree – Dress Ratings
Questions?
Comments?
Poison-tipped darts?