Download Data Mining Using Recursive Partitioning

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts

Nonlinear dimensionality reduction wikipedia, lookup

Cluster analysis wikipedia, lookup

Transcript
Data Mining Using
Recursive Partitioning
Peter Westfall
With some help from
Dr. Barry Macy, Dr. Seul-Hee Yoo, and
TTU Institutional Research
Business Intelligence
= Transforming Business Data into Action
What Data?
Lots of data.
http://www.pcworld.com/news/article/0,aid,11
3170,00.asp
Text, numeric, sound, pictures, video.
Old and New Learning Paradigms
Old:
THEORY
New:
DATA
ANALYSIS
Data
Analysis
Theory
THEORY
DATA
ANALYSIS
Typical Data Mining Methods
• Clustering (eg, customer segmentation)
• Affinity (eg, what items do people buy
together)
• Exception analysis (eg, credit card fraud,
terrorism)
• Predictive Modeling (eg, deciding loans,
predicting employee turnover, predicting
likely customers)
Recent Horizons in Data Mining
•
•
•
•
Visualizations
Text mining
Audio mining
Video mining
Requirements of DM Tools
• Simple (even an MBA can use it)
• Actionable results
• Flexible, open-ended (“Analysis at the
speed of thought”)
• Scale-Up: Can handle massive data sets
• Drill-Down: Ability to investigate sub-units
Recursive Partitioning
•
•
•
•
A predictive modeling tool
Also called “Decision Trees”, “CART”
Works by recursively splitting data set
Software:
– SAS Enterprise Miner
– SPSS Clementine
– SPLUS
– Lots of Freeware
– Demo: “Partitionator” of Eureka! Technologies.
http://www.eurekatechnologies.com/MoreDetails.aspx
Example 1: Survey of Innovative
Organizations
• Action Orientation: Which management
levers lead to better performance?
• V24=earned profit in last 5 years:
– 1=all five
– 2=most of 5
– 3 = some of five
– 4 = none of 5
Interesting Variables
• V617B = Number of years that elimination of
perks for certain groups of people has been in
effect
• V894A = Percent of workforce involved in
SPC/SQC/TQC training
–
–
–
–
1=None
2=1-20%
…
7=100%
Example 2:
Texas Tech University Ratings
By Thesis Students
• Who is satisfied? Who is not satisfied?
• Action Orientation –
– Improve pockets where students are
dissatisfied.
– Emulate pockets where students are satisfied.
Example 3: Business Dress Styles Rated
Lower Rated Dress Types
Final Tree – Dress Ratings
Questions?
Comments?
Poison-tipped darts?