Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
The GDB Cup: Applying “Real World” Financial Data Mining in an Academic Setting Gary D. Boetticher University of Houston - Clear Lake Houston, Texas, USA What is the GDB Cup? Modeled after the KDD Cup Start with $100,000 + Financial Data + Data Mining Techniques = Make As Much Money as Possible Motivation • Availability of Data • Gain Experience with DM Process • Synthesize ML + Domain Knowledge • Pragmatic implications Availability of Data • Different Time Series Perspectives – 1 minute to monthly • Different Financial Instruments – Stocks, Futures, Options, Mutual Funds • Large Sample Size – 400 - 700 Stocks (Daily, 2.5 Years) – EMini Future (5 Minute, 2 Years) • Inexpensive or Free Sources – www.anfutures.com – www.ashkon.com – Screen Scraping (finance.yahoo.com) DM Process: Data Cleansing • Low = 0 • Volume = 0 • Missing Data (e.g. no Open) • Missing Time Periods Build Models (Synthesize ML & Domain Knowledge) Machine Learners Supervised NN, GP, SVM, Neuro Fuzzy, SOM, ILP, etc. Tech. Analysis Moving Averages, RSI, MACD, Stochastics, PNF, etc. www.equis.com/Education/TAAZ Validating Models Statistical Valid. Financial Valid. Ignore Market Conditions (Buy & Hold) Start Date Value End Date Value Unrealistic Conditions (e.g. Drawdown) Standardize portfolio management Validate with EXCEL models Results - 1 Fall 2002 12/31/99 - 5/31/02 452 stocks Amount $1,036,137 $851,283 $454,649 $342,496 $187,336 $172,635 $165,000 $100,000 Annual ROI = 270% Spring 2003 12/31/99 - 5/31/02 712 stocks Amount $1,405,760.17 $1,074,124.63 $605,763.09 $264,609.14 $207, 142.87 $137,397.37 $135,706.63 $146,908.68 $5,789.71 Annual ROI = 310% Fall 2003 6/14/02 - 6/12/03 S&P EMini (5 Min.) Amount $852,453.20 $783,681.20 $624,417.80 $499,154.00 $239,402.40 $213,199.70 $125,655.00 Annual ROI = 852% Results - 2 Spring 2004 (Train) 10/12/01 - 12/26/03 S&P EMini (5 Min.) Spring 2004 (Test) 12/29/03 - 04/16/04 S&P EMini (5 Min.) Amount $51,454,740 $35,484,449 $ 3,643,309 $ 1,088,176 $ 1,085,189 $ 976,923 $ 958,883 $ 192,226 Amount $270,588 $814,621 $ 85,268 $ 75,074 $185,435 $120,244 $ 99,154 $100,00 Annual ROI = 23,300% Annual ROI = 2,172% Demo Conclusions • Effective way to understand DM Process – Data Cleansing – Data Validation • Very Good Results – ROI > 250% in all four cases • Pragmatic implications