Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
The Case for A Data Mining Approach to Technical Analysis If I’m so smart how come I’m not rich yet ?? The Case for Data Mining You After Finance 9790 You Before Finance 9790 1. TA Is a Multivariate Recurrent Prediction Problem 2.The Four Tasks of A Recurrent Prediction Problem 1) Defining Target (Y), 2) Propose List of Candidate Predictors (X’s) 3) Build Data Base of Solved Examples 4) Selecting X’s, 5) Determining the Prediction Function 3. Humans & Computers Complimentary Information Processing Abilities Humans Uniquely Able to Handle Tasks 1 & 2 &3 But Poor at Tasks 4 & 5 Data Mining Algorithms Optimal for Task 4 & 5 4. TA Practitioners Should Partner-Up With Data Mining Algorithms 5. TA Practitioners Should Abandon Outdated Methods & Focus On Their Proper Role in a Human / Machine Partnership Data Bases Data Mining Practitioner Data Mining Software 1. TA Is a Multivariate Recurrent Prediction Problem 2.The Four Tasks of A Recurrent Prediction Problem 1) Defining Target (Y), 2) Propose List of Candidate Predictors (X’s) 3) Build Data Base of Solved Examples 4) Selecting X’s, 5) Determining the Prediction Function 3. Humans & Computers Complimentary Information Processing Abilities Humans Uniquely Able to Handle Tasks 1 & 2 & 3 But Poor at Tasks 4 & 5 Data Mining Algorithms Optimal for Task 3 & 4 4. TA Practitioners Should Partner-Up With Data Mining Algorithms There Are Two Kinds of Prediction Problems 1. Regression: predicting the FUTURE value of a continuous variable 2. Classification: predicting the class of an object (situation) In Both Regression & Classification The target variable concerns something that is not yet known!! In Both Regression & Classification We use information that is known To make the prediction Two Kinds of Prediction Problems 1. Regression: we wish to predict the FUTURE value of a continuous variable • • This variable is referred to as: the dependant variable, the target variable, Y The target variable in a regression problem is a continuous variable: can assume any value within a range Example: the % change in the S&P500 from now (t0) to a point in time 90 days into the future ( t+90) Two Kinds of Prediction Problems 2. Classification: we wish to predict the class of an object whose class is not yet known • The target variable in a classification problem is a discrete variable Assumes a limited number of discrete values or names ( 0,1), (+1, 0, -1), (benign / malignant) Example 1: the future class of a company with respect to solvency ( bankrupt / non-bankrupt) Example 2: the future trend of the market over the next 90 days ( up / down) What Is A Recurrent Multivariate Prediction Problem? 1. The same type of prediction is required over and over again. 2. The same set of information is available each time a prediction is required • • The information is a set of values for each of a multitude of variables These variables are referred by the name “independent variables, predictors, candidate predictors, indicators, etc. Examples Classification Problem Recurrent Decision Problems Does the Object Belong to Class 1 or Class 2 1. The same type of prediction is required over and over again. – Medicine: Is a given tumor malignant or benign – Oil Exploration: At a given location: Is there Oil or No Oil (Drill / Don’t Drill) – Marketing: is given consumer a likely buyer or non-buyer for our product or service – Credit Approval: Is a given loan applicant likely to Repay or Default ( Lend / Don’t Lend) – Technical Analysis: Is the market more likely to advance or decline ( Buy / Sell) Recurrent Problems ExamplesDecision Regression Problem The Future Value of A Continuous Y Variable 1. The same type of prediction is required over and over again. – Medicine: survival time for someone with disease X – Oil Exploration: amount of oil a new well is likely to produce – Marketing: What are the likely sales of a product – Technical Analysis: • • How much will the S&P500 appreciate over the next month By how much will stock A beat the market over the next month Recurrent Decision Problem 2. The same set of information is available each time a decision is required • Information is a set of values for a multitude of variables Multivariate Information Set measured values for a multitude of variables Medicine: set of results on medical tests Oil Exploration: set of values for various geological parameters Marketing: set of demographic factors describing the person Blood pressure, cholesterol level, blood sugar, etc. zip code, owns car yes/no, etc. Credit Approval: set of credit factors describing the loan applicant . # years at current address, number of credit cards, payment history Technical Analysis Information Set multitude of Indicator Readings at a given point in time 1. 2. 3. 4. 5. 6. 7. close / moving average = $ 1.075 10 day ma / 50 day ma = 1.067 RSI Indicator = 74 5 day ma volume / 25 day ma volume VIX (Implied Volatility on Stock Options) Ratio of Insider Sales / Purchases Ratio of Upside / Downside Volume 62.1, +0.1, -.02 This point in time Is characterized by These indicator values 75.5, -2.1,-.55 75.5 62.1 -2.1 +0.1 -.55 -.02 In Other Words: There Are 3 Candidate Predictor Variables. We can treat this as Classification Problem Class 1: Market Return over the next 20 days is > 0 Class 2: Market Return over the next 20 days is < 0 The Target Variable: The Thing We Wish To Predict Is Discrete Variable that can Assume 2 Values > 0 or < 0 ( we can call this Class 1 or Class 2, This point in time t0 Is characterized by 75.5, -2.1,-.55 62.1, +0.1, -.02 75.5 62.1 -2.1 +0.1 -.55 -.02 Do These predictors (indicators ) Enable Us to classify (discriminate) Future Up-Moves from Future Down Moves? Class 1 from Class 2 This point in time t0 Is characterized by 75.5, -2.1,-.55 62.1, +0.1, -.02 t0 t+20 t0 t+20 Getting Matters of Time Straight t0 and t+20 • t0 refers to the date on which the prediction or classification is made – This is date of the most recent values of the predictor variables • t+20 or t+n refers to a time in the future that the target variable (Y) refers to – In the bankruptcy prediction problem it is any time over the following two years. – So the future looking horizon of the target need not be a fixed date. Value of Y is based on Future Information Values of X’s based on past and current information Future Past Values of Predictors (X) based on What happens Back here & up to from t-n unitl t0 Value of Target (Y) based on What happens out here From t0 until t+n t0 Time 1. TA Is a Multivariate Recurrent Prediction Problem 2.The Four Tasks of A Recurrent Prediction Problem 1) Defining Target (Y), 2) Propose List of Candidate Predictors (X’s) 3) Build Data Base of Solved Examples 4) Selecting X’s, 5) Determining the Prediction Function 3. Humans & Computers Complimentary Information Processing Abilities Humans Uniquely Able to Handle Tasks 1 & 2 & 3 But Poor at Tasks 4 & 5 Data Mining Algorithms Optimal for Task 4 & 5 4. TA Practitioners Should Partner-Up With Data Mining Algorithms Task 1: Define The Target Variable (Y) The Single Variable We Wish to Predict 1. Define the type of the problem: Classification or Regression A. Classification (Discrimination): Y defined as a class 2 or more distinct classes • Benign / malignant • Lend / Don’t Lend • Buy / Sell / • Strong Buy / Weak Buy/ Weak Sell / Strong Sell B. Regression: a continuous quantity (linear regression) • Future % increase in the market • Predicted amount of future purchases Task 2: Propose Candidate Predictors (X’s) These are merely candidates because we don’t know yet if any will be useful for predicting the target Y Predictors must be based on data known at the time the prediction is made: look back in time from present Tomorrow’s closing price – No Today’s closing price or prior closing prices- Yes Not all indicators need to be useful, but some must be. Success in predictive modeling requires that some candidate predictors have useful information about the quantity or class to be predicted (Y) Task 2 is crucial!!!!! If not done well…..all is lost 1. The TASK of the domain expert……(YOU) 2. Expert must know which raw data series may contain relevant information 1. 2. 3. 4. Price Volume Open interest Interest rates, etc – – For example in our problem X’s must be stationary. That expose the information in the raw data series to the data mining algorithm 3. Expert proposes useful ways to transform raw series into indicators Skipping Task 3 For A moment Building the Data Base Of Solved Examples From Which DM Algorithm Learns the Model Tasks 4 & 5 4. Selecting Indicators for from Candidate List that warrant a place in the prediction model Determining which candidates contain relevant nonredundant information about (Y) The set of indicators that work synergistically 5. Determining the prediction function What is mathematical or logical formula for combining the values of the X’s to best estimate the value of Y A complex configural reasoning problem What Is A Prediction Function • A mathematical or logical formula for combining the selected indicators to produce a best estimate of the target variable. • Simplest : Y – 1 predictor model – linear shape: y = ax1+b b is value of the Y intercept of line a is the slope of the line X1 Simplest Prediction Model 1 predictor & flat (no hills or valleys) in model’s surface The model predicts This value of Y Y Y intercept =b For this value of X1 X1 Multiple Linear Regression Combines Two or More X’s in a linear way to predict the value of Y • In multiple linear regression the combining function is assumed to be linear (weighted sum) • Y= a1X1 + a2X2 + a3X3……….anXn + c. Regression coefficients (weights) are found By the method of Least-Squares Modern Data-Miners Need Not Assume A Linear Form They Allow the data mining algorithm to discover it. It May Be Non-Linear & Arbitrarily Complex Linear Model : Flat Response (Y) Surface Y Is Linear Function of Two Features X1, X2 Y “A” slope X1 X2 “C” intercept “B” slope Y = A X1 + B X2 + C Linear Model Is Best Fitting Tilted Flat Surface to the Data Y “A” slope X1 X2 “C” intercept “B” slope Y = A X1 + B X2 + C The Model’s Prediction is The Altitude of the Y Surface Corresponding to values of X1 and X2 The model predicts This value of Y Y X1 X2 Thinking of A Prediction Model’s Output As A Super Indicator A new indicator that condenses & combines the information In two or more indicators (variables) Into a new or super indicator Model Output As a “Super Indicator” • The output of a prediction model is a new variable, produced by function found by regression analysis • The function is a weighted sum of the indicators serving as inputs to the model ( X1, X2, etc) • The function’s weights been optimized to transform values of inputs into a best estimate of the target (Y). – method of least-squares is used to find optimal weights – Weights cause the line or plane to fit the historical data Multiple Linear Regression Combines Two or More X’s in a linear way to predict the value of Y • In multiple linear regression the combining function is assumed to be linear (additive) • Y= a1X1 + a2X2 + a3X3……….anXn + c. But What If the true shape of the relationship Between the indicators (X1…..Xn) is not a tilted Flat Surface….but something more complex???? Multiple Linear Regression Combines Two or More X’s in a linear way to predict the value of Y • In multiple linear regression the combining function is assumed to be linear (additive) • Y= a1X1 + a2X2 + a3X3……….anXn + c. Modern Data-Miners Do Not Assume the Model Surface Is Linear (free of hills and valleys) They Allow the data mining algorithm to discover its Shape, Which May Be Non-Linear Suppose the authentic relationship Between X1 & X2 and Y Looks Like This Y X1 X2 3 Y = f ( X1 , X2 ) Forcing A Linear toCapture Describe The Model Fails to Non-Linear Phenomenon The Boat! The Authentic PatternsMisses in the Data Linear Model’s Predictions Too Low Y – future trend Linear Model’s Predictions Too High 2 X1 X2 – TA indicator X2 Financial Markets Are Most Likely to Be Complex Non-Linear Systems Tasks 4 & 5 Must Be Performed by Data Mining Software X1 Candidate Predictors: X2 A Set X3 of X4 Indicators X5 Proposed By Human Expert 6 Xn Task 4 Which, if any, of the candidate predictors Contain information relevant to Y ? ?f? Y = f (x) Complex Combining System Function Outcome Y To Predict Task 5 What is the shape of the mathematical function best combines the indicators into a Predicted Value of Y Tasks 4 & 5 Must Be Performed by Data Mining Software X1 Candidate Predictors: X2 A Set X3 of X4 Indicators X5 Proposed By Human Expert 6 Xn Task 4 Which, if any, of the candidate predictors Contain information relevant to Y ? ?f? Y = f (x) Complex Combining System Function Outcome Y To Predict Task Task 55 Note!! What is the shape of the In When the DM method used mathematical function Is Multiple Linear Regression best combines the indicators The into Prediction a PredictedFunction Value ofIsY Assumed to Be Linear 1. TA Is a Multivariate Recurrent Prediction Problem 2.The Four Tasks of A Recurrent Prediction Problem 1) Defining Target (Y), 2) Propose List of Candidate Predictors (X’s) 3) Build Data Base of Solved Examples 4) Selecting X’s, 5) Determining the Prediction Function 3. Humans & Computers Complimentary Information Processing Abilities Humans Uniquely Able to Handle Tasks 1 & 2 & 3 But Poor at Tasks 4 & 5 Data Mining Algorithms Optimal for Task 4 & 5 4. TA Practitioners Should Partner-Up With Data Mining Algorithms Human Experts & Data Mining Algorithms Have Different But Complementary Information Processing Abilities They Synergize Where Human’s Are Strong, DM Algorithms Weak Where Humans Experts Are Weak, DM Algorithms Strong Definition: Configural Thinking a multitude of variables (indicators) must be considered simultaneously as an inseparable configuration (pattern). Considering each variable individually will not provide the correct conclusion. Human Intelligence Strengths • Creative – Posing Problems (Y) – Proposing candidate indicators (Xs) & Weaknesses: • Weak Configural Reasoning – Distinguishing relevant from irrelevant X’s – Combining multiple variables 3 Machine Intelligence (Data Mining) Weaknesses • Lack Creativity – Unable to pose questions (define Y) – Unable to propose candidate indicators (define X’s). & Strengths • Excellent ability to handle numerous variables simultaneously Configural – Can identify relevant non-redundant indicators. – Can formulate multivariate prediction functions. 3 Who or What Should Handle the 5 Tasks? 1. 2. 3. 4. Define Y Propose Candidate Indicators X’s Build Data Base of Solved Cases Indicator Selection: which Candidate X’s Are relevant and non-redundant 5. Determining optimal combining function: a mathematical model that combines useful X’s into a prediction or classification decision A Task for Automated Data Mining Algorithms The Evidence Studies of Human Experts Solving Multivariate Recurrent Prediction Problems Shows…….. 1. Experts realize the necessity for configural reasoning (combining variables in complex non-linear fashion) 2. Experts are under the impression that they are combining information in a complex configural manner but studies show…. 3. Experts rely primarily on simple linear rules for combining information 4. Their performance is poor – – Inconsistent–same set of information elicits different decision on different :Correlation .6 Correlation among experts is also low Technical Analyst Faced With Large Set Of Conflicting Indicators Bearish Let each factor = +1 &bearish each bearish factor Sum bullish factors = +5 bearish factors == -3-1 5bullish bullish factors &: 3Sum factors Human Experts (Technical Analysts) Rely on Intuitive Linear Combining +5 – 3 = +2 I’m bullish Sum bullish factors = +5 & Sum bearish factors = -3 Comparing the Subjective Predictions of Experts With Multiple Linear Regression Models Studies Began in 1954 The Question How accurate are the predictions of humans compared to multiple linear regression models given the same set of indicators ? Expert’s Subjective Predictions vs. Multiple Regression Models 0.9 0.8 Model Mean 0.38 0.7 0.6 r2 Predicted Vs. Actual 0.5 r2 Expert 0.4 Model Expert Mean 0.11 0.3 0.2 0.1 r2 0 Sales Effective. -0.1 Expert 1 Model Academic 1 2 2 3 4 5 6 7 3 4 5 6 7 Stocks Cancer survival Student Att. Mental ill. 8 8 9 9 Teach. effective Business Failure Meta-Analysis of 135 Similar Studies Draws A Conclusion From Multiple Independent Studies Study1 Study2 Study3 Studyn Swets, Monahan & Dawes 2000 • meta-analysis of >135 studies comparing 3 decision making methods. 1. Expert / intuitive (subjective) judgment based on anecdotal experience & informal reasoning. 2. Statistical models. 3. Combination of methods #1 & #2. Wide Variety of Disciplines Were Examined in the 135 Studies. • Fields – Medical diagnosis – Penology (parole recidivism,violence) – Psychology(diagnosis and treatment selection), – Education ( predicting success in academics) – Predicting football game outcomes. • Results were quite consistent across fields Results of Meta Analysis 135 Studies • In 96% of the studies, regression models beat or were equal to expert judgment. • In medical diagnosis expert judgment was always worse than regression model. • Experts beat statistical models in only 6 studies. The Question: With All This Evidence Why Do Experts Insist on Making Subjective / Intuitive Predictions & Decisions Bottom Line For Technical Analysis Aronson’s Editorial Opinion When Making Predictions Rely On Objective Statistical Models Not Subjective Judgment 1 1. TA Is a Multivariate Recurrent Prediction Problem 2.The Four Tasks of A Recurrent Prediction Problem 1) Defining Target (Y), 2) Propose List of Candidate Predictors (X’s) 3) Build Data Base of Solved Examples 4) Selecting X’s, 5) Determining the Prediction Function 3. Humans & Computers Complimentary Information Processing Abilities Humans Uniquely Able to Handle Tasks 1 & 2 & 3 But Poor at Tasks 4 & 5 Data Mining Algorithms Optimal for Task 4 & 5 4. TA Practitioners Should Partner-Up With Data Mining Algorithms Task #3 Build Data Base Of Solved Examples The Data (Experience) Base Is Used By the Data Mining Algorithms to Learn How to Build The Prediction Model This task often takes 90-95% of the time when developing A Data Mined Model Data Base of Solved Examples Known Values of “Y” • What is a “solved example”? : A case (situations, examples, etc) for which the value of the target variable is known as well as the values of the X (candidate predictors) – Value of Y is known because the case happened in the past – Even though Y is a forward looking the case occurred long enough ago so that the value of Y is known. • Each case in the data based is described by 2 kinds of information 1. Value for the target variable Y. 2. The values for the candidate predictors Examples of A Solved Case A. 1 day of market history for the S&P500 1. Y value: % change over the month following the date of the case (regression) 2. X values: values of the indicators on the date of the case B. An oil drilling site 1. Y value: did the site produce oil or not (class) 2. X values: values of 10 geophysical parameters characterizing the site C. 1 company 1. Y value: company failed or did not fail within next 2 years 2. X values: values of various financial ratios taken from the most recent balance sheet and income statement Data Base of Solved Examples • Contains many cases: (typically thousands) – Why so many? - data density. • From the many cases the DM algorithm tries to discover – Which, if any, of the candidate predictors can solve the regression or classification problem • Task #4 – How the selected predictors should be combined mathematically or logically to give the most accurate estimate possible of the value of the target (Y) • Task #5 Candidate Indicators X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 XN Y 1.2 -2.5 -5.1 1.2 -2.5 -5.1-2.5 -5.1 1.2 -2.5 -5.1 1.2 -2.5 -5.1-2.5 -5.1 1.2 -2.5 -5.1 1 Examp. 2 1.2 -2.5 -5.1 1.2 -2.5 -5.1 -2.5 -5.1 1.2 -2.5 -5.1 1.2 -2.5 -5.1 -2.5 -5.1 1.2 -2.5 -5.1 1 1.2 -2.5 -5.1 0 -2.5 -5.1 1.2 -2.5 -5.1-2.5 Examp. 3 1.2 -2.5 -5.1 1.2 -2.5 -5.1 -2.5 -5.1 -5.1 1.2 -2.5 -5.1 -2.5 1.2 -2.5 1.2 -2.5 Matrix of Examples 1.2 -2.5 -5.1 0 -2.5 -5.1 1.2 Examp. 4 With Known -2.5 -2.5 -5.1 1.2 -2.5 -5.1 0 Values 1.2 -2.5 -5.1 -2.5 -2.5 -5.1 0 Of both Xs & Y -2.5 -2.5 -5.1 1.2 -2.5 -5.1 1 1.2 -2.5 -5.1 1.2 -2.5 -5.1-2.5 -5.1 1.2 -2.5 -5.1 1.2 -2.5 -5.1 -2.5 -5.1 1.2 -2.5 -5.1 0 1.2 -2.5 -5.1 1.2 -2.5 -5.1 -2.5 -5.1 1.2 -2.5 -5.1 1.2 -2.5 -5.1 -2.5 -5.1 1.2 -2.5 -5.1 0 Case N N 1.2 -2.5 -5.1 1.2 -2.5 -5.1 -2.5 -5.1 1.2 -2.5 -5.1 1.2 -2.5 -5.1 -2.5 -5.1 1.2 -2.5 -5.1 1 Examp. Examp. 1 Human Intelligence: Unchanging Computer Power & Machine Intelligence Growing Exponentially Power Arithmetic Scale Moore’s Law: An Increasing Competitive Advantage to the Data Miners Time The A,B,C’s of Being An Intelligent Technical Analyst A. Know How to Use Data Mining Tools B. Know how to Define Data Mining Problems ( Define Y) C. Know how to define List of Information Rich Candidate Predictors (X’s)