Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Commodities Futures Price Prediction An Artificial Intelligence Approach Thesis Defense Commodities Markets • Commodity – A good that can be processed and resold – Examples – corn, rice, silver, coal • Spot Market • Futures Market Futures Markets • Origin • Motivation – Hedgers • Producers • Consumers – Speculators • Size and scope – CBOT (2002) • 260 million contracts • 47 different products Profit in the Futures Market • Information – Supply • • • • Optimal production Weather Labor Pest damage – Demand • Industrial • Consumer • Time series analysis Time Series Analysis Background • Time Series – examples – – – – – – River flow and water levels Electricity demand Stock prices Exchange rates Commodities prices Commodities futures prices • Patterns Time Series Analysis - Methods • • • • • Linear regression Non-linear regressions Rule based systems Artificial Neural Networks Genetic Algorithms Data • • • • • Daily price data for soybean futures Chicago Board of Trade Jan. 1, 1980 – Jan. 1, 1990 Datastream Normalized Why use an Artificial Neural Network (ANN)? • Excellent pattern recognition • Other uses of ANN and financial time series analysis – Estimate generalized option pricing formula – Standard & Poors 500 index futures day trading system – Standard & Poors 500 futures options prices ANN Implementation • Stuttgart Neural Network Simulator, version 4.2 • Resilient propagation (RPROP) – Improvement over standard back propagation – Uses only the sign of the error derivative • Weight decay • Parameters – – – – Number of inputs 10 and 100 Number of hidden nodes 5, 10, 100 Weight decay 5, 10, 20 Initial weight range +/- 1.0, 0.5, 0.25, 0.125, 0.0625 ANN Data Sets • Training set Jan. 1, 1980 – May 2, 1983 • Testing set May 3, 1983 – Aug. 29, 1986 • Validation set Sept. 2, 1986 – Jan. 1, 1990 ANN Results • Mean Error – 100 input • 12.00 • 24.93 – 10 input • 10.62 • 25.88 – Cents per bushel Why Evolve the parameters of an ANN? • Selecting preferred parameters is a difficult poorly understood task • Search space is different for each task • Trial and error is time consuming • Evolutionary techniques provide powerful search capabilities for finding acceptable network parameters. Genetic Algorithm Implementation • Galib, version 4.5 (MIT) • Custom code to implement RPROP with weight decay • Real number representation – – – – – – Number of input nodes (1 – 100) Number of hidden nodes (1 – 100) Initial weight range (0.0625 – 2.0) Initial step size (0.0625 – 1.0) Maximum step size (10 – 75) Weight decay (0 – 20) Genetic Algorithm – Implementation (continued) • • • • Roulette wheel selection Single point crossover Gausian random mutation High mutation rate Evaluation Function • Decode the parameters and instantiate a network using them • Train the ANN for 1000 epochs • Report the lowest total sum of squared error for both training and testing data sets • Fitness equals the inverse of the total error reported. Parameter Evolution - Results • GANN Mean error 10.82 • NN Mean error 10.62 • Conclusions: – GANN performance is close and out performs the majority of networks generated via trial and error – Genotype / Phenotype issue – Other, possibly better GA techniques • Multipoint crossover • Tournament selection Evolving the Weights of an ANN • Avoid local minima • Avoid tedious trial and error search for learning parameters • Perform search of broad, poorly understood solution space and maximize the values for function parameters Weight evolution Implementation • • • • • • Galib, version 4.5 (MIT) Custom written neural network code Real number representation Gausian Mutation Two point crossover Roulette wheel selection Weight Evolution – objective function • Instantiate a neural network with the weight vector (I.e. the individual) • Feed one epoch of the training data • Fitness equals the inverse of the sum of the squared network error returned Weight Evolution – keeping the best individual • Fitness function evaluates against training set only • Objective function evaluates against training set as well, but only for retention of candidate best network • Meta-fitness, or meta-elite individual Weight Evolution - Results • Mean Error – GANN-Weight 10.67 – GANN 10.82 – NN 10.61 • Much faster • Fewer man hours Summary • Pure ANN approach is very man hour intensive and expert experience is valuable • Evolving network parameters requires few man hours, but many hours of computational resources. • Evolving the network weights provides most of the performance for smaller cost in both human and computer time