Download Final Year Project

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Nonlinear dimensionality reduction wikipedia , lookup

K-nearest neighbors algorithm wikipedia , lookup

Transcript
First Presentation, Final Year Project, 2013
Analyzing Stock Quotes using
Data Mining Techniques
Name of Student: To Yi Fun
University Number: 2010149103
Flow of Presentation
•
•
•
•
•
•
•
Aim of the this classification for stock trade
Theory of Classification
Decision Tree making
Introduction of the application
Structure and techs used in this application
Preparation
Interface
Flow of Presentation
•
•
•
•
Demonstration
Data Analysis
What to do next
Q&A
Aim
• Find a model for class attribute as a function of
others to group a class for previously unseen
records
• e.g. find out the classifier for historic stock price;
Group companies into different classes for
inspection
• classier: decision tree, rule-based classifier
Theory for Decision Tree
• A series of test conditions making to sort the
instances into class
• Greedy, split record based on attribute that best
suit the criterion
• Attribute (discrete) setting, 2-way split;
multiple-way split
Theory for Decision Tree
• Best split
-Gini Index, generalization of variance impurity
-Entropy, amount of impurity on a set
• Aim: using a training set
to provide a classifier for
classifying testing set
Application Structure
Download
CSV2MYSQ
LGENERAT
OR
Processed
Data
Filter Query
(Splitting)
Raw data
Data
processing
Information
presentation and
arithmetic operation
Preparation
• Downloading the stock historic data: for 30 DOM shares e.g. Pfizer,
Bank of America, America Express, Exxon
• Convert to .csv file to be processed by the
CSV2MYSQLGENERATOR program, the result is a lengthy sql
commands
Data Processing
• Categories into different type of stock by its industries
• Dow 30 as training set and 8 more stocks as testing set, mainly large
scale company
Data Processing
• Downloading the stock historic data: for 30 DOM shares e.g. Pfizer,
Bank of America, America Express, Exxon
• Convert to .csv file to be processed by the
CSV2MYSQLGENERATOR program, the result is a lengthy sql
commands
Data Processing
• Attributes Setting
-HL_30DaysAverage: Tendency
-HL_ChangeDaily: Change
-HL_ChangePerc: Difference
-HL_VolChange: Popularity
Class:
-B_RiseMore3Perc5Day:
Buy Signal
Data Processing
• Attributes Setting
User Interface
• Make Use of the mysql connector to input the processed data into
the C#
• Three Major Components:
-Input
-Result Log
-Test
Demonstration
• Make Use of the mysql connector to input the processed data into
the C#
• Three Major Components:
-Input
-Result Log
-Test
Result
Result Analysis
Attributes Setting
-HL_30DaysAverage: Tendency
-HL_ChangeDaily: Change
-HL_ChangePerc: Difference
-HL_VolChange: Popularity
What to do Next
• Implement a more user friendly UI for presenting the stock price,
visualize the tree and provide query service
• Implement an splitting Algorithm using Gini and compare the
difference of the results generated by these Algorithms
Q&A