Download Core Data Analysis

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Core Data Analysis
Boris Mirkin / Борис Григорьевич Миркин
•
•
Professor, Data Analysis and AI, NRU HSE, Moscow
Professor Emeritus, Computer Science, UL, London
Presentation:
1.
2.
3.
4.
Intro
Course philosophy and contents
A most successful example of data analysis
Differences in approaches of:
•
•
•
•
Mathematical statistics
Machine Learning
Data Mining
Data Analysis
Main text (1):
From Computing Reviews of the
ACM June 27, 2011:
 “There is an unforgettable scene in
the film Lawrence of Arabia where T.E.
Lawrence is asked why he is
obsessed with the desert. His reply:
“It’s clean.”
 Core concepts in data analysis is clean
and devoid of any fuzziness.
 The author presents his theses with
a refreshing clarity seldom seen in a
text of this sophistication. The entire
text is rich in solved examples, case
studies, projects, and introspective
questions.
Data Analysis: Methods for
enhancing knowledge (2)
 Core
Data Analysis: Methods for structural
knowledge enhancing
 Elements of structural knowledge:
◦concepts
◦statements of relation
among concepts:
Ohm law in physics – quantitative
Rule AB - categorical
CoDA contents: Structural
knowledge enhancing (2)
 Generic:
Two pathways  Two formats
◦ Summarization (concept) methods:
 Quantitative
Principal component analysis (PCA)
 Categorical
Cluster analysis
◦ Correlation (relation) methods:
 Quantitative
Regression
 Categorical
Classifier
Preliminary:
 1D Data Analysis
Histograms
 2D Data Analysis
Correlation/Association
Example of Data Analysis (3)
Laws for planetary motion: J. Kepler (circa 1605)
using data of Tycho Brahe (1546-1601):
 1st Law: Planets revolve Sun in ellipses
 2d Law: The further away from Sun, the slower
the speed (Equal sectors in equal time)
Does

either

Example of Data Analysis: 3d Law (3.1)
Period
Planet
(year)
Distance
(average,
relative to that
of Earth)
Mercury
0.241
0.39
Venus
0.615
0.72
Earth
1.00
1.00
Mars
1.88
1.52
Jupiter
11.8
5.20
Saturn
29.5
9.54
Uranus
84.0
19.18
Neptune
165
30.06
Pluto
248
39.44
Is there any relation
between
speed/period and
distance?
Example of Data Analysis: 3d Law (3.2)
3d Kepler’s Law:
Is there any relation
between
speed/period and
distance?
Fits no line…
Example of Data Analysis: 3d Law (3.3)
3d Kepler’s Law (1619):
[ J. Napier invented
logarithm (1614) ]
Transform data:
𝟑
𝟐
Log(P)= Log(D)
2
3
P =D
Example of Data Analysis: 3d Law.
So what? (3.4)
Three Kepler’s Laws: What is so grand about them?
Substantiated theoretically by
R. Hooke (1635-1703) and I. Newton (1642-1727)
UNIVERSAL GRAVITATION LAW !
Mathematical equation, CORNERSTONE of science
Data Analysis Differs from
- Math Statistics,
- Data Mining,
- Machine Learning
 Mathematical
methods for data
processing are the same
BUT
 Different
questions asked
Math Statistics Approach
3d Kepler’s Law:
Is there any relation
between speed/period
and distance?
Needs a
probabilistic model
Period=f(Distance,
Error).
Proof: Statistical
criteria
Data Mining
Approach
3d Kepler’s Law:
Is there any relation
between speed/period
and distance?
Take many Fs:
Period=F(Distance)
- F(x)=log(xa)
- F(x)= ax+b
- F(x)=ax
………….. Which one
is most interesting?
Proof: Usage
Machine Learning Approach
d
3 Kepler’s Law:
Relation between
speed/period and
distance?
Needs a function f
to predict:
Period=f(Distance,
Error) for Uranus,
Pluto.
Deep network for f
? - ok.
Proof: Small error.
Data Analysis Approach
3d Kepler’s Law:
Relation between
speed/period and
distance?
Needs a function f
to add to the
theory:
Period=f(Distance)
P2=D3
Proof: Good
interpretation (like
Core Data Analysis
Boris Mirkin / Борис Григорьевич Миркин
•
•
Professor, Data Analysis and AI, NRU HSE, Moscow
Professor Emeritus, Computer Science, UL, London
Presentation:
1.
2.
3.
4.
Intro
Course philosophy and contents
A most successful example of data analysis
Differences in approaches of
•
•
•
•
Mathematical statistics
Machine Learning
Data Mining
Data Analysis
Related documents