Download 7. Decision Trees and Decision Rules

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Microsoft Jet Database Engine wikipedia , lookup

Database wikipedia , lookup

Relational model wikipedia , lookup

Database model wikipedia , lookup

Concurrency control wikipedia , lookup

Clusterpoint wikipedia , lookup

Transcript
國立雲林科技大學
National Yunlin University of Science and Technology
Local linear correlation analysis with the
SOM
Advisor : Dr. Hsu
Presenter : ching-wen Hong
Author
: Antonio Piras, Alain Germond
Intelligent Database Systems Lab
Outline
2
N.Y.U.S.T.
I. M.

Motivation

Objection

Linear correlation

Local linear correlation

The computing of local linear correlation in SOM

Application to electrical load forecasting

Conclusion

My opinion
Intelligent Database Systems Lab
Motivation

N.Y.U.S.T.
I. M.
When two variables are not linearly correlated, it does not mean
that they are not linearly correlated in the some local region. For
example, temperature and electrical energy consumption are not
linearly correlated, but the linear correlation is negative in winter
and positive in summer. So the linear correlation coefficient is not a
good tool in some application.
3
Intelligent Database Systems Lab
Objection
4
N.Y.U.S.T.
I. M.

This paper uses a local linear correlation to select relevant input
variables for non-linear regression. The method is an extension to
the concept of SOM and linear correlation.

The method is based on the SOM which allows to compute and
analysis the local linear correlation between variables in neighbour
subspace.
Intelligent Database Systems Lab
Linear correlation
5
N.Y.U.S.T.
I. M.

Consider two variables x and y, and x, y variables with zero mean.

Data set D={(x(t),y(t)) | i=1,…,N}
Intelligent Database Systems Lab
Linear correlation
N.Y.U.S.T.
I. M.
-1≤rxy≤1
rxy=0
~ x, y are no linear correlation
rxy=1 ~ x, y are a perfect positive linear correlation
rxy=-1 ~ x, y are a perfect negative linear correlation
6
Intelligent Database Systems Lab
Local linear correlation
7
N.Y.U.S.T.
I. M.

When two variables are not linearly correlated,
it does not mean that they are non-linear
correlated. It could happen that in some regions
of the definition space the variables are
correlated and in some other regions not.

For example, temperature and electrical energy
consumption are not linearly correlated, but the
linear correlation is negative in winter and
positive in summer.
Intelligent Database Systems Lab
The computing of local linear
correlation in SOM



8
A vector quantization algorithm which divides
the manifold where v=(x,y) in S subspaces.
The mi is the centres of the subspaces vi, i=1,..,s
We can compute mi and local linear correlation
rxy,i by the classical Kohonen learning rule.
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.
The classical Kohonen learning rule.
9
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.
Measuring local linear correlations
The
N.Y.U.S.T.
I. M.
local linear correlation rxy,i can be viewed as a
weighted measure of neighbouring correlations.
We now need to sum up the local linear correlations with
a single index to discover quickly the important variables.
The measuring index of local linear correlation:RMSLC
S is the number of clusters , r,i is the local correlation
between x and y in the cluster vi,i=1,…,s
10
Intelligent Database Systems Lab
Measuring local linear correlations
11
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.
Application to electrical load
forecasting
N.Y.U.S.T.
I. M.
The input variables:
1.the max and min temperatures of the forecast day
and of the day before of three recording
points:TMIN1,TMAX1, TMIN2,TMAX2,
TMIN3,TMAX3 , and TMIN1(-1),TMAX1 (-1),
TMIN2 (-1),TMAX2 (-1), TMIN3 (-1),TMAX3
(-1)
2.The load of the forecast day , of the 10 day
before and 1 day:Y,Y(-10) ,Y(-1)

12
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.
13
Intelligent Database Systems Lab
The training of closed Kohonen ring


14
N.Y.U.S.T.
I. M.
One network with 12 units has been trained with
a set composed of 2yr historical data (700
patterns).
1.we obtain the mean and the standard deviation
of the input vector in each cluster.2.the
covariance and the correlation coefficient
between the load Y and the rest of variables in
the input vector according to the Eqs.(4)-(7).
Intelligent Database Systems Lab
The training of closed Kohonen ring is
terminated
Spring:
neurons 3-4-5-6
Summer: neurons 1-2
Autumn: neurons 10-11-12
Winter: neurons 7-8-9
15
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.
N.Y.U.S.T.
I. M.
A
clear negative correlation in winter (r=-0.8)
A insignificant positive correlation in summer (r=0.2)
Y(-1) is significantly positive correlated with the load of today.
Y(-10) is not correlated with the load of today.
The TMIN3 was more important than TMAX3.
16
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.
A
clear negative correlation in winter (r=-0.6)
A insignificant negative correlation in summer (r=-0.2)
Y(-1) is significantly positive correlated with the load of today.
Y(-10) is not correlated with the load of today.
The TMAX3 was more important than TMIN3.
17
Intelligent Database Systems Lab
Quantify the importance of each variables
(for 1 p.m. data, networks of 1and 12 )
We
N.Y.U.S.T.
I. M.
knew that the temperature is an important explanatory
variable, but this fact cannot be shown with the simple
linear correlation coefficient computed over a year of
observations.
18
Intelligent Database Systems Lab
Conclusion



19
N.Y.U.S.T.
I. M.
In the paper we presented a method to the linear
correlation between variables in neighbour
subspaces based on SOM.
The visualisation of the local linear correlation
computed at each unit of SOM allows to
understand the varying dependency between
variables.
Future works will focus on the application the
method to theoric in put distributions.
Intelligent Database Systems Lab
My opinion


20
N.Y.U.S.T.
I. M.
1.We can present a method to the multiple linear
correlation(複相關) between variables in
neighbour subspaces based on SOM.
2.先以SOM分群後,在再依各群去計算相關係
數,其結果不知與本文結果有何差異?
Intelligent Database Systems Lab