Download Analysis of the Gene Expression Data with 4ft-Miner

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Analysis of
the Gene Expression Data
with 4ft-Miner
07.10.2005
Emilia Ylirinne
Tampere University of Technology
Finland
Outline
•
•
•
•
•
GUHA method in brief
4ft-quantifiers and 4ft-Miner
Data Mining process
Results
Conclusions
GUHA Method
• General Unary Hypotheses Automaton
• Introduced in 1960´s by Hájek
• Exploratory data analysis based on
association rules:

Boolean attributes  and  are associated in the
sense of 4ft-quantifier .
GUHA Method
• Also conditional association rules
  /
• Four fold table corresponding to   
M



a
b

c
d
Examples of 4ft-quantifiers
Founded implication (FUI) =>p, Base, where
0<p≤1 and Base>0 satisfies condition:
a/(a+b)≥p and a≥Base
M



a
b

c
d
Examples of 4ft-quantifiers
Double founded implication (DFUI) <=>p, Base,
where 0<p≤1 and Base>0 satisfies condition:
a/(a+b+c)≥p and a≥Base
M



a
b

c
d
4ft-Miner
• A part of academic system LISp-Miner
• http://lispminer.vse.cz/
• Mines for both association and conditional
association rules
Data Mining
• The small dataset: 74 x 822 gene
expression matrix was used
• We tried to find potential synexpression
groups from data set
• Preprocessing based on work of Becquet
et al (2002)
• With mid-range based approach we got
matrix with boolean values 0 and 1
Data Mining Tasks
4ft-quantifier
Input value for p
Type of rule
Input value for Base
Length of antecedent
Length of succedent
Number of output rules
Time of solution
Task 1
DFUI
0.8
ordinary
10
1
1
2
3 secs
Task 2
DFUI
1.0
conditional
10
1…5
1
73
19 secs
Results
• Example of Task 1
- AAGACAGTGG <=>85%,11 AAGGAGATGG
Suc
 Suc
Ant
11
0
 Ant
2
61
Results
• Example of task 2
GGCAAGAAGA TCACAAGCAA
TGTGCTAAAT TGTGTTGAGA
<=>100%,10 GCTTTTAAGG / TACAAGAGGA
Suc
 Suc
Ant
10
0
Ant
0
3
Conclusions
• This study was very preliminary, but there
are advantages, which 4ft-Miner can offer
• LISp-Miner contains 15 quantifiers
• We found numerous results
• Even pure equivalencies can be found
with conditional association rules
• A biologist should be consulted of
significance of these results
Related documents