Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Analysis of the Gene Expression Data with 4ft-Miner 07.10.2005 Emilia Ylirinne Tampere University of Technology Finland Outline • • • • • GUHA method in brief 4ft-quantifiers and 4ft-Miner Data Mining process Results Conclusions GUHA Method • General Unary Hypotheses Automaton • Introduced in 1960´s by Hájek • Exploratory data analysis based on association rules: Boolean attributes and are associated in the sense of 4ft-quantifier . GUHA Method • Also conditional association rules / • Four fold table corresponding to M a b c d Examples of 4ft-quantifiers Founded implication (FUI) =>p, Base, where 0<p≤1 and Base>0 satisfies condition: a/(a+b)≥p and a≥Base M a b c d Examples of 4ft-quantifiers Double founded implication (DFUI) <=>p, Base, where 0<p≤1 and Base>0 satisfies condition: a/(a+b+c)≥p and a≥Base M a b c d 4ft-Miner • A part of academic system LISp-Miner • http://lispminer.vse.cz/ • Mines for both association and conditional association rules Data Mining • The small dataset: 74 x 822 gene expression matrix was used • We tried to find potential synexpression groups from data set • Preprocessing based on work of Becquet et al (2002) • With mid-range based approach we got matrix with boolean values 0 and 1 Data Mining Tasks 4ft-quantifier Input value for p Type of rule Input value for Base Length of antecedent Length of succedent Number of output rules Time of solution Task 1 DFUI 0.8 ordinary 10 1 1 2 3 secs Task 2 DFUI 1.0 conditional 10 1…5 1 73 19 secs Results • Example of Task 1 - AAGACAGTGG <=>85%,11 AAGGAGATGG Suc Suc Ant 11 0 Ant 2 61 Results • Example of task 2 GGCAAGAAGA TCACAAGCAA TGTGCTAAAT TGTGTTGAGA <=>100%,10 GCTTTTAAGG / TACAAGAGGA Suc Suc Ant 10 0 Ant 0 3 Conclusions • This study was very preliminary, but there are advantages, which 4ft-Miner can offer • LISp-Miner contains 15 quantifiers • We found numerous results • Even pure equivalencies can be found with conditional association rules • A biologist should be consulted of significance of these results