Download Mining for Meaning:

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Laboratory of Parasitic Diseases, NIAID
MINING FOR MEANING:
Data mining &
Knowledge extraction
Data interpretation
Experiment
Results
Knowledge
Conclusions
Data interpretation
Experiment
Results
Knowledge
Conclusions
High publication rates
Data interpretation
Experiment
Results
High-throughput screening
Knowledge
Conclusions
High publication rates
Data interpretation
Experiment
GENE EXPRESSION PROFILING
Results
Identify relevant genes
Identify expression patterns
Pathogens
Bm 5
Mtb
Bm 50
Ld
DC
Tg
Lm
Knowledge
Induced by
Fold Change (log2)
6
5
4
3
intracellular pathog
2
1
0
-1
-2
7
6
5
4
Leishmania & TB
3
2
1
0
-1
-2
7
Fold Change (log2)
6
5
4
3
2
Toxoplasma & TB
1
0
-1
-2
6
5
Fold Change (log2)
Genes
Conclusions
Fold Change (log2)
9
8
4
3
2
1
0
-1
-2
Toxoplasma
Data interpretation
Experiment
GENE EXPRESSION PROFILING
Results
Identify relevant genes
Identify expression patterns
Knowledge
FUNCTIONAL PROFILING
Identify functional implications
Conclusions
MINING FOR MEANING:
FUNCTIONAL PROFILING
IL-4 + GM-CSF
DC
Infection with
5 pathogens
RNA pools
-overnight-
M-CSF
Elutriated Human
Monocytes
U95
Mac
7 donors
Intracellular
Protozoan
Toxoplasma gondii
vs
vs
Bacteria
Extracellular
Brugia malayi
Leishmania
Mycobacterium tuberculosis
Leishmania major
vs
Leishmania donovani
Gene expression profiling

DC
Mac
Dataset
Dataset
Extraction

Filter
12.000 genes
1200
75
Bm 5
Mtb
Bm 50
Tg
Induced by
6
Fold Change (log2)
Pathogens
Ld
Lm
DC
5
4
3
2
intracellular pathogens
1
0
-1
-2
8
7
6
5
4
3
2
Leishmania & TB
1
0
-1
-2
Fold Change (log2)
7
6
5
4
3
2
Toxoplasma & TB
1
0
-1
-2
6
5
Fold Change (log2)
Genes
Fold Change (log2)
9
4
3
2
1
0
-1
-2
Toxoplasma
MINING FOR MEANING:
FUNCTIONAL PROFILING
WITH GENE ONTOLOGIES - GO
http://www.geneontology.org/
http://apps1.niaid.nih.gov/david/
MINING FOR MEANING:
FUNCTIONAL PROFILING
WITH GENE ONTOLOGIES - GO
WITH LITERATURE ONTOLOGIES - MESH
http://www.nlm.nih.gov/mesh/MBrowser.html
http://array.ucsd.edu/hapi/
http://132.239.155.52/HAPI/TEST_377.HTML
MINING FOR MEANING:
FUNCTIONAL PROFILING
WITH GENE ONTOLOGIES - GO
WITH LITERATURE ONTOLOGIES - MESH
WITH LITERATURE ABSTRACTS
Data interpretation
Experiment
Results
Knowledge
Conclusions
12 million references
Data interpretation
Pathogens
Bm 5
Mtb
Bm 50
Ld
DC
Tg
Lm
Experiment
Induced by
Fold Change (log2)
6
5
4
3
intracellular patho
2
1
0
-1
-2
7
6
5
4
Leishmania & T
3
2
1
0
-1
-2
7
6
Fold Change (log2)
Genes
Results
Fold Change (log2)
9
8
5
4
3
2
Toxoplasma & T
1
0
-1
-2
6
Fold Change (log2)
5
4
3
2
1
0
-1
-2
Knowledge
Conclusions
12 million references
Toxoplasma
MINING FOR MEANING:
FUNCTIONAL PROFILING
WITH LITERATURE ABSTRACTS
 Co-citation Network
http://www.pubgene.com/
MINING FOR MEANING:
FUNCTIONAL PROFILING
WITH LITERATURE ABSTRACTS
 Co-citation Network
 Natural Language Processing
MINING FOR MEANING:
FUNCTIONAL PROFILING
WITH LITERATURE ABSTRACTS
 Co-citation Network
 Natural Language Processing
 Literature Profiling
1.Gene - Literature indexation
Gene A
Gene B
Retrieve relevant literature
C…
for each gene Gene
Gene X
Abstracts
2. Analysis of abstract contents
Gene A
Gene B
Gene C… Gene X
Term1
0%
54%
2%
Term 2
12%
0%
6%
Term 3…
60%
1%
35%
Term occurrences in abstracts
Term y
2.5%
5%
7.5%
10%
12.5%
15%
17.5%
20%
22.5%
25%
Determine term occurrence
in abstracts
Term occurrence
in abstracts
Chemokines
Death / Apoptosis
Analyze functional
relationships
Interferon response
Dendrite elongation
MHC class I pathway
3.Term filtering
Select
relevant
Discrimination
terms
Co-occurrence
- Identify functional relationships
1.Gene - Literature indexation
Abstracts
2. Analysis of abstract contents
Gene A
Gene B
Gene C… Gene X
Term1
0%
54%
2%
Term 2
12%
0%
6%
Term 3…
60%
1%
35%
Term occurrences in abstracts
Term y
3.Term filtering
25%
20%
22.5%
Select
relevant
17.5%
12.5%
15%
7.5%
10%
- Interpret data
5%
Determine term occurrence
in abstracts
2.5%
- Translate genelists into
keywords
Gene A
Gene B
Retrieve relevant literature
C…
for each gene Gene
Gene X
Discrimination
terms
CoCo-occurrence
Term occurrence
in abstracts
Chemokines
Death / Apoptosis
Analyze functional
relationships
Interferon response
Dendrite elongation
MHC class I pathway
Pathogens
Bm 5
Mtb
Bm 50
Ld
Tg
Lm
Experimental system
DC
Fold Change (log2)
5
4
3
intracellular pathogens
2
1
0
-1
-2
Fold Change (log2)
9
8
7
6
5
4
Leishmania & TB
3
2
Gene List
1
0
-1
-2
7
Fold Change (log2)
6
5
4
3
2
Toxoplasma & TB
1
0
-1
-2
6
5
Fold Change (log2)
Genes
KNOWLEDGE
Induced by
6
4
3
2
1
0
-1
-2
Toxoplasma
12 million
references
MINING FOR MEANING:
FUNCTIONAL PROFILING
LITERATURE MINING
Data interpretation
Experiment
Results
Knowledge
Conclusions
High publication rates
MINING FOR MEANING:
DATA MINING
LITERATURE MINING
DOCUMENT CLUSTERING
MINING FOR MEANING:
DATA MINING
LITERATURE MINING
DOCUMENT CLUSTERING
NLP
MINING FOR MEANING:
DATA MINING
LITERATURE MINING
DOCUMENT CLUSTERING
NLP
LITERATURE PROFILING
Data interpretation
Experiment
GENE EXPRESSION PROFILING
Results
Identify relevant genes
Identify expression patterns
Pathogens
Bm 5
Mtb
Bm 50
Ld
DC
Tg
Lm
Knowledge
Induced by
Fold Change (log2)
6
5
4
3
intracellular pathog
2
1
0
-1
-2
7
6
5
4
Leishmania & TB
3
2
1
0
-1
-2
7
Fold Change (log2)
6
5
4
3
2
Toxoplasma & TB
1
0
-1
-2
6
5
Fold Change (log2)
Genes
Conclusions
Fold Change (log2)
9
8
4
3
2
1
0
-1
-2
Toxoplasma
Related documents