Download ppt

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Cancer epigenetics wikipedia , lookup

Minimal genome wikipedia , lookup

Behavioural genetics wikipedia , lookup

Genetic testing wikipedia , lookup

Pharmacogenomics wikipedia , lookup

Gene desert wikipedia , lookup

Gene therapy wikipedia , lookup

Epigenetics of human development wikipedia , lookup

Non-coding DNA wikipedia , lookup

Population genetics wikipedia , lookup

Genomics wikipedia , lookup

Quantitative trait locus wikipedia , lookup

RNA-Seq wikipedia , lookup

Human genetic variation wikipedia , lookup

Pathogenomics wikipedia , lookup

Nutriepigenomics wikipedia , lookup

Vectors in gene therapy wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

Gene wikipedia , lookup

Gene expression profiling wikipedia , lookup

Genome evolution wikipedia , lookup

Helitron (biology) wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

Epistasis wikipedia , lookup

Genetic engineering wikipedia , lookup

Genome editing wikipedia , lookup

Biology and consumer behaviour wikipedia , lookup

Public health genomics wikipedia , lookup

Gene expression programming wikipedia , lookup

Genome (book) wikipedia , lookup

History of genetic engineering wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Designer baby wikipedia , lookup

Microevolution wikipedia , lookup

Transcript
Maximal Extraction of
Biological Information from
Genetic Interaction Data
Greg Carter
Galitski Lab
Institute for Systems Biology (Seattle)
Genetic Interaction
Pairwise perturbation
two genes combine to affect phenotype
Hereford & Hartwell 1974
Measure a phenotype for 4 strains:
1.
Wild-type reference genotype
2.
Perturbation of gene A
3.
Perturbation of gene B
4.
Double perturbation of A and B
•
Loss-of-function, gain-of-function, dominant-negative,
etc.
•
Interaction depends on phenotype measured.
Genetic Interaction
Example: flo11 and sfl1 for yeast invasion.
post-wash
pre-wash
Invasion Assay
WT
flo11
sfl1
flo11sfl1
~2000 interactions measured
(Drees et al, 2005)
Classification of Interactions
45 possible phenotype inequalities
Classified into 9 rules (Drees, et al. 2005)
WT=A=B=AB,
A=B=WT<AB,
AB<A<WT=B,
WT=A=AB<B,
etc…
WT=A<B=AB,
A<B<WT=AB,
WT=A=AB<B,
A<B<WT<AB,
Yeast Invasion Network
Distribution of Rules
2000 interactions among 130 genes
Extracting Biological Statements
Statistical associations of a gene interacting with a function
PhenotypeGenetics plug-in for Cytoscape
www.cytoscape.org
Classification Problem
Can the 45 interactions be classified in a more informative way?
How many rules?
Distribution of interactions?
WT=A=B=AB,
A=B<WT<AB,
AB<A<WT=B,
WT=A=AB<B,
etc…
WT=A<B=AB,
A<B<WT=AB,
WT=A=AB<B,
A<B<WT<AB,
?
Context-dependent Complexity
Requirements for a complexity metric Y:
1.
Adding a gene with random interactions adds no information
2.
Duplicating a gene adds no information
3.
Should depend on
(i) the information content of each gene’s interactions, and
(ii) the information content of all gene-gene relationships.
General requirements for biological information (see poster).
Context-dependent Complexity
Y = S Ki mij (1 – mij )
pairs ij
Ki is the information of node i,
mij is the mutual information between i and j,
0 ≤ mij ≤ 1
and
0 ≤ Y≤ 1
Applied to (see poster):
• Sets of bit strings (sequences)
• Network architecture
• Dynamic Boolean networks
• Genetic interaction networks…
Genetic Interaction Networks
•
Invasion network of Drees, et al. Genome Biology 2005
130 genes, 2000 interactions
•
MMS fitness network of St Onge, et al. Nature Genetics 2007
26 genes, 325 interactions
Determined networks of maximum complexity Y.
Network
Classification
Scheme
Invasion Data
MMS Fitness Data
Y
biological
statements
Y
biological
statements
Drees, et al.
0.57
52
0.27
28
Segré, et al.
0.52
47
0.32
19
-
-
0.16
10
0.79
72
0.62
32
St Onge, et al.
Maximum Y
Complexity and Biological Information
Number of biological statements is correlated with Y
115k possible MMS fitness networks, r = 0.80
Genetic Interaction Networks
Maximally complex MMS fitness network
Frequency
1
120
PAB = PA < PB < PWT
epistatic
2
55
PAB < PA = PB < PWT
additive
3
92
PAB < PA < PB < PWT
additive
30
PAB = PA = PB < PWT
asynthetic
4
PAB = PA < PB = PWT
non-interactive
PAB < PA = PB = PWT
synthetic
PA < PAB = PB < PWT
epistatic
PAB = PA = PB = PWT
non-interactive
PAB < PA < PB = PWT
conditional
PA < PAB < PB < PWT
single-nonmonotonic
5
26
Inequalities
Classical Interpretation
Rule
(Drees et al. 2005)
Genetic Interaction Networks
Biological statements from the
maximally complex MMS fitness network
gene
interacts via
PSY3
Rule 1
gene
SGS1
SWC5
CSM2
SHU2
SHU1
with genes
meiotic recombination
interacts via with genes
Rule 5
Rule 2
Rule 4
Rule 4
Rule 4
error-free DNA repair
error-free DNA repair
error-free DNA repair
error-free DNA repair
error-free DNA repair
P
0.0011
P
0.00014
0.00056
0.0026
0.0030
0.0065
St Onge, et al. Figure 5d
Conclusion and Future Work
For a given data set, maximizing Y facilitates unsupervised, maximal
information extraction by balancing over-generalized and overspecific classifications schemes.
Need network-based methods to interpret the maximally complex
interaction rules. Interpretations will depend on the system,
specific to phenotype measured and perturbations performed.
See poster for more details
Thanks to
Becky Drees
Alex Rives
Marisa Raymond
Iliana Avila-Campillo
Paul Shannon
James Taylor
Susanne Prinz
Vesteinn Thorsson
Tim Galitski
Matti Nykter
Nathan Price
Ilya Shmulevich
David Galas