Download msb201129-sup

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Supplementary Figures
Supplementary Figure 1. Validation of the association score reliability in the DPL matrix.
(A) Distribution of Z-values by subcellular localization-disease association scores.
(B)
Examples of subcellular localization profiles in the DPL matrix of the two diseases. The
number indicates the subcellular localization-disease association score and color range goes
from white to dark green based on the enrichment of the score.
Supplementary Figure 2. (A) Hierarchical clustering of subcellular localizations and 487
diseases that have more than two disease-associated proteins.
(B) Number of observed
diseases enriched in particular subcellular localization (red arrow) in comparison with the
distribution of the expected number of diseases (black). Disease sets with more than two
disease-associated proteins were used.
Supplementary Figure 3. (A) Hierarchical clustering of subcellular localizations and 882
diseases.
(B) PCC for each pair of subcellular localization profiles was calculated for the
same diseases from OMIM data and disease-associated protein complex data, and compared
with that of random selections.
Supplementary Figure 4. Examples of disease classes enriched in the specific subcellular
localizations.
(A) In connective tissue diseases, disease-associated proteins are significantly
enriched in the extracellular region.
(B) Developmental diseases are a disease class that
does not have significantly enriched associated proteins in a subcellular localization.
Supplementary Figure 5. Diagram depicting the informatics workflow.
Supplementary Figure 6. Relative Risk and -correlation increase along with the subcellular
localization similarity (PCC). Disease pairs that have more than two or more associated
proteins are considered.
similarities increase.
(A) Relative Risk for disease pairs as subcellular localization
(B) Average -correlations, a complementary measure of comorbidity
tendency, also increase as subcellular localization similarities.
Supplementary Figure 7. Average comorbidity tendencies measured from disease pairs
connected by subcellular localization and co-expression.
Supplementary Figure 8.
Examples of disease modules representing clusters of interacting
proteins connected by subcellular localization.
Disease modules of (A) Cerebral
degenerations in childhood, (B) Encephalitis, and (C) Glycoprotein Ia deficiency are shown.
Colored nodes represent the disease-associated proteins in each disease module. Gray nodes
are connected by subcellular localization and linked by protein-protein interactions with
disease-associated proteins.
Supplementary Figure 9. Validation of the relationship between subcellular localization and
comorbidity tendency based on different sets of mitochondrial proteins. (A) Subcellular
localization annotations of MitoCarta proteins.
Disease-associated mitochondrial proteins
and their subcellular localization information are shown. (B) Relative Risk for disease pairs
as subcellular localization similarities increase.
Blue, yellow, and green indicate the set of
subcellular localization information from Swiss Prot, ConLoc, and MitoCarta, respectively.
Supplementary Figure 10.
Subcellular localization profiles of disease subtypes.
(A)
Fraction of disease pairs at the given subcellular localization PCC. The combined disease
indicates disease subtypes combined into single diseases. Subcellular localization PCC was
calculated between combined disease and their subtypes. Diseases that have more than five
disease subtypes are considered.
The random control set was constructed by randomly
assigning subcellular localization to the diseases (P = 1.94 x 10-64; Mann-Whitney test). (B)
Examples of the subcellular localization profiles of combined diseases and their subtypes.
Single-line boxes indicate the subcellular localization profile of combined diseases and the
multi-line boxes show the subcellular localization profiles of each disease subtype.
Supplementary Figure 11. Subcellular localization enrichment of the diseases from Gene
Association Database (GAD).
(A) Hierarchical clustering of the subcellular localization
enrichment of 427 diseases in GAD.
specific subcellular localizations.
(B) Examples of complex diseases in GAD enriched in
Supplementary Figure 12. Comorbidity tendency of disease pairs based on the UMLS
mapping.
(A) Average comorbidity tendencies (RR) for disease pairs with increasing
subcellular localization similarities. Note that subcellular localization similarity was not
detected range from 0.6 to 0.8. (B) Average comorbidity tendencies of disease pairs sharing
genes or co-expression, linked by PPIs, and connected by subcellular localization. (C) The
numbers of disease pairs that share genes or co-expression, linked by PPIs, and connected via
subcellular localization.
(D) Average comorbidity tendencies between disease pairs
connected via subcellular localization and the link distances.
Related documents