Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Supplementary Figures Supplementary Figure 1. Validation of the association score reliability in the DPL matrix. (A) Distribution of Z-values by subcellular localization-disease association scores. (B) Examples of subcellular localization profiles in the DPL matrix of the two diseases. The number indicates the subcellular localization-disease association score and color range goes from white to dark green based on the enrichment of the score. Supplementary Figure 2. (A) Hierarchical clustering of subcellular localizations and 487 diseases that have more than two disease-associated proteins. (B) Number of observed diseases enriched in particular subcellular localization (red arrow) in comparison with the distribution of the expected number of diseases (black). Disease sets with more than two disease-associated proteins were used. Supplementary Figure 3. (A) Hierarchical clustering of subcellular localizations and 882 diseases. (B) PCC for each pair of subcellular localization profiles was calculated for the same diseases from OMIM data and disease-associated protein complex data, and compared with that of random selections. Supplementary Figure 4. Examples of disease classes enriched in the specific subcellular localizations. (A) In connective tissue diseases, disease-associated proteins are significantly enriched in the extracellular region. (B) Developmental diseases are a disease class that does not have significantly enriched associated proteins in a subcellular localization. Supplementary Figure 5. Diagram depicting the informatics workflow. Supplementary Figure 6. Relative Risk and -correlation increase along with the subcellular localization similarity (PCC). Disease pairs that have more than two or more associated proteins are considered. similarities increase. (A) Relative Risk for disease pairs as subcellular localization (B) Average -correlations, a complementary measure of comorbidity tendency, also increase as subcellular localization similarities. Supplementary Figure 7. Average comorbidity tendencies measured from disease pairs connected by subcellular localization and co-expression. Supplementary Figure 8. Examples of disease modules representing clusters of interacting proteins connected by subcellular localization. Disease modules of (A) Cerebral degenerations in childhood, (B) Encephalitis, and (C) Glycoprotein Ia deficiency are shown. Colored nodes represent the disease-associated proteins in each disease module. Gray nodes are connected by subcellular localization and linked by protein-protein interactions with disease-associated proteins. Supplementary Figure 9. Validation of the relationship between subcellular localization and comorbidity tendency based on different sets of mitochondrial proteins. (A) Subcellular localization annotations of MitoCarta proteins. Disease-associated mitochondrial proteins and their subcellular localization information are shown. (B) Relative Risk for disease pairs as subcellular localization similarities increase. Blue, yellow, and green indicate the set of subcellular localization information from Swiss Prot, ConLoc, and MitoCarta, respectively. Supplementary Figure 10. Subcellular localization profiles of disease subtypes. (A) Fraction of disease pairs at the given subcellular localization PCC. The combined disease indicates disease subtypes combined into single diseases. Subcellular localization PCC was calculated between combined disease and their subtypes. Diseases that have more than five disease subtypes are considered. The random control set was constructed by randomly assigning subcellular localization to the diseases (P = 1.94 x 10-64; Mann-Whitney test). (B) Examples of the subcellular localization profiles of combined diseases and their subtypes. Single-line boxes indicate the subcellular localization profile of combined diseases and the multi-line boxes show the subcellular localization profiles of each disease subtype. Supplementary Figure 11. Subcellular localization enrichment of the diseases from Gene Association Database (GAD). (A) Hierarchical clustering of the subcellular localization enrichment of 427 diseases in GAD. specific subcellular localizations. (B) Examples of complex diseases in GAD enriched in Supplementary Figure 12. Comorbidity tendency of disease pairs based on the UMLS mapping. (A) Average comorbidity tendencies (RR) for disease pairs with increasing subcellular localization similarities. Note that subcellular localization similarity was not detected range from 0.6 to 0.8. (B) Average comorbidity tendencies of disease pairs sharing genes or co-expression, linked by PPIs, and connected by subcellular localization. (C) The numbers of disease pairs that share genes or co-expression, linked by PPIs, and connected via subcellular localization. (D) Average comorbidity tendencies between disease pairs connected via subcellular localization and the link distances.