Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Bioinformatics Core (B) Progress and Future Goals www.functionalglycomics.org Advancing Glycomics Key Issues & Challenges Challenges • Chemical heterogeneity • Challenges in isolation and analysis • Challenges in representation/processing glycan Information Non-template biosynthesis Ensemble or group of glycan structures resulting from coordinated expression of several biosynthetic enzymes Glycan-Protein Interaction Glycan Structure-Function Relationships Multivalent interactions involving multiple glycan motifs with multivalent CRDs on proteins Challenges • Understanding biochemical basis for glycan-protein interactions • How glycan specific genotype governs whole organism phenotype • Constructing biological pathways and interaction networks Consortium for Functional Glycomics www.functionalglycomics.org • Active collaborative effort to advance glycomics • Goal: Understand role of glycan–protein interactions in cell-cell communication • Develop technologies and resources to obtain data at various levels from molecule mouse • Integrated approach to glycan structure-function Bioinformatics Core (B) Vision & Approach CARBORDER CPATHEXP CELLSOURCE Corder_key: NUMBER Cell_key: NUMBER Corder_ID: VARCHAR2(20) Corder_date: DATE Carbinv_key: NUMBER Order_receiver: VARCHAR2(50) Order_address: VARCHAR2(200) Order_quantity: NUMBER shipping_flag: CHAR(1) Cell_ID: VARCHAR2(20) biosource_key: NUMBER Cell_line: VARCHAR2(30) Cell_linetype: VARCHAR2(20) Cell_tissue: VARCHAR2(20) Cell_Organ: VARCHAR2(20) Cpathexp_key: INTEGER Cpathexp_ID: VARCHAR2(20) Cpathexp_title: VARCHAR2(20) Cpath_key: INTEGER Protocol_key: INTEGER Cpathexp_keyword: VARCHAR2(20) Cpathexp_Instrument: VARCHAR2(20) Cpathexp_comment: VARCHAR2(20) Cpathexp_date: DATE User_key: INTEGER BIOLOGICAL_SOURCE ENZYMESEQUENCE biosource_key: INTEGER Enzyme_sequenceKey: INTEGER Biosource_ID: VARCHAR2(20) kingdom: VARCHAR2(20) Phylum: VARCHAR2(20) Class: VARCHAR2(20) Genus: VARCHAR2(20) Species: VARCHAR2(20) Enzyme_sequence: VARCHAR2(20) Enzyme_key: INTEGER ENZYME GENE CARBINVENTORY Carbinv_key: NUMBER Carbinv_ID: VARCHAR2(20) Carb_key: NUMBER Carbexp_key: VARCHAR2(20) Carbinv_Purity: NUMBER Carbinv_color: VARCHAR2(10) Carbinv_State: VARCHAR2(10) Quantity: NUMBER Carbexp_batchno: VARCHAR2(20) DNASEQ PROTEINSEQUENCE Gene_key: INTEGER DNASEQ_key: INTEGER Genome: VARCHAR2(20) Gene_name: VARCHAR2(20) biosource_key: INTEGER Gene_key: INTEGER DNASEQ_ID: VARCHAR2(20) DNASEQ_IDtype: VARCHAR2(20) DNASEQ_sequence: VARCHAR2(20) Protein_sequenceKey: INTEGER Protein_sequence: VARCHAR2(20) Protein_key: INTEGER CARBPATHWAY Cpath_key: INTEGER Cpath_name: VARCHAR2(20) Carb_key: INTEGER Enzyme_key: INTEGER Cpath__Reaction: VARCHAR2(20) Cpath_condition: VARCHAR2(20) Cpath_stepFromCarb: INTEGER Cpath_yield: INTEGER PROTEIN Protein_key: INTEGER BIOFUNCTION_CATEGORY Protein_ID: VARCHAR2(20) ProteinID_type: VARCHAR2(20) Protein_name: VARCHAR2(20) Protein_CommonName: VARCHAR2(20) Protein_family: VARCHAR2(20) Protein_subFamily: VARCHAR2(20) Protein_superUnitKey: INTEGER Protein_type: VARCHAR2(20) Biosource_key: INTEGER Biofunction_key: INTEGER Gene_key: VARCHAR2(20) DNASEQ_key: INTEGER Protein_sequenceKey: INTEGER Protein_biofunctionDesc: VARCHAR2(20) Access_code: VARCHAR2(20) Protein_listdate: DATE Biofunction_key: INTEGER Static Pages (SHTML) for Web Site Dynamic pages (JSPs) for database driven Entry, Dissemination and Queries Biofunction_name: VARCHAR2(20) CARBOHYDRATE Carb_key: INTEGER CARBEXP Carbexp_key: INTEGER carbexp_ID: VARCHAR2(20) carbexp_title: VARCHAR2(30) Protocol_key: NUMBER Carb_key: NUMBER Carbexp_keyword: VARCHAR2(100) Carbexp_Instrument: VARCHAR2(100) Carbexp_comment: VARCHAR2(200) Carbexp_date: DATE Cabexp_batchno: VARCHAR2(20) User_key: NUMBER biosource_key: INTEGER Carb_ID: VARCHAR2(20) Carb_Sciname: VARCHAR2(20) Carb_commonName: VARCHAR2(20) Carb_Family: VARCHAR2(20) Carb_Subfamily: VARCHAR2(20) Carb_Type: VARCHAR2(10) Carb_MW: NUMBER Biofunction_key: NUMBER Carb_BiofunctionDesc: VARCHAR2(100) Carb_listdate: DATE Carb_structure: VARCHAR2(500) Carb_structureImgtype: VARCHAR2(10) Carb_structureImg: BLOB Access_code: VARCHAR2(20) CPBINDING Cpb_key: NUMBER CARBEXPIMG carbExpImg_key: INTEGER Carbexp_key: NUMBER CarbExpImg_exptype: VARCHAR2(20) CarbExpImg_filetype: VARCHAR2(20) CarbExpImg_imageRaw: BLOB Cpb_ID: VARCHAR2(20) Cpb_IDtype: VARCHAR2(20) Carb_key: NUMBER Protein_key: NUMBER Cpb_flag: CHAR(1) Biofunction_key: NUMBER Cpb_BiofunctionDesc: VARCHAR2(100) Cpb_Keq: NUMBER Cpb_DG: NUMBER Cpb_DH: NUMBER Cpb_DS: NUMBER Cpb_listdate: DATE PROTIENREF Paper_key: NUMBER Protein_key: INTEGER CARBPATHREF Paper_key: NUMBER Carb_key: INTEGER Paper_key: NUMBER Protocol_key: INTEGER CPBEXPIMG • CPROFILEEXPIMG cpexpImg_key: INTEGER Cpexp_key: INTEGER CpexpImg_exptype: VARCHAR2(20) CpexpImg_filetype: VARCHAR2(20) CpexpImg_imageRaw: BLOB CpexpImg_desc: VARCHAR2(20) GLYCOARRAY GAEXP Gaexp_key: INTEGER Gaexp_ID: VARCHAR2(20) Gaexp_title: VARCHAR2(20) Protocol_key: INTEGER Garray_key: INTEGER Gaexp_SampleID: VARCHAR2(20) Gaexp_SampleDesc: VARCHAR2(20) Gaexp_keyword: VARCHAR2(20) Gaexp_readtype: VARCHAR2(20) Gaexp_raw: VARCHAR2(20) Gaexp_date: DATE Gaexp_imgfiletype: VARCHAR2(20) Gaexp_imgraw: BLOB Dbuser_key: INTEGER TARGETPROTEIN Carbprofile_key: NUMBER Carbprofile_ID: VARCHAR2(20) Carbprofile_title: VARCHAR2(30) Carbprofile_carbComp: VARCHAR2(200) Carbprofile_avgMW: NUMBER Mouseind_key: NUMBER Carbprofile_source: VARCHAR2(100) Biofunction_key: NUMBER Carbprofile_biofunctionDesc: VARCHAR2(200) Parentprofile_ID: VARCHAR2(20) Enzexp_key: INTEGER Enzexp_ID: VARCHAR2(20) Enzexp_title: VARCHAR2(20) Protocol_key: INTEGER Enzyme_key: INTEGER Enzexp_keyword: VARCHAR2(20) Enzexp_Instrument: VARCHAR2(20) Enzexp_comment: VARCHAR2(20) Enzexp_date: DATE User_key: INTEGER Cbpexp_ID: VARCHAR2(20) Cbpexp_title: VARCHAR2(20) Protocol_key: INTEGER Protein_key: INTEGER Cbpexp_keyword: VARCHAR2(20) Cbpexp_Instrument: VARCHAR2(20) Cbpexp_comment: VARCHAR2(20) Cbpexp_date: DATE Dbuser_key: INTEGER Vision: Present the dynamic “face” of the Consortium via Internet to best utilize the value of the resources and datasets generated by its participants and the broader scientific community. Protocol_key: INTEGER PAPER Protocol_ID: VARCHAR2(20) Protocol_Category: VARCHAR2(20) Protocol_keyword: VARCHAR2(20) Protocol_Title: VARCHAR2(20) Protocol_overview: VARCHAR2(20) Protocol_procedure: CHAR(18) Protocol_trobleshooting: VARCHAR2(20) Dbuser_key: INTEGER Paper_key: NUMBER Paper_ID: VARCHAR2(20) ID_type: VARCHAR2(20) Paper_title: VARCHAR2(200) Paper_year: INTEGER Journal_name: VARCHAR2(200) Page_range: VARCHAR2(20) ENZEXPIMG EnzexpImg_key: INTEGER Enzexp_key: INTEGER EnzexpImg_exptype: VARCHAR2(20) EnzexpImg_filetype: VARCHAR2(20) EnzexpImg_imageRaw: BLOB CBPEXPIMG CbpexpImg_key: INTEGER Cbpexp_key: INTEGER CbpexpImg_exptype: VARCHAR2(20) CbpexpImg_filetype: VARCHAR2(20) CbpexpImg_imageRaw: BLOB CPATHREAGENT Protocol_key: INTEGER Reagent_key: INTEGER REAGENT Reactant_Weightg: INTEGER Reactant_mmol: VARCHAR2(20) Reagent_key: INTEGER Reagent_ID: VARCHAR2(20) Reagent_IDtype: VARCHAR2(20) Reagnet_name: VARCHAR2(20) Reagent_formula: VARCHAR2(20) Reagent_MW: INTEGER CPROFILEID Carbprofile_key: NUMBER Carb_key: INTEGER CPATHINSTRUMENT Protocol_key: INTEGER Instrument_key: INTEGER Concentration: VARCHAR2(20) CPROFILEEXP Cpexp_key: INTEGER cpexp_ID: VARCHAR2(20) cpexp_title: VARCHAR2(20) Protocol_key: INTEGER Carbprofile_key: NUMBER Cpexp_keyword: VARCHAR2(20) Cpexp_instrument: VARCHAR2(20) Cpexp_date: DATE Cpexp_comment: VARCHAR2(20) Dbuser_key: INTEGER MOUSESTRAIN Mouse_key: INTEGER Mouse_ID: VARCHAR2(20) Cell_key: NUMBER Mouse_type: VARCHAR2(20) Protocol_key: INTEGER Mouse_keyword: VARCHAR2(20) Mouse_desc: VARCHAR2(20) MOUSEHISTOLOGY MHS_key: INTEGER MHS_ID: VARCHAR2(20) Mouseind_key: INTEGER MHS_date: DATE MHS_age: INTEGER MHS_weight: INTEGER Protocol_key: INTEGER MHS_purpose: VARCHAR2(20) MHS_result: VARCHAR2(20) MHS_comment: VARCHAR2(20) Dbuser_key: INTEGER MAREF MICROARRAY Marray_key: INTEGER Paper_key: NUMBER Marray_key: INTEGER Marray_ID: VARCHAR2(20) Marray_type: VARCHAR2(20) Marray_desc: VARCHAR2(20) Marray_imgtype: VARCHAR2(20) Marray_img: BLOB INSTRUMENT Instrument_key: INTEGER Instrument_category: VARCHAR2(20) Instrument_name: VARCHAR2(20) Instrument_model: VARCHAR2(20) Instrument_Vender: VARCHAR2(20) MAEXP MAEXPANALYSIS Maexp_key: INTEGER Maexp_ID: VARCHAR2(20) Maexp_title: VARCHAR2(20) Protocol_key: INTEGER Marray_key: INTEGER Maexp_SampleID: VARCHAR2(20) Maexp_SampleDesc: VARCHAR2(20) Maexp_keyword: VARCHAR2(20) Maexp_readtype: VARCHAR2(20) Maexp_raw: VARCHAR2(20) Maexp_date: DATE Maexp_imgfiletype: VARCHAR2(20) Maexp_imgraw: BLOB Dbuser_key: INTEGER Maexpanaly_key: INTEGER Maexp_key: INTEGER Maexpanaly_title: VARCHAR2(100) Maexpanaly_desc: VARCHAR2(500) DBUSER Dbuser_key: INTEGER Dbuser_username: VARCHAR2(20) Dbuser_password: VARCHAR2(20) Dbuser_lastname: VARCHAR2(20) Dbuser_firstname: CHAR(18) Dbuser_email: VARCHAR2(20) Dbuser_employer: VARCHAR2(20) Dbuser_title: VARCHAR2(20) Dbuser_Core: VARCHAR2(20) Dbuser_listdate: DATE TARGETGENE Maexp_key: INTEGER Gene_key: INTEGER Targetgene_desc: VARCHAR2(20) KOGENE KOgene_key: INTEGER MOUSEHEMA KOgene_ID: VARCHAR2(20) Mouse_key: INTEGER Gene_key: INTEGER Gene_ID: VARCHAR2(20) KOgene_synonym: VARCHAR2(20) KOgene_Mtype: VARCHAR2(20) KOgene_UnigenDes: VARCHAR2(20) KOgene_HumanGene: VARCHAR2(20) MH_key: INTEGER MH_ID: VARCHAR2(20) Mouseind_key: INTEGER MH_date: DATE MH_age: INTEGER MH_weight: INTEGER MH_WBC: INTEGER MH_Neutrophils: INTEGER MH_Lymphocytes: VARCHAR2(20) MH_HGB: INTEGER MH_HCT: INTEGER MH_PLT: INTEGER MH_MPV: INTEGER Dbuser_key: INTEGER Gaexp_key: INTEGER Protein_key: INTEGER Targetcbp_desc: VARCHAR2(20) Cpathexp_key: INTEGER CpathexpImg_exptype: VARCHAR2(20) CpathexpImg_filetype: VARCHAR2(20) CpathexpImg_imageRaw: BLOB CBPEXP PROTOCOL Paper_key: NUMBER Cpb_key: NUMBER CARBPROFILE Garray_key: INTEGER Garray_ID: VARCHAR2(20) Garray_type: VARCHAR2(20) Garray_desc: VARCHAR2(20) Garray_imgtype: VARCHAR2(20) Garray_img: BLOB CpathexpImg_key: INTEGER ENZEEXP CPBREF cpbExpImg_key: INTEGER Cpbexp_key: NUMBER CpbexpImg_exptype: VARCHAR2(20) CpbexpImg_imageRaw: BLOB Cpbexpimg_desc: VARCHAR2(20) CpbexpImg_filetype: VARCHAR2(20) CPATHEXPIMG Cbpexp_key: INTEGER Cpbexp_key: NUMBER Middleware implemented using Entity-Class Operations, Servlets in Java Enzyme_ID: VARCHAR2(20) Enzyme_type: VARCHAR2(20) Enzyme_name: VARCHAR2(20) Enzyme_CommonName: VARCHAR2(20) Enzyme_family: VARCHAR2(20) Enzyme_subFamily: VARCHAR2(20) biosource_key: INTEGER Biofunction_key: INTEGER Enzyme_biofunctionDesc: VARCHAR2(20) Gene_key: VARCHAR2(20) DNASEQ_key: INTEGER Enzyme_sequenceKey: INTEGER Proteindb_ID: VARCHAR2(20) SwissPort_ID: VARCHAR2(20) Access_Code: VARCHAR2(20) Protein_listdate: DATE PROTOCOLREF CPBEXP cpbexp_ID: VARCHAR2(20) cpbexp_title: VARCHAR2(20) Protocol_key: INTEGER Cpb_key: NUMBER Cpbexp__keyword: VARCHAR2(20) Cpbexp_Instrument: VARCHAR2(20) Cpbexp_date: DATE Cpbexp_comment: VARCHAR2(20) Dbuser_key: INTEGER Paper_key: NUMBER Cpath_key: INTEGER PAPER_REFERENCE Enzyme_key: INTEGER MOUSEIND Mouseind_key: INTEGER MIND_ID: VARCHAR2(20) Mouse_key: INTEGER Mind_sex: VARCHAR2(20) MOUSEIMMU MI_key: INTEGER Object-based Relational Database implemented using Oracle MI_ID: VARCHAR2(20) Mouseind_key: INTEGER MI_date: DATE MI_age: INTEGER MI_weight: INTEGER MI_sampletype: VARCHAR2(20) MI_CD4: INTEGER MI_CD8: INTEGER B220: INTEGER IAB: INTEGER IgM: INTEGER IgD: INTEGER Dbuser_key: INTEGER MOUSEBEHAV MB_key: INTEGER MB_ID: VARCHAR2(20) Mouseind_key: INTEGER MB_date: DATE MB_age: INTEGER MB_weight: INTEGER MB_GrossNero: VARCHAR2(20) MB_SensorMotorReflex: VARCHAR2(20) MB_PosturalReflex: VARCHAR2(20) MB_motoractivity: VARCHAR2(20) MB_Nocieption: VARCHAR2(20) MB_AcousticStartle: VARCHAR2(20) MB_SocialDomine: VARCHAR2(20) MB_conditionedFear: VARCHAR2(20) MB_waterMaze: VARCHAR2(20) MB_radialArmMaze: VARCHAR2(20) Dbuser_key: INTEGER MOUSEMETABOLISM MM_key: INTEGER MM_ID: VARCHAR2(20) Mouseind_key: CHAR(18) MM_date: DATE MM_age: INTEGER MM_weight: INTEGER MM_Cardiac: VARCHAR2(20) MM_BloodDiastol: INTEGER MM_bloodSystol: INTEGER MM_plumonary: VARCHAR2(20) MM_FoodConsumption: INTEGER MM_waterConsumption: INTEGER MM_activityLevel: INTEGER MM_OxygenConsumption: INTEGER MM_CO2Output: INTEGER MM_HeatOutput: INTEGER Dbuser_key: INTEGER Part of CFG database overall ontology map Core B Organization Administration Ram Sasisekharan (Core Coordinator), Rahul Raman (Core Director) Ada Ziolkowski (Admin Staff) MIT Information Systems Database Administration Core Operations & Management Bioinformatics MIT Information Systems Server Maintenance Information Technology Scientific liaisons, User specifications, Bioinformatics applications Database, web, software applications, user interface development Core B Team Maha Venkataraman, Subu Ramakrishnan, Savitri Subramanian, Thomas Lutteke Ganesh Venkataraman, Wei Lang, Eric Berry, Nishla Keiser, Ishan Capila, Chipong Kwan and Consultants Implementation Strategy Databases and Interfaces Consortium DB Data Objects, Relationships, Raw and Parsable Data User Interface Data Acquisition Public Databases GBP, Glycans, Glycoenzymes Consortium Data Dissemination User Interface for Information Dissemination GBP Molecule Pages GlycoEnzyme Molecule Pages Web-based Forms Annotation Tools Glycan Structures Database Implementation Strategy Schema of CFG Data Objects and Integration Glycobiology, 16(5), 82R-90R Highlights of Progress Years 1-4 Years 1 - 2 Years 3-4 • Infrastructure set-up • Acquisition of Core data • Database implementation • Public release of databases • Software development • CFG data dissemination • Interactions with Cores for data acquisition and dissemination • GBP molecule pages • Glycan structures DB Highlights of Progress Quarters 1-3 of Year 5 CFG data acquisition and dissemination • • • Streamlining the process to upload printed array data from Cores D and H Enhancing interfaces for disseminating scientific Core data Implementation of data dissemination and research tracking interfaces for PIs Specialized databases and molecule pages • • Data organization and implementation of glycosyltransferase database Updating expert contribution fields for C-type lectins in GBP database Other key highlights • • Bioinformatics satellite session at the Society for Glycobiology 2005 meeting – discussion of glycan analysis methods and data exchange formats Renewal application and preparation for Council review Publications • • “Glycomics: An integrated systems approach to structure-function relationships of glycans.” Nature Methods, 2(11): 817-24 “Advancing glycomics: implementation strategies at the consortium for functional glycomics.” Glycobiology, 16(5): 82R-90R CFG Scientific Core Data Data Organization Experiment Information Brief description of scope of experiment or analysis Protocols Sample/Resources Standardized Protocols used in an experiment or analysis Detailed information on sample / resource utilized [mouse, tissue, cell, GBP] Data Summary Summary of interpretation of data by Scientific Core Raw Data Annotated Data Raw data files generated by the analysis – Images, Excel, binary formats Processed information on entities such as gene, mouse, GBP, etc. stored in DB tables CFG Scientific Core Data Statistics (as of June 2006) Gene Microarray (E) 47 Experiments (33) 617 Samples (387) 3546 data files (1826) Glycotechnology (C) Mouse: 11 tissues, 8 KO strains, 92 MALDI-MS spectra Human: 11 Tissues 108 MALDIMS spectra Cell Lines: 12 Cell Lines 27 Spectra Mouse Phenotyping (G) 16 KO Strains (11) 266 Experiments (149) 3116 data files (1100) GBP-Glycan (H) 247 Samples (143) CFG Scientific Core Data Enhanced dissemination interfaces Navigation and downloading of CFG Data – Gene Expression, MALDI-MS Glycan Profiling, Mouse Phenotyping and Glycan Array - DEMO Gene Microarray Data Gene Microarray Data Gene Microarray Data Mouse Phenotyping Data Mouse Phenotyping Data Mouse Phenotyping Data Mouse Phenotyping Protocols Glycan Array Data Glycan Array Data Glycan Array Data Glycan Array Data CFG Data Integration Expression of glycan related genes in wildtype mouse spleen CFG Data Integration Glycan profile of spleen of FucT-VII KO mice with differences in histology staining Tracking PI Research Progress Data dissemination and resource tracking interface Copy of submitted request emailed with Request ID in database Core A PI Request Approval Process Post Status Database Prioritizing/ Approved SC Request ID Data IDs Submit Online Request Form Upload Data with Request ID Core Tracking PI Research Progress Data dissemination and resource tracking interface Navigating PI information page to find associated resource requests and data DEMO GBP Molecule Page Interfaces Three main components Data from Cores Automated Acquisition Data from public databases, links to public resources Interface to CFG resources and data Molecule Information Portal • Glycan array data • Mouse phenotyping data • Transgenic mouse line Expert Contribution Filling out fields as experts on the molecule Contribution from experts obtained for C-type lectins GBP Molecule Pages Updating expert contributions for C-type lectins GBP Molecule Page example for C-type lectins with filled expert contribution fields - DEMO GBP Molecule Pages DC-SIGN Molecule Page Expert Contributions DC-SIGN Molecule Page DC-SIGN Molecule Page DC-SIGN Molecule Page DC-SIGN Molecule Page Expert Contributions PDB IDs of DC-SIGN complexed with glycan ligands identified using PDB2LINUCS tool DC-SIGN Molecule Page GT Database Development Expert annotation of glycosylation pathways Type II Extension/ Termination Composite Structure 343 Genes FucTs expert annotation 40 Sialyl-T 24 Fucosyl-T 69 GlcNAc-T 47 Gal-T 54 GalNAc-T 78 Sulfo-T 31 Man-T GT Database Development Glycosylation interface & GT molecule pages Automated Acquisition Data from public databases, links to public resources Molecule Information Portal Data from Cores Interface to CFG resources and data • Glycan profiling of GT KO mice • Mouse phenotyping data • Transgenic mouse line • Gene expression (Year 6 goals) Glycan Structures Database Glycan Structure Search Interface Glycan Structure Search Interface Glycan Structure Search Interface Glycan Structure Search Interface Glycan DB Search Results Glycan DB Search Results