Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
CaRE Center Informatics NHLBI CaRE Center Meeting Bethesda, MD July 25, 2006 Marcia Nizzari CaRE Center Informatics • Builds on existing Genetic Analysis Platform – Operational for 2+ years – Genotyping and Resequencing – Code base successfully reused • CaRE Center enhancements: – Data sharing strategy – Phenotype/Trait thesaurus, meta thesaurus – Customizable analytic pipelines User Experience – Production Three “portals” or dashboards – • Sample Management – Register and fingerprint samples, manage storage and aliquots for experiments – Record phenotypes for Individuals and Samples • Project Management – Manage Groups, Projects, plan your experiments – Shunt filtered results into analysis pipelines • Process/LIMS Management – Design and execute experiments per platform, curate results • Affy, Illumina, Sequenom or resequencing High Level Workflow – for CaRE Analysis: Gene Pattern + Production: CaRE analysis tools BSP/GAP + CaRE enhancements Create Experiments (Samples x Features) Project DB Feature DB Design and Execute Experiments QC/Curate Results Data Compile BSP DB Web Services Upload Samples, Peds, Individuals, Phenotypes LIMS DBs Data Vault Summarize/Filter PLINK Association & Statistics Viewers Cohort’s Custom Algorithms, Viewers Production Screenshots Upload Phenotypes, Create Experiments, Curate Results, Filter by Phenotype for Analysis Project Management dashboard Showing Phenotype Upload Anticipate significant enhancements to handle CaRE Center requirements. Project Management dashboard Showing Experiment Definition Experiments flow through the Process Dashboard for execution; they provide the unit of logical reporting on progress. Process Dashboard Showing QC Report on Affy chemistry plates – Fingerprints to the right! Lab techs and coordinators can view and curate plates; set up re-hyb and redo pipelines. Project Management dashboard Showing QC Statistics and Pheno Query Production analysis workflow executed prior to exporting data for Gene Pattern pipeline association study analyses. Project Management dashboard Search phenotypes to slide and dice results for analysis Resulting subset will be piped into Gene Pattern pipeline for analysis on derived, curated dataset. User Experience -- Analysis • GenePattern framework – Provides “pluggable” backplane – Can string together tools in a pipeline – Tracks everything for ‘reproducible research’ • For CaRE Center – We create templates for our standard analysis methods – Cohort teams can customize – Streamlines publication! Screenshots for Analysis Gene Pattern framework with PLINK and custom reporting High Level Workflow – for CaRE Analysis: Gene Pattern + Production: CaRE analysis tools BSP/GAP + CaRE enhancements Create Experiments (Samples x Features) Project DB Feature DB Design and Execute Experiments QC/Curate Results Data Compile BSP DB Web Services Upload Samples, Peds, Individuals, Phenotypes LIMS DBs Data Vault Summarize/Filter PLINK Association & Statistics Viewers Cohort’s Custom Algorithms, Viewers Complied Files for PLINK QC Report (In browser) Issues/Questions • Scope of phenotype-related enhancements • Group/Project structure for CaRE Center • CaRE user visibility into Process Dashboard/LIMS • Data release model decision – Data Enclave scenarios and security • User training and doco – Analysis methodology – System and security training Security for Production & Analysis BSP Lab Technician Users in JAAS domain CaRE Cohort Technician Project Management Groups, Projects, Grants, Panels, Feature Sets, Sample Sets Process/ LIMS Proj Mgt Security Context (Project) Lab Security Broad Lab Technician, Context Coordinator (X-Project) Biological Samples Platform BSP Security Context (Sample Collection) Shareable Objects: Peds, Individuals, Phenotypes, Samples, Features LSIDs PIPS DB Feature DB CaRE Scientist Analysis Pipelines CaRE Analysis Security Context (Scope based on rules of Data Enclave, could cover multiple Projects) The World MIT The Broad Institute Firewalls Cisco Pix Internet “Cloud” MIT On LIMS Used for authentication for VPN access Radius DB Core Router Cisco Pix Open jack Host B Access Rules for Subnets: Explicit allows, e.g., allow host on LIMS to talk to host on server Host on server … Allow Rules: Explicit allows – http = 80 -> host Ssh = 22 -> host https = 443 (SSL) Host A Must be in the list to permit access Unregistered 10.10 domain Wireless Acknowledgements • • • • Genetic Analysis Platform team Biological Sample Platform team GenePattern team Stacey Gabriel, David Altshuler, Mark Daly • URLs: – – – – GenePattern: http://www.broad.mit.edu/cancer/software/genepattern/ PLINK: http://pngu.mgh.harvard.edu/~purcell/plink/ Haploview: http://www.broad.mit.edu/mpg/haploview/ Center for Genotyping and Analysis: http://www.broad.mit.edu/gen_analysis/genotyping/