Download database of Genotype and Phenotype

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Quantitative trait locus wikipedia , lookup

Public health genomics wikipedia , lookup

Genome-wide association study wikipedia , lookup

Hardy–Weinberg principle wikipedia , lookup

Transcript
database of Genotype and Phenotype
http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=gap
Kim Pruitt
(for Matt Mailman)
NCBI
‹#›
National Center for Biotechnology Information
Overview
• Phenotype
• Genotype
• Genotype X Phenotype Association
‹#›
National Center for Biotechnology Information
Overview
• Phenotype
– Data tables
• Columns are phenotypes
• Rows are individuals
– Documents (ie: protocols, data collection forms)
• Parts of documents linked to variables
– Data dictionary
• Genotype
• Genotype X Phenotype Association
‹#›
National Center for Biotechnology Information
Overview
• Phenotype
• Genotype
– Genotype files directly from vendor
– Intensity files (ie: .CEL)
• Genotype X Phenotype Association
‹#›
National Center for Biotechnology Information
Overview
• Phenotype
• Genotype
• Genotype X Phenotype Association
– Various statistical models and methods
– P-value or LOD score for each marker
– Filters by P-value, HWE, minor allele frequency
– Map phenotypes onto genomic sequence
‹#›
National Center for Biotechnology Information
Overview
• Phenotype
• Genotype
• Genotype X Phenotype Association
• Obvious expansion potential:
– More species; different types of association data (QTL)
• Critically important to archive all data:
–
–
–
–
–
Submit primary data to appropriate public archive!
Probe DB: primers, resequencing amplicons
dbSTS: STS markers
Maps: UniSTS; Map Viewer
GenBank: ESTs
‹#›
National Center for Biotechnology Information
dbGaP Web Site
two levels of access - open and controlled
•open access to non-sensitive data
•study summaries and documents
•measured variables and data elements
•analysis reports
•genome browser
•controlled access provides oversight and accountability for
use of sensitive datasets involving personal information
•De-identified phenotypes and genotypes for individual subjects
•Pedigrees
‹#›
National Center for Biotechnology Information
Browse Studies
http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=gap
Link back to dbGaP homepage
Instructions
Description of dbGaP
Link to study report
List of variables in study
List of documents in study
Automated query to PubMed for genome-wide association study articles
‹#›
National Center for Biotechnology Information
Browse Studies by Disease
Link to study report
Expand/collapse
Link to Terms from MeSH vocabulary
‹#›
National Center for Biotechnology Information
Advanced Search
Fields to be searched
Add any number of search criteria
‹#›
National Center for Biotechnology Information
Study Report
Citeable unique stable identifier
Genotype x phenotype
association or linkage
analyses
History
Links back to submitter website Publications
Attribution
Access Rules
Criteria for inclusion/exclusion
National Center for Biotechnology Information
search this study
Link to
variable
report
‹#›
Variable Report
Citeable unique stable identifier
Documents containing a section that
has been linked to this variable
Statistical summary of values for this variable
P-value is red if cases differ from controls
‹#›
National Center for Biotechnology Information
Variable Report (continued)
Document name
Link to document
Section of document that has
been linked to this variable
‹#›
National Center for Biotechnology Information
Analysis Report
Link back to report for measured or derived variable that was analyzed
Genome browser of analysis results
‹#›
National Center for Biotechnology Information
Genome Browser of Analysis Results
Slider filters results less significant than threshold
2MB bins colored to represent the most
Significantly associated marker
Click on bin of interest to
zoom in and see association
in context with other objects
mapped to the same genomic
region
LINK
‹#›
National Center for Biotechnology Information
Genome Browser – Higher Resolution
Collapse table
P-value of genotyped marker
Scroll via
boxes
above
Add maps
CFH gene has been
associated with AMD
in several studies
‹#›
National Center for Biotechnology Information
Coming Soon…
• Studies
– Early 2007
• Michael J. Fox Foundation Parkinson’s Disease Study (LEAPS)
• NINDS Stroke and ALS
– Spring 2007
• GAIN (Genetic Association Information Network)
• Framingham SHARe – first two generations
• NIDDK GoKinD and EDIC
– Summer 2007
• Framingham SHARe – third generation
– Late 2007- Early 2008
• GEI (Genes and Environment Initiative)
• Features
– Search analysis results by:
• Gene
• SNP or microsatellite marker
• Genomic region
– Filter analysis results by:
•
•
•
•
P-value
HWE
Minor allele frequency
Call rate?
– Download
• Public summaries
• Authorized access
for individual-level
data
National
Center for Biotechnology
Information
‹#›
Acknowledgements
• Phenotype
–
–
–
–
–
–
–
–
–
Rinat Bagoutdinov
Luning Hao
Mas Kimura
Jimmy Jin
Natasha Popova
Stephanie Pretels
Karl Sirotkin
Jack Wang
Matt Mailman
• Genotype
–
–
–
–
–
Mike Feolo
Lon Phan
David Shao
Ming Ward
Steve Sherry
• XML
• Authorized Access
– Steve Sherry
–
–
–
–
–
–
Eugene Yaschenko
Valdimir Soussov
Misha Kimmelman
Don Preuss
Al Graeff
Jim Ostell
– Kim Tryka
– Laura Kelly
– Jeff Beck
http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=gap
National Center for Biotechnology Information
‹#›