Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Silencer (genetics) wikipedia , lookup
Genomic imprinting wikipedia , lookup
Exome sequencing wikipedia , lookup
Promoter (genetics) wikipedia , lookup
Ridge (biology) wikipedia , lookup
Gene regulatory network wikipedia , lookup
Gene expression profiling wikipedia , lookup
Non-coding DNA wikipedia , lookup
Gene desert wikipedia , lookup
Genome evolution wikipedia , lookup
Analyzing human variation with Galaxy Belinda Giardine and Cathy Riemer Feb 8, 2012 Outline Part 1: Filtering out SNPs found in genomes of healthy individuals Uploading files Using Galaxy libraries Basic filtering Part 2: Selecting known coding SNPs predicted to be damaging, then finding their genes and associated pathways PolyPhen2 Gene-based analysis Part 3: Running new predictions for coding SNPs likely to be detrimental SIFT Workflows Part 4: Finding SNPs that fall in any given set of intervals Predicted regulatory regions, ENCODE functional data, phyloP conserved regions Fake example dataset SNP calls from Complete Genomics GS12880 5 known disease variants added for illustration Various genes and parts of the gene (coding, regulatory, splicing, …) Realistic background for search, but not a realistic SNP combination Uploading a file Converting file format Shared data Importing datasets from library Filtering SNPs Filter results Outline Part 1: Filtering out SNPs found in genomes of healthy individuals Uploading files Using Galaxy libraries Basic filtering Part 2: Selecting known coding SNPs predicted to be damaging, then finding their genes and associated pathways PolyPhen2 Gene-based analysis Part 3: Running new predictions for coding SNPs likely to be detrimental SIFT Workflows Part 4: Finding SNPs that fall in any given set of intervals Predicted regulatory regions, ENCODE functional data, phyloP conserved regions PolyPhen2 Filtering PolyPhen2 results PolyPhen2 results Linking identifiers Identifier fields Join identifiers to result Comparative Toxicogenomics Database (CTD) Outline Part 1: Filtering out SNPs found in genomes of healthy individuals Uploading files Using Galaxy libraries Basic filtering Part 2: Selecting known coding SNPs predicted to be damaging, then finding their genes and associated pathways PolyPhen2 Gene-based analysis Part 3: Running new predictions for coding SNPs likely to be detrimental SIFT Workflows Part 4: Finding SNPs that fall in any given set of intervals Predicted regulatory regions, ENCODE functional data, phyloP conserved regions SIFT inputs Shared data Workflow Your workflows Running the workflow Running SIFT Filter SIFT results SIFT results Outline Part 1: Filtering out SNPs found in genomes of healthy individuals Uploading files Using Galaxy libraries Basic filtering Part 2: Selecting known coding SNPs predicted to be damaging, then finding their genes and associated pathways PolyPhen2 Gene-based analysis Part 3: Running new predictions for coding SNPs likely to be detrimental SIFT Workflows Part 4: Finding SNPs that fall in any given set of intervals Predicted regulatory regions, ENCODE functional data, phyloP conserved regions Import predicted regulatory regions Filter with intersect tool PRPs results Using ENCODE data Again filter with intersect DNase HSS results Conservation Histogram of phyloP scores Filter on phyloP greater than or equal to 0.5 phyloP results What we covered Part 1: Filtering out SNPs found in genomes of healthy individuals Uploading files Using Galaxy libraries Basic filtering Part 2: Selecting known coding SNPs predicted to be damaging, then finding their genes and associated pathways PolyPhen2 Gene-based analysis Part 3: Running new predictions for coding SNPs likely to be detrimental SIFT Workflows Part 4: Finding SNPs that fall in any given set of intervals Predicted regulatory regions, ENCODE functional data, phyloP conserved regions Editing the dataset name and build