* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Functional Genomics
Oncogenomics wikipedia , lookup
Gene nomenclature wikipedia , lookup
Human genome wikipedia , lookup
Epigenetics of neurodegenerative diseases wikipedia , lookup
Transposable element wikipedia , lookup
Non-coding RNA wikipedia , lookup
Gene desert wikipedia , lookup
Epigenetics of diabetes Type 2 wikipedia , lookup
Biology and consumer behaviour wikipedia , lookup
Non-coding DNA wikipedia , lookup
Messenger RNA wikipedia , lookup
Genomic library wikipedia , lookup
X-inactivation wikipedia , lookup
Vectors in gene therapy wikipedia , lookup
Genomic imprinting wikipedia , lookup
History of genetic engineering wikipedia , lookup
Ridge (biology) wikipedia , lookup
Nutriepigenomics wikipedia , lookup
Gene therapy of the human retina wikipedia , lookup
Polycomb Group Proteins and Cancer wikipedia , lookup
Minimal genome wikipedia , lookup
Long non-coding RNA wikipedia , lookup
RNA silencing wikipedia , lookup
Metagenomics wikipedia , lookup
Genome (book) wikipedia , lookup
Primary transcript wikipedia , lookup
Genome editing wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Microevolution wikipedia , lookup
Mir-92 microRNA precursor family wikipedia , lookup
Epigenetics of human development wikipedia , lookup
Therapeutic gene modulation wikipedia , lookup
Gene expression programming wikipedia , lookup
Genome evolution wikipedia , lookup
Pathogenomics wikipedia , lookup
Epitranscriptome wikipedia , lookup
Designer baby wikipedia , lookup
Public health genomics wikipedia , lookup
Gene expression profiling wikipedia , lookup
RNA interference wikipedia , lookup
Functional Genomics in Non-Model Organisms What is Functional Genomics? • • Functional genomics refers to the development and application of global (genome-wide or system-wide) experimental approaches to assess gene function by making use of the information and reagents provided by structural genomics. It is characterized by high-throughput or large-scale experimental methodologies combined with statistical or computational analysis of the results (Hieter and Boguski 1997) Functional genomics as a means of assessing phenotype differs from more classical approaches primarily with respect to the scale and automation of biological investigations. A classical investigation of gene expression might examine how the expression of a single gene varies with the development of an organism in vivo. Modern functional genomics approaches, however, would examine how 1,000 to 10,000 genes are expressed as a function of development. (UCDavis Genome Center) Functional Genomics Hunt & Livesey (eds.) • • • • • • • Subtracted cDNA Libraries Differential Display Representational Difference Analysis Suppression Subtractive Hybridization cDNA Microarrays Serial Analysis of Gene Expression 2-D Gel Electrophoresis My View of Functional Genomics • Differential Gene expression – SAGE/MPSS – RDA/SSH – *Open systems* • Identifying the Function of Genes – Functional Complementation – RNA interference/RNA silencing Disclaimer • • • • Relevant primarily to eukaryotes Most common systems (literature/class) Personal experience with them I like them Why We Need Functional Genomics Organism E. coli yeast C. elegans Drosophila Arabadopsis mouse human # genes % of genes with inferred function Completion date of genome 4288 6,600 19,000 12-14K 25,000 ~30,000? ~30,000? 60 40 40 25 40 10-20 10-20 1997 1996 1998 1999 2000 2002 2000 My Two Cents (as expressed by Hieter & Boguski 97) • Functional genomics will not replace the timehonored use of genetics, biochemistry, cell biology and structural studies in gaining a detailed understanding of biological mechanisms. • The extent to which any functional genomics approach actually defines the function of a particular protein (or set of proteins) will vary depending on the method and gene involved. mRNA abundance classes (Okamuro & Goldberg) • Superabundant – 15-90% of mRNA mass – <10 structural gene transcripts – >5000 molecules per cell per sequence • Abundant – 50-75% of mRNA mass – ~200-1000 structural gene transcripts (5% of diversity) – 500-2500 molecules per cell per sequence • Rare/complex – <25% of mRNA mass; individual seqs <0.01% – 95% of mRNA diversity – 1-10 molecules per cell per sequence SAGE & MPSS • • • • Serial Analysis of Gene Expression Massively Parallel Signature Sequencing Start from mRNA (euks) Generate a short sequence tag (9-21 nt) for each mRNA ‘species’ in a cell Generate cDNA primed with biotin-oligo(dT) Restriction digest double-stranded cDNA with a 4-base cutter “anchoring enzyme”; bind to streptavidin coated beads AAAA TTTT AAAA TTTT AAAA TTTT GTAC AAAA TTTT GTAC Divide pool in half & ligate to different linkers (1 or 2), both of which have a restriction site for the “tagging enzyme” 1 CATG GTAC AAAA TTTT 2 CATG GTAC AAAA TTTT Restriction digest with a Type IIS restriction enzyme, which recognizes the linker sequences and cuts downstream in a sequence independent fashion; fill-in 5’ overhang to blunt ends. 1 GGATGCATGXXXXXXXXXX CCTACGTACXXXXXXXXXX Blunt end ligate pool 1 to pool 2, and PCR amplify with primers specific to linker sequences 1 and 2 1 2 Tag 1 GGATGCATGOOOOOOOOOO CCTACGTACOOOOOOOOOO Tag 2 GGATGCATGXXXXXXXXXXOOOOOOOOOOCATGCATCC CCTACGTACXXXXXXXXXXOOOOOOOOOOGTACGTAGG 2 Ditag Restriction digest with same anchoring enzyme (above); concatenate ditags and ligate to cloning/sequencing vector Ditag Ditag -----CATGXXXXXXXXXXOOOOOOOOOOCATGXXXXXXXXXXOOOOOOOOOOCATG---- ----GTACXXXXXXXXXXOOOOOOOOOOGTACXXXXXXXXXXOOOOOOOOOOGTAC---Tag 1 Tag 2 Tag 3 Tag 4 SAGE • • • • Described by Velculescu et al. (1995) Originally 9 bp tags, now LongSAGE 21 bp 10-50 tags in a clone Only requires a sequencer (and some time) MPSS • • • • Proprietary technology; published 2000 Generates 17 nt “signature sequence” Collects >1,000,000 signatures per sample Requires 2 µg of mRNA and $$ What is significantly different? Ruijter et al. 2002. Physiol. Genomics 11:37-44. What is significantly different? Planning SAGE experiments… How many tags need to be sequenced? Comparing 2 libraries… MPSS - Alexandrium fundyense 90000 100 80000 90 70000 80 -N signatures (tpm) -N signatures (tpm) 39931 unique tags; 3172 different at p<0.001 60000 50000 40000 30000 20000 70 60 50 40 30 20 10000 10 0 0 0 20000 40000 60000 -P signatures (tpm) 80000 100000 0 20 40 60 -P signatures (tpm) 80 100 Not every tag is a unique sequence Not every sequence has a unique tag • Alternative splicing, >1 tag per gene • No restriction site, no tags per gene • Sequencing error (random, 0.7% for SAGE, Velculescu et al. 1995) • Antisense transcripts Tag Abundance Distribution 10000 100000 1000 # tags -P # tags -N 100 10 # tags (p<0.001) # of tags 10000 1000 # sig tags -P 100 # sig tags -N 10 1 1 >1% >0.1% >0.01% abundance >0.001% <0.001% >1% >0.1% >0.01% abundance >0.001% <0.001% P= 1.5 N >N :P >1 N: P> 1.5 N: P> 2 N: P> 5 N: P> 10 N: P> 20 N: P> 40 N: P> 50 N: P= 0 P: N= 0 P: N> 50 P: N> 40 P: N> 20 P: N> 10 P: N> 5 P: N> 2 P: N> 1.5 1.5 >P :N >1 # tags (p<0.001) P= 1.5 N >N :P >1 N: P> 1. 5 N: P> 2 N: P> 5 N: P> 10 N: P> 20 N: P> 40 N: P> 50 N: P= 0 P: N= 0 P: N> 50 P: N> 40 P: N> 20 P: N> 10 P: N> 5 P: N> 2 P: N> 1 .5 1.5 >P :N >1 # of tags 100000 10000 1000 100 10 1 expression ratio 10000 1000 100 10 1 expression ratio Expression Ratio RDA • Initially used for DNA comparisons (Lisitsyn et al. 1993) • Later modified for cDNA to reduce complexity (Hubank and Schatz 1994) • May need >1 enzyme to cover all genes • Should pick up transcript present at <=0.005% • Time-intensive + a LOT of manipulation Success with RDA • DNA markers in ginbuna (Murakami et al. 2002) • mRNA induced under hypoxia in tiger salamander (McKean et al. 2002) • Rice & date palm 2002; oak 2001; tobacco 2000; pea & maize 1998; earliest 1996 • No more recent refs MPSS - Alexandrium fundyense 90000 100 80000 90 70000 80 -N signatures (tpm) -N signatures (tpm) 39931 unique tags; 3172 different at p<0.001 60000 50000 40000 30000 20000 70 60 50 40 30 20 10000 10 0 0 0 20000 40000 60000 -P signatures (tpm) 80000 100000 0 20 40 60 -P signatures (tpm) 80 100 Tester cDNA with Adaptor 1 Driver cDNA (in excess) Tester cDNA with Adaptor 2 first hybridization all components denatured a b c d { second hyb: mix, add freshly denatured driver; anneal a,b,c,d + e fill in the ends a add primers; PCR amplify no amplification b no amplification c linear amplification d no amplification e exponential amplification Efficacy of SSH… Ji et al. 2002 BMC Genomics 3:12 • Diatchenko et al. 1996; could detect as little as 0.001% target • Critical factor is relative concentration of target in tester and driver populations • Effective enrichment when: – Target present at >= 0.01% – Concentration ratio>= 5-fold What this looks like 5000 1000 4500 900 4000 800 3500 700 -N signatures (tpm) -N signatures (tpm) 208 signatures at >=0.01%, >= 5-fold induction 3000 2500 2000 1500 600 500 400 300 1000 200 500 100 0 0 1000 2000 3000 -P signatures (tpm) 4000 5000 0 0 200 400 600 -P signatures (tpm) 800 1000 Success with SSH • Armbrust 1999, diatoms • Lots of biomedical refs 2003 • Xylella, Aspergillus, Dunaliella Post-translational gene silencing Fungi Neurospora quelling transgenes Plants Petunia, Nicotiana, Arabadopsis, rice, tomato, potato, etc. PTGS Co-suppression transgenes viruses Animals: Invertebrates C. elegans Drosophila Paramecium Planaria Hydra T. brucei RNAi RNAi Co-suppression RNAI RNAi RNAi dsRNA dsRNA transgenes dsRNA dsRNA dsRNA Animals: Vertebrates Zebrafish mouse RNAi RNAi dsRNA dsRNA Kamath et al. 2003 16,757 strains = 86% of predicted ORFs Looked for sterility or lethality(Nonv), slow growth (Gro) or defects (Vpep) 1,722 strains (10.3% had such phenotypes) Genes involved in basic metabolism & cell maintenance are enriched for Nonv phenotype Genes involved in more complex ‘metazoan’ processes (signal transduction, transcriptional regulation) are enriched for Vpep phenotype Nonv phenotypes highly underrepresented on the X chromosome X chromosome is enriched for Vpep phenotypes Basal functions of eukaryotes are shared: - lethal (Nonv) genes tended to be of ancient origin - ‘animal-specific’ genes tended to be non-lethal (Vpep) - almost no ‘worm-specific’ genes were lethal Genes producing a defective phenotype are clustered: Nonv clustered in central regions, except: on the X chromosome, which is underenriched for Nonv phenotypes Functional Complementation • Often yeast, E. coli • The goal of the SGDP is to generate as complete a set as possible of yeast deletion strains with the overall goal of assigning function to the ORFs through phenotypic analysis of the mutants. • As of 01/03, 95% of the approx. 6200 ORFs have been deleted; more than 20,000 strains are available from Research Genetics, Open Biosystems and the ATCC. Functional Complementation • Intramembrane cleaving proteases: Drosophila rhomboid complements the aarA of Providencia stuartii and vice versa (Gallio et al. 2002) • Cyclophilin-RNA interacting proteins in Paramecium, conserved from yeast to humans (Krzywicka et al. 2001)