* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Comparative Genomics
Survey
Document related concepts
Exome sequencing wikipedia , lookup
RNA interference wikipedia , lookup
Whole genome sequencing wikipedia , lookup
Non-coding RNA wikipedia , lookup
Gene desert wikipedia , lookup
Non-coding DNA wikipedia , lookup
Gene expression wikipedia , lookup
Genomic imprinting wikipedia , lookup
Gene regulatory network wikipedia , lookup
Community fingerprinting wikipedia , lookup
Transcriptional regulation wikipedia , lookup
Ridge (biology) wikipedia , lookup
Genomic library wikipedia , lookup
Molecular evolution wikipedia , lookup
Promoter (genetics) wikipedia , lookup
Silencer (genetics) wikipedia , lookup
Gene expression profiling wikipedia , lookup
Transcript
The Human Genome Project Public: International Human Genome Sequencing Consortium (aka HUGO) Private: Celera Genomics, Inc. (aka TIGR) The HGP 1st proposed in 1986 In addition to humans, the effort included E. coli, yeast, C. elegans, Drosophila, and mouse Funded in 1988 Estimated cost: $3 billion Got underway in 1990 Final cost: $2.6 billion 1st genome sequenced in 1995 (TIGR) Yeast sequenced in 1996 E. coli sequenced in 1997 C. elegans sequenced in 1998 Drosophila sequenced in 2000 (Celera) The Human Sequence Human draft sequence released in Jan. 2001 (HUGO & Celera) The genome was sequenced about 4 times over Contained errors and gaps Gaps can exist: 1) within unfinished sequence clones 2) between sequenced BACs 3) between mapped BACs The finished sequence, released in April of 2003, was sequenced 8 times over, had 1 error in 10,000 bases and did not contain significant gaps The “Typical” Human Gene Size of exons # of exons Size of introns Size of 3’ UTR Size of 5’ UTR Coding sequence size CDS Genomic extent 145 bp 8.8 3,365 bp 770 bp 300 bp 1,340 bp 447 aa 27 kb The Number of Human Genes 140,000 120,000 100,000 80,000 60,000 40,000 20,000 0 Early estimates Later estimates Draft sequence Final sequence # of Genes in Other Organisms 25000 20000 15000 10000 5000 0 M. g E. c S. c D. m C. e H. s A. t Orthologs of Human Proteins Where did the prokaryotic orthologs come from? One possibility is horizontal transfer 41 genes may have been transferred in this way For example: MAOs, monoamine oxidases These enzymes deactivate neurotransmitters Another possibility is the loss of these genes over time so that most eukaryotes lack them Functional Categories of Proteins Families of Transcription Factors Some surprises from the HGP Not every gene has its own promoter Not every gene encodes a protein The number of genes in our genome Promoters: a number of adjacent genes are transcribed simultaneously. These genes were shown to share a promoter, much like prokaryotes control gene expression. Genes that do not encode proteins tRNA rRNA snRNAs (small nuclear RNAs) snoRNAs (small nucleolar RNAs) ncRNAs (non-coding RNAs) These are untranslated genes such as the let-7 gene in C. elegans. It encodes a 21-base RNA that binds to another gene How Can We Have So Few Genes? Combinatorial Control We are not just 1.5 times as complex as flies, even though we have about 1.5 times the number of genes. If each gene has 2 states: on or off, then there are 213,600 different combinations in Drosophila but 221,000 different combinations in humans. Alternate Splicing Epigenetic Control