Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Genome Organization and Evolution Assignment For 2/24/04 Read: Lesk, Chapter 2 Exercises 2.1, 2.5, 2.7, p 110 Problem 2.2, p 112 Weblems 2.4, 2.7, pp 112-113 Assignment For 3/02/04 Pick any two bioinformatics projects or resources, such as those in the previous lecture. For each, write a brief survey (~1000 words), giving such information as: the history of the project; the participants; the funding; its purpose and scope. Sources: web site, mailing lists, faqs, published papers. Genes ● Definition: A gene is a segment of DNA which codes for a protein – Caveats: – DNA which codes for functional RNA? – Control regions? Gene organization ● ● ● A gene may occur on either strand of DNA Genes are continuous stretches (almost always) in prokaryotes Genes are (often) discontinuous stretches (exons) in eukaryotes. The intervening regions are called introns ● Upstream is a binding site ● Location of regulatory region is less predictable The Central Dogma ● One gene, one protein ● Like most dogmas, not entirely true ● ● ● Alternative splicing permits the manufacture of many products from a single gene The protein products are sometimes called the proteome With current technology, more gene information is available than protein information Transmission of information ● ● The continuity of life is a reflection of the (nearly) faithful transmission of genetic information The adaptation of life (evolution) is a result of imperfect transmission of information, and natural selection Genetic maps ● ● ● Variable number tandem repeats (VNTRs – minisatellites), 10-100 bp, are a sort of genetic fingerprint Short tandem repeat polymorphisms (STRPs – microsatellites), 2-5 bp, are another kind of marker A sequence tagged site (STS), 200-600 bp, is a known unique location in the genome Identifying genes ● ● ● A long ORF is probably a gene (but what about eukaryotes? AG and GT splice signals) A gene promoter site has identifiable characteristics (TATA box) If it looks like a known gene, it's a gene Prokaryote genomes ● Example: E. coli ● 89% coding ● 4,285 genes ● 122 structural RNA genes ● Prophage remains ● Insertion sequence elements ● Horizontal transfers Eukaryotic genome ● Example: C. elegans ● 10 chromosomes ● 19,099 genes ● Coding region – 27% ● Average of 5 introns/gene ● Both long and short duplications Evolution of genomes ● ● Adaptation of species is coterminous with adaptation of genomes Where do genes come from? (Answer: from other genes) ● Homologs and paralogs ● Lateral transfer ● ● Molecular species each have their own family tree Genes are widely shared Close relatives ● ● ● ● ● Yeast, fly, worm and human share at least 1308 groups of proteins Unique to vertebrates: immune proteins (for example) Unique molecules are adapted from ancient molecules of different purpose but similar design Most new proteins come from domain rearrangement Most new species come from control region variation