* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Notes from Lecture 1 - Tufts Computer Science
Epigenomics wikipedia , lookup
Non-coding RNA wikipedia , lookup
Human genome wikipedia , lookup
Genomic library wikipedia , lookup
Genome (book) wikipedia , lookup
Cre-Lox recombination wikipedia , lookup
Extrachromosomal DNA wikipedia , lookup
Nutriepigenomics wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Minimal genome wikipedia , lookup
Gene expression profiling wikipedia , lookup
Protein moonlighting wikipedia , lookup
Deoxyribozyme wikipedia , lookup
Genome evolution wikipedia , lookup
Non-coding DNA wikipedia , lookup
Nucleic acid analogue wikipedia , lookup
Polycomb Group Proteins and Cancer wikipedia , lookup
Genome editing wikipedia , lookup
Expanded genetic code wikipedia , lookup
Vectors in gene therapy wikipedia , lookup
Designer baby wikipedia , lookup
Primary transcript wikipedia , lookup
Epigenetics of human development wikipedia , lookup
Microevolution wikipedia , lookup
Genetic code wikipedia , lookup
History of genetic engineering wikipedia , lookup
Point mutation wikipedia , lookup
Helitron (biology) wikipedia , lookup
Comp 150-05: Computational Biology Challenges Prof. Lenore Cowen Tufts University, Fall 2016 Scribe: Rebecca Newman Lecture 1: Biology Review 1 Metaphor: The Cell = Factory Floor Proteins are the machines, DNA is the blueprint set 2 Metaphor: Proteins = Beads on a String 20 types of ”beads” called amino acids. Amino acids differ from eachother based on what ”hangs off” the standard chemical backbone. We use letters represent the alphabet of all amino acids. 3 The Central Dogma There exists a direct correspondance between the 4-letter DNA alphabet and the 20-letter protein alphabet. DNA Alphabet 4 DNAalphabet triple Protein Alphabet Genes Genes are the parts of the DNA that code for proteins. You can get different proteins from the same portion of DNA via splicing. 1 5 Available Data Linear Sequence on Chromosome RNA: Transcription Expression Levels, Abundance Ribosomes: Translation Protein sequence, structure Protein Function Amino Acid Chain Phenotypes Information Inferred Variations, annotations Alt. Splicing, Non-coding RNA There is a huge knowledge gap between data collection and understanding of the data. It’s publicly available, we just need to process and interpret it! How is this data organized? Either by organism or across-organisms. Which models do we want to look at? Yeast, worm, fly, mouse, rat, plants, human. 2 6 Yeast Baker’s yeast is a simple single-cell Eukaryote: 12 million base pairs, 32 chromosomes, 6000 genes. Yeast Genome Database: www.yeastgenome.org ORF (Open Reading Frame): The exact codon letters for this gene is known. Sequence Section: tells you exactly where on the chromosome the gene is located. Essential Gene: If you inactivate the gene, the organism dies. 3