* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download 5` 3`
Epigenetics of diabetes Type 2 wikipedia , lookup
Metagenomics wikipedia , lookup
Messenger RNA wikipedia , lookup
Epigenetics in learning and memory wikipedia , lookup
Extrachromosomal DNA wikipedia , lookup
Cancer epigenetics wikipedia , lookup
Nucleic acid tertiary structure wikipedia , lookup
Gene therapy wikipedia , lookup
Polycomb Group Proteins and Cancer wikipedia , lookup
Genomic imprinting wikipedia , lookup
Nucleic acid analogue wikipedia , lookup
Cre-Lox recombination wikipedia , lookup
Ridge (biology) wikipedia , lookup
Gene nomenclature wikipedia , lookup
Biology and consumer behaviour wikipedia , lookup
Transposable element wikipedia , lookup
Epigenetics of neurodegenerative diseases wikipedia , lookup
Gene desert wikipedia , lookup
Genetic engineering wikipedia , lookup
Long non-coding RNA wikipedia , lookup
RNA interference wikipedia , lookup
Short interspersed nuclear elements (SINEs) wikipedia , lookup
Gene expression programming wikipedia , lookup
Minimal genome wikipedia , lookup
Genetic code wikipedia , lookup
Human genome wikipedia , lookup
RNA silencing wikipedia , lookup
Deoxyribozyme wikipedia , lookup
Point mutation wikipedia , lookup
Epitranscriptome wikipedia , lookup
Genome editing wikipedia , lookup
Nutriepigenomics wikipedia , lookup
History of RNA biology wikipedia , lookup
Genome (book) wikipedia , lookup
Vectors in gene therapy wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Non-coding DNA wikipedia , lookup
Gene expression profiling wikipedia , lookup
History of genetic engineering wikipedia , lookup
Genome evolution wikipedia , lookup
Non-coding RNA wikipedia , lookup
Epigenetics of human development wikipedia , lookup
Designer baby wikipedia , lookup
Primary transcript wikipedia , lookup
Microevolution wikipedia , lookup
Therapeutic gene modulation wikipedia , lookup
Helitron (biology) wikipedia , lookup
DNA STRUCTURE DOUBLE HELIX 3’ 5’ 3’ 5’ Antiparallel DNA strands Hydrogen bonds between bases Fig.1.8 HOW TO DEFINE A GENE? (there are many descriptions...) - sequence of DNA essential for specific function - codes for protein or structural RNA ATG 5’ 3’ TAA 3’ 5’ DNA “structural” gene Transcription & RNA processing Gene + flanking regulatory sequences AUG UAA 5’ 3’ RNA UTRs - untranslated regions which flank the coding sequence in a mRNA (so in transcribed region) Where is translation initiation site? Where is transcription initiation site? promoter? Eukaryotic (but not prokaryotic) genes usually contain introns 5’ 3’ ATG Intron 2 Intron 1 “Exon 1” Exon 2 “Exon 3” 3’ 5’ 5’ UTR Exon 1 coding region Exon 2 TAA 3’ UTR 3’ 5’ DNA mRNA Exon 3 Intron - non-coding sequences removed from pre-RNA (by splicing) Exon - sequences that remain in mature RNA (mostly coding) Nomenclature “problem”: • Textbooks (& papers) often show only coding sequences as exons, but first exon includes 5’UTR and last exon includes 3’UTR • Dilemma because often the positions of RNA ends are not known or tissue-specific differences • Introns can also occur within UTR regions Example of human pax6 gene Lines: introns Bars: exons What does the bent arrow signify? Tall bars: coding exons Short bars: non-coding exons Where would the initiation and stop codons be? Mercer Nat Rev Genet 10: 155, 2009 1. Human genes: Intron length: typically ~200 nt to > 10 kb Number per gene: several to dozens… Exon length: typically 100 - 200 nt Extreme example: dystrophin gene (~2400 kb) with ~78 introns!! Tennyson, Klamut & Worton (1995) “The human dystrophin gene requires 16 hours to be transcribed and is cotranscriptionally spliced” Nat Genet.9:184-90 Genes-within-genes! Other genes are sometimes located within long introns! … in same or opposite orientation (see Practice set #1, question 4) 2. Plant genes: Intron density similar to animals, but shorter length: typically 100 - 300 nt 3. Yeast genes: < 5% have introns (vs. mammals where >95% genes have introns) - mostly in tRNA genes (intron length ~ 20-30 nt) …and in ribosomal protein genes (intron length ~ 100-500 nt) Structure of NF2 (neurofibromatosis type II) gene in various animals What features of this gene are different among these animals? Golovnina et al. BMC Evol Biol 2005 Bacterial genes are often organized in operons with short intergenic spacers - polycistronic mRNA, but each gene has its own start and stop codons Gene A Gene B Gene C But neighbouring operons might be in opposite orientation in genome Gene 2 Gene 1 5’…ATAGGACAT 5’ …gatcgctctataggaggtgc ATGCAATGG…3’ 3’…TATCCTGTA ctagcgagatatcctccacg TACGTTACC…5’ Aside: My examples will often show unrealistically short sequences What are N-terminal sequences of proteins encoded by genes 1 and 2? See also Practice question #2 Where would promoter(s) for genes 1 and 2 be located? Gene 2 Gene 1 Presence of genes located close together but encoded on opposite strands is sometimes also seen in eukaryotic genomes bidirectional promoter ? Adachi & Lieber Cell 109: 807, 2002 5’ RNA structure Features of RNA vs. DNA RNA synthesis 5’ 3’ “Coding strand” Template strand mRNA has same sequence as coding strand (except U instead of T) RNA synthesized in 5’ to 3’ direction with antiparallel DNA strand as template Fig.1.11 3’ Alberts Fig.6.4 RNA content of a cell small regulatory RNAs snRNAs (small nuclear) - role in splicing Fig.1.12 small non-coding (nc) regulatory RNAs are also present in bacteria sRNAs snoRNA (small nucleolar) - role in methylation of rRNAs miRNA (microRNAs) & siRNA (short interfering RNAs) - role in regulation of expression of individual genes RNA processing in eukaryotes - presence of long introns (& short exons) can make finding genes in eukaryotic DNA sequences difficult - may be alternative splicing pathways so more than one protein generated from one gene (Discussed later, Chapter 6) Fig.1.13 Link between transcriptome & proteome Mediated by tRNAs (codon-anticodon) Genetic code “standard code” - can deduce amino acid sequence of protein from nt coding sequence … using genetic code table Fig.1.2 See Practice question #1 Fig.1.20 PROTEIN-CODING GENES DNA divided into triplets (codons) 5’ …. ATG GGA TTG CCC GCC …. 3’ “coding strand” 3’ .… TAC CCT AAC GGG CGG …. 5’ “template strand” mRNA 5’ …. AUG GGA UUG CCC GCC …. 3’ - in research papers DNA usually shown as single-stranded with coding strand in 5’ to 3’ orientation (left to right) … so genetic code table can be used directly Amino acid one-letter abbreviation often used instead of 3-letters Translation termination codons Initiation codon Remember that although AUG is the standard initiation codon, there can also be AUG triplets within an ORF, … specifying internal Met residues in the protein And when analyzing DNA data obtained in the lab, initiation codon might be located outside the sequenced region Alberts Fig. 6-50 Examples of deviation from the standard genetic code in mitochondria and microbes Table 1.3 PROTEIN SEQUENCE & STRUCTURE Fig.13.24 Fig.1.17 Different proteins can be generated from single precursor polypeptide through post-translational events …so can have larger proteome (set of proteins) than predicted from number of genes in genome Cis-acting element: DNA (or RNA) sequences near a gene, that are important for its expression Latin word “cis” means "on the same side as” Trans-acting factor: protein (or RNA) that binds to cis-element to control gene expression 5’ DNA 3’ ATG TAA 3’ 5’ Cis-elements can actually be quite far away from genes they control in intergenic spacers (ENCODE project) and within introns