* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Download RNA
X-inactivation wikipedia , lookup
Genealogical DNA test wikipedia , lookup
History of RNA biology wikipedia , lookup
DNA polymerase wikipedia , lookup
DNA damage theory of aging wikipedia , lookup
Bisulfite sequencing wikipedia , lookup
Gene desert wikipedia , lookup
Gene therapy wikipedia , lookup
Zinc finger nuclease wikipedia , lookup
Genome (book) wikipedia , lookup
Polycomb Group Proteins and Cancer wikipedia , lookup
Metagenomics wikipedia , lookup
Transposable element wikipedia , lookup
Genetic code wikipedia , lookup
Gene expression profiling wikipedia , lookup
Frameshift mutation wikipedia , lookup
Genetic engineering wikipedia , lookup
Cancer epigenetics wikipedia , lookup
Nucleic acid double helix wikipedia , lookup
Human genome wikipedia , lookup
DNA vaccination wikipedia , lookup
Genomic library wikipedia , lookup
DNA supercoil wikipedia , lookup
No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup
Cell-free fetal DNA wikipedia , lookup
Genome evolution wikipedia , lookup
Epigenomics wikipedia , lookup
Molecular cloning wikipedia , lookup
Epigenetics of human development wikipedia , lookup
Nutriepigenomics wikipedia , lookup
Nucleic acid analogue wikipedia , lookup
Extrachromosomal DNA wikipedia , lookup
Cre-Lox recombination wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Non-coding DNA wikipedia , lookup
Point mutation wikipedia , lookup
Deoxyribozyme wikipedia , lookup
Genome editing wikipedia , lookup
History of genetic engineering wikipedia , lookup
Designer baby wikipedia , lookup
Microevolution wikipedia , lookup
Vectors in gene therapy wikipedia , lookup
Primary transcript wikipedia , lookup
Therapeutic gene modulation wikipedia , lookup
The Nature of a GENE; Component Parts Susquehanna MAGNET School for Medicine and Health Sciences October 7, 2013 Professor Michael Chorney Learning Objectives Explain the nature of codons and the information inherent in their base composition Explain what is meant by degeneracy Describe the nature of a gene, its component parts and hallmarks Discuss the nature of exons, and what is meant by an open reading frame; in addition, define the what is meant by exon splicing DNA Scale, the numbers, Review We have a lot of DNA in each gamete (3.1647 x 109 basepairs); cf a bacterium which contains about 4.6 million basepairs 200 phonebooks the size of Manhattan’s (1,000 pages) would be required to tally the information; if you read the bases nonstop, it would take 9.5 years to complete The fertilized ovum contains twice a much DNA as each gamete, or (6.4 x109 basepairs); half maternal, half paternal* * • • • The body contains 10-100 trillion cells=1014 A cell contains 12 picograms of DNA= 12x10-12=6x109 basepairs (G, A, T, C) The body contains 1014 cells x 12x10-12 g=1,200 g of DNA=0.25% wt=6.4x1022 basepairs DNA Scale, the numbers.2, Review The DNA in each somatic cell is arranged into chromosomes, i.e., linear strands of DNA of varying lengths The DNA is condensed by proteins of opposite charge, called histones, which provides a means for regulating base (information) access by other proteins Condensed DNA, during mitosis, can be easily stained, revealing the chromosomes’ size and banding variation (reflected in the variation of A/T and C/G content) Cytogenetics Giemsa-stained metaphase spread of human chromosomes from one cell, the most condensed form of DNA within the cell, seen at MITOSIS Vocabulary metacentric sub-metacentric acro(telo)centric centromere p arm q arm banding heterochromatin euchromatin telomeres autosome Each chromosome is a linear strand of helical ds-DNA with capped ends called telomeres Information flow Sense strand A GENE TRANSCRIPT Figure 6-2 Molecular Biology of the Cell (© Garland Science 2008) Anti-sense strand Like copy ing the leading strand Figure 6-21 Molecular Biology of the Cell (© Garland Science 2008) RNA polymerase replaces U for T (why?) in RNA and ribose for the deoxy sugar Deaminated C=U Figure 6-4 Molecular Biology of the Cell (© Garland Science 2008) DNA, the Puzzle, Review Only a small amount (percentage) of human DNA contains information that is ostensibly converted into proteins: these sequences are associated with genes. The proteins coded for by genes do biochemical work and regulate cell division, generate energy, respond to the environment, provide immunity to invasive DNA sequences (infection), etc. What (and where) is this information we keep hearing so much about? For starters, It resides in the bases; particular triplet base combinations which comprise the exons and provide information called codons You should have been exposed to the list of codons last week in the case study, next slide There are 64 codons that equate to the twenty amino acids (a.a’s), with multiple codons existing for most of the a.a.’s, called degeneracy. Three of the codons are called termination codons, more later CODONS, what do you notice? Figure 6-50 Molecular Biology of the Cell (© Garland Science 2008) The question for molecular biologists: What distinguishes a gene (1-2% of DNA) from the remaining DNA (98%)? This has posed a problem for some time; now that this is becoming solved, the question becomes, what does the ‘gene’ do? Figure 4-7 Molecular Biology of the Cell (© Garland Science 2008) Can you see, at a quick glance, a gene in the sequence at the left? The yellow highlighted bases signify the beta globin gene!!! Genes are subject to the following: 1. They must be recognized by a polymerase, that is, an RNA polymerase that will guide gene copying called TRANSCRIPTION—compare DNA polymerase 2. The collective DNA sequence that summons forth RNA polymerase is called a PROMOTER 3. The information copied into RNA immediately adjacent to the promoter must be readable (CODING SEQUENCE); i.e. no stop codons until the naturally determined end of translation 4. There has to be a place after the coding sequence that signals the end of transcription, different than the end of translation The eukaryotic gene’s general features and processing characteristics 5’ p exon AGGT A AGG exon AGGT A AGG exon AGGT A AGG exon AATAAA 3’UTR 3’ ATG STOP The gene is controlled by a promoter (p) which is not simple – there are generalized transcription factors and more gene-specific ones that may reside outside of the promoter proper, within the gene, within the 3’ end of the gene or even far 5’ and/or 3’ of the gene itself –they open the DNA and expose sites The gene is structured in ‘staccato,’ with coding sequence (exons) interrupted by noncoding intervening sequences, called introns; the first exon begins with the ATG met codon, the last exon ends with one of three translational terimantion codons (TAA, TAG, TGA) Termination of transcription occurs in the 3’ untranslated region (3’UTR) which possesses termination signals and an RNA domain which drives 3’ processing, the AATAAA polyadenylation signal Exon-intron borders possess sequences which aid in splicing, AG/GT……A……AG/G along with small, nuclear RNAs forming the spliceosome 5’UTR exon1 exon 2 exon 3 exon 4 3’UTR CpG Islands: under-represented nucleotides found at the 5’ end of eukaryotic genes AATAAA 5’ p exon AUG AGGT A AGG exon AGGT A AGG exon AGGT A AGG exon 3’UTR STOP CH3ase [CG] Maintaining DNA euchromatic also rests upon factors that bind to C’s and G’s, which protect the CpG ‘islands’ from cytosine methylases best known for their role in imprinting Let’s try a poor analogy, constrained by the English language and a dearth of three-letter words, but Here goes…. Find the three letter (codon)-containing ‘exons’ that make a kind of a sensible phrase (names included)This is comparable to an open reading frame Word DNA …..Wlsjeutlsjimsatouttutyecmdsisladksltkald Thedayforeeeuslkeiandseveeubhismomand ttugosocunntewherebudtedandtueislsiecn Tisnggotallsixeooaltaxlekqzztiellforthebigbadsum rrrrrrrrrrrrteidas……… Answer: jimsatoutthedayforhismomandbudted andgotallsixforthebigbadsum………. jim sat out the day for his mom and bud ted and got all six for the big bad sum………. …..Wlsjeutlsjimsatouttutyecmdsisladksltkald Thedayforeeeuslkeiandseveeubhismomand ttugosocunntewherebudtedandtueislsiecn Tisnggotallsixeooaltaxlekqzztiellforthebigbadsum rrrrrrrrrrrrteidas……… What happens if I delete the s? Jim sat out the day for him oma ndb udt eda ndg ota lls ixf ort heb igb ads um………. FRAMESHIFT—the OPEN READING FRAME IS GONE RNA Figure 6-51 Molecular Biology of the Cell (© Garland Science 2008) CODING SEQUENCE IS CONSERVED SEQUENCE ACROSS SPECIES LEPTIN GENE ALIGNMENT Figure 4-76 Molecular Biology of the Cell (© Garland Science 2008) THERE IS GREATER EVOLUTIONARY PRESSURE TO CONSERVE CODING SEQUENCE (EXONS) THAN INTRON SEQUENCES Figure 4-78 Molecular Biology of the Cell (© Garland Science 2008) DNA, the puzzle.2 Humans have approximately 23,000 genes (down from the 80-140k prediction Genes are dispersed along the chromosomes in what appears to be a random fashion, although many gene clusters exist which seem to aid coordinate expression: globin, histone, immunoglobulin, MHC, etc. Some chromosomes are more rich in genes than others, although chromosome size roughly correlates with gene number A gene’s location is termed its locus as we have touched upon Genes vary in size, from beginning to end And in their number of exons, whose tally following splicing must = an open reading frame, or ORF Exons’ size varies, but average about 200 basepairs (based on my Knowledge of the Ig superfamily members); their translated sequences often equate to ‘domains,’ units of primary amino acid sequence that perform function The average protein is 45Kd (110 for the mw of an average amino acid); the average size of a spliced gene (mRNA) is 1.5 kb, therefore, the amount of coding sequence in the human genome is 0.14% http://www.cshlp.org/ghg5_all/section/gene.shtm BIG GENESl