* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Slide 1
Pathogenomics wikipedia , lookup
Non-coding RNA wikipedia , lookup
Zinc finger nuclease wikipedia , lookup
Epigenomics wikipedia , lookup
Copy-number variation wikipedia , lookup
Genomic imprinting wikipedia , lookup
Molecular cloning wikipedia , lookup
Extrachromosomal DNA wikipedia , lookup
Cancer epigenetics wikipedia , lookup
Polycomb Group Proteins and Cancer wikipedia , lookup
Oncogenomics wikipedia , lookup
X-inactivation wikipedia , lookup
Epigenetics of diabetes Type 2 wikipedia , lookup
Gene therapy of the human retina wikipedia , lookup
Biology and consumer behaviour wikipedia , lookup
Metagenomics wikipedia , lookup
Genomic library wikipedia , lookup
Frameshift mutation wikipedia , lookup
Minimal genome wikipedia , lookup
Cre-Lox recombination wikipedia , lookup
No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup
Genetic engineering wikipedia , lookup
Gene nomenclature wikipedia , lookup
Deoxyribozyme wikipedia , lookup
Transposable element wikipedia , lookup
Gene therapy wikipedia , lookup
Gene expression programming wikipedia , lookup
Human genome wikipedia , lookup
Gene desert wikipedia , lookup
Nutriepigenomics wikipedia , lookup
Genome (book) wikipedia , lookup
Gene expression profiling wikipedia , lookup
Non-coding DNA wikipedia , lookup
Epigenetics of human development wikipedia , lookup
Point mutation wikipedia , lookup
Genome evolution wikipedia , lookup
History of genetic engineering wikipedia , lookup
Genome editing wikipedia , lookup
Primary transcript wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Vectors in gene therapy wikipedia , lookup
Therapeutic gene modulation wikipedia , lookup
Designer baby wikipedia , lookup
Helitron (biology) wikipedia , lookup
The Nature of a GENE; Genomic/Evolutionary Context Gene Duplication Susquehanna MAGNET School for Medicine and Health Sciences October 21, 2013 Professor Michael Chorney Learning Objectives Explain the nature of unequal crossing-over and the consequences related to gene family expansion Explain the ramifications of copying genes, both the positive and negative Describe the nature of a pseudogene and the process by which it deteriorates Discuss the rate of base change in the genome, considering such concepts as purifying selection, synonymous versus nonsynonymous substitution (ds, dn), neutrality etc.—formulate some personal thoughts as there is no right or wrong here. Relay what is meant by the THEORY of evolution. Consider a gene, which is functional………… gene It has a promoter, generates an open reading frame after splicing (maintained consensus sequences for splicing, of course), has 5’ and 3’ untranslated exons, has a polyadenylation site at its 3’ end, etc. Consider a gene, which is functional…………but which succumbs to base changes: these are a function of a variety of enzymatic and chemical processes that may be consistent over millions of years of time (polymerase error, oxidation, radiation,etc.) Some of the mutations are silent and can be tolerated, some are missense but conservative and tolerated okay, some are missense but nonconservative-some are nonsense and some shift frame: these would appear to be under Darwinian selection, and…… Purifying selection, which serves as a gatekeeper by eliminating harmful mutations within important DNA sequences Some view the bulk of mutations and polymorphisms as being generally neutral, in that one allele is as good, or bad, as the next=neutrality (check it out) Somewhere in between may be a more realistic view of the nature of genomic change Within critical genes, conservation is maintained…. Within unimportant sequences, can a more true rate of change be ascertained? A base change rate is called the molecular or evolutionary clock— it can tell the relative distance two organisms are separated, but not an exact time—this is where the fossil record is important The best sort of clock may be based on the frequency of third base position changes, called synonymous substitution rate………. It is my intention to have you think about DNA changes and the composition of the genome—it is ever changing, apparently slowly based on our perspective of time One instance where evolution can speed up is through gene duplication, via UNEQUAL CROSSING-OVER X Meiosis, 2X to 1X or Homologues gamete 1 gamete 2 or, via crossing-over gamete 3 gamete 4=recombinants Let us magnify the chromosome and look at two homologous chromosomes containing alleles lost selected Gamete progenitor gamete 1 gamete 2 gene under purifying selection gene free to succumb to accelerated rate of mutation, first step being loss of CpG binding (protective) proteins creation of a new niche for the mutated ‘paralog’ (within species)—’homolog’ between species Gene conversion-DNA polymerase switches templates during copying of DNA during gametogenesis based on the gene similarity-the sequence incorporated is small donor template is unaltered, recipient becomes a composite, intrachromosomal Consider olfaction genes 400 functional genes 600 pseudogenes and Transplantation Antigen=Major Histocompatibility Complex=Human Leukocyte Antigen and globin genes See: http://en.wikipedia.org/wiki/List_of_gene_families Cytogenetics Giemsa-stained metaphase spread of human chromosomes from one cell, the most condensed form of DNA within the cell, seen at MITOSIS Vocabulary metacentric sub-metacentric acro(telo)centric centromere p arm q arm banding heterochromatin euchromatin telomeres autosome Each chromosome is a linear strand of helical ds-DNA with capped ends called telomeres Information flow Sense strand A GENE TRANSCRIPT Figure 6-2 Molecular Biology of the Cell (© Garland Science 2008) Anti-sense strand Like copy ing the leading strand Figure 6-21 Molecular Biology of the Cell (© Garland Science 2008) RNA polymerase replaces U for T (why?) in RNA and ribose for the deoxy sugar Deaminated C=U Figure 6-4 Molecular Biology of the Cell (© Garland Science 2008) DNA, the Puzzle, Review Only a small amount (percentage) of human DNA contains information that is ostensibly converted into proteins: these sequences are associated with genes. The proteins coded for by genes do biochemical work and regulate cell division, generate energy, respond to the environment, provide immunity to invasive DNA sequences (infection), etc. What (and where) is this information we keep hearing so much about? For starters, It resides in the bases; particular triplet base combinations which comprise the exons and provide information called codons You should have been exposed to the list of codons last week in the case study, next slide There are 64 codons that equate to the twenty amino acids (a.a’s), with multiple codons existing for most of the a.a.’s, called degeneracy. Three of the codons are called termination codons, more later CODONS, what do you notice? Figure 6-50 Molecular Biology of the Cell (© Garland Science 2008) The question for molecular biologists: What distinguishes a gene (1-2% of DNA) from the remaining DNA (98%)? This has posed a problem for some time; now that this is becoming solved, the question becomes, what does the ‘gene’ do? Figure 4-7 Molecular Biology of the Cell (© Garland Science 2008) Can you see, at a quick glance, a gene in the sequence at the left? The yellow highlighted bases signify the beta globin gene!!! Genes are subject to the following: 1. They must be recognized by a polymerase, that is, an RNA polymerase that will guide gene copying called TRANSCRIPTION—compare DNA polymerase 2. The collective DNA sequence that summons forth RNA polymerase is called a PROMOTER 3. The information copied into RNA immediately adjacent to the promoter must be readable (CODING SEQUENCE); i.e. no stop codons until the naturally determined end of translation 4. There has to be a place after the coding sequence that signals the end of transcription, different than the end of translation The eukaryotic gene’s general features and processing characteristics 5’ p exon AGGT A AGG exon AGGT A AGG exon AGGT A AGG exon AATAAA 3’UTR 3’ ATG STOP The gene is controlled by a promoter (p) which is not simple – there are generalized transcription factors and more gene-specific ones that may reside outside of the promoter proper, within the gene, within the 3’ end of the gene or even far 5’ and/or 3’ of the gene itself –they open the DNA and expose sites The gene is structured in ‘staccato,’ with coding sequence (exons) interrupted by noncoding intervening sequences, called introns; the first exon begins with the ATG met codon, the last exon ends with one of three translational terimantion codons (TAA, TAG, TGA) Termination of transcription occurs in the 3’ untranslated region (3’UTR) which possesses termination signals and an RNA domain which drives 3’ processing, the AATAAA polyadenylation signal Exon-intron borders possess sequences which aid in splicing, AG/GT……A……AG/G along with small, nuclear RNAs forming the spliceosome 5’UTR exon1 exon 2 exon 3 exon 4 3’UTR CpG Islands: under-represented nucleotides found at the 5’ end of eukaryotic genes AATAAA 5’ p exon AUG AGGT A AGG exon AGGT A AGG exon AGGT A AGG exon 3’UTR STOP CH3ase [CG] Maintaining DNA euchromatic also rests upon factors that bind to C’s and G’s, which protect the CpG ‘islands’ from cytosine methylases best known for their role in imprinting Let’s try a poor analogy, constrained by the English language and a dearth of three-letter words, but Here goes…. Find the three letter (codon)-containing ‘exons’ that make a kind of a sensible phrase (names included)This is comparable to an open reading frame Word DNA …..Wlsjeutlsjimsatouttutyecmdsisladksltkald Thedayforeeeuslkeiandseveeubhismomand ttugosocunntewherebudtedandtueislsiecn Tisnggotallsixeooaltaxlekqzztiellforthebigbadsum rrrrrrrrrrrrteidas……… Answer: jimsatoutthedayforhismomandbudted andgotallsixforthebigbadsum………. jim sat out the day for his mom and bud ted and got all six for the big bad sum………. …..Wlsjeutlsjimsatouttutyecmdsisladksltkald Thedayforeeeuslkeiandseveeubhismomand ttugosocunntewherebudtedandtueislsiecn Tisnggotallsixeooaltaxlekqzztiellforthebigbadsum rrrrrrrrrrrrteidas……… What happens if I delete the s? Jim sat out the day for him oma ndb udt eda ndg ota lls ixf ort heb igb ads um………. FRAMESHIFT—the OPEN READING FRAME IS GONE RNA Figure 6-51 Molecular Biology of the Cell (© Garland Science 2008) CODING SEQUENCE IS CONSERVED SEQUENCE ACROSS SPECIES LEPTIN GENE ALIGNMENT Figure 4-76 Molecular Biology of the Cell (© Garland Science 2008) THERE IS GREATER EVOLUTIONARY PRESSURE TO CONSERVE CODING SEQUENCE (EXONS) THAN INTRON SEQUENCES Figure 4-78 Molecular Biology of the Cell (© Garland Science 2008) DNA, the puzzle.2 Humans have approximately 23,000 genes (down from the 80-140k prediction Genes are dispersed along the chromosomes in what appears to be a random fashion, although many gene clusters exist which seem to aid coordinate expression: globin, histone, immunoglobulin, MHC, etc. Some chromosomes are more rich in genes than others, although chromosome size roughly correlates with gene number A gene’s location is termed its locus as we have touched upon Genes vary in size, from beginning to end And in their number of exons, whose tally following splicing must = an open reading frame, or ORF Exons’ size varies, but average about 200 basepairs (based on my Knowledge of the Ig superfamily members); their translated sequences often equate to ‘domains,’ units of primary amino acid sequence that perform function The average protein is 45Kd (110 for the mw of an average amino acid); the average size of a spliced gene (mRNA) is 1.5 kb, therefore, the amount of coding sequence in the human genome is 0.14% http://www.cshlp.org/ghg5_all/section/gene.shtm BIG GENESl