* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download The Scientific Method in Biology
Comparative genomic hybridization wikipedia , lookup
Agarose gel electrophoresis wikipedia , lookup
Cell-penetrating peptide wikipedia , lookup
Transcriptional regulation wikipedia , lookup
Biochemistry wikipedia , lookup
Community fingerprinting wikipedia , lookup
Maurice Wilkins wikipedia , lookup
Silencer (genetics) wikipedia , lookup
Gene expression wikipedia , lookup
Gel electrophoresis of nucleic acids wikipedia , lookup
Biosynthesis wikipedia , lookup
Two-hybrid screening wikipedia , lookup
Transformation (genetics) wikipedia , lookup
Point mutation wikipedia , lookup
Molecular cloning wikipedia , lookup
List of types of proteins wikipedia , lookup
DNA vaccination wikipedia , lookup
Molecular evolution wikipedia , lookup
Non-coding DNA wikipedia , lookup
Vectors in gene therapy wikipedia , lookup
DNA supercoil wikipedia , lookup
Cre-Lox recombination wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Questa è solamente una traccia; il significato di molte parole chiave qui contenute deve essere espanso dallo studente con altre fonti. A brief introduction to DNA and proteins. The chemistry and biology of DNA and proteins Greg Gloor, Ph.D., Professor of Biochemistry [email protected], 661-3526 Overall concepts Modern biotechnology is built on the products of manipulated DNA Novel biotechnologies involve DNA directly or indirectly (e.g. cloning, barcoding, genetic disease, DNA computing) DNA is an information medium All information that flows within the cell, the body and between generations is ultimately contained in DNA DNA is an information medium The role of DNA is to carry information … within a cell between generations of cells between generations of organisms DNA is an information medium The role of DNA is to carry information … It carries all the information required to build a new organism Including making a faithful copy of its own information Information content The double-helix structure of DNA (deoxyribonucleic acid) was discovered in 1954 by James Watson and Francis Crick. A DNA molecule is composed of four different bases, guanine, thymine, cytosine and adenine (G, T, C, and A, respectively) called nucleotide bases. The bases bind in pairs via a hydrogen bond, and the pairs of bases form a long string, shaped in the form of a double helix. The pairs can only appear as guanine opposite cytosine (G-C) or thymine opposite adenine (T-A), as sketched below: TAC C G TAG G T CA. . . | | | | | | | | | | | | ||| AT G G CAT C CAGT . . . The string of base pairs forms a coded message, in which the bases are the characters of the "alphabet." If one of the pairs of the string is known, then the other one is also known. This property is used during cell division, when the helices unwind themselves and each half is copied. This copying activity can be considered information transfer, but errors in the code may also occur. If we consider a long string of, say, 100,000 bases, then the first "letter" may be either G, T, C, or A, or one of four possibilities. For all 100,000 characters we then have 4 x 4 x 4 x ... 4 = 4100,000 = 2200,000 possible strings of codes. If the probability of occurrence of all strings of codes is equal, then the probability of finding a specific string is p = 2200,000. By Shannon's formula, the information content of the code described by this molecule is therefore I = -log2 p = -log2 2-200,000 = 200,000 (bits). A DNA molecule of 100,000 base pairs has a length of approximately 500,000 Å and is 20 Å thick (1 Å = 10-10 m), which is impressive compared to the amount of space required to store a code of 100,000 bits in a computer. A chromosome that contains on the order of 5 x 109 nucleotides, may code for 10 x 109 bits. For the 23 chromosome pairs in the human genome, this would mean on the order of 5 x 1011 bits (equivalent to about 60 gigabytes). From: http://www.mihandbook.stanford.edu/handbook/home.htm Information Flow DNA RNA Protein Information Flow - major processes DNA is in chromosomes DNA is in chromosomes - base pairs are the bits/letters DNA is in chromosomes - genes are information units I cromosomi e il DNA Ogni cellula del nostro corpo contiene una copia del nostro completo progetto strutturale genetico costituito dai geni (detto anche patrimonio genetico o genoma) Il progetto genetico è suddiviso, distribuito e compattato nelle cellule sotto forma di cromosomi, in numero di 46 e costituiti da stringhe parziali di quei geni Sia i cromosomi che i geni sono costituiti da DNA ■ I cromosomi e i geni sono contenuti nel nucleo delle cellule (no nei globuli rossi, che non hanno un nucleo e quindi non hanno geni e cromosomi) e, in piccola parte, anche nei mitocondri, che hanno un ruolo energetico nella cellula I cromosomi sono costituiti da un numero variabile di geni: alcuni contengono diverse migliaia di geni, altri forse una o due migliaia Non è ancora chiarito esattamente il criterio di ripartizione dei geni nei cromosomi I geni sono sequenze di parole di tre caratteri presi dall’alfabeto A, T, G, C Un sito che presenta una sintesi elementare di concetti base di genetica è: http://www.genetics.com.au/pdf/factSheets/FS1.pdf DNA is in chromosomes - chromosomes package DNA Chromosomes ensure even division of copies of DNA between daughter cells Packaging instructions contained in centric heterochromatin Genes contained in euchromatin Information content of DNA The sequence of the bases along the DNA structure contain the information DNA has polarity and is always read 5’ to 3’. Thus 5’-GATA is different than 3’-GATA All information in DNA is redundant (AT bp and GC bp) Quick Time™a nd a Gra phics de compr es sor a re ne eded to se e th is p ic tu re. Information content of DNA - read in the lab by sequencing Bases are differentially labeled (here by a fluorescent dye) Each peak represents a different base along the DNA sequence As each base has one of 4 different fluorophores, the unique sequence of the DNA is determined Information content of DNA - DNA replication DNA replication make a copy of each original strand Base-pairing controls the specificity DNA polymerase adds the DNA DNA helicase unwinds the DNA Primase makes the RNA primer DNA topoisomerase untwists the DNA Copyright 1999 Access Excellence @ the National Health Museam http://www.accessexcellence.org Information content of DNA - changes over time About 1 base pair is changed per genome per replication cycle 1.5 x 109 bases per genome 1 x 1013 cells per body Each cell in the adult has ~43 random base differences with the original genome Information content of DNA - two types of information DNA information can be cis-acting (intrinsic to the molecule) or transacting (extrinsic to the molecule Cis-acting sequences control how the DNA molecule interacts with the cellular environment Cis-acting sequences have specific names (centromere, telomere) Trans-acting sequences ultimately produce the intracellular environment (proteins, carbohydrates, small molecules, etc) Trans-acting sequences are usually called genes QuickTime™ and a Cine pak decomp ress or are nee ded to s ee this picture. Information content of DNA - cis-information Controls the behavior of the chromosome Centric heterochromatin controls how the cell divides chromosomes during cell division Telomeric heterochromatin caps the ends of the chromosome(which is a linear molecule of DNA) QuickTime™ and a Cine pak decomp ress or are nee ded to s ee this picture. Information content of DNA - is transcribed into RNA RNA is a copy of the information from one of the DNA strands Process is called transcription Primary point of control of gross levels of gene products Base-pairing controls the specificity RNA is unstable and transient compared to DNA RNA can be used directly or be an intermediate for protein production QuickTime™ and a Cine pak decomp ress or are nee ded to s ee this picture. Information content of DNA - a digression into proteins Proteins perform most of the actual work in the cell (structure, enzymes, etc) Proteins are linear polymers of amino acids There are 20 naturally occurring amino acids commonly found in proteins The sequence of a protein corresponds to the sequence of the DNA/RNA that encodes it Proteins fold spontaneously into a the proper 3-D structure Information content of DNA - is usually translated into protein via the RNA intermediate Information in RNA contained in triplets of bases (codons) Codons are non-overlapping and abutting Each codon contains the information required to include one and only one amino acid at a position in a protein The genetic code is universal (except for a few small differences in some very weird organisms) Knowing the DNA or RNA sequence of a chromosome, then allows us to determine the sequences of all the proteins which could be made by the information on that chromosome Terms An enzyme is a biological catalyst, almost always composed of one or more proteins (although RNA is also capable of being an enzyme). Chemical Reaction Converts one or more reactants (H2O2) into one or more products (H2O, O2) Requires energy to start the reaction Products have less energy than reactants Proceeds through a transition state, where the chemical bonds in the reactants are stretched to their limit without being broken Source:http://explanation-guide.info/meaning/Activation-energy.html Catalyst Reduces the activation energy Therefore the reaction proceeds more rapidly, with less input energy The catalyst is not consumed One catalyst can thus enable multiple instances of the same reaction Source:http://explanation-guide.info/meaning/Activation-energy.html Enzyme An enzyme is a biological catalyst Generally enzymes are proteins, but sometimes they are composed of RNA or RNA and protein Only a small portion of the enzyme (shown as balls or sticks) is involved in catalysis of the reactant (shown in red) The remainder of the protein (dots) holds the catalytic residues in the proper 3-D place Catalase converts hydrogen peroxide to water and oxygen Review of Information Flow Four levels of protein structure Primary - sequence Secondary - initial fold Tertiary - 3-D folding Quaternary - interactions between proteins The peptide bond Alpha helices and beta sheets predominate secondary structures They are composed of peptide bonds that repeat their bond angles for several residues Peptide backbones are free to rotate around the bonds shown in blue This rotation is constrained by the R groups (the 20 possible aa side chains) Each peptide bond can assume one of 8 possible conformations (on average) so the number of possible protein conformations is (n-1)8, where n is the number of residues in the protein Secondary structure Alpha helices and beta sheets predominate secondary structures They are composed of peptide bonds that repeat their bond angles for several residues Ionic, hydrogen, van der Waals and hydrophobic interactions all help to stabilize these structures Tertiary structure Alpha helices and beta sheets predominate tertiary structures Protein assumes its characteristic 3-D shape by the secondary structural elements folding on themselves Ionic, hydrogen, van der Waals and hydrophobic interactions all help to stabilize these structures Quaternary structure Proteins fit together the secondary and tertiary structural elements folding on themselves Ionic, hydrogen, van der Waals and hydrophobic interactions all help to stabilize these structures What governs activity? Chemical Equilibrium of a DNA binding protein Kd = 1E-10 M/L [Protein][Ligand] Kd = –––––––––––––––– [Protein:Ligand] Concentrations are in moles/liter Example bacterial cell volume = 1.5E-15 L 1 molecule of DNA = 1/6.02E23 moles = 1.66E-24 moles 1 molecule per cell = 1.66E-24 moles / 1.5E-15 litres = 1.1E-9 M/L 5 molecules protein / cell = 5.5E-9 M/L 98% of the DNA sites are occupied under this condition Bound Ligand [Protein] ––––––––––– = ––––––––––– Free Ligand Kd + [Protein] B –– = .982 F (thanks to Dr. Brian Shilton) This means … That reactions in the cell that depend on proteins are almost never fully on or fully off. Most operations in the cell operate as an analogue machine in intermediate stages of activation Synthesis, Folding, Degradation Synthesis rates account for the relative amount of protein made at any one time Depends upon the amount of mRNA made from the gene and upon the rate that mRNA is translated Synthesis, Folding, Degradation Most proteins assume their final structure with the help of chaperones (proteins that aid folding) Chaperones increase the speed of folding and alert degradation pathways to proteins that cannot fold Synthesis, Folding, Degradation Pathways for degradation include lysosomes and ubiquitin-dependent pathway Proteins are degraded to peptides and amino acids extracellularly (digestive tract, etc) as well Some modifications, such as acetylation or glycosylation, increase the protein’s life span