* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Notions of Biochemistry and Molecular Biology Manipulating DNA
Holliday junction wikipedia , lookup
Genome evolution wikipedia , lookup
Mitochondrial DNA wikipedia , lookup
Comparative genomic hybridization wikipedia , lookup
Human genome wikipedia , lookup
Nutriepigenomics wikipedia , lookup
Genetic engineering wikipedia , lookup
DNA profiling wikipedia , lookup
DNA sequencing wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Metagenomics wikipedia , lookup
No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup
Point mutation wikipedia , lookup
Designer baby wikipedia , lookup
Cancer epigenetics wikipedia , lookup
DNA polymerase wikipedia , lookup
SNP genotyping wikipedia , lookup
DNA damage theory of aging wikipedia , lookup
Primary transcript wikipedia , lookup
DNA vaccination wikipedia , lookup
United Kingdom National DNA Database wikipedia , lookup
Genealogical DNA test wikipedia , lookup
Genomic library wikipedia , lookup
Vectors in gene therapy wikipedia , lookup
Bisulfite sequencing wikipedia , lookup
Microsatellite wikipedia , lookup
Therapeutic gene modulation wikipedia , lookup
Epigenomics wikipedia , lookup
Molecular cloning wikipedia , lookup
Microevolution wikipedia , lookup
Cell-free fetal DNA wikipedia , lookup
Non-coding DNA wikipedia , lookup
Gel electrophoresis of nucleic acids wikipedia , lookup
Nucleic acid analogue wikipedia , lookup
Cre-Lox recombination wikipedia , lookup
DNA supercoil wikipedia , lookup
Nucleic acid double helix wikipedia , lookup
Extrachromosomal DNA wikipedia , lookup
Helitron (biology) wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Special course in Computer Science: Molecular Computing Lecture 2-3: Notions of Biochemistry and Molecular Biology, manipulating DNA Vladimir Rogojin Department of IT, Åbo Akademi http://combio.abo.fi/teaching/special Fall 2013 06/11/2013 http://combio.abo.fi/teaching/special DNA ➢ DNA ➢ ➢ Base element of Molecular Computing Importance of DNA ➢ ➢ ➢ 06/11/2013 Responsible for inheritance: encodes all genetic information of an organism Responsible for life processes: encodes all “instructions” needed for the functioning of the organism The most fundamental mechanisms of DNA manipulation are the same in all living organisms http://combio.abo.fi/teaching/special wikipedia Today we will learn ➢ Biology of organic compounds and their functioning: ➢ ➢ ➢ ➢ Structure of DNA, RNA, proteins Central dogma: how instructions in DNA are “executed” Genes and chromosomes ➢ Sequencing, ➢ Testing and filtering ➢ Separating and fusing, ➢ Shortening/lengthening, ➢ Cutting/linking ➢ Cloning, ➢ Synthesizing Wet-lab and cellular operations with DNA: 06/11/2013 http://combio.abo.fi/teaching/special Genetics Children resemble their parents: Although they may look more like one than the other, most of them are a blend of the characteristics of both parents Useful traits may be accentuated by controlled mating: Speed in horses, strength in oxes, larger fruits, etc. were studied in centuries of breeding of domestic animals and plants 06/11/2013 Question: How are traits inherited from one generation to the other? Story begins with an Augustinian monk named Gregor Mendel, 1865 He proposed a “theoretical” notion of gene responsible for inheritance Proposed the three laws of inheritance Could not explain what genes are or where they physically located http://combio.abo.fi/teaching/special Chromosomes carry genes T.H.Morgan, 1904: studies on the physical basis of heredity: Organism of choice: fruit fly Established that genes are physically located on chromosomes E.g., the gene for eye color is fruit flies is located on the X chromosome 06/11/2013 http://combio.abo.fi/teaching/special Chromosomes carry genes Other early examples (for humans): Sex-linked genetic disorders: muscular dystrophy (1913), red-green color blindness (1914), hemophilia (1916) Disorderss related to recessive inheritance: alkaptonuria (1902), albinism (1903) Disorders related to dominant inheritance: brachydactyly (short fingers, 1905), congenital cataracts (1906), Huntington’s disease (1913) Eye color inheritance (brown dominant, blue recessive) – 1907 Modern view: more genes are involved 1941: one gene makes one protein 06/11/2013 http://combio.abo.fi/teaching/special DNA is the genetic material ➢ ➢ ➢ Question: Genes are somewhere on the chromosomes, but where are they really placed ? Chromosomes are complex structures consisting of DNA and many kind of proteins DNA was discovered about the same time Mendel and Darwin published their work: 06/11/2013 ➢ ➢ ➢ DNA was known to be a large molecule, but it seemed likely that its four chemical components were assembled in a monotonous pattern The hypothesis was that genes were made of chromosomal proteins and DNA was a structural polymer that was holding the chromosome together 1941-1952: DNA is indeed the genetic material http://combio.abo.fi/teaching/special DNA is the genetic material ➢ ➢ ➢ Question: Genes are somewhere on the chromosomes, but where are they really placed ? Chromosomes are complex structures consisting of DNA and many kind of proteins DNA was discovered about the same time Mendel and Darwin published their work: 06/11/2013 ➢ ➢ ➢ DNA was known to be a large molecule, but it seemed likely that its four chemical components were assembled in a monotonous pattern The hypothesis was that genes were made of chromosomal proteins and DNA was a structural polymer that was holding the chromosome together 1941-1952: DNA is indeed the genetic material http://combio.abo.fi/teaching/special Proving that DNA is the genetic material ➢ 1928 - the Pneumococcus bacterium causes pneumonia in mice ➢ ➢ 06/11/2013 Some strains of Pneumococcus are highly virulent (S-bacteria) and some are non-virulent (Rbacteria) ➢ ➢ Injected in mice, the coat protects the bacteria from attack by white blood cells, they multiply, infect, and eventually kill the mouse The presence of the coat is an inherited genetic trait A virulent bacteria has a polysaccharide coat, while the non-virulent one lacks the coat http://combio.abo.fi/teaching/special Proving that DNA is the genetic material ➢ Experiments in 1944 show that genes are made of DNA ➢ Living S-bacteria injected in mouse: mouse dies ➢ Living R-bacteria: mouse lives ➢ Heat-killed S-bacteria: mouse lives ➢ Heat-killed S-bacteria + living R-bacteria: mouse dies !!! ➢ ➢ 06/11/2013 Living S-bacteria isolated in the blood of the dead mouse These bacteria were fully functional, they could replicate and infect other mice http://combio.abo.fi/teaching/special The transforming principle ➢ ➢ ➢ Conclusion: some chemical process called transforming principle was responsible for conferring new genetic trait on non-virulent bacteria Further experiments: that chemical turned out to be DNA The concept of DNA as genetic material received confirmation in another series of experiments in 1952 06/11/2013 http://combio.abo.fi/teaching/special Structure of DNA DNA is composed of building blocks called nucleotides A nucleotide consists of a deoxiribose sugar, a phosphate group and one of the four possible bases The bases are: adenine (A), thymine (T), guanine (G), cytosine (C) Phosphates and sugars of adjacent nucleotides link (through strong phosphodiester bonds) to form a long polymer The ratio of A-to-T and G-to-C constant in all living organisms Final clue provided by X-ray crystallography: DNA is a double helix, shaped like a twisted ladder Sugar is a pentose (5 carbon atoms): the phosphate is attached to the 5’-carbon, the base to the 1’-carbon 06/11/2013 http://combio.abo.fi/teaching/special Structure of DNA 1953: race to describe the 3D structure won by James Watson and Francis Crick Alternate sugar and phosphate molecules form the twisted uprights of the DNA ladder The rugs of the ladder are formed by complementary pairs of bases: A always paired with T, G always paired with C One strands runs in the 5’-3’ direction, the other one in the 3’-5’ direction 06/11/2013 http://combio.abo.fi/teaching/special wikipedia 06/11/2013 http://combio.abo.fi/teaching/special DNA representation Single stranded DNA has polarity It has phosphate (attached to the 5’-carbon of the sugar) available for binding at one end of the strand and the 3’-carbon of the sugar available at the other end: the 5’-end and the 3’-end Four types of nucleotides (A,T,C,G) One single stranded DNA molecule may be represented as a string over the alphabet {A,T,C,G}, ALWAYS written in the 5’-to-3’ direction Example: 5’-ATGCTAC-3’ (most of the time omit 5’ and 3’) 06/11/2013 http://combio.abo.fi/teaching/special DNA representation Representing double stranded DNA molecules: double strings over the alphabet {A,T,C,G} Example of one such double string: 5’-ATGTAC-3’ 3’-TACATG-5’ Most often we omit the 5’ and 3’ Worth noting that the same molecule is represented also by its “inverse” 5’-GTACAT-3’ 3’-CATGTA-5’ 06/11/2013 http://combio.abo.fi/teaching/special DNA replication One half of the DNA ladder serves as a template for recreating the other half during DNA replication Responsible for this: the enzyme DNA polymerase that adds complementary nucleotides to the template provided by a single stranded DNA molecule This was conjectured already by Watson and Crick but the enzyme was actually discovered later 06/11/2013 http://combio.abo.fi/teaching/special DNA replication 06/11/2013 http://combio.abo.fi/teaching/special Central dogma DNA mostly found in the nucleus (in eukaryotes) Another type of nucleic acid commonly found in the cytoplasm: RNA RNA copies the DNA message in the nucleus and carries it out to the cytoplasm, where proteins are synthesized (in the rybosomes) DNA transcribed in the nucleus into mRNA mRNA translated in the cytoplasm (rybosomes) into amino acids – tRNA plays the role of “adaptor” 06/11/2013 http://combio.abo.fi/teaching/special Central dogma 06/11/2013 http://combio.abo.fi/teaching/special RNA protein From mRNA to protein: the universal genetic code Each triplet of nucleotides (codon) specifies one amino acid One codon specify beginning of translation (AUG) and 3 codons specify the end of it. 06/11/2013 http://combio.abo.fi/teaching/special The universal genetic code 06/11/2013 http://combio.abo.fi/teaching/special mRNA protein Translation mRNA → protein: implemented through some “adaptors” that recognize both codons and amino acids: transfer RNA (tRNA) On one part the tRNA holds an anticodon and on the other side it holds the corresponding amino acid Ordering the tRNA molecules on mRNA is complex: using the ribosome – “large protein synthesizing machine” 06/11/2013 http://combio.abo.fi/teaching/special Genes and chromosomes Each DNA molecule is packaged in a separate chromosome The total genetic information stored in the chromosomes: the genome Eukaryotes: (almost) every single cell contains a complete set of the genome Difference in functionality: different expression of the corresponding genes 06/11/2013 http://combio.abo.fi/teaching/special Coding and non-coding DNA Bacteria: virtually all DNA encodes proteins Eukaryotic DNA is composed of repeated sequences that do not encode proteins: non-coding sequences (junk DNA) They separate relatively infrequent “islands” of genes Many non-coding sequences (introns) are found also within the genes Less than 5% of the human genome encodes proteins Human genome: about 3 billion bp Amoeba: 200 times more DNA than humans 06/11/2013 http://combio.abo.fi/teaching/special Chromosomes Chromsomes are dynamic, changing structures DNA may “jump” from one position on a chromosome to another (B.McClintock, Nobel prize) Genes can be moved between species based on the universality of the genetic code Different bacteria may get antibioticresistant genes by exchanging plasmids (small chromosomes) Genes can be turned on and off Human insulin produced in E.coli bacteria (1980s) Regulatory genes produce inhibitors for other genes Genetic engineering – exploding field nowadays There are thus both genetic “plans” for protein production and genetic regulatory programs for expressing those plans 06/11/2013 http://combio.abo.fi/teaching/special Living things share common genes Compelling evidence for shared ancestry of all living things Evolution of higher life forms require development of new genes, while retaining the old ones Genes may be “stolen” from other organisms Mutations may accumulate – history of the evolutionary life of a gene 06/11/2013 http://combio.abo.fi/teaching/special Beyond DNA Passive role for DNA: encodes and transmits genetic info through time Active role for proteins encoded by DNA Post-genomic phase: what is the exact role of the proteins made by these specific genes? Animal of choice for test: mice They share about 99% of the genes with humans 06/11/2013 http://combio.abo.fi/teaching/special A classification Living organisms: Prokaryotes: single-cell organisms with no nucleus (e.g., simple bacteria) Eukaryotes: DNA protected in a nucleus, genes splitted in exons and introns (e.g., humans) 06/11/2013 http://combio.abo.fi/teaching/special Wet-lab methods to manipulate DNA We will overview the methods used in the lab and in the cell itself to manipulate DNA The availability of such tools made the rapid development of the field possible It is not a lab manual, only present a number of available tools and the ideas supporting them Essential to know about elementary lab tools also for mathematicians and computer scientists Topics: measuring DNA, testing for/isolating DNA molecules, separating and fusing DNA strands, lengthening, shortening, cutting, linking DNA, modifying nucleotides, multiplying, sequencing 06/11/2013 http://combio.abo.fi/teaching/special Measuring the length of DNA molecules Problem: measure the length of a molecule without sequencing it Length – definition: Single stranded DNA: the number of nucleotides (ex: 12 mer) Double stranded DNA: the number of base pairs (ex: 12 bp) 06/11/2013 http://combio.abo.fi/teaching/special Measuring the length of DNA Central fact: DNA molecules are negatively charged: in electric fields, they migrate towards + electrodes (anodes) The force needed to move DNA molecules – proportional to its length (due to friction) Idea: gel electrophoresis technique 06/11/2013 http://combio.abo.fi/teaching/special The gel electrophoresis technique Pour some gel in a rectangular container with electrodes on the sides In the cooling process: a comb is inserted on the negative side Gel cooled down, comb removed: row of small wells 06/11/2013 http://combio.abo.fi/teaching/special The gel electrophoresis technique The solution with the DNA to be measured brought in the wells Activate the electric field: DNA molecules travel towards the anode Gel acts as a molecular sieve – big friction small molecules move faster than big molecules Deactivate the field when the first molecules have reached the anode Small molecules – longer distance than bigger molecules Molecules of same length – same distance 06/11/2013 http://combio.abo.fi/teaching/special The gel electrophoresis technique Reading the resulting gel DNA molecules are colorless: mark them Two techniques for marking: Staining with ethridium bromide -> fluorescent mark under ultraviolet light Attaching radioactive markers to the ends of DNA molecules -> expose a film 06/11/2013 http://combio.abo.fi/teaching/special Computing the length Based on the distance traveled Using a calibrating solution in one of the wells -> compare the location of bands on the other paths with the calibrating path Digital printout of an agarose gel electrophoresis of cat-insert plasmid DNA - Wikipedia 06/11/2013 DNA gel electrophoresis - Wikipedia http://combio.abo.fi/teaching/special Gels of plasmid preparations usually show a major band of supercoiled DNA with other fainter bands in the same lane. Note that by convention DNA gel is displayed with smaller DNA fragments near the bottom of the gel. This is because historically DNA were run vertically and the smaller DNA fragments move downwards faster. - Wikipedia Testing for/isolating known molecules Problem: test if some known single stranded molecules s exist in a given DNA solution and isolate some of them if so Solutions: Attach the complementary strands (probes) to a filter and pour the solution through the filter Attach probes to tiny glass beads, place them in a glass column and pour the solution over them Attach probes to tiny magnetic beads, throw them in the solution and stir; attract them to one side using a magnet Result: s molecules bind to their complements to form double strands (annealing) and stay there 06/11/2013 http://combio.abo.fi/teaching/special DNA microarray Basic function: detect and quantify DNA subsequences in a sample collection of DNA molecules Basic principle: chip (either glass, plastic or silicon) contains thousands of DNA spots organized in an ordered uniform manner (array) Each DNA spot contains picomoles of copies of short sDNA sequences (probes) The probe DNA sequence is associated to a specific position on the chip 06/11/2013 Measurement: Prepare sample, label fluorescent target DNA sequences Let target sequences to hybridize with the complementary probes Wash-out weakly hybridized target DNAs Result: Only DNAs with the subsequences matching the probes will stay bound to the corresponding DNA spots on the chip http://combio.abo.fi/teaching/special DNA microarray Analysis: Fluorescently labeled DNA targets bound to probes produce light signal. More targets bound to a spot → higher the signal Scan the microarray image, basing on the position of a spot and its signal level – quantify occurance of DNA subsequences in the sample Issues: Denoising normalization 06/11/2013 Wikipedia http://combio.abo.fi/teaching/special DNA microarray Usage: Gene expression profiling Comparative genomic hybridization Gene identification – test for hallmarks of microorganisms in food and feed Detect protein-DNA binding sites SNP detection Alternative splicing and fusion genes detection 06/11/2013 http://combio.abo.fi/teaching/special Separating and fusing DNA strands Hydrogen bond between complementary bases is weaker than the covalent bond between consecutive nucleotides within one strand One can separate the two strands of a DNA molecule without breaking the single strands Separating the two strands: denaturation - heating Fusing two complementary strands: annealing/renaturation/hybridization - cooling 06/11/2013 http://combio.abo.fi/teaching/special Enzymes Enzymes: proteins that catalyze biochemical reactions Very specific: most catalyze just a single chemical reaction Highly efficient: speed up the chemical reaction by as much as a trillion times If no enzymes existed, bio-reactions would most likely be too slow to support life 06/11/2013 http://combio.abo.fi/teaching/special Lengthening DNA polymerases The enzymes: polymerases - add nucleotides to an existing DNA molecule Polymerases only extend the DNA molecules in the 5’-3’ direction They extend repeatedly the 3’ end, provided that the required nucleotides are available in the vicinity Given one strand, to produce the corresponding double stranded molecule, one needs the primer and the polymerases Polymerase in action 06/11/2013 http://combio.abo.fi/teaching/special Lengthening DNA terminal transferase does not need a template: extends double strands at both ends by some single stranded tails Engineering single strands is possible: a procedure leading to automation exists → “synthesizing robots” Short such single strands are called oligonucleotides (oligos in short); they are essential in PCR 06/11/2013 http://combio.abo.fi/teaching/special Shortening/cutting DNA The enzymes: DNA nucleases – degrade DNA Two types of enzymes: DNA exonucleases: cleave (remove) one nucleotide at a time from the ends of the DNA DNA endonucleases: destroy internal bonds in the DNA molecule (cutting DNA) Restriction endonucleases – cut on the specific sites 06/11/2013 endonuclease exonuclease http://combio.abo.fi/teaching/special Linking DNA Fuse together two pieces of DNA Add the necessary chemical links Examples Hydrogen bonding Ligation (enzyme: ligase) Hybridization: link two complementary single strands Blunt end ligation 06/11/2013 http://combio.abo.fi/teaching/special Multiplying DNA Major problem in biochemical research: obtain sufficient quantities of some substance Solution: cloning techniques, PCR Cloning in vivo: insert the DNA fragment in the DNA of some fast reproducing cell and the fragment will be replicated with the cell Cloning in vitro: Polymerase chain reaction – PCR 1985, Kary Mullis: Nobel prize Very efficient: produce in a short time billions of copies of even a single strand The PCR technique: Use viruses or hybridization This technique provides high quantities of DNA fragments It also gives a mean to preserve the fragments for long time (keeping alive the “host”) 06/11/2013 http://combio.abo.fi/teaching/special We want to amplify a DNA molecule alpha with known borders beta and gamma Process: repeat a cycle of denaturation, priming, extension Start: prepare a solution containing alpha (the target), oligonucleotides beta’ and gamma’ (complements of beta and gamma), polymerase enzymes, and many nucleotides (A, C, G, T) The PCR technique Denaturation: heat the solution to 85-95 C: alpha denatures into two single strands alpha1 and alpha2 Priming: Cool down the solution to 55C: the primers beta’ and gamma’ anneal to their complementary borders denaturation Extension: Heat the solution to 72C: polymerase extend the primers to produce two double stranded DNA molecules alpha 06/11/2013 http://combio.abo.fi/teaching/special priming extension The PCR technique Repeat the cycle n times: 2n copies (in principle): highly efficient bio-copymachine! A single cycle takes about 5 minutes: obtain billions of copies in several hours (days for cloning) The polymerase must be heat resistant – nature’s solution: thermophilic bacteria Important observation: one needs to know the borders of the DNA segment to be copied 06/11/2013 http://combio.abo.fi/teaching/special Sequencing Learning the exact sequence of nucleotides of a DNA molecule The most popular method of sequencing: the Sanger method 1940s: Sanger, insulin sequencing (Nobel prize) 1977: Sanger, Gilbert, DNA sequencing (Nobel prize) 1985: Mullis, PCR (Nobel prize) Main tool: nucleotide analogues – nucleotides chemically altered so that no other nucleotide can attach to their 3’ end: ddA, ddC, ddG, ddT (dideoxynucleotides) 06/11/2013 http://combio.abo.fi/teaching/special Sequencing – Sanger method Problem: sequence a single stranded molecule alpha If only nucleotides are used: produce full duplex Extend it to 3’ by gamma and add the primer gamma’ (labeled): let beta be the new molecule Solution: tube X contains a limited amount of nucleotide analogues ddX, for all X in the set {A,C,T,G} Prepare 4 tubes (A,C,G,T) containing: Beta molecule – ready to sequence beta molecules Primers gamma’ Nucleotides Incomplete molecules in tube A 06/11/2013 http://combio.abo.fi/teaching/special Example: alpha=3’-AGTACGTGACGC Tube A: Tube C: 5’- gamma’TCATGCACTGCG 5’- gamma’TCATGCACTGCG 5’- gamma’TCA 5’- gamma’TC 5’- gamma’TCATGCA 5’- gamma’TCATGC 5’- gamma’TCATGCAC 5’- gamma’TCATGCACTGC Tube G: Tube T: 5’-gamma’TCATGCACTGCG 5’- gamma’T 5’- gamma’TCATGCACTGCG 5’- gamma’TCATG 5’- gamma’TCATGCACTG 5’- gamma’TCAT 5’- gamma’ TCATGCACT 06/11/2013 http://combio.abo.fi/teaching/special Sequencing – Sanger method Gel electrophoresis using 4 wells - one for each tube Read the bands (the primers were marked) Obtain the sequence Sequencing ladder 06/11/2013 http://combio.abo.fi/teaching/special Sanger method, observations This was only a very simplified presentation Use polymerase not having the associated exonuclease activity: do not cut the ddX Limited amount of nucleotide analogues: too many make the polymerase before then end of the strand Impossible to sequence long molecules with this method Sequencing a longer molecule: divide it in overlapping short segments, sequence them and then reassemble the original sequence: computational (math) problem! More recently, Sanger sequencing has been supplanted by "Next-Gen" sequencing methods, especially for large-scale, automated genome analyses. However, the Sanger method remains in wide use, primarily for smaller-scale projects and for obtaining especially long contiguous DNA sequence reads (>500 nucleotides). 06/11/2013 http://combio.abo.fi/teaching/special NextGen sequencing The high demand for low-cost sequencing has driven the development of high-throughput sequencing NextGen sequencing produces thousands or millions of sequences concurrently. High-throughput sequencing technologies are intended to lower the cost of DNA sequencing beyond what is possible with standard dye-terminator methods. In ultra-high-throughput sequencing as many as 500,000 sequencing-by-synthesis operations may be run in parallel. e.g., Illumina (Solexa) sequencing e.g., SOLiD sequencing Faster >1000 times, cheaper <1000 times 06/11/2013 http://combio.abo.fi/teaching/special Acknowledgements Ion Petre's course “Introduction to Biocomputing, Fall 2004” 06/11/2013 http://combio.abo.fi/teaching/special