Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Sequencing Definition § Sequencing of DNA isthe determination of the succession of the nuclotides constituting the DNA. § It is nowadays a routine technique used in biology labs. § This technique uses the knowledge acquired during the past 30 years in DNA replication mechanisms. 2 2 é c Nucleic Acids Nucleic acid = polymer of nucleotides Nucleotides (nucleic acid: Friedrich MIESCHER, 1871) Phoebus LEVENE, 1919 Bases 3 l3 m oc rg Protéines – Glucides DNA and RNA r – A. nucléique RNA ribonucleic acid DNA desoxyribonucleic acid More stable 3’–5 ’ t t t t t t 44 Cours G. BARTHOLE, ENS Cachan Secondary structure of DNA § 1947-1950 Erwin CHARGAFF [Nature 165, 756 (1950)] Discovers that in humans the proportions are: Adénine ≈ Thymine (≈30%) Cytosine ≈ Guanine (≈20%) è can be explained by A-T andC-G pairing § 1953 Rosalind FRANKLIN, James WATSON & Francis CRICK X ray diffraction pattern (R. FRANKLIN) showing a cross characteristic of an helicoidal structure Pray, L., Nature Education 1, 100 (2008) 55 5’ End (relative to the position of the C in the pentose) Base pairing Hydrogene bond Synthesis from 5’ to 3’ Pray, L., Nature Education 1, 100 (2008) 3’ End 66 Protein synthesis Transcription: DNAà mRNA Nucleus Translation: mRNAà protein Amino acid Proteins Cell membrane 88 1977 1ST GENERATION SEQUENCING: SANGER METHOD 9 Sanger Technique § Frederick SANGER (1918-2013) english biochemist who received 2 Nobel prices of chemistry: – 1958 : structure of proteins (insulin) – 1980 : for the sequencing § The DNA polymerases synthesize a complementary DNA strand starting from a matrix strand. § For sequencing, slightly different nucleotides are used: the dideesoxyribonucleotides (ddNTP) instead of the usual desoxyribonucleotides triphosphates (dNTP). 10 10 dNTP vs ddNTP § The difference between ddNTP and dNTP is the absence of a OH moetie in 3’ position. § So, when a DNA polymerase uses a ddNTP, it cannot add any nucleotide after: the strand synthesis stops. 5’ CH2-phosphate O 5’ CH2-phosphate 1’ base O 3’ OH OH present in 3’ : dNTP 1’ base 3’ H OH absent in 3’ : ddNTP 11 11 Protocole (Sanger) § 4 solutions are prepared, each containing: – the fragment to be sequenced, – a small DNA with its sequence complementary to the 3’ end of the fragment to be sequenced= primer – the 4 dNTP's (dCTP, dATP, dGTP, dTTP) – DNA polymerase DNA with a known sequence primer complementary of a part of the known sequence unknown DNA, to be sequenced synthesis of the DNA complementary strand by a DNA polymerase 12 12 Protocole § In each tube, small quantities of one fluorescent or radioactif (32P) ddNTP are added http://wwwarpe.snv.jussieu.fr/coursvt/images_10/sangerprinc.gif § The random incorporation of one ddNTP stops the synthesis. à At the end of the reactions, a set of DNA strands with various sizes are obtained, depending on the location where a ddNTP was incorporated. 13 13 Protocole Synthesis of the complementary strand è if stop due to a ddGTP, it means that there is a Cytosine in the original sequence 14 14 strand reading fragment length electrophoresis migration (4 wells) 3’ known DNA DNA to be sequenced 5’ 3’ primer5’(known) 20 nt synthesis from 5’ to 3’ 28 GTAGGCAT DNA to be sequenced 5’-ATGCCTAC-3’ 27 GTAGGCA 26 GTAGGC 25 GTAGG 24 GTAG 23 GTA 22 GT 21 G Exemple of autoradiography (32P labeling) of an electrophoresis gel. 15 Optical reading of the strands § Labeling of each ddNTP with a different fluorophore (spectrally separated) Chromatogramme § Capillary Electrophoresis (modern machine) Asset: sequencing in a single reaction instead of 4. 16 16 Automatisation Voir aussi une animation « flash » sur http://www.yourgenome.org/teachers/sequencing.shtml 17 17 Example of application Research of genetic markers of cancers Normal (wild type) Non-small-cell lung cancer: look for genetic markers related to ERBB2 gene. Région dupliquée Sick Presence of a duplicated region [GCATACGTGATG] of ERBB2, appearing in several sequences è association with the disease 18 18 Performances & Limitations Performances of modern Sanger sequencers § several hundreds of samples simultaneously and one sequencing per hour. § Sequences of 300-1000 nucleotides max Limitations § If amplification (PCR) before sequencing: small parts of the amplification vector sequence found in the Sanger sequencing. § Mistakes at the beginning of the sequence: incorrect recognition of the primer. § Low resolution between sequence having only 1 nt length difference. 19 19 2nd generation sequencing: pyrosequencing 20 Pyrosequencing § Based on a « sequencing by synthesis » principle, by opposition with the sequencing by « terminaison » of the Sanger method. § Sequencing of a single strand DNA by the synthesis of the complementary strand, base per base by detecting at each step the polymerase activity using an other chemiluminescent enzyme : the luciferase. 21 21 Pyrosequencing § Nucleotides (dNTP) added sequentially (≠Sanger sequencing) Polymerase ACCTTGAATTCGTCCTAGGA----GATCCT-------dNTP 22 22 Pyrosequencing § Nucleotides (dNTP) added sequentially (≠Sanger sequencing) § If it is the right one: incorporation and release of a pyrophosphate (PPi) § Ppi à ATP by action ofATP-sulfurylase § L’ATP apporte l’énergie nécessaire à la réaction de conversion de la luciférine par la luciférase. Cette réaction génère de la lumière visible dont l’intensité est proportionnelle à la quantité d’ATP. Polymerase ACCTTGAATTCGTCCTAGGA----GGATCCT-------dGTP PPi § UnePPi Apyrase dégrade les nucléotides en surplus 23 23 Pyrosequencing § Nucleotides (dNTP) added sequentially (≠Sanger sequencing) § If it is the right one: incorporation and release of a pyrophosphate (PPi) § Ppi à ATP by action of ATPsulfurylase Polymerase ACCTTGAATTCGTCCTAGGA----GGATCCT-------dGTP PPi ATP-sulfurylase ATP 24 24 Pyrosequencing § Nucleotides (dNTP) added sequentially (≠Sanger sequencing) § If it is the right one: incorporation and release of a pyrophosphate (PPi) Polymerase ACCTTGAATTCGTCCTAGGA----GGATCCT-------dGTP § Ppi à ATP by action of ATPsulfurylase § The ATP brings the energy necessary to the reaction of luciferine convertion by the luciferase. This reaction generates visible light. Its intensity is proportionnal to the quantity of ATP. PPi ATP-sulfurylase ATP + Luciférine Luciferase Light 25 25 Pyrosequencing § Nucleotides (dNTP) added sequentially (≠Sanger sequencing) § If it is the right one: incorporation and release of a pyrophosphate (PPi) § Ppi à ATP by action of ATPsulfurylase § The ATP brings the energy necessary to the reaction of luciferine convertion by the luciferase. This reaction generates visible light. Its intensity is proportionnal to the quantity of ATP. Polymerase ACCTTGAATTCGTCCTAGGA----GGATCCT-------dNTP Apyrase PPi ATP-sulfurylase ATP dNMP Luciferase Light § An Apyrase degradates the remaining nucleotides. 26 26 Implementation: « 454 technology» Pyrosequencing vs. Sanger § 100 faster and cheaper § sequenced fragments shorter (but in progress) 454 sequencing (Roche Diagnostics) Integration of several high-tech methods: – pyrosequencing, – picotiter plates made of optical fibers (1,6 millions wells) – emulsion PCR (emPCR) in microreactors (300 000 PCR reactions in parallel) – Image analysis… 27 27 Emulsion based clonal amplification http://classes.soe.ucsc.edu/bme215/Spring09/PPT/BME%20215-5.pdf 28 28 Loading of the beads in a multi-well plate 400 000 sequencing reactions in parallel 29 29 http://www.biopsci.com/2012/02/22/sequencage-de-ladn-la-revolution-est-de-nouveau-en-marche/ Exemple of a « 454 » machine Roche « GS FLX System + » Sequencing of 100 to 400 Mbases in 7 hours (per machine) 30 30 Whole genome sequencing 31 31 Sequencing of an entire genome § Sequencable part of the human genome: 2,9 Gpb ! – Impossible to read in one time – Anyway, the biologists don’t know how to manipulate so long DNAs. – However, possibility to sequence « relatively fast » with the new technologies. Ø Basic principles for sequencing a genome : 1. Random fragmentation in large pieces. 2. Sequencing of the piece ends. 3. Reconstruction bu using overlapping fragments. 32 32 Parallelisation « Factory » making a bacterial Library factory - artificial chromosome library (BAC) at the Whitehead Institute Whitehead Institute (MIT, USA) Nature 409, 860 (2001) Sequencing « factory » Sequencing factory (Sanger Institute, UK) Sanger Institute 33 33 Whole genome shotgun § Many overlapping reads are needed à Some regions will be sequenced many times. à Despite this redondancy, holes remain. § Genome is randomly broken up in many small segments § The segments are sequenced using the chain termination method to obtain reads. § Reconstruction Ex: Haemophilus influenzae (1st sequenced bacteria) • 1,8 Mb broken up mecanically to give a library of ~2000 bp. • 20 000 sequenced segments (starting by one or both ends) • 24 000 reads kept, having an average length of 470 pb • à 11,6 Mpb sequenced i.e. 6.3 times the genome, but the coverage was not perfect! 34 34 Clone by clone sequencing Nature 409, 860 (2001) 1. Bacterial artificial chromosome (BAC): 100-200 kb clone 2. Physical map to sort the clones 3. Sequencing of small fragments (100-1000 bp) Sanger sequencing Overlaps Summary: http://www.snv.jussieu.fr/vie/dossiers/genomes/methodes_resume.htm 35 35 Chronology 1977 ϕ X147 bacterial virus 5386 nt 11 genes 1984 HIV retrovirus 1995 Haemophilus influenzae bacteria 1.8 Mbp 1740 genes 1997 Saccharomyces cerevisae yeast 13 Mpb 6275 genes 1997 Escherichia coli bacteria 4,6 Mpb 1998 Caenorhabditis elegans animal 97 Mpb 2000 Arabidobsis thaliana vegetal 157 Mpb + rice food plant 430 Mpb cereal with the smallest genome 90% of genes in common with us 2002 mouse 2,5 Gpb 2007 zebra fish 1,7 Gpb circular single strand DNA 37 37 Human genome project 1988 Human Genome Organisation 1992 1st map of the human genome 1999 sequencing of the 1st human chromosome (22) 2001 Rough draft of the human genome completed 2005 Human metagenome project 38 38 Pacific Biosciences ZERO-MODE-WAVEGUIDES 39 39 Principle § Sequencing by detection of the incorporation of fluorescent nucleotides. § Detection at the single molecule level. 40 40 Single molecule flurescence microscopy: confocal microscopy scanner (x,y,z) objective excitation laser dichroic mirror tube lens photodetector 41 41 Single molecule fluorescence microscopy: TIRFM § total-internal reflexion fluorescence microscopy (TIRFM) evanescent wave coverslip fluorophores oil Objective NA=1,45 excitation beam fluorescence signal 42 42 Single molecule fluorescence microscopy: elementary excitation volumes § confocal § in TIRFM onde évanescente lamelle fluorescents huile Objectif ON=1,45 Faisceau d’excitation Signal de fluorescence Elem. excitation volume /ON 100nm 1,22 Vélem =0,02 fL NA=1,45 =550nm 1,22 4n Vélem =0,16 fL /ON /ON2 43 43 concentration at which there is, on average, sion events at high concentrations. The diffusion Consequence on the fluorescently labeled one molecule in the observation volume at coefficient and quantum yield of R110-dCTP any given time. Volumes as small as 10 were found independently to be 2.24 , 10 molecules concentration cm s and 77%, respectively (22). These pazeptoliters, more than four orders of magni- will affect the quantum yield of the fluorophore and therefore the shape of the observation volume. In general, the radiative rate of a dipole is proportional to the density of photonic states available for emission at the appropriate frequency (24, 25). A detailed calculation of changes to the radiative rate as a function of position in zero-mode waveguides is beyond the scope of this paper. However, for our current purpose we will make the approximation that the photon density of states, and hence the radiative rate, is proportional to the output coupling from the waveguide and therefore proportional to p(z). The fluorescence quantum yield, Q, is a function of the radiative and nonradiative rates of dipole de-excitation, kr and knr, such that -6 2 tude smaller than the diffraction limit, are possible. Thus, for the smallest waveguides it is possible to work at concentrations as high as 200 'M and still have less than one molecule per volume. Arrays of zero-mode waveguides were manufactured as small holes in an 89-nm thick film of aluminum on fused silica coverslips (Fig. 3). Holes of various diameters were patterned with the use of electron beam lithography followed by reactive ion etching (22). FCS was used to characterize the observation volume inside the waveguides and to demonstrate their usefulness for high-concentration FCS and cross-correlation. One-dimensional FCS curves can be derived from the profile S(z) with the use of either a Fourier (27) or a Laplace (28) transform, assuming nonstick boundary conditions, resulting in the expression for the T. Traut, Mol. Cell. autocorrelation function G(() Biochem.,140,1 (1994) -1 rameters were used to derive G(() for waveguides of various diameters with the assumption that the quantum yield at the entrance of the waveguide is the same as that for the freely diffusing dye. Fits to a 43-nm waveguide for various concentrations of fluorophore are shown in Fig. 4A; for comparison, a curve from a conventional, diffraction-limited volume using a dye concentration of 4 nM is also shown. Zero-mode waveguides increased the usable concentration range by well over three orders of magnitude. The value of G(0) scales as expected with concentration; however, a nonfluctuating background, B, from the large pool of highly concentrated dye on the opposite side of the waveguide can affect the measured value of G(0) such that N G(0) ! (N " B)2 § If more than one fluorescent molecule is in one elementary excitation volume: we cannot detect single molecules. § Max. reachable concentrations: 100 nM for TIRFM and 10 nM for confocal § Physiological concentrations: k (z) p (z) Q(z) ! r k r (z) " k nr # p(z) " C (2) – dATP, 24±22 µM where C is a constant such that Q(0) equals 5.2±4.5 the quantum – yielddGTP, at the entrance of the µM waveguide. – bare dCTP, 29±19 Quenching by metal could contribute µM to – dTTP 37±30 µM Fig. 3. A fused silica coverslip with zero-mode waveguides arrays. (A) The è Excitation volume must be smaller! 44 44 Levene, M. J. Science 299, 682–686 electron microscope image of et an al., individual waveguide is shown(2003). in (D). mode waveguides. Zero-mode waveguide For > c=1.7d no propagative TE11 mode Aluminium 0 d B 150 Solution z 100 nm z A 50 Al ⇤ 0 evanescent wave -1 -2 Al -3 I(z) = I0 e z/⇤ 0 -4 Fused Silica -50 -100 -50 0 nm 50 100 45 Fig. 2. (A) Three-dimensional finite-element time-domain simulation of the 45 inten diameter and 100 nm long.waveguide (B) S(z) curves for different waveguide zero mode (50 nm x100diameter, nm) d. (C From nanostructures to confine the excitation volume … 46 46 Levene, M. J. et al., Science 299, 682–686 (2003). … to the sequencing of a single DNA 47 47 … to the sequencing of a single DNA 48 48 Summary https://www.youtube.com/watch?v=WMZmG00uhwU&list=UU2y78sjVOumGc2da1tN629g SMRT = single molecule real time 49 49 Sources § Thanks to François Treussart (ENS Cachan) for improving these slides. § Animation séquençage de Sanger : http://www.yourgenome.org/teachers/sequencing.shtml § Transparents sur le séquençage de Sanger : http://www.dil.univ-mrs.fr/~vancan/optionBio1/cours.html#htoc15 (cours de Sophie Bleves) § Pyroséquençage : voir http://gepv.univ-lille1.fr/ § § § technologie 454 : http://classes.soe.ucsc.edu/bme215/Spring09/PPT/BME%20215-5.pdf et http://www.biopsci.com/2012/02/22/sequencage-de-ladn-la-revolution-est-de-nouveau-en-marche/ Les technologies de laboratoire n°5 juillet-août 2007 « Evolution des techniques de séquençage » T4 : photo issue de http://en.wikipedia.org/wiki/Frederick_Sanger § T5 : schéma issu de http://www.mun.ca/biology/scarr/iGen3_02-07.html § T7 : schéma inspiré de http://www.biology.arizona.edu/biochemistry/problem_sets/large_molecules/06t.html § T9 & T11 : schémas issus de https://facmed.univ-rennes1.fr/wkf//stock/RENNES20080328110058vdavidSEquencage.pdf § T12 : http://dc202.4shared.com/doc/ICPFo0Ga/preview.html § génome humain : http://www.snv.jussieu.fr/vie/dossiers/genomes/methodes_intro.htm § T29-30 cf http://www.universalis.fr/encyclopedie/sequencage-d-adn-reperes-chronologiques/ § zebra fish : http://www.zmescience.com/medicine/mind-and-brain/zebrafish-locomotion-human-evolution-942333/ et Ferris State University 50 50