* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download DNA Fingerprinting
Genome evolution wikipedia , lookup
Eukaryotic transcription wikipedia , lookup
Whole genome sequencing wikipedia , lookup
DNA barcoding wikipedia , lookup
Comparative genomic hybridization wikipedia , lookup
Gene expression wikipedia , lookup
Maurice Wilkins wikipedia , lookup
Transcriptional regulation wikipedia , lookup
Agarose gel electrophoresis wikipedia , lookup
Promoter (genetics) wikipedia , lookup
Silencer (genetics) wikipedia , lookup
DNA sequencing wikipedia , lookup
Genomic library wikipedia , lookup
DNA vaccination wikipedia , lookup
Gel electrophoresis of nucleic acids wikipedia , lookup
SNP genotyping wikipedia , lookup
Molecular cloning wikipedia , lookup
Vectors in gene therapy wikipedia , lookup
Molecular evolution wikipedia , lookup
Non-coding DNA wikipedia , lookup
Real-time polymerase chain reaction wikipedia , lookup
DNA supercoil wikipedia , lookup
Cre-Lox recombination wikipedia , lookup
Nucleic acid analogue wikipedia , lookup
Bisulfite sequencing wikipedia , lookup
Community fingerprinting wikipedia , lookup
PBG/MCB 620 DNA Fingerprinting DNA Fingerprinting A method for the detection of DNA variation image source - http://db2.photoresearchers.com/feature/infocus1 Applications of DNA fingerprinting • • • • • • Human genetics and disease Systematics and taxonomy Population, quantitative, and evolutionary genetics Plant and animal breeding and genetics Legal, forensic, and anthropological analysis Genome mapping and analysis Important Timeline • Discovery of DNA as the Hereditary Material in 1944 • DNA structure described in 1953 • Restriction endonucleases discovered in 1968-1969 • DNA sequencing described in 1977 • DNA fingerprinting first used in 1985 • Polymerase chain reaction (PCR) invented in 1985 DeoxyriboNucleic Acid (DNA) structure “It has not escaped our notice that the specific pairing we have postulated immediately suggests a possible copying mechanism for the genetic material” (Watson and Crick 1953) DNA Structure • DNA is the hereditary material and contains all the information needed to build an organism. • It is a polymeric molecule made from discrete units called nucleotides. • Nucleotides link together to form a DNA strand at positions 3’ and 5’ Nitrogenous base: • Purines: Adenine and Guanine • Pyrimidines: Thymine and Cytosine Sugar: 2-deoxyribose Phosphate group Nucleotide Thymidine Pairing of Strands 2 strands of polynucleotides: • Twisted around each other in clock-wise direction • Antiparallel: complementary and inverse • H-Bridges links that are specific: G C A T Essential Features of DNA The structure of DNA is identical in all eukaryotes, therefore the genetic information resides in the sequence of their bases Gene is a DNA segment with a sequence of bases that has the information for a biologic function. Alternative forms of a gene are called alleles Location of DNA in Eukaryotic Cells A small fraction is located in the organelles: • Chloroplasts (cpDNA): 135 to 160 kb with high density of genes • Mitochondria (mtDNA): 370 to 490 kb. Only about 10% are genes Most of it in the nucleus: Nucleus • 63 Mb to 150 Gb in plants; 20Mb to 130 Gb in animals Mitochondria • Number of molecules (chromosomes) highly variable: 2 to >500 in animals and 2 to >1000 in plants. • Just a very small fraction of the genome is actual genes. Chloroplast • Some tens of thousand genes and gene clusters are scatterd From Brooker et al. Genetics: Analysis around in a vast majority of apparently non-functional DNA. & Principles. McGraw Hill. 2009 • DNA is associated with other components (mainly proteins) and form a complex called Chromatin. DNA Organisation Chromatin: The basic structure of chromatin is made of DNA and proteins (histones) The structure of the chromatin changes throughout the cell cycle: • Most of the time, when the cell is not undergoing mitosis, the chromatin is relatively uncondensed. However, there are more compacted zones (heterochromatin) and less compacted zones (euchromatin, which is the majority). • When the cell is going to divide, the chromatin gets more and more compacted producing individualized structures called methaphasic chromosomes From Brooker et al. Genetics: Analysis & Principles. McGraw Hill. 2009 DNA Replication • DNA primase: catalyzes the synthesis of a short RNA primer complementary to a single strand DNA template • Helicase: unwinds and separates the two strands of DNA • Gyrase: facilitates the action of the helicase relieving tension of the coiled DNA • Single Stranded DNA binding proteins (SSB): stabilize single strand DNA • DNA polymerase: synthesize a new DNA strand complementary to a template strand by adding nucleotides one at a time to a 3’ end. Polymerase Chain Reaction - PCR • Invented by K.B Mullis in 1983 • Allows in vitro amplification of ANY DNA sequence in large numbers • Design of two single stranded oligonucleotide primers complementary to motifs on the template DNA. PCR Basic Principle A Polymerase extends the 3’ end of the primer sequence using the DNA strand as a template. PCR Principles • Each cycle can be repeated multiple times if the 3’ end of the primer is facing the target amplicon. The reaction is typically repeated 25-50 cycles. • Each cycle generates exponential numbers of DNA fragments that are identical copies of the original DNA strand between the two binding sites. • The PCR reaction consists of: • A buffer • DNA polymerase (thermostable) • Deoxyribonucleotide triphosphates (dNTPs) • Two primers (oligonucleotides) • Template DNA • And has the following steps: • Denaturing: raising the temperature to 94 C to make DNA single stranded • Annealing: lowering the temperature to 35 – 65 C the primers bind to the target sequences on the template DNA • Elongation: DNA polymerase extends the 3’ ends of the primer sequence. Temperature must be optimal for DNA polymerase activity. PCR is Exponential 1st cycle 2nd cycle Restriction Endonucleases • Enzymes which recognize a specific sequence of bases within double-stranded DNA. • Endonucleases make a double-stranded cut at the recognition site. • Examples: EcoRI HindIII BamHI 5‘- G|AATTC 5‘- A|AGCTT 5‘- G|GATCC 3‘- CTTAA|G 3‘- TTCGA|A 3‘- CCTAG|G Gel Electrophoresis • A process used to separate DNA fragments • An electric current passes through agarose or polyacrylamide gels • The electrical current forces molecules to migrate into the gel at different rates depending on their sizes From Hartwell et al. Genetics. McGraw Hill. 2008 Decoding DNA – Sanger Sequencing • • • • • deoxinucleotyde (dNTP) Buffer DNA polymerase dNTPs Labeled primer Target DNA ddGTP ddATP dideoxinucleotyde (ddNTP) ddCTP Link: http://www.wellcome.ac.uk/Education-resources/Education-and-learning/ Resources/Animation/WTDV026689.htm ddTTP Reading Sanger Sequencing *GCTTAAGTACATACCTAGTACCACTATATAATG G A C T *GTACATACCTAGTACCACTATATAATG *GTACCACTATATAATG *ACGCTTAAGTACATACCTAGTACCACTATATAAT G *AAGTACATACCTAGTACCACTATATAATG *AGTACATACCTAGTACCACTATATAATG *ATACCTAGTACCACTATATAATG *ACCTAGTACCACTATATAATG *AGTACCACTATATAATG *CGCTTAAGTACATACCTAGTACCACTATATAATG *CATACCTAGTACCACTATATAATG *CCTAGTACCACTATATAATG *CTAGTACCACTATATAATG *TTAAGTACATACCTAGTACCACTATATAATG *TAAGTACATACCTAGTACCACTATATAATG *TACATACCTAGTACCACTATATAATG *TACCTAGTACCACTATATAATG *TAGTACCACTATATAATG Separate gel lanes Single gel lane Next Generation Sequencing - Illumina http://technology.illumina.com/technology/next-generation-sequencing/sequencing-technology.html Sequencing Options Method Read length Accuracy Reads per run Time per run Cost per 1 million bases (in US$) Advantages Disadvantages Moderate throughput. Equipment can be very expensive. 5,500 bp to 8,500 Single-molecule bp avg (10,000 bp); real-time maximum read sequencing (Pacific length >30,000 Bio) bases 99.999% consensus 50,000 per SMRT accuracy; 87% cell, or ~400 single-read megabases accuracy 30 minutes to 2 hours $0.33–$1.00 Longest read length. Fast. Detects 4mC, 5mC, 6mA. Ion semiconductor (Ion Torrent sequencing) 98% 2 hours $1 Less expensive equipment. Fast. Homopolymer errors. $10 Long read size. Fast. Runs are expensive. Homopolymer errors. $0.05 to $0.15 Potential for high sequence yield, depending upon sequencer model and desired application. Equipment can be very expensive. Requires high concentrations of DNA. $0.13 Low cost per base. Slower than other methods. Have issue sequencing palindromic sequence. $2400 Long individual reads. Useful for many applications. More expensive and impractical for larger sequencing projects. Pyrosequencing (454) up to 400 bp 700 bp Sequencing by 50 to 300 bp synthesis (Illumina) Sequencing by ligation (SOLiD sequencing) Chain termination (Sanger sequencing) 99.9% 98% 50+35 or 50+50 bp 99.9% 400 to 900 bp 99.9% up to 80 million 1 million 24 hours up to 3 billion 1 to 10 days, depending upon sequencer and specified read length 1.2 to 1.4 billion 1 to 2 weeks N/A 20 minutes to 3 hours Some Plant Sequenced Genomes Common name Year Chr (#) Size Assemble d Mb Assem Gene (#) % Repeat % Arabidopsis 2000 5 125 115 92 25,498 14 Rice 2002 12 430 362 84 59,855 26 Sorghum 2009 10 818 739 90 34,496 62 Maize 2009 10 2,300 2048 89 32,540 85 Soybean 2010 20 1,115 973 87 46,430 57 Brachypodium 2010 5 272 272 100 25,532 21 Barley 2012 7 5,100 4980 98 30,400 84 Wheat 2012 21 17,000 3800 22 94,000 80 Barley DNA Sequence • Total sequence is 5,300,000,000 base pairs – 165 % of human genome – Enough characters for 11,000 large novels • Expressed Genes - 60,000,000 base pairs – Approx 1% of total sequence, like humans – 125 large novels Finding Sequences http://www.ncbi.nlm.nih.gov/nuccore Comparing Sequences • Have anonymous from your plant or candidate sequence from another species – What gene does it come from? • Compare to existing sequences in databases – Basic Local Alignment Search Tool (BLAST) • • • • • Nucleotide query in nucleotide database – BLASTN Protein query in protein database – BLASTP Translated nucleotide query in protein database – BLASTX Protein query in translated nucleotide database – TBLASTN Translated nucleotide query in translated nucleotide database - TBLASTX • Is this consistent with the character? Multiple Sequence Alignment - Input www.ebi.ac.uk/Tools/msa/clustalo/ Sequence Alignment - Results www.ebi.ac.uk/Tools/msa/clustalo/ Gene Expression DNA SNP/indels genes; molecular markers; epigene.cs Transcrip.on RNA Transla.on Protein Localisa.on Func.on/Trait Craig Simpson microarrays small RNAs; alterna.ve splicing SILAC Biochemical markers GFP Localisa.on Phenotypic traits Post transcriptional processing Barley RNA processing linked to abiotic stress responses • Evidence that RNA binding proteins and splicing factors respond to abio7c stresses. • Evidence that abio7c (and bio7c) stress linked genes are alterna7vely spliced. Alterna7ve splicing: • Increases protein complexity. • Results in non produc7ve isoforms that counteract transcrip7on. Evidence that most genes which undergo changes in alterna7ve splicing occur independently of transcrip7on. The mechanisms by which environmental condi.ons are translated into post-‐ transcrip.onal change is essen.al to understanding phenotypic plas.city. Craig Simpson Alternative Splicing Options Alternative Splicing Barley RNA processing linked to abiotic stress responses Arabidopsis High resolution RT-PCR alternative splicing panel RT-PCR with ~380 pairs of gene specific primers, one of pair 6-FAM labelled Separate products by size; quantify the ratio of the products. Monitor Environmental responses – abiotic stress F Standard Temp. L S Heat Shock R At5g41700, Ubiquitin-conjugating enzyme Craig Simpson At2g26150 Heat shock responsive expression AS1 AS2