* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download the 3
Transposable element wikipedia , lookup
Epigenetics in learning and memory wikipedia , lookup
DNA profiling wikipedia , lookup
Zinc finger nuclease wikipedia , lookup
Genetic engineering wikipedia , lookup
Gene expression profiling wikipedia , lookup
Frameshift mutation wikipedia , lookup
Epigenetics of human development wikipedia , lookup
Genome evolution wikipedia , lookup
Non-coding RNA wikipedia , lookup
History of RNA biology wikipedia , lookup
DNA polymerase wikipedia , lookup
DNA damage theory of aging wikipedia , lookup
Genetic code wikipedia , lookup
Cancer epigenetics wikipedia , lookup
Genealogical DNA test wikipedia , lookup
United Kingdom National DNA Database wikipedia , lookup
Human genome wikipedia , lookup
Nutriepigenomics wikipedia , lookup
Messenger RNA wikipedia , lookup
SNP genotyping wikipedia , lookup
DNA sequencing wikipedia , lookup
No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup
DNA vaccination wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Epitranscriptome wikipedia , lookup
Designer baby wikipedia , lookup
Point mutation wikipedia , lookup
DNA supercoil wikipedia , lookup
Nucleic acid double helix wikipedia , lookup
Epigenomics wikipedia , lookup
Extrachromosomal DNA wikipedia , lookup
Bisulfite sequencing wikipedia , lookup
Microsatellite wikipedia , lookup
Cre-Lox recombination wikipedia , lookup
Molecular cloning wikipedia , lookup
Metagenomics wikipedia , lookup
Vectors in gene therapy wikipedia , lookup
Cell-free fetal DNA wikipedia , lookup
Genome editing wikipedia , lookup
Microevolution wikipedia , lookup
Non-coding DNA wikipedia , lookup
Gel electrophoresis of nucleic acids wikipedia , lookup
History of genetic engineering wikipedia , lookup
Therapeutic gene modulation wikipedia , lookup
Nucleic acid analogue wikipedia , lookup
Primary transcript wikipedia , lookup
Deoxyribozyme wikipedia , lookup
Genomic library wikipedia , lookup
MB206-Jan09 Project Samples : Plant (A) Objective: Isolate 100 ESTs from Plant (A) RNA Extraction cDNA Library Construction ESTs Generation RNA Extraction Extract RNA from sample (A) – method depend on sample. Check previous note. Check the quality and quantity of the RNA. Isolate mRNA from the RNA (using kits) Check the quality and quantity of the mRNA. Then? mRNA isolation Most eukaryotic mRNAs are polyadenylated at their 3’ ends 5’ cap AAAAAAAAAAn • oligo (dT) can be bound to the poly(A) tail and used to recover the mRNA. Angelia 09 5 Check the mRNA integrity Make sure that the mRNA is not degraded. Methods: Analysis the mRNAs by gel elctrophoresis: use agarose or polyacrylamide gels Angelia 09 6 Cloning the particular mRNAs Is useful especially one is trying to clone a particular gene rather to make a complete cDNA library. Fractionate on the gel: performed on the basis of size, mRNAs of the interested sizes are recovered from agarose gels Enrichment: carried out by hybridization Example: clone the hormone induced mRNAs (substrated cDNA library) Angelia 09 7 Types of Libraries Genomic Library • whole genes w/ promoters & introns (Euk.), operons (bacteria), DNA regulatory elements… cDNA Library • mRNA transcript only w/ 5’ & 3’ untranslated regions (UTRs), no introns, tissue specific. (5’UTR) (3’UTR) 8 Angelia 09 cDNA Libraries cDNA library Genomic DNA mRNA polyA Reverse transcribe cDNA (and more) polyA polyA Genomic DNA library Clone in vector Genomic DNA Digest DNA fragments Angelia 09 9 Choosing a Vector Usually you select a vector (plasmid, λ, other) depending on how big you want your DNA fragments to be & the capacity of the vector. Angelia 09 10 Lambda Library Lodish, et al. Fig 7-12 Plasmid Library Lodish, et al. Fig 7-1 Angelia 09 12 Plasmid !!! mRNA isolation, purification Check the RNA integrity Synthesis of cDNA Treatment of cDNA ends Ligation to vector cDNA libraries 1. No cDNA library was made from prokaryotic mRNA. • Prokaryotic mRNA is very unstable • Genomic libraries of prokaryotes are easier to make and contain all the genome sequences. Angelia 09 15 cDNA libraries 2.cDNA libraries are very useful for eukaryotic gene analysis • • • • Condensed protein encoded gene libraries, have much less junk sequences. cDNAs have no introns genes can be expressed in E. coli directly Are very useful to identify new genes Tissue or cell type specific (differential expression of genes) Angelia 09 16 Synthesis of cDNA : First stand synthesis: materials as reverse transcriptase ,primer( oligo(dT) or hexanucleotides) and dNTPs (Fig 1.1) Second strand synthesis: best way of making full-length cDNA is to ‘tail’ the 3’-end of the first strand and then use a complementary primer to make the second. Angelia 09 17 5’ 5’ 3’ 5’ 3’-CCCCCCC 5’-pGGGG-OH 3’-CCCCCCC 5’-pGGGG 3’-CCCCCCC mRNA AAAAA-3’ HO-TTTTTP-5’ Reverse transcriptase Four dNTPs mRNA AAAAA-3’ cDNA TTTTTP-5’ cDNA AAAAA-3’ TTTTTP-5’ Terminal transferase dCTP mRNA cDNA Alkali (hydrolyaes RNA) Purify DNA oligo(dG) TTTTTP-5’ Klenow polymerase or reverse Transcriotase Four dNTPs -3’ TTTTTP-5’ Duplex cDNA Angelia 09 18 Duplex cDNA 5’-pGGGG 3’-CCCCCCC -3’ TTTTTp-5’ Single strand-specific nuclease 5’-pGGGG 3’-CCC -3’ TTTTTp-5’ Klenow polymerase treat with E.coRI methylase 5’-pGGGG 3’-CCCC Add E.colRI linkers using T4 DNA ligase HO-CCGAATTCGGGGGG 3’-GGCTTAAGCCCCCC -3’ TTTTTp-5’ HO-CCG/AATTCGG-3’ 3’-GGCTTAA/GCC-OH CCGAATTCGG-3’ TTTTTGGCTTAAGCC-OH E.colRI digestion 5’-pAATTCGGGGGG 3’-CCCCCCC CCG-3’ TTTTTGGCTTAAp-5’ Ligate to vector and transfom Fig2.1 Second strand synthesis 19 Treatment of cDNA ends Blunt and ligation of large fragment is not efficient, so we have to use special acid linkers to create sticky ends for cloning. The process : Move protruding 3’-ends (strand-special nuclease) Fill in missing 3’ nucleotide (klenow fragment of DNA polyI and 4 dNTPs) Ligate the blunt-end and linkers(T4 DNA ligase) Tailing with terminal transferase or using adaptor molecules Restriction enzyme digestion (E.coRI ) 20 Ligation to vector Any vectors with an EcoRI site would suitable for cloning the cDNA. The process : Dephosphorylate the vector with alkaline phosphatase Ligate vector and cDNA with T4 DNA ligase (plasmid or λ phage vector) 21 Screening Screening The process of identifying one particular clone containing the gene of interest from among the very large number of others in the gene library . Plate the cDNA library on LB agar plates -It need the help of host. -The detail can refer any cDNA library construction kits. Angelia 09 23 Pick colony Culture in LB broth (antibiotic) 37oC, overnight Plasmid Preparation Gel electrophoresis & Spectrophotometer Verification Restriction Enzyme Gel electrophoresis Verification PCR Gel electrophoresis Expressed Sequence Tag (EST) Messenger RNA (mRNA) sequences in the cell represent copies from expressed genes. RNA cannot be cloned directly reverse transcribed to double-stranded cDNA The resultant cDNA is cloned to make libraries representing a set of transcribed genes of the original cell, tissue or organism. Characteristics of EST sequences Nagaraj, S. H. et al. Brief Bioinform 2007 8:6-21; doi:10.1093/bib/bbl015 Sequencing DNA sequencing by the Sanger method The standard DNA sequencing technique is the Sanger method, named for its developer, Frederick Sanger, who shared the 1980 Nobel Prize in Chemistry. This method begins with the use of special enzymes to synthesize fragments of DNA that terminate when a selected base appears in the stretch of DNA being sequenced. These fragments are then sorted according to size by placing them in a slab of polymeric gel and applying an electric field -- a technique called electrophoresis. Because of DNA's negative charge, the fragments move across the gel toward the positive electrode. The shorter the fragment, the faster it moves. Typically, each of the terminating bases within the collection of fragments is tagged with a radioactive probe for identification. DNA sequencing example Problem Statement: Consider the following DNA sequence (from firefly luciferase). Draw the sequencing gel pattern that forms as a result of sequencing the following template DNA with ddNTP as the capper. atgaccatgattacg... Solution: Given DNA template: DNA synthesized: 5'-atgaccatgattacg...-3' 3'-tactggtactaatgc...-5' DNA sequencing example Given DNA template: 5'-atgaccatgattacg...-3' DNA synthesized: 3'-tactggtactaatgc...-5' Gel pattern: +-------------------------+ lane ddATP |W | | || | lane ddTTP |W| | | | | | lane ddCTP |W | | | | lane ddGTP |W || | | +-------------------------+ Electric Field + Decreasing size where "W" indicates the well position, and "|" denotes the DNA bands on the sequencing gel. A sequencing gel This picture is a radiograph. The dark color of the lines is proportional to the radioactivity from 32P labeled adenonsine in the transcribed DNA sample. Reading a sequencing gel You begin at the right, which are the smallest DNA fragments. The sequence that you read will be in the 5'-3' direction. This sequence will be exactly the same as the RNA that would be generated to encode a protein. The difference is that the T bases in DNA will be replaced by U residues. As an example, in the problem given, the smallest DNA fragment on the sequencing gel is in the C lane, so the first base is a C. The next largest band is in the G lane, so the DNA fragment of length 2 ends in G. Therefore the sequence of the first two bases is CG. The sequence of the first 30 or so bases of the DNA are: CGTAATCATGGTCATATGAAGCTGGGCCGGGCCGTGC.... When this is made as RNA, its sequence would be: CGUAAUCATGGUCAUAUGAAGCUGGGCCGGGCCGUGC.... Note that the information content is the same, only the T's have been replaced by U's!. The codon table 5’-Base U(=T) C A G U(=T) Phe Phe Leu Leu Leu Leu Leu Leu Ile Ile Ile Met Val Val Val Val Middle C Ser Ser Ser Ser Pro Pro Pro Pro Thr Thr Thr Thr Ala Ala Ala Ala Base A Tyr Tyr Term Term His His Gln Gln Asn Asn Lys Lys Asp Asp Glu Glu 3’-Base G Cys Cys Term Trp Arg Arg Arg Arg Ser Ser Arg Arg Gly Gly Gly Gly U(=T) C A G U(=T) C A G U(=T) C A G U(=T) C A G Translating the DNA sequence The order of amino acids in any protein is specificed by the order of nucleotide bases in the DNA. Each amino acid is coded by the particular sequence of three bases. To convert a DNA sequence First, find the starting codon. The starting codon is always the codon for the amino acid methionine. This codon is AUG in the RNA (or ATG in the DNA): GCGCGGGUCCGGGCAUGAAGCUGGGCCGGGCCGUGC.... Met In this particular example the next codon is AAG. The first base (5'end) is A, so that selects the 3rd major row of the table. The second base (middle base) is A, so that selects the 3rd column of the table. The last base of the codon is G, selecting the last line in the block of four. Translating the DNA sequence This entry AAG in the table is Lysine (Lys). Therefore the second amino acid is Lysine. The first few residues, and their DNA sequence, are as follows (color coded to indicate the correct location in the codon table): Met Lys Leu Gly Arg … ... AUG AAG CUG GGC CGG GCC GUG C.. This procedure is exactly what cells do when they synthesize proteins based on the mRNA sequence. The process of translation in cells occurs in a large complex called the ribosome. Automated procedure for DNA sequencing A computer read-out of the gel generates a “false color” image where each color corresponds to a base. Then the intensities are translated into peaks that represent the sequence. High-throughput seqeuncing: Capillary electrophoresis The human genome project Sheath flow has spurred an effort to Laser develop faster, higher Sheath flow cuvette Focusing lens throughput, and less expensive technologies for DNA sequencing. Capillary electrophoresis Beam block Collection Lensc (CE) separation has many PMT filter advantages over slab gel separations. CE separations are faster and are capable of producing greater resolution. CE instruments can use tens and even hundreds of capillaries simultaneously. The figure show a simple CE setup where the fluorescently-labeled DNA is detected as it exits the capillary. DNA sequencing. Dideoxy analogs of normal nucleotide triphosphates (ddNTP) cause premature termination of a growing chain of nucleotides. ACAGTCGATTG ACAddG ACAGTCddG ACAGTCGATTddG Fragments are separated according to their sizes in gel electrophoresis. The lengths show the positions of “G” in the original DNA sequence. Nucleotides and phosphodiester bond. Phosphodiester bond Genomic sequencing. Individual chromosomes are broken into 100kb random fragments. This library of fragments is screened to find overlapping fragments – contigs. Unique overlapping clones are chosen for sequencing. Put together overlapping sequenced clones using computer programs. Sequencing cDNA libraries. mRNA is pooled from the tissues which express genes. cDNA libraries are prepared by copying of mRNA with reverse transcriptase. Expressed Sequence Tags (EST) – partial sequences of expressed genes. Comparing translated ESTs to annotated proteins – annotation of genes. Gene prediction. Gene – DNA sequence encoding protein, rRNA, tRNA … Gene concept is complicated: - Introns/exons - Alternative splicing - Genes-in-genes - Multisubunit proteins Gene structure. ATG -35 TER -10 Promoter sequences Gene ATG – start codon; TER (TAA, TAG,TGA) – termination codons Codon usage tables. - Each amino acid can be encoded by several codons. - Each organism has characteristic pattern of codon usage. Problems arising in gene prediction. Distinguishing pseudogenes (not working former genes) from genes. Exon/intron structure in eukaryotes, exon flanking regions – not very well conserved. Exon can be shuffled alternatively – alternative splicing. Genes can overlap each other and occur on different strands of DNA. Gene identification Homology-based gene prediction • Similarity Searches (e.g. BLAST, BLAT) • ESTs Ab initio gene prediction • Prokaryotes ORF identification • Eukaryotes Promoter prediction PolyA-signal prediction Splice site, start/stop-codon predictions Prokaryotic genes – searching for ORFs. - Small genomes have high gene density Haemophilus influenza – 85% genic - No introns - Operons One transcript, many genes - Open reading frames (ORF) – contiguous set of codons, start with Met-codon, ends with stop codon. Example of ORFs. There are six possible ORFs in each sequence for both directions of transcription. Confirming gene location using EST libraries. Expressed Sequence Tags (ESTs) – sequenced short segments of cDNA. They are organized in the database “UniGene”. If region matches ESTs with high statistical significance, then it is a gene or pseudogene.