Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Genomics • Structural, functional • Genome, Transcriptome, Proteome, Metabolome, Interactome www.the-scientist.com Genomics or Genetics? “What's the Difference? Well, as a rule, genetics is the study of single genes in isolation. Genomics is the study of all the genes in the genome and the interactions among them and their environment(s). Analogy 1 If genomics is like a garden, genetics is like a single plant. If the plant isn’t flowering, you could study the plant itself (genetics) or look at the surroundings to see if it is too crowded or shady (genomics) – both approaches are probably needed to find out how to make your plant blossom.” http://www.genomebc.ca/education/articles/genomics-vs-genetics/ Genomics and Molecular Markers Structural genomics for plant breeders and applied geneticists = molecular markers • • • • How many genes determine important traits? Where these genes are located? How do the genes interact? What is the role of the environment in the phenotype? • Molecular breeding: Gene discovery, characterization, and selection using molecular tools • Molecular markers are a key implement in the molecular breeding toolkit What is a Molecular Marker Markers are based on polymorphisms • Amplified fragment length polymorphism • Restriction fragment length polymorphism • Single nucleotide polymorphism • The polymorphisms become the alleles at marker loci • The marker locus is not necessarily a gene: the polymorphism may be in the dark matter, in a UTR, in an intron, or in an exon • Non-coding regions may be more polymorphic DNA Mutations & Polymorphisms • Changes in the nucleotide sequence of genomic DNA that can be transmitted to the descendants. • If these changes occur in the sequence of a gene, it is called a mutant allele. The most frequent allele is called the wild type. • A DNA sequence is polymorphic if there is variation among the individuals of the population. Types of DNA Mutations (1) Wildtype 5’ – AGCTGAACTCGACCTCGCGATCCGTAGTTAGACTAG -3’ Substitution (transition: A 5’ – AGCTGAACTCGGCCTCGCGATCCGTAGTTAGACTAG -3’ G Substitution (transversion: G 5’ – AGCTCAACTCGACCTCGCGATCCGTAGTTAGACTAG -3’ C) C Deletion (single bp) 5’ – AGCTAACTCGACCTCGCGATCCGTAGTTAGACTAG -3’ CAACTCGACC Deletion (DNA segment) 5’ – AGCTTCGCGATCCGTAGTTAGACTAG -3’ Types of DNA Mutation (2) Wildtype 5’ – AGCTGAACTCGACCTCGCGATCCGTAGTTAGACTAG - 3’ Insertion (single bp) 5’ – AGCTGAACTACGACCTCGCGATCCGTAGTTAGACTAG - 3’ Insertion (DNA segment) 5’ – AGCTGAACTAGTCTGCCCGACCTCGCGATCCGTAGTTAGACTAG -3’ Inversion 5’ – AGCAGTTGACGACCTCGCGATCCGTAGTTAGACTAG -3’ Tranposition: 5’ – AGCTCGACCTCGCGATCCGTAGTTATGAACGACTAG - 3’ Why Use Markers? A way of dealing with the • Large number of genes per genome • Huge genome size • Technical challenges and cost of whole genome sequencing The search for DNA polymorphisms was not driven by a desire to complicate things, but rather by the low number of naked eye polymorphisms (NEPs) Markers may be linked to target genes Markers in target genes are perfect markers What is a perfect marker for a gene deletion? DNA Markers • Polymorphisms can be visualized at the metabolome, proteome, or transcriptome level but for a number of reasons (both technical and biological) DNA-level polymorphisms are currently the most targeted • Regardless of whether it is a “perfect” or a “linked” DNA marker, there are two key considerations that need to be addressed in order for the researcher/user to visualize the underlying genetic polymorphism • Applications in Mapping and Marker Assisted Breeding Key steps for DNA Markers 1. Finding and understanding the genetic basis of the DNA-level polymorphism, which may be as small as a single nucleotide polymorphism (SNP) or as large as an insertion/deletion (INDEL) of thousands of nucleotides 2. Detecting the polymorphism via a specific assay or "platform". The same DNA polymorphism may be amenable to different detection assays Applications of Marker Maps 1. Establish evolutionary relations: homoeology, synteny and orthology • Homoeology: Chromosomes, or chromosome segments, that are similar in terms of the order and function of the genetic loci. Homoeologous chromosomes may occur within a single allopolyploid individual (e.g. the A, B, and D genomes in wheat) May also be found in related species (e.g. the 1A, 1B, 1D series of wheat and the 1H of barley) • Orthology: Refers to genes in different species which are so similar in sequence that they are assumed to have originated from a single ancestral gene. • Synteny: Classically refers to linked genes on same chromosome Also used to refer to conservation of gene order across species 2. Associations due to linkage or pleiotropy • Identify markers that can be used in marker assisted selection 3. Locate genes for qualitative and quantitative traits • Map-based cloning strategies Polymorphism Detection Issues Polymorphisms vs. assays An ever-increasing number of technology platforms have been, and are being, developed to deal with these two key considerations These platforms lead to a bewildering array of acronyms for different types of molecular markers. To add to the complexity, the same type of marker may be assayed on a variety of platforms Ideal marker is one that targets the causal polymorphism (perfect marker). Not always available though….. Restriction Fragment Length Polymorphism (RFLP) • RFLPs (Botstein et al. 1980) are differences in restriction fragment lengths caused by a SNP or INDEL that create or abolish restriction endonuclease recognition sites. • RFLP assays are based on hybridization of a labeled DNA probe to a Southern blot (Southern 1975) of DNA digested with a restriction endonuclease Labeled Probe Target 3’ TGGCTAGCT 5’ 1 3’ TGGCTAGCT 5’ ||||||||| 5’-CCTAACCGATCGACTGAC-3’ 2 5’-GGATTGGCTAGCTGACTG-3’ RFLP Steps Co-Dominant RFLP Polymorphism A C A T T GCGAA T T C A T GT A CGC A T T GT AA CGC T T AAGT A CA T GCGT A A C A T T GCGAAG T C A T GT A CGC A T T GT AA CGC T T C AGT A CA T GCGT A Allele A Restriction Site Allele a A a a A a a A Aa Ind 1 Ind 2 Ind 3 Ind 4 Ind 5 Ind 6 Ind 7 Ind 8 Dominant RFLP Polymorphisms A C A T T GCGAA T T C A T GT A CGC A T T GT AA CGC T T AAGT A CA T GCGT A A C A T T GCGAAG T C A T GT A CGC A T T GT AA CGC T T C AGT A CA T GCGT A Allele A Restriction Site Allele a A a a A a a A Aa Ind 1 Ind 2 Ind 3 Ind 4 Ind 5 Ind 6 Ind 7 Ind 8 Features of RFLPs • • • • • • • Co-dominant, unless probe contains restriction site Locus-specific Genes can be mapped directly Supply of probes and markers is unlimited Highly reproducible Requires no special instrumentation Radioactive detection…… Amplified Fragment Length Polymorphism (AFLP) • Fragment genomic DNA with frequent and rare cutters • AFLPs (Vos et al. 1995) are differences in restriction fragment lengths caused by SNPs or INDELs that create or abolish restriction endonuclease recognition sites. • AFLP assays are performed by selectively amplifying a pool of restriction fragments using PCR. AFLP Protocol EcoRI (1/4096) MseI (1/256) Digestion with 2 restriction enzymes Restriction site adapter ligation Selective preamplification 3’ 5’ T A T 5’ 3’ A Amplification 3’ 5’ CTT 5’ ATG 3’ AFLP Polymorphisms • Polymorphisms between genotypes may arise from: – Sequence variation in one or both restriction sites – Sequence variation in the region immediately adjacent to the restriction sites – Insertions or deletions within an amplified fragment • Band Detection – Denaturing polyacrylamide gel electrophoresis & autoradiography or silver staining – Sequencing Features of AFLPs • • • • • • Very high multiplex ratio Very high throughput Off-the-shelf technology Fairly reproducible Dominant and co-dominant Radioactive detection but less hazardous options available • Can convert favourite marker to SCAR Simple Sequence Repeats (SSR) • SSRs or microsatellites (Nakamura et al. 1987) are tandemly repeated mono-, di-, tri-, tetra-, penta-, and hexa-nucleotide motifs • SSR length polymorphisms are caused by differences in the number of repeats • Assayed by PCR amplification using pairs of oligonucleotide primers specific to unique sequences flanking the SSR • Detection by autoradiography, silver staining, sequencing… SSR Repeats Repeat Motifs • AC repeats tend to be more abundant than other di-nucleotide repeat motifs in animals (Beckmann and Weber 1992) • The most abundant di-nucleotide repeat motifs in plants, in descending order, are AT, AG, and AC (Lagercrantz et al. 1993; Morgante and Oliveri 1993) • Typically, SSRs are developed for di-, tri-, and tetra-nucleotide repeat motifs • CA and GA have been widely used in plants • Tetra-nucleotide repeats have the potential to be very highly polymorphic; however, many are difficult to amplify Simple sequence repeat in hazelnut Note the difference in repeat length AND the consistent flanking sequence SSR Protocol Individual 1 (AC)x9 Individual 2 (AC)x11 51 bp Chloroplast SSRs of pine Powell et al. 1995. Proc Natl Acad Sci U S A. 92(17): 7759–7763. 55 bp Features of SSRs • • • • • • Highly polymorphic Highly abundant and randomly dispersed Co-dominant Locus-specific High throughput Can be automated Diversity Arrays Technology - DArT DArT Analysis • 2,500 markers per sample • 94 samples - ~$4,500 • ~ 2 cents per datapoint http://www.diversityarrays.com/ Features of DArT • • • • • • • Very high multiplex ratio Very high throughput Bi-allelic Dominant marker system Requires substantial investment Fairly reproducible DArT sequences now available Single Nucleotide Polymorphisms (SNP) • DNA sequence variations that occur when a single nucleotide (A, T, C, or G) in the genome sequence is altered …..ATGCTCTTACTGCTAGCGC…… …..ATGCTCTTACTGCTAGCGC…… …..ATGCTCTTCCTGCTAGCGC…… …..ATGCTCTTACTGCAAGCGC…… Consensus…..ATGCTCTTNCTGCNAGCGC…… Alleles Single Nucleotide Polymorphisms (SNPs) Features of SNP • Highly abundant (1 every 200 bp in barley; Rostoks et al., 2005) • Locus-specific • Co-dominant and bi-allelic • Basis for high-throughput and massively parallel genotyping technologies • Genic rather than anonymous marker • Phenotype due to SNP can be mapped directly SNPS in Allopolyploids www.cerealsdb.uk.net Varietal SNPs in Allopolyploids www.cerealsdb.uk.net SNP Detection Strategy • Locus specific system – Many samples with few markers • Marker assisted selection in commercial breeding programmes for key target characters • Addition of characteristic major genes to e.g. mapping populations and association panels • KASP – buy master mix and synthesise own primers • Genome wide system – Fewer samples with many markers • Germplasm characterization, academic and breeding • Genotyping panels for GWAS • Illumina or Affymetrix for higher density arrays, costs↓ • What about bi-parental populations?? Affymetrix Axiom Technology • Two colour ligation based assay • Utilises unique oligonucleotide complementary to flanking genomic sequence • Automated parallel processing Wheat SNP Arrays KASPTM Genotyping More Information: http://www.lgcgroup.com/services /genotyping/#.VCMgyPldWJ0 Wheat SNP Resources www.cerealsdb.uk.net Wheat SNP Haplotypes www.cerealsdb.uk.net Sequencing Approaches • RRL – Reduced Representation Library • RAD-Seq – Restriction Site Associated DNA Sequencing • GBS – Genotyping by Sequencing • See Davey et al., (2011) Nature Reviews Genetics 12: 499-510 RADseq: Restriction-site Associated DNA markers • Uses Illumina sequencing technology • Based on digestion with restriction enzymes. An adapter binds to the restriction site and up to 5kb fragments are sequenced around the target size. • Bioinformatics work used to find SNPs on the amplified regions Genotyping by Sequencing Genotyping by Sequencing Genomic DNA digestion Barcode adaptor PstI, MseI + Common adaptor GP x Morex map + 1 ligation Pooling and cleanup P1 PCR enrichment Library size analysis Illumina sequencing P2 2 0.0 1.7 3.2 *MR_1276826_1H *MR_104832_1H MR_137377_P5852F51_1H 6.9 8.0 *MR_1570047 MR_112662_P2522R4 12.4 14.1 15.8 MR_134866_P2478F39 MR_1558776_P8419R20 *MR_1558776 18.9 21.1 23.2 23.6 *MR_107223 *MR_118609 *MR_136272 BK_1688877_P193R49 38.0 38.9 40.6 42.0 43.7 44.6 44.9 45.1 45.3 45.9 46.0 46.1 46.2 46.3 46.4 46.6 46.8 47.2 48.5 49.1 49.6 50.4 51.7 53.6 54.8 56.3 70.3 72.4 75.6 79.1 79.9 81.0 83.3 86.7 92.1 93.4 95.9 MR_1566497_P5071F52 MR_138882_P1622R4 MR_1561831_P3942F59 MR_139179_P5528F23 MR_140361_P70F25 MR_1458736_P77R55 MR_131409_P671F28 BK_788008_P113R9 MR_1560843_P1839F61 MR_135195_P8340R34 MR_128994_P1295F58 BK_2569298_P70F59 BK_2478601_P171F36 MR_1135837_P125R45 MR_144808_P4920F25 MR_140562_P405R52 MR_10966_P107R21 BK_2693165_P231R33 MR_1266903_P238R54 *MR_1567178 MR_136889_P1033R7 MR_1559182_P21590F9 MR_1561783_P4985F37 MR_101181_P7543R8 *BK_582988 *MR_141504 MR_120198_P1020R24 MR_139962_P433F24 *MR_135645 MR_1561237_P3808R7 *MR_110268 MR_1558327_P6889F22 MR_1039081_P76R47 *MR_1566429 BW_999558_P142F15 MR_1562271_P255R17 *MR_1569341 114.0 115.7 116.5 118.0 MR_141931_P2537F61 MR_134723_P5971F51 *BK_301066 *MR_146408_1H 133.7 136.7 138.4 139.2 140.5 MR_109075_P4268F37_1H MR_132049_P3206R58_1H MR_1036344_P168R24 *MR_121539 *MR_1563012_1H 3 0.0 *MR_128619 3.3 *MR_1568534_2HL 9.8 11.9 13.0 14.3 *MR_135496_2HL MR_135496_P11711R33 MR_1565157_P2236F53_2HL *MR_136074_2HL 19.5 20.1 22.7 23.5 26.7 28.8 30.2 32.4 32.6 *MR_1005688 MR_130829_P726F63 *MR_116040 *MR_1560188 MR_120904_P162F19 *BW_1492788 MR_1559679_P1442F26 MR_110436_P648R24 MR_102751_P1157R23 40.0 41.6 42.0 44.3 46.5 *MR_136407 *MR_1562278 *MR_139948 *MR_1568395 MR_138589_P395F25 50.9 52.5 MR_127779_P10445R3 MR_1564529_P1596R45 57.2 57.5 58.5 MR_142671_P2664F59 MR_122092_P5573F32 MR_1558515_P5631R11 70.8 *MR_1501374 73.8 BK_540153_P1310F8 92.6 94.3 95.5 96.2 96.4 97.3 97.9 98.0 98.1 98.2 98.3 98.6 98.8 98.9 99.0 99.1 99.9 100.1 101.1 101.5 102.8 103.8 115.2 125.0 125.6 127.2 132.5 135.4 135.6 136.6 138.4 139.2 140.5 141.4 142.6 145.6 MR_141795_P216F57 BK_837501_P177F52 MR_137965_P6544R32 MR_144119_P1286F39 BK_932326_P250R20 *BK_1519822 MR_108261_P3875R57 MR_143177_P4240F30 MR_1559558_P2766R47 BW_2039910_P27F19 BK_2323017_P183F20 MR_151040_P653F59 MR_1566116_P3952F61 BW_212248_P60F15 MR_138866_P5828R17 MR_128550_P1407R32 MR_138239_P3883F60 BW_995640_P66R43 BW_1563827_P115F49 BW_1880334_P144F33 MR_135936_P7299R52 MR_1560545_P1288F58 MR_1562102_P4921R13 MR_144736_P3402R44 MR_135631_P5364R50 MR_142805_P3716F28 MR_117787_P16F20 BW_860235_P7039R30 MR_13526_P100F27 MR_1558263_P1004R40 MR_134800_P2632F12 *MR_138683 *MR_138225 BW_941631_P2145R30 *MR_1117107 *MR_135823 156.0 MR_1044900_P92R22 0.0 1.7 2.1 3.6 *MR_1558729_3HS *MR_122161_3HS *MR_107168 MR_1557973_P10720R35 16.6 *MR_1562590 20.1 21.4 *MR_147597 MR_1561375_P1243R16 30.9 31.8 MR_105908_P2026F15 MR_154974_P870R31 47.8 49.2 52.5 54.1 54.4 55.1 55.8 56.2 56.3 56.5 56.9 57.1 57.3 57.4 57.6 57.7 57.8 57.9 58.0 58.1 58.2 58.4 58.5 58.7 59.0 59.2 59.6 59.7 60.4 61.3 63.3 69.5 71.4 74.6 76.5 77.6 79.2 80.4 81.3 81.4 81.8 82.4 82.8 85.5 100.4 105.6 108.9 117.9 120.4 120.6 124.2 125.9 126.8 127.4 127.8 128.8 129.3 130.8 132.0 134.7 136.3 138.3 140.3 142.2 145.8 MR_135433_P590F19 MR_1558131_P463F38 MR_1558686_P3301F60 MR_1566281_P6984R1 *BW_1670301 MR_135476_P1178R54 *BW_352819 *MR_1407623 MR_136029_P2340R19 BW_1973916_P239F43 BK_343652_P3505R6 *MR_1565554 BK_1376240_P151R56 MR_124120_P1444R29 BW_1325607_P128F26 MR_103909_P1929R21 MR_120303_P1253R40 BK_2407688_P416R58 MR_141378_P30F24 BK_1861733_P2582R29 MR_128655_P966R8 BK_2258647_P145F24 MR_120286_P596F24 MR_105969_P2010F47 MR_141688_P3417F64 MR_156480_P824F57 *MR_1570532 BK_833635_P145R35 BK_538655_P631F43 *MR_1512661 *MR_1566637 MR_135723_P5464F55 *BW_1616787 MR_1560072_P3136F16 MR_1558760_P710R13 *MR_1558760 *MR_1557906 MR_126674_P1695F14 MR_141625_P2405R35 MR_1488714_P64F61 MR_1558260_P3059F27 MR_1558586_P4265F14 MR_1560884_P818R25 *BK_2842106 *MR_134557 MR_134626_P4265R35 *MR_117030 BK_1647625_P177F26 MR_1559011_P1779R11 MR_148389_P981R22 MR_139796_P4223F34 MR_116520_P1739R46 MR_145473_P1827R34 MR_143077_P2526R32 MR_139464_P3595R51 *MR_138895 *MR_137247 MR_138554_P3977F53 *MR_136112 MR_125855_P2720F24 MR_1566051_P821F15 MR_135524_P4003F58 *MR_1570494 *MR_1558791_3HL *MR_1568158_3HL 159.4 BW_1845219_P144F53 172.7 *MR_1559531 SNPs vs GbS • SNPs – – – – Minimal input, don’t even have to isolate DNA Rapid turn around and data is ready to use Markers in known genes and generally mapped More useful in GWAS • GbS – Now quite cheap and potentially many markers – Rapid generation of sequence output but markers are anonymous • Find an expert bio-informatician to align your data and, if possible, align to reference sequence – More useful in bi-parental mapping studies Marker to Candidate Gene 4H Head GrainN Yield Ferm_Ext Viscosity SNR HWE Glucose GT25Sv TGW Tot_Sugars Grains Gwidth 11_20392 a b b a b a b b a a a b a b a b b b a b b b a a b b a a a b b a a b a a Ferment sdw1 a b a a b b a a a a a b a b b a b a a b b a a b a b a b b b b b b b b a MillEn Line/Marker 11_21508 ari-eGP Derkado a a B83-12/21/5 b b DH_001 b b DH_002 a a DH_003 b b DH_004 a a DH_005 b b DH_006 b b DH_007 a a DH_008 a a DH_009 a a DH_010 b b DH_011 a a DH_012 b b DH_013 a a DH_014 b b DH_015 b b DH_016 b b DH_018 a a DH_019 b b DH_020 b b DH_021 b b DH_022 a a DH_023 a a DH_024 b b DH_026 b b DH_027 a a DH_028 a a DH_029 a a DH_030 b b DH_031 b b DH_032 a a DH_033 a a DH_034 b b DH_035 a a DH_036 a a 11_20145 11_21056 11_10221 11_21385 11_20210 11_10132 11_21374 11_20411 11_21122 11_10028 11_10093 11_20269 11_20289 11_20496 11_10667 11_20939 11_10793 11_10942 11_11042 11_11114 11_21490 11_11332 11_10046 11_10568 11_11244 11_20020 11_20135 11_10262 11_20412 11_20450 11_20472 11_10809 11_10527 11_20361 11_10914 11_21400 11_21191 11_20062 11_10509 11_20723 11_20820 11_11207 11_10010 11_10606 11_10639 11_20906 11_20924 11_11431 11_20072 11_20580 11_21504 11_20740 11_10829 11_20689 11_21151 11_11398 11_10751 11_20119 11_20762 11_11292 11_11470 11_10510 11_10614 mlo mlo07646 mlo04264 mlo02559 11_10123 11_10712 11_10611 11_11066 11_10269 11_20007 11_10610 11_11186 af459084_02