Download DNA BARCODING

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Species distribution wikipedia , lookup

Cre-Lox recombination wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

Extrachromosomal DNA wikipedia , lookup

History of genetic engineering wikipedia , lookup

Microsatellite wikipedia , lookup

Vectors in gene therapy wikipedia , lookup

Genomics wikipedia , lookup

Genome editing wikipedia , lookup

Designer baby wikipedia , lookup

Non-coding DNA wikipedia , lookup

Point mutation wikipedia , lookup

United Kingdom National DNA Database wikipedia , lookup

Koinophilia wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

Metagenomics wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Helitron (biology) wikipedia , lookup

Microevolution wikipedia , lookup

DNA barcoding wikipedia , lookup

Transcript
DNA BARCODING IMBB 2016 BecA-­‐ILRI Hub, Nairobi May 9 – 20, 2016 Joyce Nzioki DNA Barcoding: a new diagnosEc tool for rapid species recogniEon, idenEficaEon and discovery DNA barcoding: towards an inventory of life A DNA barcode is a short gene sequence taken from a standardized porFon of the genome, used to idenFfy species What is DNA Barcoding?
What is DNA Barcoding •  A•  way
samples
A wof
ay identifying
of idenFfying to species
a ashort
samples based
based onon short standardised gene-region
standardized gene-­‐region •  Keywords:
•  Keywords –  Identify
–  IdenFfy –  Samples
–  Samples –  Species
–  Species –  Gene
–  Gene –  Short
–  Short –  Standardised
–  Standardized An Internal ID System for All Animals DNA Barcoding with Cytochrome Oxidase subunit 1 (CO1) The ideal gene to study •  Present in all species The CO1 Gene ü  All eukaryotes contain mitochondria; CO1 encodes a mitochondrial protein needed for cells to make ATP. •  Variable, but not too variable •  Standardized among scienFsts around the world ü  CO1 is almost idenFcal within a species but varies between different species. ü  Agreement among scienFsts that the CO1 gene is used for animal barcoding. Non – CO1 regions for other taxa •  Land plants: o  Chloroplast matK and rbcL o  70-­‐75% resolving ability, higher in angiosperms o  Non-­‐coding plasFd and nuclear regions being explored •  Fungi o  Nuclear ITS region including coding and non-­‐coding regions o  72% effecFve at species level; supplementary regions used •  Bacteria: o  16S Ribosomal gene Why we need barcoding v IdenFfying specimens – recogniFon of named (described) species v Discovering new species – aid in speedy discovering to the remaining biodiversity as tradiFonal taxonomy (morphology) is too slow. Image credit: Barcoding institute of ontario
Image credit: Barcoding institute of ontario
TheVision
Vision
The
Credit: iBOL
PotenFal applicaFons a)  Controlling agricultural pests – by idenFfying them at any life stage easing control before crop damage. b)  IdenEfying Disease vectors – allow idenFficaFon of disease causing vectors in animals and humans. c)  Sustaining natural resources -­‐ by monitoring illegal trade of products made of natural resources like hard wood. d)  ProtecEng endangered species – Primate populaFon is reduced in Africa by 90% due to bush meat hunFng. e)  Monitoring water quality – By studying organisms in lakes, rivers and streams their health can be measured. f)  RouEne authenEcaEon od Natural Health Products. g)  Biosecurity h)  IdenFfy plant leaves even when flowers and fruits are not available. PotenFal applicaFons a)  Controlling agricultural pests – by idenFfying them at any life stage easing control before crop damage. b)  IdenEfying Disease vectors – allow idenFficaFon of disease causing vectors in animals and humans. c)  Sustaining natural resources -­‐ by monitoring illegal trade of products made of natural resources like hard wood. d)  ProtecEng endangered species – Primate populaFon is reduced in Africa by 90% due to bush meat hunFng. The Barcode of Life Project (BOLD) NavigaFng the system Databases Public data portal: a database of all public sequences on BOLD. Barcode Index Numbers (BINs) Database: BINS are an interim taxonomic system for animals. Primer databases – database of barcode primers & primer staFsFcs PublicaEon database: community maintained database of barcode papers. Taxonomy A publicly available resource which displays images, distribuFon maps and other details for each taxon on BOLD IdenEficaEon The animal, plant and fungal idenFficaFon engines are based on CO1, matK/ rbcl and ITS genes respecFvely Workbench The workbench provides access to manage and contribute to DNA barcode projects as well as the BOLD data analysis tool Resources Technical documentaFon, user support and addiFonal resources are available at this link. Searching Public Data Searching Public Data •  Users can enter a combinaFon of search terms to advance their search e.g “Lepidoptera Canada” will return all of the Lepidoptera records collected in Canada. •  QuotaFon marks must be used for exact match retrieval of mulF word terms e.g. “United States” Aves will return results of US birds •  A minus (-­‐) operator will omit certain results from the search e.g. “Biodiversity Ins<tute of Ontario” Sesiidae –Manitoba will deliver results for the Sesiidae stored in the Biodiversity InsFtute of Ontario, but not collected in Manitoba BOLD IdenFficaFon Engine •  The BOLD ID engine accepts sequences from the 5’ region of the CO1 gene and returns species level idenFficaFon (when possible). •  BOLD uses the BLAST algorithm to idenFfy single base indels before aligning the protein translaFon through profile to a Hidden Markov Model of the CO1 protein. •  In the Bold Engine ITS is the default idenFficaFon for fungal barcodes and rbcl/matK for plant barcodes DescripFon of the 6 types of IdenFficaFon Databases on BOLD. Database Name DescripEon Database Size All Barcode Records Every CO1 sequence on bold > 500bp 4,407,257 sequences Species Barcode records Every CO1 sequence > 500bp with species level idenFficaFon 2,573,278 sequences Public Barcode records Every public CO1 sequence > 500bp 980,022 sequences Full length Barcode Records Every CO1 sequence on BOLD > 640bp 1,633,770 sequences Fungal Records Every ITS sequence on BOLD > 100bp >15,000 sequences Plant Records Every rbcl and matK sequence on BOLD >95,000 & >70,000 > 500bp sequences respecFvely Taxonomy browser Primer database BARCODE Data Standards •  A set of required elements for a reserved word (‘Barcode’) in GenBank •  Ensure data longevity by archiving in GenBank •  Enable comparisons among records from approved BARCODE gene regions. •  Ensure minimum quality of sequences •  Enable georeferencing •  Provide traceability to voucher specimen •  Ensure access to raw sequencer data •  Pave the way for regulatory and forensic use. BARCODE Data Standards •  Include at least 500 conFguous unambiguous base-­‐pairs from bi-­‐direcFonal sequencing within the approved barcode region. •  Include no more than 1% ambiguous sites for the enFre submioed sequence. •  Include the name of the gene region used. •  Be associated with the trace file submioed to the NCBI Trace Archive of the Ensemble Trace Server •  The “La<tude and longitude, Name of the iden<fier, Name of the collector and date of collec<on” Are also recommended but not required How DNA Barcodes should not be used “It is expected that DNA Barcodes will contribute to the discovery and formal recogniFon of new species. However, DNA barcodes should not be used as the sole criterion for descripFon of new species, which instead require analysis of diverse data, including morphology, ecology and behavior, as well as geneFcs.” The End
Some slides were adopted from Mark Wamalwa and David E. Schindel