Download DNA Barcoding in Plants

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Ornamental bulbous plant wikipedia , lookup

Hybrid (biology) wikipedia , lookup

Transcript
DNA Barcoding in Plants:
Biodiversity Identification and
Discovery
University of
Sao Paulo
December
2009
W. John Kress
Department of Botany
National Museum of Natural History
Smithsonian Institution
New Technologies
for Taxonomy
DNA Barcodes
UNITED STATES NATIONAL HERBARIUM
4.7 Million Specimens
NATIONAL MUSEUM OF NATURAL HISTORY
124 Million Specimens
DNA Barcodes
A short universal gene sequence taken
from a standardized portion of the
genome used to identify species
Uses of DNA Barcodes
1.
Research tool for taxonomists:

To aid identification of species

To expand species diagnoses to all life history
stages, including fruits, seeds, dimorphic sexes,
damaged specimens, gut contents, scats

To test consistency of species definitions with a
DNA measure of variability
2.
Applied tool for users of taxonomy:

To identify regulated species, including invasives

To test purity and identity of biological products

To assist ecologists in field studies of poorly
known organisms
3.
Discovery tool:

To flag potentially new species, especially
undescribed and cryptic species
The Barcoding Process - 2 parts
1. Populate the barcode “library” with known
species
•
•
•
•
•
Collect tissue from voucher specimen
Extract DNA
PCR/Amplify/cycle sequence gene(s)
3. Put barcode sequences to work
Sequence
to answer compelling scientific
Database
questions
• Ecological forensics
2. “BLAST” an unidentified
• Community ecology and
specimen against the barcode
phylogenetics
library
• Sequence comparison
• New searching technologies
• Ultimately - handheld device ?
Smithsonian‟s
National Museum of
Natural History
Caribbean Sponges
DNA Barcode
Pipeline
Select plant
material
DNA
Extraction
PCR
Data Editing
Robotic
Sequencing
Finished
„Barcode‟
L
i
b
r
a
r
y
The Primary Choice for Barcoding in
Animals: the Mitochondrial Genome
D-Loop
Small ribosomal RNA
Large
ribosomal RNA
Cyt b
ND1
ND6
ND5
L-strand
COI
COI
ND2
H-strand
ND4
COI
ND4L
ND3
COIII
COII
ATPase subunit 8
ATPase subunit 6
What about Plants?
Why were plants behind?
• Finding the right gene
regions
• Mobilizing a consensus in
the botanical community
Finally….
• Consensus on gene
regions
• Moving ahead
Criteria for DNA Barcoding
• Contains sufficient variation to
discriminate between species
• Conserved flanks for universal
primers
All land plants
• Short, 300-800 bp
Limited by current sequencing technology,
cost consideration (= 1 read length), and
ability to use degraded samples
• Sequence Quality
Three Genomes of Plant Cells
for Barcode Candidates
Chloroplast
*High copy number
*Conserved structure
*Diversity of substitution
rates across genes,
introns, and
intergenic spacers
Nuclear
*Contain the most
variable loci
*Problems with multigene families
*Single-copy genes
often technically
difficult
Mitochondrial
*Locus of choice for
animal
barcoding is
mitochondrial
COI
*Limitations with
plants
-Low
divergence
-Rapid genome
rearrangements
Atropa vs. Nicotiana Chloroplast Genomes
Complete
SchmitzLinneweber
et al. 2002
Atropa vs. Nicotiana Chloroplast Genomes
1% divergence
Atropa vs. Nicotiana Chloroplast Genomes
trnL-F
trnV-atpE
atpB-rbcL
psbM-trnD
ycf6-psbM
trnC-ycf6
trnK-rps16
rpl36-rps8
trnH-psbA
2%
difference
2% divergence
Top Plant Barcode Candidate:
Intergenic Spacer trnH-psbA
CRITERIA FOR BARCODING
• Short, 300-800 bp
trnH-psbA = 450 bp
• Conserved flanks for universal
primers
trnH-psbA = 93-100% success
• Contains sufficient variation to
discriminate between species
trnH-psbA = 1.17%
A SINGLE-LOCUS PLANT BARCODE
Option #1: Best Candidate
Plastid Non-Coding
trnH-psbA
Many Other Regions Proposed:
accD, matK, ndhJ, rbcL, rpoC1, rpoB2,
trnL, YCF5, UPA, ITS, CO1
SAMPLING
AND PCR
SUCCESS:
39 Orders
of Land
Plants
A SINGLE-LOCUS PLANT BARCODE:
Comparative Results
A TWO-LOCUS PLANT BARCODE
Hierarchical and
Complementary
rbcL
= the “Anchor”
(Plastid Coding
Gene)
+
trnH-psbA
= the
“Identifier”
(Plastid Noncoding Spacer)
INTERGENIC SPACERS – Indels,
Alignment, and Repeats:
Problems or Assets?
•Spacers for
Identification (and localscale phylogenetics)
•Indels as added
characters for ID
•Partial sequences are
useful
•New Informatics Tools
for Searching the
Reference Database
•New technologies for
solving problems
Indel variation in segment of trnH-psbA
spacer among 57 species
Do we need a coding gene??
An Alternative Two-Locus Plant Barcode
CBOL Plant Working Group - 2009
Conclusion:
U
n
i
v
e
r
s
a
l
i
t
y
rbcL + matK
with
trnH-psbA &
other spacers
as alternative
barcodes
156 Cryptogams
81 Gymnosperms
170 Angiosperms
D
i
s
c
r
i
m
i
n
a
t
i
o
n
A THREE-LOCUS PLANT BARCODE
Hierarchical and
Complementary
matK
rbcL
= the “Anchor”
(Plastid Coding
Gene)
+
trnH-psbA
= the
“Identifier”
(Plastid Noncoding Spacer)
+
matK
(Plastid Coding
Gene)
Major Medicinal Plants of the World:
An Applied Test of DNA Barcoding
What is a medicinal plant?
We used a consensus of four
sources that list medicinal plants,
primarily:
World Economic Plants - A
Standard Reference
Major Medicinal Plants of the World:
An Applied Test of DNA Barcoding
• How we assembled our set:
– Selected ~1150 species
– Requested
• USDA germplasm
• USBG living collection
• Local gardens
• NMNH herbarium
– What we have:
• 768 species
• >168 Genera
• 113 Plant Families
• 4 accessions per species
Major Medicinal Plants of the World:
An Applied Test of DNA Barcoding
Two-locus
approach:
Lamiales:
Mentha
create
backbone of
tree with rbcL
as the Anchor;
then separate
individuals
species in
smaller
groups with
trnH-psbA as
the Identifier
Results:
>94%
success with
rbcL/
trnH-psbA
rbcL Anchor
trnH-psbA
Identifier
50-ha Forest Dynamics Plot on
Barro Colorado Island, Panama
Vital statistics of BCI
• Island in Panama Canal
– Premier Ecological Plot
• 296 tree
species
l Research
Institute
1035 specimens (~3
l Forest– Science
accession/species)
• 180 Genera
obal Earth Observatories
• 49 Families
EO)
• ~50% of genera have
forestone
research:
monitoring
species
= easy test
imate change
of barcoding
Why DNA Barcoding
on BCI?
Species identification
*forensic/ecological
Phylogenetic applications
*species/community
phylogenies
*functional trait mapping
50-ha Forest Dynamics Plots
Field Information Management
System
Collection Data Tab
Geographic Data Tab
Tissue Data Tab
50-ha Forest Dynamics Plot on
Barro Colorado Island, Panama
Barcode Success
Institute
ence
servatories
ch: monitoring
trnH-psbA*
pcr
seq
98% 95%
matK
pcr
seq
85% 69%
rbcLa
pcr
seq
94% 94%
ID Freq
ID Freq
ID Freq
95%
99%
75%
*Note: ~8% of sequences are partial
50-ha Forest Dynamics Plot on
Barro Colorado Island, Panama
Species Identification = BLAST
(Basic Local Alignment Search Tool)
• Designed to search for similarity among sequences
• Can quantify rates of resolution
• Use 281 barcode sequences as both library and query
RESULTS
• rbcLa + trnH-psbA + matK:
– 98% of all samples could be assigned to correct Species
– All ambiguity was in 4 genera: Psychotria, Ficus, Inga, Piper
– 100% of sequences were assigned to correct Genus
– Partial sequences were assigned correctly
Barcodes and
Forensic Ecology
Barcode
Barcodes and Community
Ecology
The Components of
Biodiversity
Swenson 2009
Building a Community
Phylogeny with Phylomatic
Phylogenetically
clustered = High Plateau,
Low Plateau and Young
Habitats
Phylogenetically Overdispersed = Swamp and
Slope Habitats
Phylogenetically Random =
Stream and Mixed Habitats
Building a Community Phylogeny with
Barcodes: A Supermatrix of rbcL, matK, and
trnH-psbA
rbcLa
*aligns unambiguously
matK
*aligned with
backtranslation (AA)
trnH-psbA
*aligned within
ORDERS (Muscle), then
orders placed within
rbcLa alignment with
“missing data” coded
for other Orders
(MacClade)
Trees
*constructed with
Parsimony (PAUP) and
ML (Garli: GTR+I+Ѓ)
50-ha Forest
Dynamics Plot
on BCI, Panama
(281 species):
Community
Phylogeny
using a
Supermatrix
Approach with
rbcL/trnHpsbA/matK
A Comparison of Ordinal and Family
Relationships on BCI
Asterids
50-ha Forest
Dynamics Plot
on BCI, Panama
(282 species):
(281
Community
Phylogeny
Phylogeny
of 23
Orders
using
using
a a
Supermatrix
Approach with
rbcL/trnH-psbA
rbcL/trnHpsbA/matK
Barcodes
vs. Phylomatic
vs.
50-ha Forest
Dynamics Plot
on BCI, Panama
(282 species):
(281
Community
Phylogeny
Phylogeny
of 23
Orders
using
using
a a
Supermatrix
Approach with
rbcL/trnH-psbA
rbcL/trnHpsbA/matK
Overall
Rubiaceae
Tree:
< 50% resolution
vs
>97% resolution
Barcodes
vs. Phylomatic
50-ha Forest
Dynamics Plot
on BCI, Panama
(282 species):
(281
Community
Phylogeny
using a
Supermatrix
Approach with
rbcL/trnH-psbA
rbcL/trnHpsbA/matK
Phylomatic Phylogeny:
Barcode Phylogeny:
Phylogenetically
clustered = High Plateau,
Low Plateau and Young
Habitats
Phylogenetically
clustered = Low Plateau
and Slope Habitats
Phylogenetically Overdispersed = Swamp and
Slope Habitats
Phylogenetically Overdispersed = High
Plateau, Mixed and
Young Habitats =
Phylogenetically Random
Phylogenetically
= Stream and Mixed
Random = Stream and
Net
Relatedness
Index (NRI)
Habitats
Swamp
Habitats
Functional Trait Analysis
50-ha Forest
Dynamics Plot
on BCI, Panama
(281 species):
Community
Phylogeny
using a
Supermatrix
Approach with
rbcL/trnHpsbA/matK
Phylogenies and
Community Ecology
Community
Assembly,
Productivity,
Stability,
Functional Trait
Evolution
Swenson 2009
Center for Tropical Forest Science
Smithsonian Institution Global Earth
Observatories (SIGEO)
22 Established Sites (Black)
12 Candidate Sites (Blue)
Barcoding Initiated (Red)
Smithsonian Tropical Research Institute
Center for Tropical Forest Science
**
*
*
*
*Observatories
**
Smithsonian Institution Global
Earth
*
(SIGEO) *
*
**
*
A global program of long-term forest research: monitoring
*
the impact of climate change
Purpose:
*Forest Dynamics
*Climate Change
Expanding*Conservation
the network!
Smithsonian
Center for Tropical
Institution
Forest
Global
Science
Earth
Smithsonian
Observatories
Institution
(SIGEO)
Global Earth
Observatories (SIGEO)
DNA Barcoding in Plants:
Biodiversity Identification and
Discovery
Dave Erickson
Ken Wurdack
Liz Zimmer
Dan Janzen
Lee Weigt
Ling Zhang
Nate Swenson
Andy Jones
Oris Sanjur
Jamie Whitaker
Ida Lopez
Stuart Davies
W. John Kress
Joe Wright
Department of Botany
Biff Bermingham
National Museum of Natural History
Scott Miller
Smithsonian Institution
University of
Sao Paulo
December
2009