Download Genetic identification of eleven aquatic bacteria using the 16S rDNA

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Population genetics wikipedia , lookup

No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup

Pathogenomics wikipedia , lookup

Nucleic acid double helix wikipedia , lookup

Human genome wikipedia , lookup

DNA barcoding wikipedia , lookup

Human genetic variation wikipedia , lookup

Primary transcript wikipedia , lookup

Genome evolution wikipedia , lookup

DNA supercoil wikipedia , lookup

Genomic library wikipedia , lookup

Gene wikipedia , lookup

Epigenomics wikipedia , lookup

Gel electrophoresis of nucleic acids wikipedia , lookup

Point mutation wikipedia , lookup

Molecular cloning wikipedia , lookup

United Kingdom National DNA Database wikipedia , lookup

Nutriepigenomics wikipedia , lookup

Public health genomics wikipedia , lookup

Genetic testing wikipedia , lookup

Nucleic acid analogue wikipedia , lookup

RNA-Seq wikipedia , lookup

Cre-Lox recombination wikipedia , lookup

Genealogical DNA test wikipedia , lookup

Deoxyribozyme wikipedia , lookup

Vectors in gene therapy wikipedia , lookup

Genome (book) wikipedia , lookup

Bisulfite sequencing wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

Cell-free fetal DNA wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

Extrachromosomal DNA wikipedia , lookup

Genome editing wikipedia , lookup

Non-coding DNA wikipedia , lookup

Designer baby wikipedia , lookup

Microsatellite wikipedia , lookup

Helitron (biology) wikipedia , lookup

Genomics wikipedia , lookup

Genetic engineering wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Microevolution wikipedia , lookup

Metagenomics wikipedia , lookup

History of genetic engineering wikipedia , lookup

Transcript
Genetic identification of bacteria 1
Genetic identification of eleven aquatic bacteria using the 16S rDNA gene
M. Gabriela Blocka and Anthony Ouelletteb*
a
Department of Chemistry, bDepartment of Biology and Marine Science, Jacksonville
University, Jacksonville, Florida
04 October 2010
Submitted for publication: 06 March 2012
*Corresponding author. Department of Biology and Marine science, Jacksonville University, Jacksonville.
Phone: (904) 256-7330. E-mail: [email protected]
Genetic identification of bacteria 2
ABSTRACT
Bacteria help maintain ecological balance by participating in the carbon, oxygen, and/or
nitrogen cycles. These cycles are important for the organism’s survival, that’s why their
identification is fundamental in order to determine how they function and interact in an
ecosystem. In this study, eleven previously isolated bacteria from the upstate New York
waterbodies (the Oswego River, Lake Ontario, and Lake Neatahwanta) were identified by
sequencing a fragment of the 16S rDNA gene from each bacteria. Identification was supported
by metabolic tests. The sequences were compared in the nucleotide database in GenBank
(GenBank, 2009) and then aligned to construct a phylogenetic tree. The eleven bacteria were
grouped into eight genera: Acinetobacter, Aeromonas, Planococcus, Enterobacter,
Exiguobacterium, Pseudomonas, Plesiomonas, and Staphylococcus. The metabolic tests better
supported the identification for the samples with longer sequences than samples with shorter
sequences, which shows the importance of getting longer sequences to better identify bacteria.
The identified genera of the isolated bacteria were found to occur in aquatic environments and
Plesiomona shigelloides, one of the samples, is usually found in fish and other aquatic animals.
Genetic identification of bacteria 3
INTRODUCTION
The identification of bacteria is fundamental to understanding the biodiversity in an
ecosystem as well as the ecological processes. Bacteria are very important to the environment
because they interact with life on Earth by their metabolic activities. Nitrifying bacteria, for
instance, make nitrogen available to plants and animals, sulfate-reducing bacteria interact in
cycling of sulfur compounds, which affects the fertility of soil, and bacteria also contribute in the
organic matter decomposition, an important process in the carbon cycle (Atlas & Robert, 1995).
Some workers like R. H. MacArthur and G. E. Hutchinson state the importance of biodiversity as
a measure of ecological processes like resource portioning, competition, succession, community
productivity and as a community stability indicator (Morris et al., 2002).
Bacteria can be identified by using different methods: like phenotypic, biochemical, or
nucleic acid and genetic assays. Phenotypic identification of microorganisms is a traditional
technique based on the size, motility and the morphology of the bacteria (Janda & Abbott, 2002).
Biochemical identification needs chemical and biochemical markers to differentiate the products
obtained by the metabolism from bacteria. “Metabolic fingerprint” is the name of the technique
used to identify bacteria by using a pattern of tests that permit perform a better identification
(Sutton, n.d.). Both types of methods can allow identification of bacteria to the genus level, or
minimize the probability that they belong to other groups. With biochemical identification,
bacteria can be classified to species level, but the level of classification can vary between groups
Genetic identification of bacteria 4
(Baron, 1996). Nucleic acids assays include methods for identification that consists on the
determination of the relative proportion of guanine and cytosine, however, this method does not
rely on the linear arrangement of the nucleotides, and therefore, its accuracy is low. DNA and
RNA homology experiments identify bacteria by hybridization of DNA or RNA molecules
between species, but the procedure is expensive and time-consuming (Janda & Abbott, 2007).
The genetic methods include the molecular analysis of DNA where DNA sections like genes or
the entire genome are used to identify bacteria. Genetic identification using the entire genome is
more accurate than using just one gene, like the 16S rDNA, however it is more time-consuming
and expensive.
The 16S rDNA gene sequencing technique has several advantages over phenotypic and
biochemical identification. Nevertheless, none of these approaches is 100% accurate for bacteria
identification. One of the advantages of using the 16S rDNA gene analysis is that this gene is
present in all bacteria. The lack of extensive mutations in this gene is another advantage of using
this type of analysis (Clarridge, 2004). In addition, the 1550bp long gene is easier to work with
because sequencing smaller genes is cheaper and faster than larger genes (SSU & LSU, n.d.).
The comparison of 16S rDNA gene sequencing with biochemical and phenotypic identification
technique shows some advantages as well. Phenotypic characteristics, for instance, are not as
accurate as genotypic methods like 16S rDNA, because in bacteria these characteristics can
change due to environmental conditions, growth substrate, temperature, and pH levels (Janda &
Abbott, 2002). Some of the disadvantages for 16S rDNA gene sequencing analysis include cost
and technical considerations. However, the accuracy and other advantages discussed above are
why some laboratories prefer to work with 16S rDNA gene sequence analysis.
Genetic identification of bacteria 5
To identify microorganisms by the 16S rDNA gene, PCR (polymerase chain reaction) is
utilized to amplify the gene. Many database libraries, such as Entrez Gene from Refseq
(Reference sequence) are used to compare gene sequences for identification and to study
phylogeny and taxonomy. For example, Entrez nucleotide library which works with BLAST
(Basic Local Alignment Search Tool), have 1,965,631 partial 16S rRNA partial genes available
for genetic identification. When sequences from these databases and samples to be identified
present a >97% identity, they can be considered the same species (Lozupone & Knight, 2009).
Due to the advantages of using the 16S rDNA gene for species identification, and for
uncultured bacterial analysis to determine the bacterial diversity in a community, many projects
have included, and are including this technique for their investigations. In one study, PCR was
used to detect toxin genes of associated cyanobacterial communities in Lake Erie. In this study,
PCR was employed to detect 16S rDNA and toxin gene fragments, and they were able to identify
that toxigenic Microcystis was present in different sections of the lake from 1999 to 2002
(Ouellette, Handy, & Wilhelm, 2005). Another investigation found that the identification of
bacteria isolated from clinical laboratories by 16S rDNA gene sequences is more accurate than
by phenotypic testing (Clarridge, 2004). A phylogenetic analysis of the 16S rDNA amplicons
from chloroethene-contaminated sites throughout North America and Europe, demonstrated that
members of the Dehalococcoides group are widely distributed in nature and can be found in a
variety of geological formations and different climatic zones (Hendrickson et al., 2002). Grampositive bcteria cultured from marine sediments were phylogenetically distinguished based on
16S rDNA gene sequences, with 65.6% of the bacteria belonging to the class Actinobacteria, and
the other 34.4% were in the Bacilli class (Gontang, Fenical, & Jensen, 2007).
Genetic identification of bacteria 6
As we see, the 16S rDNA gene sequence technique is being widely used for bacterial
identification whether for ecological or medical purposes, and the results show accuracy.
However, sequence databases are not always accurate because the sequences deposited in the
databases can result from strain misidentification, hence a verification test is necessary to support
the genetic characterization of species. Metabolic tests are still being used, and is one of the
recommended tests to identify bacteria together with phenotypic tests, cellular fatty acid analysis,
and genotypic testing methods (Janda & Abbott, 2002).
The objective of this project was to genetically identify bacteria isolated from three
different ecosystems: the Oswego River, Lake Ontario and Lake Neatahwanta in New York
State. The bacterial samples were taken during blooms on August, 9th 2005 by Dr. Ouellette. The
16S rDNA was used for the identification of eleven bacteria and the sequences obtained during
the project were used to construct a phylogenetic tree to identify the similarity between them. To
support the results obtained from the identification, metabolic tests like acid and gas production
from D-glucose and D-mannitol were used as well as catalase, oxidase and indole tests, H2S
production, motility, Gram staining, and bacteria morphology and size.
Genetic identification of bacteria 7
MATERIALS AND METHODS
The bacteria were already isolated from the upstate New York waterbodies and stored in
glycerol at -70°C. A5B, A5F, and A5C were isolated from Lake Ontario (Oswego Harbor), C5B
was isolated from Lake Neatahwanta, and B5I, B5D, B5H, B5E, B5G, B5J, B7A, B7I, B7K,
B7G, B7C and B7D were isolated from Oswego River in Minetto by John F. Heagerty.
PCR amplification
The amplification was performed with whole cell samples (DNA was not isolated from
bacteria).
Bacterial samples A5B, A5F, A5C, B5I, B5D, B5H, B5E, B5G, B5J, B7A, B7I, B7K,
B7G, B7C, B7D, C5B were amplified with a master mix containing nuclease free water
(Promega), primers 27F (sequence 5’ AGAGTTTGATCMTGGCTCAG 3’) and 1518R
(sequence 5’ AAGGAAGGTGATCCANCCRCA 3’) (Urbach, Vergin, Young, & Morse, 2001)
with 400nM as the final concentration for each reaction, 300ng/µl of Bovine serum albumin used
as a stabilizer (BSA, Sigma cat#A-7030), and 0.04U/µl of bacterial Taq DNA polymerase stored
in Buffer B(Promega). In order to perform the amplification, the samples were diluted 1:100 in
TE (Tris-EDTA buffer). 40ul of the mastermix and 10µl of each sample was added to each PCR
Genetic identification of bacteria 8
EasyStart (Molecular Bioproducts), tube. The positive control reaction contained 0.5pg/µl of
DNA. The negative control contained 10 µl of nuclease free water in the 100µl reaction. The
amplification was also performed on samples C5B and B7G (C5Bg and B7Gg) by using
GoTaq green mastermix (Promega catalog# M7122) because some of the reactions did not
amplify by using the previous mastermix. The amplification of the DNA was performed in a
Peltier Thermal Cycler, model PTC-200 (MJ Research). The first denaturation of the DNA was
performed at 94°C for 3 minutes, and was followed by 30 cycles that included denaturation at
94°C for 1 minute, annealing at 55°C for 1.5minutes and extension at 72°C for 3 minutes. The
final extension was performed at 72°C for 7 minutes.
Gel electrophoresis
The samples were loaded in blue/orange 6x dye and separated through a 1% agarose gel
in 1x TAE with ethidium bromide final concentration 0.5µg/ml (Sambrook & Russell, 2001). A
100bp DNA ladder (Promega) was used to compare the length of the isolated amplicons.
Electrophoresis was conducted at 120 volts for 40 minutes and the gel was destained with 1x
TAE for 30 minutes on a shaker. The gel was analyzed with a transilluminator BIO-RAD
system. The contrast of the original gel figures was altered in order to visualize the DNA ladders,
(Figure 1a, t; Figure 2a; Figure 3a, q; Figure 4a).
Purification of the PCR products
The DNA gel bands were cut with disposable blades to avoid contamination under a
Foto/UV 15by Fotodyne chamber, and the gel samples were stored at -20°C until purification.
The purification of the PCR products was carried out with the Wizard SV Gel and PCR Clean-
Genetic identification of bacteria 9
Up System Kit (Promega). The purified PCR products A5B, B5H, B5D, B7G, B7K, B7I, B5E
were collected in 15µl of Nuclease-Free Water while the PCR products B5G, B5J, B7C, B7D,
and C5B were collected in 20µl. The purified DNA was stored at -20°C.
DNA quantification
The purified samples were quantified by the PicoGreen dsDNA Quantitation Reagent and
Kit (Molecular Probes). The DNA concentration was determined with a Fluorescence
Spectrometer Model LS 55 (Perkin Elmer). In order to obtain the DNA concentration of the
samples, a standard curve with 6 different concentrations (from 0.025ng/mL to 100ng/mL) of
dsDNA was used (Table 3). The samples were prepared with 1µl of the purified PCR products,
999µl of TE, and 1000µl of diluted PicoGreen Reagent. The standard curve and the samples
were analyzed using 480nm excitation and 520nm emission.
DNA sequencing and phylogenetic analysis
The samples were sent to the University of Tennessee-Knoxville (UTK) Molecular
Biology Resource Facility (MBRF) in Knoxville, Tennessee for sequencing. The sample’s
sequences were analyzed using BLAST and the Molecular Evolutionary Genetics Analysis
(MEGA) program. The sequences were aligned to build a phylogenetic tree by using Neighbor
Joining analysis with complete deletion of gaps using the Mega4 software package.
Support information
Genetic identification of bacteria 10
Metabolic tests were used to verify the data obtained from the genetic identification.
Metabolic tests like D-Glucose, D-Mannitol, H2S production, catalase, oxidase (not performed
on B5J), and production of indole. Other tests like motility and Gram staining were also
performed for the morphologic support information. Cell size was determined using a Swiftcam
Imaging II. All these tests were performed on samples A5B, B7I, B7K, B5D, B5H, B7G, C5B,
B5J, B7C, B5G, B5J. Additionally, the metabolic tests for samples B7K, B7G, and B5J were
performed using BD BBLTM EnterotubesTM II (for results obtained from metabolic tests and
Gram staining look on appendixes). The data obtained from the tests were analyzed using
Bergey’s manual (Holt, Krieg, Sneath, Staley, & Williams, 1994; Krieg & Holt, 1984).
Genetic identification of bacteria 11
RESULTS
PCR
The amplification was performed with whole cell samples (DNA was not isolated from
bacteria) that were frozen after being isolated from Lake Ontario (Oswego Harbor), Lake
Neatahwanta, and Oswego River in Minetto.
Samples A5B, B5D, B5H, B5E, B5G, B5J, B7I, B7K, B7G, B7C, B7D, C5B were
amplified (Figure 1: Gel Ac, e-p; Gel Bc, d, g-p; Gel Cc-l; Gel Dc-i). Samples B7A, and B7C
(pure) were not amplified, and it can be identified by the absence of bands (Figure 1Be, f, j).
Samples B7Gg and C5Bg were designated with the “g” because they were amplified using the
GOTaq green Mastermix. The bands are observed as bright bands (B7Gg Figure 1Ck, Cl, Df and
C5Bg Dg-i). The presence of a band for the control (+) confirms the effectiveness of the PCR
(Figure 1 Aq, Br, Cn, Dj), furthermore, the absence of a band for control (-) demonstrates the
lack of contamination in the PCR amplification and gel electrophoresis (Figure 1 Ar, Bs, Co and
Dk).
DNA quantification
Genetic identification of bacteria 12
DNA quantification was performed to determine the amount of DNA in each sample.
This is a procedure needed for DNA sequencing, which is the next step for the bacteria
identification.
DNA was quantified three times. The first DNA quantification was performed on A5B,
B5H, B5D, B7G, B7K, B7I and B5E. The second was performed on C5B, B5J, B7C, B5G, and
B7D. The third DNA quantification was performed on B7G, B7D, B5G, B7C, B7Gg, and C5Bg.
All the quantifications were performed using a standard curve (Figure 2). The total DNA amount
for each amplified sample in both DNA quantifications is within the requirements for sequencing
(155ng of DNA based on 10ng per 100bp) except for samples A5B and B5E (Table 1).
16S rDNA gene sequences analysis
16S rDNA gene was used for the bacteria identification. This gene encodes a component
of the small subunit in the ribosome.
All samples except for B5E, because of its low amount of DNA (21.37ng), were sent for
sequencing the 16S rDNA gene. The first time, the gene was sequenced from samples A5B, B7I,
B7K, B5D, B5H, and B7G (Figure 3). All the samples had sequences longer than 400bp (Table
2). A second set of samples B5J, B7C, B5G, C5B, and B7D were also subjected to sequencing
reaction, however none of the sequences had more than 244bp (Table 2) and the lack of
nucleotides can be also observed from the images obtained from sequencing (Figure 4). On a
third time, B7C, B5G, B7D, B7G, B7Gg, and C5Bg were sequenced (B7C, B5G, B7D, B7G,
C5B were amplified with Taq polymerase while B7Gg and C5Bg were amplified with GOTaq
green polymerase), nevertheless the peaks for B7C, B5G, and B7G (Figure 4) were not clear
Genetic identification of bacteria 13
enough to identify nucleotides in the sequences. B7D and B7Gg had less than 200bp. C5Bg was
the only one sample with more than 200 nucleotides (Table 2).
Phylogenetic analysis
Sequences were compared to other sequences from the genetic nucleotide library by
using BLAST. The 16S rDNA gene sequences size was modified for each sample by eliminating
undefined nucleotides at the beginning and at the end of the sequences because the peaks on that
section are not well defined (Table 3). The sequences were then analyzed in two phylogenetic
trees in order to better appreciate the relationship between they and the sequences found in
BLAST. Samples with more than 200 nucleotides were grouped in one tree (Figure 5) while the
ones with less than 200 nucleotides were grouped in a second tree (Figure 6). This analysis
revealed samples in both trees were grouped within a genus that was closely related to the
sequences. Samples B5D and B7C which are grouped within the Aeromonas group were
analyzed in a third tree (Figure 7a) to identify which of the samples was more closely related to
the Aeromonas genus. In this phylogenetic tree B5D, the sample with more nucleotides, is closer
related to Aeromonas compared to B7C which is grouped even out of the node. In the same way,
samples B5G and B5H were also analyzed in a tree (Figure 7b) to differentiate both samples.
Similarly to B5D-B7C analysis, B5G sample, which is the one with less number of nucleotides,
was grouped further away from the Acinetobacter genus compared to B5H.
Metabolic tests as support information
Metabolic tests were used to verify the data obtained from the genetic identification.
Tests like D-Glucose, D-Mannitol, H2S production, catalase, oxidase, indole, motility, Gram
Genetic identification of bacteria 14
staining, and cell size were performed for the morphologic and metabolic tests as support
information. Enterotubes were also used but not for all samples (for results obtained from
metabolic tests and Gram staining look on appendixes).
Enterobacter sp. (B7K) had similar metabolic tests with Enterobacter, Pantoea and
Klebsiella genus. B7K was tested by using enterotubes. From twenty-two tests, five were
inconsistent with Enterobacter cloacae, eight from Pantoea sp., two from Klebsiella sp. and two
from Pantoea agglomerans. However, the percentage of certainty for eight tests for the last
species was not high.
Enterobacter sp. (B7G) was also tested on enterotubes due to its similarity with
Enterobacter. Seventeen tests support B7G close relationship to Enterobacter sp., while for
Escherichia coli and Escherichia hermanii there were four and five inconsistencies for metabolic
tests.
The support information for Plesiomonas shigelloides (A5B) was consistent for nine tests
out of ten. The colony pigmentation was one of the inconsistencies. A5B presented colony
pigmentation after growing on TSA (Tryptic Soy Agar) and also formed acid from the
fermentation of D-Mannitol. Plesiomonas shigelloides usually does not present pigmentation
(unknown media of growth used) and it is usually negative for D-Mannitol (0-10% is positive).
For Aeromonas hydrophila, A. sobria, A. veronii. (B5D and B7C) metabolic analysis was
consistent in all the three species for the Aeromonas genus. A. hydrophila was not consistent for
the H2S test. While A. sobria and A veronii were negative for H2S production as B5D. However,
A. sobria was negative for D-Glucose gas production contrary to A. veronii. B7C was one of the
samples with less than 200 nucleotides, and from the DNA comparison in BLAST, it was
grouped within the Aeromonas genus like sample B5D. Nevertheless, B7C had four
Genetic identification of bacteria 15
inconsistencies from eleven tests. B7C was negative for D-Glucose and D-Mannitol acid
production, gas production and indole test which do not match with any of the Aeromonas
species (just with A. sobria for D-Glucose gas production).
Acinetobacter sp. (B5H and B5G). Both samples were consistent in all tests (Janda & Abbott,
2007) for the Acinetobacter genus. Four tests were not confirmed because of the lack of
information on literature review.
Exiguobacterium sp. (B7I) was consistent for six tests out of seven. The oxidase test was
not consistent, it was negative. There was lack of information for four tests (Acid from DGlucose and D-Mannitol, H2S and indole), therefore they could not be used as support
information.
Planococcus sp. (C5B) two tests from eleven tests were not consistent. One of them is the
shape of the bacteria, which indicates C5B as a rod while in the literature Planococcus has a
coccus shape. The second inconsistency was the lack of motility in C5B compared to
Planococcus (Pictures of tests on appendixes).
Staphylococcus sp. (B5J) was consistent for all tests (Holt, Krieg, Sneath, Staley, &
Williams, 1994). B5J was tested using Enterotubes, however there was not enough metabolic
information about Staphylococcus genus to compare all the tests included in the enterotubes. The
colony pigmentation, D-Mannitol, and lactose test were marked as “d” which means 26-75% of
bacteria are positive for these tests. For urea test on S. cohnii the test was inconsistent (negative)
however it is positive for the S. cohnii subspecies urealyticus.
Pseudomonas sp. (B7D) was consistent for all the tests found for Pseudomonas genus (four).
Genetic identification of bacteria 16
DISCUSSION
Sixteen samples were attempted to be amplified. However, twelve of them successfully
amplified (A5B, B5D, B5H, B5E, B5G, B5J, B7I, B7K, B7G, B7C, B7D, and C5Bg) (Figure
1A, B, C, D). The samples A5F, A5C, B7A (Figure 1Be, f) and B5I did not amplify maybe due
to lack or low DNA concentration. A low DNA concentration in samples can be the result of
DNA degradation, another cause could be that the cells did not lyse by freezing, or maybe there
were inhibitors that interfered with the binding between the primers and the 16S rDNA gene.
Primers can also be a cause for PCR failure, because if the sequence of the primer is not similar
to the sample sequence, then they will not match, and therefore the 16S rDNA will not amplify
(Wand et al., 2009). Inhibitors might be also the cause for the lack of amplification for one of the
B5J samples (Figure 1Bj), since the same sample was amplified after being diluted 1:100 (Figure
1Bi).
The DNA amount obtained from B5D, B5H, B5G, B5J, B7I, B7K, B7G, B7C, B7D, and
C5Bg samples was over 155ng (Table 1), which is the minimum amount required to sequence
the 16S rDNA gene. DNA from samples B5E and A5B was low (21.37ng and 94.51ng). The
Genetic identification of bacteria 17
consequence of having a low DNA concentration could be a short sequence. We can confirm this
with the short sequence size obtained from A5B, 557nucleotides (Table 2). B5E sample was not
sequenced because its DNA amount was too low (determined by fluorescence). The obtained low
concentration of DNA for these samples could be due to human error when isolating the DNA
fragment of the gel for the purification process.
The first set of DNA sequences are from samples A5B, B5D, B5H, B7I, B7K, and B7G
(Figure 1A). The second set of DNA sequences are from samples B5G, B5J, B7C, B7D, and
C5B. The amount of DNA was over 155ng for all the samples, however the images from the
sequences and the number of nucleotides were very low or null for most of the samples (Figure
7, Table 2). A third sequencing was performed to test the viability of the Taq polymerase used
for the amplification. Samples B7G, B7C, B5G, and B7D were amplified using the same Taq
polymerase used for the first and second set, however two samples, C5Bg, and B7Gg were
sequenced using GoTaq polymerase. Two samples, B7C, B7D, and B7Gg had smaller sequences
compared to the samples sequenced on the first set and compared to the sequence of C5Gg
(Table 2). The difference in sequences size could be due to the Taq polymerase. Both
polymerases had already expired (Taq polymerase in 2008 and GoTaq polymerase in 2009).
Another reason could be a long loss of power, which occurred on January 2010 and lasted for 8
hours. This blackout exposed the enzymes to higher temperatures than the recommended for
storage purposes (-20C for Taq polymerase and 4C for GoTaq polymerase). Thus, problems
that may have caused DNA amplification failure are the multiple priming sites (“National Center
for Biotechnology”, 1989) and maybe the long blackout and/or the expiration date of the
reactants influenced the Taq to multiple prime and produce the faulty results.
Genetic identification of bacteria 18
The sequences’ size differs between the DNA samples, however all of them were
identified and grouped. The species with higher percentage of identity compared to A5B is
Plesiomonas shigelloides with 100% of identity (Table 3). Phenotypic and metabolic tests are
consistent with the identification. However two tests do not coincide: Colony pigmentation and
D-Mannitol. On TSA (Tryptic Soy Agar), A5B presented a yellowish pigment in the colony.
However, Plesiomonas shigelloides does not present colony pigmentation (Holt, Krieg, Sneath,
Staley, & Williams, 1994). The difference of pigmentation could be the result of growth in
different media. Production of acid in D-Mannitol was another test that did not coincide. A5B
produced acid, but Plesiomonas usually do not produce acid, however 0-10% of them does
(Table 4). The other ten metabolic tests are supportive with the identification. Plesiomonas
usually occur in aquatic animals, and this is another characteristic that also matches with the
A5B samples because it was actually isolated from an aquatic environment.
Samples B5D and B7C are grouped within the Aeromonas genus. The metabolic tests for
B5D support a closer relationship with A. veronii, however a closer look in the phylogenetic tree
shows a closer relationship with A. hydrophila and A. sobria (Figure 7a). The inconsistency with
A. hydrophila is the lack of H2S production in B5D, and in A. sobria is the production of gas on
D-Glucose fermentation. As we see, the inconsistencies are just one test for each species. A
positive or negative reaction is not always 100% for the species, therefore, B5D could be within
the 0-10% of bacteria that react differently for a specific metabolic reaction (Table 4). B7C has
more metabolic tests that are inconsistent with the identification under the Aeromonas genus
(Table 4). The phylogenetic tree also shows a further relationship between Aeromonas sp. than to
B5D, B7C is even grouped out of the Aeromonas node (Figure 7a).
Genetic identification of bacteria 19
B5H and B5G (Acinetobacter sp.) are grouped together and the metabolic tests are consistent for
both. Nevertheless, a closer analysis of both samples through a phylogenetic tree shows B5H is
more closely related to Acinetobacter sp. than B5G (Figure 7b). B5G´s sequence was short
compared to B5H´s sequence and the lack of nucleotides in a sequence results in an inaccurate
identification. However, B5G might still be within the Acinetobacter genus, but the species is
different from the B5H sample.
The inconsistencies in metabolic tests for Planococcus sp. (C5B), Pseudomonas sp.
(B7D), Exiguobacterium sp. (B7I), Enterobacter sp. (B7K and B7G), and Staphylococcus sp.
(B5J) (Table 4) can be due to the inaccuracy of the tests. They could also mean these bacteria
correspond to a species or subspecies which sequences are not in the GenBank or have not been
sequenced at all.
16S rDNA gene and metabolic tests helped in the identification of the bacteria isolated
from the New York upstate waterbodies. The phylogenetic tree analysis helped group bacteria in
orders and classes. The two main classes are Gammaproteobacteria and Bacilli (Figure5 and 6).
Every bacteria has different functions, for example Pseudomonas(B7D) participate in nutrient
cycling and biodegradation (Maier & Pepper, 2000), and by their different participations in the
ecosystem, bacteria maintain and equilibrated ecosystem for the good of all organisms that live
in it.
Genetic identification of bacteria 20
FIGURES AND TABLES
Gel A
a
b c d e f g h i j
k l m n o p q r s
t
1500bp
1500bp
Gel B
a
b
c d e f g h i j
Gel C
a
b
c
Gel D
a
b
d
c
e
f
d
e
g
h
f
k l m n o p q r
i
j
g
h
k
l
i
m
j
s
n
k
o
p
l
q
Genetic identification of bacteria 21
Figure 1. PCR amplicons. *=1:100, #=1:10; ∞=undiluted. Aa, Ba, Ca, Da. Altered contrast
from original DNA ladder image (b); At, Cq. Altered contrast from original DNA ladder image
(As and Cq). Aq, Br, Cn, Dj. Control (+); Ar, Bs, Co, Dk. Control (-); From gel A: c,d, and e.
B5E#; f, g, and h A5B#; i. B5H* ; j, k. B5D*; l. B7G*; m, n. B7K*; o, p. B7I*; From gel B: c,
and d. C5B*∞; e and f. B7A*∞; g, and h. B5J*∞; i, and j. B7C*∞; k, l, and m. B5G*∞ and mixture
of *∞; n, o, and p. B7D*∞ and mixture of ∞; q. B7D*; From gel C: c, d. B7G*; e, f. B7D*; g, h.
B5G*; i, j. B7C*; k, l. B7Gg*; m. B7C*; From gel D: c. B7C*; d. B5G*; e. B7G*; f. B7Gg*; g,
Fluorescence (rfu)
h, i. C5B*.
a
230.00
y = 0.0023x + 0.4215
R² = 0.9999
180.00
130.00
80.00
30.00
-20.00
0
20000
40000
60000
80000 100000
DNA concentration (pg/ml)
Genetic identification of bacteria 22
10.00
y = 0.0023x + 0.4215
R² = 0.9999
Fluorescence (rfu)
8.00
6.00
4.00
2.00
0.00
-2.00 0
1000
2000
3000
DNA concentration (pg/ml)
b
Figure 2. Standard curve for DNA. a. Complete standard curve for the first DNA
quantification; b. First three data for the standard curve.
FIRST SEQUENCING
SECOND SEQUENCING
Samples
Samples
Total DNA
amount (ng)
Total DNA
THIRD SEQUENCING
Samples
amount (ng)
Total DNA
amount (ng)
A5B
95
B7D
163
B7D
547
B5E
21
C5B
163
C5Bg
952
B7I
325
B5G
169
B5G
555
Genetic identification of bacteria 23
B7K
328
B7C
164
B7C
485
B5D
310
B5J
167
-
-
B5H
296
-
-
B7Gg
677
B7G
430
-
-
B7G
492
Table 1. DNA quantification. Amount of DNA obtained for each sample. All samples have
more than 155ng except for sample A5B and B5E.
A5B
B7I
B7K
Genetic identification of bacteria 24
B5D
B5H
B7G
Figure 3. DNA Sequencing. Sequence images obtained from a section of the 16S rDNA gene
obtained from each sample from first sequencing.
B5J
B7C(a)
B7D (a)
B7C(b)
B7D (b)
Genetic identification of bacteria 25
C5B(a)
C5Bg(b)
B5G(a)
B7G(a)
B5G(b)
B7Gg(b)
Figure 4. DNA Sequencing. Sequences images obtained from a section of the 16S rDNA gene
from each sample. (a) Second sequencing; (b) Third sequencing.
Genetic identification of bacteria 26
Samples
# nucleotides amplified with Taq
polymerase
# nucleotides amplified with Taq
polymerase
# nucleotides amplified with GOTaq
A5B
557
B5H
1306
B7I
437
B7K
441
B5D
1281
B7G
958
B7C
244
B7D
0
C5B
0
B5G
84
B5J
144
-
-
-
-
-
0
102
0
-
70
-
716
0
-
-
Table 2. Number of pair of bases sequenced for each sample. Samples were sequenced at different times.
Genetic identification of bacteria 27
Samples
B7K
Number of
nucleotides
used for
BLASTa
384
Identityb
Accession
numberc
%IDd
Enterobacter cloacae
AJ001245
99
Pantoea sp.
GQ288411
98
Pantoea agglomerans
FJ603031
97
Klebsiella oxytoca
GU253335
97
Leclercia adecarboxylata
GU265700
97
B7G
826
Enterobacter sp.
DQ855282
98
Escherichia hermannii
AB479110
98
Escherichia coli
EU026432
98
Escherichia senegalensis
AY217654
98
A5B
382
Plesiomonas shigelloides
FJ375179
100
B5D
938
Aeromonas veronii
FJ653620
99
Aeromonas hydrophila
AB473014
99
Aeromonas sobria
AB473004
99
B7C
197
Aeromonas hydrophila
GU294303
99
Aeromonas veronii
FJ653620
98
Aeromonas sobria
AB473004
94
B5H
845
Acinetobacter sp.
GU451179
100
Acinetobacter baylyi
FJ976561
100
Acinetobacter soli
FJ976568
100
B5G
68
Acinetobacter sp.
DQ129723
100
B7I
332
Exiguobacterium acetylicum GQ284347
96
Exiguobacterium sp
FM991853
96
Exiguobacterium indicum
GQ284484
96
C5B
687
Planococcaceae bacterium
FM209428
99
Bacillus psychrotolerans
GQ342531
99
Planococcus sp.
DQ375559
98
Crocinobacterium jejui
AM295339
98
Bacillus psychrodurans
EF101552
98
Paenibacillus sp.
EU621913
98
B5J
65
Staphylococcus sp.
GU451172
98
Staphylococcus saprophyticus GU201857
98
Staphylococcus warneri
GU397393
98
Staphylococcus cohnii
GU339231
98
B7D
50
Pseudomonas sp.
GQ243735
94
Table 3. BLAST results. a. Total number of pair of bases after deleting undefined nucleotides at
the beginning and at the end of each sequence. b. Species with similar sequence found in
BLAST. c. Identification code of the species. d. Percent identity compared to the sample
sequences.
Genetic identification of bacteria 28
B7K
Klebsiella oxytoca
Pantoea agglomerans
Enterobacter cloacae
Leclercia adecarboxylata
B7G
Enterobacter sp.
Escherichia coli
Enterobacter hermannii
A5B
Plesiomonas shigelloides
Aeromonas veronii
Aeromonas sobria
B5H
Acinetobacter sp.
Acinetobacter baylyi
Acinetobacter soli
Order. Pseudomonadales
Aeromonas hydrophila
Order. Aeromonales
B5D
Class. Gammaproteobacteria
Escherichia senegalensis
Order. Enterobacteriales
Pantoea sp.
Exiguobacterium acetylicum
Exiguobacterium indicum
Exiguobacterium sp.
B7I
Bacillus psychrodurans
Paenibacillus sp.
Planococcus sp.
C5B
Class. Bacili
Crocinobacterium jejui
Bacillus psychrotolerans
Planococcaceae bacterium
0.05
Figure 5. Neighbor-joining phylogenetic tree from 16S rDNA genes. From samples: B7K,
B7G, AB, B5D, B5H, B7I, C5B (Samples with more than 200bp) and sequences obtained from
BLAST.
Genetic identification of bacteria 29
Staphylococcus warneri
Staphylococcus sp.
B5J
Staphylococcus cohnii
B7C
Aeromonas veronii
Aeromonas hydrophila
Aeromonas sobria
B5G
Acinetobacter sp.
B7D
Pseudomonas sp.
Class. Gammaproteobacteria
Class. Bacili
Staphylococcus saprophyticus
0.05
Figure 6. Neighbor-joining phylogenetic tree from 16S rDNA genes. From samples: B5J,
B7C, B5G, and B7D (Samples with less than 200bp) and sequences obtained from BLAST.
Aeromonas veronii
Aeromonas sobria
Aeromonas hydrophila
B5D
B7C
a.
0.001
Acinetobacter baylyi
Acinetobacter soli
Acinetobacter sp.
B5H
b.
B5G
Figure 7. Neighbor-joining phylogenetic tree from 16S rDNA genes. a. Analysis between
samples B5D and B7C. b. Analysis between samples B5H and B5G.
0.6-6.0
0.3-1.0
+
+
+
+
+
[+]
[-]
+
NA
d
NA
+b
[+]
1.0-3.0
0.5-1.0
[+]
+
[-]
+
[-]
+
[-]
+
d
[-]
[+]
[-]
+
+
2.7±0.6
1.3±0.2
+
+
+
+
+
+
+
+
+
+
[+]c
1.2-3.0
0.6-1.0
+
+
+
+
NA
NA
+a
d
d
+
d
+
NA
NA
+
+
2.0-6.0
1.1-1.5
+
+
+
+
[+]
NA
NA
d
+
+
+
NA
NA
-
Staphylococcus saprophyticus
+
+
+
NA
NA
+
d
+
NA
NA
-
+
2.2±0.3
+
+
NA
+
-
Escherichia coli
Enterobacter sp.
B7G
Pantoea agglomerans
Klebsiella sp.
-
+
d
NA
NA
NA
d
NA
NA
NA
NA
NA
NA
d
NA
NA
NA
NA
+
NA
Staphylococcus cohnii
d
+
+
[-]
+
+
+
+
[-]
d
+
+
1.0-3.0
0.5-1.0
+
+
+
+
+
+
+/+
Staphylococcus wameri
1.2-3.0
0.6-1.0
+
+
+
+
B5J
+
2.6±0.6
1.8±0.4
+
+
+
+
+
+
+
+
+
+
+
-
Escherichia hermanii
Gram
Yellow pigment
rods length and diameter
(nm)
cocci (size)
Motility
D-Glucose
D-Glucose gas
D-Mannitol
HsS production
Indole
Catalase
Oxidase
Lysine
Ornithine
Adonitol
Lactose
Arabinose
Sorbitol
VP
Dulcitol
Phenylalanine
Urea
Citrate
Pantoea sp.
B7K
Enterobacter cloacae
Genetic identification of bacteria 30
+
d
0.5-1.5
NA
NA
NA
d
NA
NA
NA
NA
NA
NA
d
NA
NA
NA
NA
+
NA
+
d/NA
NA
NA
d
NA
NA
NA
NA
NA
NA
-e
NA
NA
NA
NA
-f
NA
motility
D-Glucose
(acid
production)
D-Glucose
(gas
production)
D-Mannitol
(acid
production
HsS
production
-
+
+
-
-
-
-
+
2.4±1.0
+
1.8±0.4
1.5-2.5
+
3.8±0.9
+
-
2.1±0.3
±
1.5-5.0
+
3.6±1.0
+
3.7±0.8
1.4±0.4
0.8±0.3
0.9-1.6
1.4±0.3
-
1.2±0.3
0.5-1.0
1.5±0.5
1.0±0.3
-
-
-
-
1-1.2
-
-
-
-
-
-
-
-
+
±
+
+
+
±
-
-
+
±
+
-
-
NA
-
-
+
NA
-
-
NA
-
-
-
-
-
-
-
+
indole
-
-
catalase
+
+
+
Plesiomonas shigelloides
-
A5B
-
Exiguobacterium sp.
-
B7I
Aeromonas veronii
-
Aeromonas sobria
B7C
B5D
Pseudomonas sp.
B7D
Planococcus sp.
C5B
Acinetobacter sp.
-
Aeromonas hydrophila
Gram
Yellow
pigment
Rods: length
and diameter
(nm)
Cocci
diameter (nm)
B5G
B5H
Genetic identification of bacteria 31
+
+
-
-
1.0-3.0
+
2.4±0.4
+
1.4-3.2
+
2.8±1.0
3
0.3-1.0
2.1±0.8
1.1-1.2
1.2±0.4
0.8-1
-
-
-
-
-
-
-
+
+
+
+
+
+
+
-
+
+
+
+
+
+
+
+
-
+
-
+
-
NA
-
-
NA
+
-
+
d
+
+
NA
+
-
+
NA
-
+
+
-
-
-
NA
-
-
-
+
NA
+
-
+
+
+
-
NA
+
+
+
+
NA
+
+
+
+
+
+
+
+
+
oxidase
±
±
+
+
+
+
+
+
+
Table 4. -, 0-10% positive; +, 90-100% positive; [-], 11-25% positive; [+], 76-89% positive d, 26-75% positive; a, E. agglomerans is negative ; b, K. pneumonia
subs. rhinoscleromatis and some ozenae are negative; c, E. agglomerans; e, positive in S. cohnii subsp urealyticus; f, positive in S. cohnii subsp urealyticus
Genetic identification of bacteria 32
ACKNOWLEDGMENTS
I thank Dr. Lucinda Sonnenberg for her valuable advice, guidance, support, and help
during this project. I also thank Ayesha Patel for her advice, friendship and help.
Samples were sequenced at the UTK Molecullar Biology Resource Facility (MBRF) in
Knoxville.
This study was supported by the Department of Biology and Marine Science and the Department
of Chemistry.
Genetic identification of bacteria 33
REFERENCES
Atlas R, Robert J, editor. 1995. Principles of Microbiology, Mosby. Lousville: Callanan. 1st
ed. 23- 24pp.
Baron Samuel. Medical microbiology. [Internet] Galveston (TX): Phenotypic Characteristics
Useful in Classification and Identification. C 1996 [cited 2010 March]. Available from:
http://www.ncbi.nlm.nih.gov/bookshelf/br.fcgi?book=mmed&part=A347
Clarridge J. 2004. Impact of 16S rRNA Gene Sequence Analysis for Identification of
Bacteria on Clinical Microbiology and Infectious Diseases. Clinical Microbiology. 17(4):
840–862
GenBank; August 2009. http://www.ncbi.nlm.nih.gov/Genbank/
Gontang E, Fenical W, and Jensen P. 2007. Phylogenetic Diversity of Gram-Positive
Bacteria Cultured from Marine Sediments. Appl Environ Microbiol. San Diego 73(10):
3272-3282.
Hendrickson E, Payne J, Young R, Starr M, Perry M, Fahnestock S, Ellis D, and Ebersole R.
2002. Molecular Analysis of Dehalococcoides 16S Ribosomal DNA from ChloroetheneContaminated Sites throughout North America and Europe. Appl Environ Microbiol. 68(2):
485-495.
Holt G John, Krieg R. Noel, Sneath H. A. Peter, Staley T. James, and Williams T. Stanley.
1994. Bergey’s manual of determinative bacteriology. 9th ed. Maryland: LIPPINCOTT
WILLIAMS & WILKINS.
Janda JM, Abbott Sharon. 2007. 16S rRNA Gene Sequencing for Bacterial Identification in
the Diagnostic Laboratory: Pluses, Perils, and Pitfalls Minireview. Clinical Microbiolog.
45(9): 2761–2764.
Genetic identification of bacteria 34
Janda JM, Abbott Sharon. 2002. Bacterial Identification for Publication: When Is Enough
Enough? Clinical Microbiology. 40(6): 1887-1891.
Krieg R Noel, Holt G. John. 1984. Bergey’s Manual of Systematic bacteriology. Volume 1
and 2. Baltimore: WILLIAMS & WILKINS.
Lozupone C, Knight R. 2009. Species Divergence and the Measurement of Microbial
Diversity. Microbiol. 32(4): 557-78.
Morris C, Bardin M, Berge O, Frey-Klett P, Fromin N, Girardin H, Guinebretiere M,
Lebaron P, Thiery J, Troussellier M. 2002. Microbial Biodiversity: Approaches to
Experimental Design and Hypothesis Testing in Primary Scientific Literature from 1975 to
1999; Microbiol and Molecular Biol. 66(4): 592-616.
Maier R, Pepper I. Gerba C. 2000. Environmental Microbiology. 2nd ed. Florida: Academic
Press.
The NCBI handbook [Internet]. National Center for Biotechnology Information; Entrez. C
1989 [Cited 2010 April]. Available from: http://www.ncbi.nlm.nih.gov/sites/entrez
Ouellette A, Handy S, Wilhelm S. 2005. Toxic Microcystis is Widespread in Lake Erie: PCR
Detection of Toxin Genes and Molecular Characterization of Associated Cyanobacterial
Communities; Microbial Ecology 51(2); 154-1656.
Sambrook and Russell. 2001. Molecular cloning. A lab manual. Volume 1. New York.
SSU & LSU. Green genes. [Internet]. Berkeley(Ca): Greengenes. [Updated 2010 September
28; cited 2010 March]. Available from: http://greengenes.lbl.gov/cgi-bin/JD_Tutorial/nph16S.cgi.
Genetic identification of bacteria 35
Sutton S: How do you decide which microbial identification system is best? The
Microbiology Network [Internet]. Chili (NY): [Updated 2010; cited 2010 March]. Available
from: http://www.microbiol.org/white.papers/WP.which.ID.htm.
Urbach E, Vergin K, Young L, Morse A. 2001. Unusual bacterioplankton community
structure in ultra-oligotrophic Crater Lake. Limnol. Oceanogr. 46(3): 557-572.
Wand, Y., Qian, Pei-Yuan; Conservative fragments in bacterial 16S rRNA genes and primer
design for 16S ribosomal DNA amplicons in metagenomic studies; NERC Center for
Ecology and Hydrology, United Kingdom, 2009.
Genetic identification of bacteria 36
APPENDIXES
SUPPORT TESTS
Plesiomonas shigelloides (A5B)
Rods Gram -
D-Glucose +
Growth on TSA media
D-Mannitol
+
SIM (-++)
Genetic identification of bacteria 37
Acinetobacter sp. (B5H)
Rods Gram -
D-Glucose -
Growth on TSA media
D-Mannitol -
SIM (- - -)
Genetic identification of bacteria 38
Exiguobacterium sp. (B7I)
Rods Gram +
D-Glucose +
Growth on TSA media
D-Mannitol +
SIM (- - +)
Genetic identification of bacteria 39
Enterobacter sp. (B7K)
Rods Gram -
D-Glucose +
Glu + Lys -
Growth on TSA media
D-Mannitol +
Orn - H2S/Ind-/+ Ado+ Lact+
SIM (- + +)
Ara+ Sorb- VP- Dul/Phe+/- Urea+ Citr -
Enterotube
Genetic identification of bacteria 40
Aeromonas sp. (B5D)
Rods Gram -
D-Glucose +
Growth on TSA media
D-Mannitol +
SIM (- + +)
Genetic identification of bacteria 41
Enterobacter sp. (B7G)
Rods Gram -
D-Glucose +
Growth on TSA media
D-Mannitol +
Glu + Lys - Orn - H2S/Ind- Ado - Lact -
Ara+
Enterotube
SIM (- - +)
Sorb+ VP+ Dul/Phe+/– Urea- Citr +
Genetic identification of bacteria 42
Aeromonas sp. (B7C)
Rods Gram -
D-Glucose -
Growth on TSA media
D-Mannitol -
SIM (+ - 0 )
Genetic identification of bacteria 43
Acinetobacter sp. (B5G)
Rods Gram -
D-Glucose +
Growth on TSA media
D-Mannitol -
SIM (- - -)
Genetic identification of bacteria 44
Staphylococcus sp. (B5J)
Coccus Gram +
Growth on TSA media
Glu + Lys + Orn + H2S/Ind Ado - Lact -
Ara-
Enterotube
Sorb-
VP+
Dul/Phe – Urea + Citr -
Genetic identification of bacteria 45
Pseudomonas sp. (B7D)
Rods Gram -
D-Glucose +
Growth on TSA media
D-Mannitol -
SIM (+ + +)
Genetic identification of bacteria 46
Planococcus sp. (C5B)
Rods Gram +
D-Glucose-
Growth on TSA media
D-Mannitol -
SIM (- - -)