* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Genetic identification of eleven aquatic bacteria using the 16S rDNA
Population genetics wikipedia , lookup
No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup
Pathogenomics wikipedia , lookup
Nucleic acid double helix wikipedia , lookup
Human genome wikipedia , lookup
DNA barcoding wikipedia , lookup
Human genetic variation wikipedia , lookup
Primary transcript wikipedia , lookup
Genome evolution wikipedia , lookup
DNA supercoil wikipedia , lookup
Genomic library wikipedia , lookup
Epigenomics wikipedia , lookup
Gel electrophoresis of nucleic acids wikipedia , lookup
Point mutation wikipedia , lookup
Molecular cloning wikipedia , lookup
United Kingdom National DNA Database wikipedia , lookup
Nutriepigenomics wikipedia , lookup
Public health genomics wikipedia , lookup
Genetic testing wikipedia , lookup
Nucleic acid analogue wikipedia , lookup
Cre-Lox recombination wikipedia , lookup
Genealogical DNA test wikipedia , lookup
Deoxyribozyme wikipedia , lookup
Vectors in gene therapy wikipedia , lookup
Genome (book) wikipedia , lookup
Bisulfite sequencing wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Cell-free fetal DNA wikipedia , lookup
Therapeutic gene modulation wikipedia , lookup
Extrachromosomal DNA wikipedia , lookup
Genome editing wikipedia , lookup
Non-coding DNA wikipedia , lookup
Designer baby wikipedia , lookup
Microsatellite wikipedia , lookup
Helitron (biology) wikipedia , lookup
Genetic engineering wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Microevolution wikipedia , lookup
Genetic identification of bacteria 1 Genetic identification of eleven aquatic bacteria using the 16S rDNA gene M. Gabriela Blocka and Anthony Ouelletteb* a Department of Chemistry, bDepartment of Biology and Marine Science, Jacksonville University, Jacksonville, Florida 04 October 2010 Submitted for publication: 06 March 2012 *Corresponding author. Department of Biology and Marine science, Jacksonville University, Jacksonville. Phone: (904) 256-7330. E-mail: [email protected] Genetic identification of bacteria 2 ABSTRACT Bacteria help maintain ecological balance by participating in the carbon, oxygen, and/or nitrogen cycles. These cycles are important for the organism’s survival, that’s why their identification is fundamental in order to determine how they function and interact in an ecosystem. In this study, eleven previously isolated bacteria from the upstate New York waterbodies (the Oswego River, Lake Ontario, and Lake Neatahwanta) were identified by sequencing a fragment of the 16S rDNA gene from each bacteria. Identification was supported by metabolic tests. The sequences were compared in the nucleotide database in GenBank (GenBank, 2009) and then aligned to construct a phylogenetic tree. The eleven bacteria were grouped into eight genera: Acinetobacter, Aeromonas, Planococcus, Enterobacter, Exiguobacterium, Pseudomonas, Plesiomonas, and Staphylococcus. The metabolic tests better supported the identification for the samples with longer sequences than samples with shorter sequences, which shows the importance of getting longer sequences to better identify bacteria. The identified genera of the isolated bacteria were found to occur in aquatic environments and Plesiomona shigelloides, one of the samples, is usually found in fish and other aquatic animals. Genetic identification of bacteria 3 INTRODUCTION The identification of bacteria is fundamental to understanding the biodiversity in an ecosystem as well as the ecological processes. Bacteria are very important to the environment because they interact with life on Earth by their metabolic activities. Nitrifying bacteria, for instance, make nitrogen available to plants and animals, sulfate-reducing bacteria interact in cycling of sulfur compounds, which affects the fertility of soil, and bacteria also contribute in the organic matter decomposition, an important process in the carbon cycle (Atlas & Robert, 1995). Some workers like R. H. MacArthur and G. E. Hutchinson state the importance of biodiversity as a measure of ecological processes like resource portioning, competition, succession, community productivity and as a community stability indicator (Morris et al., 2002). Bacteria can be identified by using different methods: like phenotypic, biochemical, or nucleic acid and genetic assays. Phenotypic identification of microorganisms is a traditional technique based on the size, motility and the morphology of the bacteria (Janda & Abbott, 2002). Biochemical identification needs chemical and biochemical markers to differentiate the products obtained by the metabolism from bacteria. “Metabolic fingerprint” is the name of the technique used to identify bacteria by using a pattern of tests that permit perform a better identification (Sutton, n.d.). Both types of methods can allow identification of bacteria to the genus level, or minimize the probability that they belong to other groups. With biochemical identification, bacteria can be classified to species level, but the level of classification can vary between groups Genetic identification of bacteria 4 (Baron, 1996). Nucleic acids assays include methods for identification that consists on the determination of the relative proportion of guanine and cytosine, however, this method does not rely on the linear arrangement of the nucleotides, and therefore, its accuracy is low. DNA and RNA homology experiments identify bacteria by hybridization of DNA or RNA molecules between species, but the procedure is expensive and time-consuming (Janda & Abbott, 2007). The genetic methods include the molecular analysis of DNA where DNA sections like genes or the entire genome are used to identify bacteria. Genetic identification using the entire genome is more accurate than using just one gene, like the 16S rDNA, however it is more time-consuming and expensive. The 16S rDNA gene sequencing technique has several advantages over phenotypic and biochemical identification. Nevertheless, none of these approaches is 100% accurate for bacteria identification. One of the advantages of using the 16S rDNA gene analysis is that this gene is present in all bacteria. The lack of extensive mutations in this gene is another advantage of using this type of analysis (Clarridge, 2004). In addition, the 1550bp long gene is easier to work with because sequencing smaller genes is cheaper and faster than larger genes (SSU & LSU, n.d.). The comparison of 16S rDNA gene sequencing with biochemical and phenotypic identification technique shows some advantages as well. Phenotypic characteristics, for instance, are not as accurate as genotypic methods like 16S rDNA, because in bacteria these characteristics can change due to environmental conditions, growth substrate, temperature, and pH levels (Janda & Abbott, 2002). Some of the disadvantages for 16S rDNA gene sequencing analysis include cost and technical considerations. However, the accuracy and other advantages discussed above are why some laboratories prefer to work with 16S rDNA gene sequence analysis. Genetic identification of bacteria 5 To identify microorganisms by the 16S rDNA gene, PCR (polymerase chain reaction) is utilized to amplify the gene. Many database libraries, such as Entrez Gene from Refseq (Reference sequence) are used to compare gene sequences for identification and to study phylogeny and taxonomy. For example, Entrez nucleotide library which works with BLAST (Basic Local Alignment Search Tool), have 1,965,631 partial 16S rRNA partial genes available for genetic identification. When sequences from these databases and samples to be identified present a >97% identity, they can be considered the same species (Lozupone & Knight, 2009). Due to the advantages of using the 16S rDNA gene for species identification, and for uncultured bacterial analysis to determine the bacterial diversity in a community, many projects have included, and are including this technique for their investigations. In one study, PCR was used to detect toxin genes of associated cyanobacterial communities in Lake Erie. In this study, PCR was employed to detect 16S rDNA and toxin gene fragments, and they were able to identify that toxigenic Microcystis was present in different sections of the lake from 1999 to 2002 (Ouellette, Handy, & Wilhelm, 2005). Another investigation found that the identification of bacteria isolated from clinical laboratories by 16S rDNA gene sequences is more accurate than by phenotypic testing (Clarridge, 2004). A phylogenetic analysis of the 16S rDNA amplicons from chloroethene-contaminated sites throughout North America and Europe, demonstrated that members of the Dehalococcoides group are widely distributed in nature and can be found in a variety of geological formations and different climatic zones (Hendrickson et al., 2002). Grampositive bcteria cultured from marine sediments were phylogenetically distinguished based on 16S rDNA gene sequences, with 65.6% of the bacteria belonging to the class Actinobacteria, and the other 34.4% were in the Bacilli class (Gontang, Fenical, & Jensen, 2007). Genetic identification of bacteria 6 As we see, the 16S rDNA gene sequence technique is being widely used for bacterial identification whether for ecological or medical purposes, and the results show accuracy. However, sequence databases are not always accurate because the sequences deposited in the databases can result from strain misidentification, hence a verification test is necessary to support the genetic characterization of species. Metabolic tests are still being used, and is one of the recommended tests to identify bacteria together with phenotypic tests, cellular fatty acid analysis, and genotypic testing methods (Janda & Abbott, 2002). The objective of this project was to genetically identify bacteria isolated from three different ecosystems: the Oswego River, Lake Ontario and Lake Neatahwanta in New York State. The bacterial samples were taken during blooms on August, 9th 2005 by Dr. Ouellette. The 16S rDNA was used for the identification of eleven bacteria and the sequences obtained during the project were used to construct a phylogenetic tree to identify the similarity between them. To support the results obtained from the identification, metabolic tests like acid and gas production from D-glucose and D-mannitol were used as well as catalase, oxidase and indole tests, H2S production, motility, Gram staining, and bacteria morphology and size. Genetic identification of bacteria 7 MATERIALS AND METHODS The bacteria were already isolated from the upstate New York waterbodies and stored in glycerol at -70°C. A5B, A5F, and A5C were isolated from Lake Ontario (Oswego Harbor), C5B was isolated from Lake Neatahwanta, and B5I, B5D, B5H, B5E, B5G, B5J, B7A, B7I, B7K, B7G, B7C and B7D were isolated from Oswego River in Minetto by John F. Heagerty. PCR amplification The amplification was performed with whole cell samples (DNA was not isolated from bacteria). Bacterial samples A5B, A5F, A5C, B5I, B5D, B5H, B5E, B5G, B5J, B7A, B7I, B7K, B7G, B7C, B7D, C5B were amplified with a master mix containing nuclease free water (Promega), primers 27F (sequence 5’ AGAGTTTGATCMTGGCTCAG 3’) and 1518R (sequence 5’ AAGGAAGGTGATCCANCCRCA 3’) (Urbach, Vergin, Young, & Morse, 2001) with 400nM as the final concentration for each reaction, 300ng/µl of Bovine serum albumin used as a stabilizer (BSA, Sigma cat#A-7030), and 0.04U/µl of bacterial Taq DNA polymerase stored in Buffer B(Promega). In order to perform the amplification, the samples were diluted 1:100 in TE (Tris-EDTA buffer). 40ul of the mastermix and 10µl of each sample was added to each PCR Genetic identification of bacteria 8 EasyStart (Molecular Bioproducts), tube. The positive control reaction contained 0.5pg/µl of DNA. The negative control contained 10 µl of nuclease free water in the 100µl reaction. The amplification was also performed on samples C5B and B7G (C5Bg and B7Gg) by using GoTaq green mastermix (Promega catalog# M7122) because some of the reactions did not amplify by using the previous mastermix. The amplification of the DNA was performed in a Peltier Thermal Cycler, model PTC-200 (MJ Research). The first denaturation of the DNA was performed at 94°C for 3 minutes, and was followed by 30 cycles that included denaturation at 94°C for 1 minute, annealing at 55°C for 1.5minutes and extension at 72°C for 3 minutes. The final extension was performed at 72°C for 7 minutes. Gel electrophoresis The samples were loaded in blue/orange 6x dye and separated through a 1% agarose gel in 1x TAE with ethidium bromide final concentration 0.5µg/ml (Sambrook & Russell, 2001). A 100bp DNA ladder (Promega) was used to compare the length of the isolated amplicons. Electrophoresis was conducted at 120 volts for 40 minutes and the gel was destained with 1x TAE for 30 minutes on a shaker. The gel was analyzed with a transilluminator BIO-RAD system. The contrast of the original gel figures was altered in order to visualize the DNA ladders, (Figure 1a, t; Figure 2a; Figure 3a, q; Figure 4a). Purification of the PCR products The DNA gel bands were cut with disposable blades to avoid contamination under a Foto/UV 15by Fotodyne chamber, and the gel samples were stored at -20°C until purification. The purification of the PCR products was carried out with the Wizard SV Gel and PCR Clean- Genetic identification of bacteria 9 Up System Kit (Promega). The purified PCR products A5B, B5H, B5D, B7G, B7K, B7I, B5E were collected in 15µl of Nuclease-Free Water while the PCR products B5G, B5J, B7C, B7D, and C5B were collected in 20µl. The purified DNA was stored at -20°C. DNA quantification The purified samples were quantified by the PicoGreen dsDNA Quantitation Reagent and Kit (Molecular Probes). The DNA concentration was determined with a Fluorescence Spectrometer Model LS 55 (Perkin Elmer). In order to obtain the DNA concentration of the samples, a standard curve with 6 different concentrations (from 0.025ng/mL to 100ng/mL) of dsDNA was used (Table 3). The samples were prepared with 1µl of the purified PCR products, 999µl of TE, and 1000µl of diluted PicoGreen Reagent. The standard curve and the samples were analyzed using 480nm excitation and 520nm emission. DNA sequencing and phylogenetic analysis The samples were sent to the University of Tennessee-Knoxville (UTK) Molecular Biology Resource Facility (MBRF) in Knoxville, Tennessee for sequencing. The sample’s sequences were analyzed using BLAST and the Molecular Evolutionary Genetics Analysis (MEGA) program. The sequences were aligned to build a phylogenetic tree by using Neighbor Joining analysis with complete deletion of gaps using the Mega4 software package. Support information Genetic identification of bacteria 10 Metabolic tests were used to verify the data obtained from the genetic identification. Metabolic tests like D-Glucose, D-Mannitol, H2S production, catalase, oxidase (not performed on B5J), and production of indole. Other tests like motility and Gram staining were also performed for the morphologic support information. Cell size was determined using a Swiftcam Imaging II. All these tests were performed on samples A5B, B7I, B7K, B5D, B5H, B7G, C5B, B5J, B7C, B5G, B5J. Additionally, the metabolic tests for samples B7K, B7G, and B5J were performed using BD BBLTM EnterotubesTM II (for results obtained from metabolic tests and Gram staining look on appendixes). The data obtained from the tests were analyzed using Bergey’s manual (Holt, Krieg, Sneath, Staley, & Williams, 1994; Krieg & Holt, 1984). Genetic identification of bacteria 11 RESULTS PCR The amplification was performed with whole cell samples (DNA was not isolated from bacteria) that were frozen after being isolated from Lake Ontario (Oswego Harbor), Lake Neatahwanta, and Oswego River in Minetto. Samples A5B, B5D, B5H, B5E, B5G, B5J, B7I, B7K, B7G, B7C, B7D, C5B were amplified (Figure 1: Gel Ac, e-p; Gel Bc, d, g-p; Gel Cc-l; Gel Dc-i). Samples B7A, and B7C (pure) were not amplified, and it can be identified by the absence of bands (Figure 1Be, f, j). Samples B7Gg and C5Bg were designated with the “g” because they were amplified using the GOTaq green Mastermix. The bands are observed as bright bands (B7Gg Figure 1Ck, Cl, Df and C5Bg Dg-i). The presence of a band for the control (+) confirms the effectiveness of the PCR (Figure 1 Aq, Br, Cn, Dj), furthermore, the absence of a band for control (-) demonstrates the lack of contamination in the PCR amplification and gel electrophoresis (Figure 1 Ar, Bs, Co and Dk). DNA quantification Genetic identification of bacteria 12 DNA quantification was performed to determine the amount of DNA in each sample. This is a procedure needed for DNA sequencing, which is the next step for the bacteria identification. DNA was quantified three times. The first DNA quantification was performed on A5B, B5H, B5D, B7G, B7K, B7I and B5E. The second was performed on C5B, B5J, B7C, B5G, and B7D. The third DNA quantification was performed on B7G, B7D, B5G, B7C, B7Gg, and C5Bg. All the quantifications were performed using a standard curve (Figure 2). The total DNA amount for each amplified sample in both DNA quantifications is within the requirements for sequencing (155ng of DNA based on 10ng per 100bp) except for samples A5B and B5E (Table 1). 16S rDNA gene sequences analysis 16S rDNA gene was used for the bacteria identification. This gene encodes a component of the small subunit in the ribosome. All samples except for B5E, because of its low amount of DNA (21.37ng), were sent for sequencing the 16S rDNA gene. The first time, the gene was sequenced from samples A5B, B7I, B7K, B5D, B5H, and B7G (Figure 3). All the samples had sequences longer than 400bp (Table 2). A second set of samples B5J, B7C, B5G, C5B, and B7D were also subjected to sequencing reaction, however none of the sequences had more than 244bp (Table 2) and the lack of nucleotides can be also observed from the images obtained from sequencing (Figure 4). On a third time, B7C, B5G, B7D, B7G, B7Gg, and C5Bg were sequenced (B7C, B5G, B7D, B7G, C5B were amplified with Taq polymerase while B7Gg and C5Bg were amplified with GOTaq green polymerase), nevertheless the peaks for B7C, B5G, and B7G (Figure 4) were not clear Genetic identification of bacteria 13 enough to identify nucleotides in the sequences. B7D and B7Gg had less than 200bp. C5Bg was the only one sample with more than 200 nucleotides (Table 2). Phylogenetic analysis Sequences were compared to other sequences from the genetic nucleotide library by using BLAST. The 16S rDNA gene sequences size was modified for each sample by eliminating undefined nucleotides at the beginning and at the end of the sequences because the peaks on that section are not well defined (Table 3). The sequences were then analyzed in two phylogenetic trees in order to better appreciate the relationship between they and the sequences found in BLAST. Samples with more than 200 nucleotides were grouped in one tree (Figure 5) while the ones with less than 200 nucleotides were grouped in a second tree (Figure 6). This analysis revealed samples in both trees were grouped within a genus that was closely related to the sequences. Samples B5D and B7C which are grouped within the Aeromonas group were analyzed in a third tree (Figure 7a) to identify which of the samples was more closely related to the Aeromonas genus. In this phylogenetic tree B5D, the sample with more nucleotides, is closer related to Aeromonas compared to B7C which is grouped even out of the node. In the same way, samples B5G and B5H were also analyzed in a tree (Figure 7b) to differentiate both samples. Similarly to B5D-B7C analysis, B5G sample, which is the one with less number of nucleotides, was grouped further away from the Acinetobacter genus compared to B5H. Metabolic tests as support information Metabolic tests were used to verify the data obtained from the genetic identification. Tests like D-Glucose, D-Mannitol, H2S production, catalase, oxidase, indole, motility, Gram Genetic identification of bacteria 14 staining, and cell size were performed for the morphologic and metabolic tests as support information. Enterotubes were also used but not for all samples (for results obtained from metabolic tests and Gram staining look on appendixes). Enterobacter sp. (B7K) had similar metabolic tests with Enterobacter, Pantoea and Klebsiella genus. B7K was tested by using enterotubes. From twenty-two tests, five were inconsistent with Enterobacter cloacae, eight from Pantoea sp., two from Klebsiella sp. and two from Pantoea agglomerans. However, the percentage of certainty for eight tests for the last species was not high. Enterobacter sp. (B7G) was also tested on enterotubes due to its similarity with Enterobacter. Seventeen tests support B7G close relationship to Enterobacter sp., while for Escherichia coli and Escherichia hermanii there were four and five inconsistencies for metabolic tests. The support information for Plesiomonas shigelloides (A5B) was consistent for nine tests out of ten. The colony pigmentation was one of the inconsistencies. A5B presented colony pigmentation after growing on TSA (Tryptic Soy Agar) and also formed acid from the fermentation of D-Mannitol. Plesiomonas shigelloides usually does not present pigmentation (unknown media of growth used) and it is usually negative for D-Mannitol (0-10% is positive). For Aeromonas hydrophila, A. sobria, A. veronii. (B5D and B7C) metabolic analysis was consistent in all the three species for the Aeromonas genus. A. hydrophila was not consistent for the H2S test. While A. sobria and A veronii were negative for H2S production as B5D. However, A. sobria was negative for D-Glucose gas production contrary to A. veronii. B7C was one of the samples with less than 200 nucleotides, and from the DNA comparison in BLAST, it was grouped within the Aeromonas genus like sample B5D. Nevertheless, B7C had four Genetic identification of bacteria 15 inconsistencies from eleven tests. B7C was negative for D-Glucose and D-Mannitol acid production, gas production and indole test which do not match with any of the Aeromonas species (just with A. sobria for D-Glucose gas production). Acinetobacter sp. (B5H and B5G). Both samples were consistent in all tests (Janda & Abbott, 2007) for the Acinetobacter genus. Four tests were not confirmed because of the lack of information on literature review. Exiguobacterium sp. (B7I) was consistent for six tests out of seven. The oxidase test was not consistent, it was negative. There was lack of information for four tests (Acid from DGlucose and D-Mannitol, H2S and indole), therefore they could not be used as support information. Planococcus sp. (C5B) two tests from eleven tests were not consistent. One of them is the shape of the bacteria, which indicates C5B as a rod while in the literature Planococcus has a coccus shape. The second inconsistency was the lack of motility in C5B compared to Planococcus (Pictures of tests on appendixes). Staphylococcus sp. (B5J) was consistent for all tests (Holt, Krieg, Sneath, Staley, & Williams, 1994). B5J was tested using Enterotubes, however there was not enough metabolic information about Staphylococcus genus to compare all the tests included in the enterotubes. The colony pigmentation, D-Mannitol, and lactose test were marked as “d” which means 26-75% of bacteria are positive for these tests. For urea test on S. cohnii the test was inconsistent (negative) however it is positive for the S. cohnii subspecies urealyticus. Pseudomonas sp. (B7D) was consistent for all the tests found for Pseudomonas genus (four). Genetic identification of bacteria 16 DISCUSSION Sixteen samples were attempted to be amplified. However, twelve of them successfully amplified (A5B, B5D, B5H, B5E, B5G, B5J, B7I, B7K, B7G, B7C, B7D, and C5Bg) (Figure 1A, B, C, D). The samples A5F, A5C, B7A (Figure 1Be, f) and B5I did not amplify maybe due to lack or low DNA concentration. A low DNA concentration in samples can be the result of DNA degradation, another cause could be that the cells did not lyse by freezing, or maybe there were inhibitors that interfered with the binding between the primers and the 16S rDNA gene. Primers can also be a cause for PCR failure, because if the sequence of the primer is not similar to the sample sequence, then they will not match, and therefore the 16S rDNA will not amplify (Wand et al., 2009). Inhibitors might be also the cause for the lack of amplification for one of the B5J samples (Figure 1Bj), since the same sample was amplified after being diluted 1:100 (Figure 1Bi). The DNA amount obtained from B5D, B5H, B5G, B5J, B7I, B7K, B7G, B7C, B7D, and C5Bg samples was over 155ng (Table 1), which is the minimum amount required to sequence the 16S rDNA gene. DNA from samples B5E and A5B was low (21.37ng and 94.51ng). The Genetic identification of bacteria 17 consequence of having a low DNA concentration could be a short sequence. We can confirm this with the short sequence size obtained from A5B, 557nucleotides (Table 2). B5E sample was not sequenced because its DNA amount was too low (determined by fluorescence). The obtained low concentration of DNA for these samples could be due to human error when isolating the DNA fragment of the gel for the purification process. The first set of DNA sequences are from samples A5B, B5D, B5H, B7I, B7K, and B7G (Figure 1A). The second set of DNA sequences are from samples B5G, B5J, B7C, B7D, and C5B. The amount of DNA was over 155ng for all the samples, however the images from the sequences and the number of nucleotides were very low or null for most of the samples (Figure 7, Table 2). A third sequencing was performed to test the viability of the Taq polymerase used for the amplification. Samples B7G, B7C, B5G, and B7D were amplified using the same Taq polymerase used for the first and second set, however two samples, C5Bg, and B7Gg were sequenced using GoTaq polymerase. Two samples, B7C, B7D, and B7Gg had smaller sequences compared to the samples sequenced on the first set and compared to the sequence of C5Gg (Table 2). The difference in sequences size could be due to the Taq polymerase. Both polymerases had already expired (Taq polymerase in 2008 and GoTaq polymerase in 2009). Another reason could be a long loss of power, which occurred on January 2010 and lasted for 8 hours. This blackout exposed the enzymes to higher temperatures than the recommended for storage purposes (-20C for Taq polymerase and 4C for GoTaq polymerase). Thus, problems that may have caused DNA amplification failure are the multiple priming sites (“National Center for Biotechnology”, 1989) and maybe the long blackout and/or the expiration date of the reactants influenced the Taq to multiple prime and produce the faulty results. Genetic identification of bacteria 18 The sequences’ size differs between the DNA samples, however all of them were identified and grouped. The species with higher percentage of identity compared to A5B is Plesiomonas shigelloides with 100% of identity (Table 3). Phenotypic and metabolic tests are consistent with the identification. However two tests do not coincide: Colony pigmentation and D-Mannitol. On TSA (Tryptic Soy Agar), A5B presented a yellowish pigment in the colony. However, Plesiomonas shigelloides does not present colony pigmentation (Holt, Krieg, Sneath, Staley, & Williams, 1994). The difference of pigmentation could be the result of growth in different media. Production of acid in D-Mannitol was another test that did not coincide. A5B produced acid, but Plesiomonas usually do not produce acid, however 0-10% of them does (Table 4). The other ten metabolic tests are supportive with the identification. Plesiomonas usually occur in aquatic animals, and this is another characteristic that also matches with the A5B samples because it was actually isolated from an aquatic environment. Samples B5D and B7C are grouped within the Aeromonas genus. The metabolic tests for B5D support a closer relationship with A. veronii, however a closer look in the phylogenetic tree shows a closer relationship with A. hydrophila and A. sobria (Figure 7a). The inconsistency with A. hydrophila is the lack of H2S production in B5D, and in A. sobria is the production of gas on D-Glucose fermentation. As we see, the inconsistencies are just one test for each species. A positive or negative reaction is not always 100% for the species, therefore, B5D could be within the 0-10% of bacteria that react differently for a specific metabolic reaction (Table 4). B7C has more metabolic tests that are inconsistent with the identification under the Aeromonas genus (Table 4). The phylogenetic tree also shows a further relationship between Aeromonas sp. than to B5D, B7C is even grouped out of the Aeromonas node (Figure 7a). Genetic identification of bacteria 19 B5H and B5G (Acinetobacter sp.) are grouped together and the metabolic tests are consistent for both. Nevertheless, a closer analysis of both samples through a phylogenetic tree shows B5H is more closely related to Acinetobacter sp. than B5G (Figure 7b). B5G´s sequence was short compared to B5H´s sequence and the lack of nucleotides in a sequence results in an inaccurate identification. However, B5G might still be within the Acinetobacter genus, but the species is different from the B5H sample. The inconsistencies in metabolic tests for Planococcus sp. (C5B), Pseudomonas sp. (B7D), Exiguobacterium sp. (B7I), Enterobacter sp. (B7K and B7G), and Staphylococcus sp. (B5J) (Table 4) can be due to the inaccuracy of the tests. They could also mean these bacteria correspond to a species or subspecies which sequences are not in the GenBank or have not been sequenced at all. 16S rDNA gene and metabolic tests helped in the identification of the bacteria isolated from the New York upstate waterbodies. The phylogenetic tree analysis helped group bacteria in orders and classes. The two main classes are Gammaproteobacteria and Bacilli (Figure5 and 6). Every bacteria has different functions, for example Pseudomonas(B7D) participate in nutrient cycling and biodegradation (Maier & Pepper, 2000), and by their different participations in the ecosystem, bacteria maintain and equilibrated ecosystem for the good of all organisms that live in it. Genetic identification of bacteria 20 FIGURES AND TABLES Gel A a b c d e f g h i j k l m n o p q r s t 1500bp 1500bp Gel B a b c d e f g h i j Gel C a b c Gel D a b d c e f d e g h f k l m n o p q r i j g h k l i m j s n k o p l q Genetic identification of bacteria 21 Figure 1. PCR amplicons. *=1:100, #=1:10; ∞=undiluted. Aa, Ba, Ca, Da. Altered contrast from original DNA ladder image (b); At, Cq. Altered contrast from original DNA ladder image (As and Cq). Aq, Br, Cn, Dj. Control (+); Ar, Bs, Co, Dk. Control (-); From gel A: c,d, and e. B5E#; f, g, and h A5B#; i. B5H* ; j, k. B5D*; l. B7G*; m, n. B7K*; o, p. B7I*; From gel B: c, and d. C5B*∞; e and f. B7A*∞; g, and h. B5J*∞; i, and j. B7C*∞; k, l, and m. B5G*∞ and mixture of *∞; n, o, and p. B7D*∞ and mixture of ∞; q. B7D*; From gel C: c, d. B7G*; e, f. B7D*; g, h. B5G*; i, j. B7C*; k, l. B7Gg*; m. B7C*; From gel D: c. B7C*; d. B5G*; e. B7G*; f. B7Gg*; g, Fluorescence (rfu) h, i. C5B*. a 230.00 y = 0.0023x + 0.4215 R² = 0.9999 180.00 130.00 80.00 30.00 -20.00 0 20000 40000 60000 80000 100000 DNA concentration (pg/ml) Genetic identification of bacteria 22 10.00 y = 0.0023x + 0.4215 R² = 0.9999 Fluorescence (rfu) 8.00 6.00 4.00 2.00 0.00 -2.00 0 1000 2000 3000 DNA concentration (pg/ml) b Figure 2. Standard curve for DNA. a. Complete standard curve for the first DNA quantification; b. First three data for the standard curve. FIRST SEQUENCING SECOND SEQUENCING Samples Samples Total DNA amount (ng) Total DNA THIRD SEQUENCING Samples amount (ng) Total DNA amount (ng) A5B 95 B7D 163 B7D 547 B5E 21 C5B 163 C5Bg 952 B7I 325 B5G 169 B5G 555 Genetic identification of bacteria 23 B7K 328 B7C 164 B7C 485 B5D 310 B5J 167 - - B5H 296 - - B7Gg 677 B7G 430 - - B7G 492 Table 1. DNA quantification. Amount of DNA obtained for each sample. All samples have more than 155ng except for sample A5B and B5E. A5B B7I B7K Genetic identification of bacteria 24 B5D B5H B7G Figure 3. DNA Sequencing. Sequence images obtained from a section of the 16S rDNA gene obtained from each sample from first sequencing. B5J B7C(a) B7D (a) B7C(b) B7D (b) Genetic identification of bacteria 25 C5B(a) C5Bg(b) B5G(a) B7G(a) B5G(b) B7Gg(b) Figure 4. DNA Sequencing. Sequences images obtained from a section of the 16S rDNA gene from each sample. (a) Second sequencing; (b) Third sequencing. Genetic identification of bacteria 26 Samples # nucleotides amplified with Taq polymerase # nucleotides amplified with Taq polymerase # nucleotides amplified with GOTaq A5B 557 B5H 1306 B7I 437 B7K 441 B5D 1281 B7G 958 B7C 244 B7D 0 C5B 0 B5G 84 B5J 144 - - - - - 0 102 0 - 70 - 716 0 - - Table 2. Number of pair of bases sequenced for each sample. Samples were sequenced at different times. Genetic identification of bacteria 27 Samples B7K Number of nucleotides used for BLASTa 384 Identityb Accession numberc %IDd Enterobacter cloacae AJ001245 99 Pantoea sp. GQ288411 98 Pantoea agglomerans FJ603031 97 Klebsiella oxytoca GU253335 97 Leclercia adecarboxylata GU265700 97 B7G 826 Enterobacter sp. DQ855282 98 Escherichia hermannii AB479110 98 Escherichia coli EU026432 98 Escherichia senegalensis AY217654 98 A5B 382 Plesiomonas shigelloides FJ375179 100 B5D 938 Aeromonas veronii FJ653620 99 Aeromonas hydrophila AB473014 99 Aeromonas sobria AB473004 99 B7C 197 Aeromonas hydrophila GU294303 99 Aeromonas veronii FJ653620 98 Aeromonas sobria AB473004 94 B5H 845 Acinetobacter sp. GU451179 100 Acinetobacter baylyi FJ976561 100 Acinetobacter soli FJ976568 100 B5G 68 Acinetobacter sp. DQ129723 100 B7I 332 Exiguobacterium acetylicum GQ284347 96 Exiguobacterium sp FM991853 96 Exiguobacterium indicum GQ284484 96 C5B 687 Planococcaceae bacterium FM209428 99 Bacillus psychrotolerans GQ342531 99 Planococcus sp. DQ375559 98 Crocinobacterium jejui AM295339 98 Bacillus psychrodurans EF101552 98 Paenibacillus sp. EU621913 98 B5J 65 Staphylococcus sp. GU451172 98 Staphylococcus saprophyticus GU201857 98 Staphylococcus warneri GU397393 98 Staphylococcus cohnii GU339231 98 B7D 50 Pseudomonas sp. GQ243735 94 Table 3. BLAST results. a. Total number of pair of bases after deleting undefined nucleotides at the beginning and at the end of each sequence. b. Species with similar sequence found in BLAST. c. Identification code of the species. d. Percent identity compared to the sample sequences. Genetic identification of bacteria 28 B7K Klebsiella oxytoca Pantoea agglomerans Enterobacter cloacae Leclercia adecarboxylata B7G Enterobacter sp. Escherichia coli Enterobacter hermannii A5B Plesiomonas shigelloides Aeromonas veronii Aeromonas sobria B5H Acinetobacter sp. Acinetobacter baylyi Acinetobacter soli Order. Pseudomonadales Aeromonas hydrophila Order. Aeromonales B5D Class. Gammaproteobacteria Escherichia senegalensis Order. Enterobacteriales Pantoea sp. Exiguobacterium acetylicum Exiguobacterium indicum Exiguobacterium sp. B7I Bacillus psychrodurans Paenibacillus sp. Planococcus sp. C5B Class. Bacili Crocinobacterium jejui Bacillus psychrotolerans Planococcaceae bacterium 0.05 Figure 5. Neighbor-joining phylogenetic tree from 16S rDNA genes. From samples: B7K, B7G, AB, B5D, B5H, B7I, C5B (Samples with more than 200bp) and sequences obtained from BLAST. Genetic identification of bacteria 29 Staphylococcus warneri Staphylococcus sp. B5J Staphylococcus cohnii B7C Aeromonas veronii Aeromonas hydrophila Aeromonas sobria B5G Acinetobacter sp. B7D Pseudomonas sp. Class. Gammaproteobacteria Class. Bacili Staphylococcus saprophyticus 0.05 Figure 6. Neighbor-joining phylogenetic tree from 16S rDNA genes. From samples: B5J, B7C, B5G, and B7D (Samples with less than 200bp) and sequences obtained from BLAST. Aeromonas veronii Aeromonas sobria Aeromonas hydrophila B5D B7C a. 0.001 Acinetobacter baylyi Acinetobacter soli Acinetobacter sp. B5H b. B5G Figure 7. Neighbor-joining phylogenetic tree from 16S rDNA genes. a. Analysis between samples B5D and B7C. b. Analysis between samples B5H and B5G. 0.6-6.0 0.3-1.0 + + + + + [+] [-] + NA d NA +b [+] 1.0-3.0 0.5-1.0 [+] + [-] + [-] + [-] + d [-] [+] [-] + + 2.7±0.6 1.3±0.2 + + + + + + + + + + [+]c 1.2-3.0 0.6-1.0 + + + + NA NA +a d d + d + NA NA + + 2.0-6.0 1.1-1.5 + + + + [+] NA NA d + + + NA NA - Staphylococcus saprophyticus + + + NA NA + d + NA NA - + 2.2±0.3 + + NA + - Escherichia coli Enterobacter sp. B7G Pantoea agglomerans Klebsiella sp. - + d NA NA NA d NA NA NA NA NA NA d NA NA NA NA + NA Staphylococcus cohnii d + + [-] + + + + [-] d + + 1.0-3.0 0.5-1.0 + + + + + + +/+ Staphylococcus wameri 1.2-3.0 0.6-1.0 + + + + B5J + 2.6±0.6 1.8±0.4 + + + + + + + + + + + - Escherichia hermanii Gram Yellow pigment rods length and diameter (nm) cocci (size) Motility D-Glucose D-Glucose gas D-Mannitol HsS production Indole Catalase Oxidase Lysine Ornithine Adonitol Lactose Arabinose Sorbitol VP Dulcitol Phenylalanine Urea Citrate Pantoea sp. B7K Enterobacter cloacae Genetic identification of bacteria 30 + d 0.5-1.5 NA NA NA d NA NA NA NA NA NA d NA NA NA NA + NA + d/NA NA NA d NA NA NA NA NA NA -e NA NA NA NA -f NA motility D-Glucose (acid production) D-Glucose (gas production) D-Mannitol (acid production HsS production - + + - - - - + 2.4±1.0 + 1.8±0.4 1.5-2.5 + 3.8±0.9 + - 2.1±0.3 ± 1.5-5.0 + 3.6±1.0 + 3.7±0.8 1.4±0.4 0.8±0.3 0.9-1.6 1.4±0.3 - 1.2±0.3 0.5-1.0 1.5±0.5 1.0±0.3 - - - - 1-1.2 - - - - - - - - + ± + + + ± - - + ± + - - NA - - + NA - - NA - - - - - - - + indole - - catalase + + + Plesiomonas shigelloides - A5B - Exiguobacterium sp. - B7I Aeromonas veronii - Aeromonas sobria B7C B5D Pseudomonas sp. B7D Planococcus sp. C5B Acinetobacter sp. - Aeromonas hydrophila Gram Yellow pigment Rods: length and diameter (nm) Cocci diameter (nm) B5G B5H Genetic identification of bacteria 31 + + - - 1.0-3.0 + 2.4±0.4 + 1.4-3.2 + 2.8±1.0 3 0.3-1.0 2.1±0.8 1.1-1.2 1.2±0.4 0.8-1 - - - - - - - + + + + + + + - + + + + + + + + - + - + - NA - - NA + - + d + + NA + - + NA - + + - - - NA - - - + NA + - + + + - NA + + + + NA + + + + + + + + + oxidase ± ± + + + + + + + Table 4. -, 0-10% positive; +, 90-100% positive; [-], 11-25% positive; [+], 76-89% positive d, 26-75% positive; a, E. agglomerans is negative ; b, K. pneumonia subs. rhinoscleromatis and some ozenae are negative; c, E. agglomerans; e, positive in S. cohnii subsp urealyticus; f, positive in S. cohnii subsp urealyticus Genetic identification of bacteria 32 ACKNOWLEDGMENTS I thank Dr. Lucinda Sonnenberg for her valuable advice, guidance, support, and help during this project. I also thank Ayesha Patel for her advice, friendship and help. Samples were sequenced at the UTK Molecullar Biology Resource Facility (MBRF) in Knoxville. This study was supported by the Department of Biology and Marine Science and the Department of Chemistry. Genetic identification of bacteria 33 REFERENCES Atlas R, Robert J, editor. 1995. Principles of Microbiology, Mosby. Lousville: Callanan. 1st ed. 23- 24pp. Baron Samuel. Medical microbiology. [Internet] Galveston (TX): Phenotypic Characteristics Useful in Classification and Identification. C 1996 [cited 2010 March]. Available from: http://www.ncbi.nlm.nih.gov/bookshelf/br.fcgi?book=mmed&part=A347 Clarridge J. 2004. Impact of 16S rRNA Gene Sequence Analysis for Identification of Bacteria on Clinical Microbiology and Infectious Diseases. Clinical Microbiology. 17(4): 840–862 GenBank; August 2009. http://www.ncbi.nlm.nih.gov/Genbank/ Gontang E, Fenical W, and Jensen P. 2007. Phylogenetic Diversity of Gram-Positive Bacteria Cultured from Marine Sediments. Appl Environ Microbiol. San Diego 73(10): 3272-3282. Hendrickson E, Payne J, Young R, Starr M, Perry M, Fahnestock S, Ellis D, and Ebersole R. 2002. Molecular Analysis of Dehalococcoides 16S Ribosomal DNA from ChloroetheneContaminated Sites throughout North America and Europe. Appl Environ Microbiol. 68(2): 485-495. Holt G John, Krieg R. Noel, Sneath H. A. Peter, Staley T. James, and Williams T. Stanley. 1994. Bergey’s manual of determinative bacteriology. 9th ed. Maryland: LIPPINCOTT WILLIAMS & WILKINS. Janda JM, Abbott Sharon. 2007. 16S rRNA Gene Sequencing for Bacterial Identification in the Diagnostic Laboratory: Pluses, Perils, and Pitfalls Minireview. Clinical Microbiolog. 45(9): 2761–2764. Genetic identification of bacteria 34 Janda JM, Abbott Sharon. 2002. Bacterial Identification for Publication: When Is Enough Enough? Clinical Microbiology. 40(6): 1887-1891. Krieg R Noel, Holt G. John. 1984. Bergey’s Manual of Systematic bacteriology. Volume 1 and 2. Baltimore: WILLIAMS & WILKINS. Lozupone C, Knight R. 2009. Species Divergence and the Measurement of Microbial Diversity. Microbiol. 32(4): 557-78. Morris C, Bardin M, Berge O, Frey-Klett P, Fromin N, Girardin H, Guinebretiere M, Lebaron P, Thiery J, Troussellier M. 2002. Microbial Biodiversity: Approaches to Experimental Design and Hypothesis Testing in Primary Scientific Literature from 1975 to 1999; Microbiol and Molecular Biol. 66(4): 592-616. Maier R, Pepper I. Gerba C. 2000. Environmental Microbiology. 2nd ed. Florida: Academic Press. The NCBI handbook [Internet]. National Center for Biotechnology Information; Entrez. C 1989 [Cited 2010 April]. Available from: http://www.ncbi.nlm.nih.gov/sites/entrez Ouellette A, Handy S, Wilhelm S. 2005. Toxic Microcystis is Widespread in Lake Erie: PCR Detection of Toxin Genes and Molecular Characterization of Associated Cyanobacterial Communities; Microbial Ecology 51(2); 154-1656. Sambrook and Russell. 2001. Molecular cloning. A lab manual. Volume 1. New York. SSU & LSU. Green genes. [Internet]. Berkeley(Ca): Greengenes. [Updated 2010 September 28; cited 2010 March]. Available from: http://greengenes.lbl.gov/cgi-bin/JD_Tutorial/nph16S.cgi. Genetic identification of bacteria 35 Sutton S: How do you decide which microbial identification system is best? The Microbiology Network [Internet]. Chili (NY): [Updated 2010; cited 2010 March]. Available from: http://www.microbiol.org/white.papers/WP.which.ID.htm. Urbach E, Vergin K, Young L, Morse A. 2001. Unusual bacterioplankton community structure in ultra-oligotrophic Crater Lake. Limnol. Oceanogr. 46(3): 557-572. Wand, Y., Qian, Pei-Yuan; Conservative fragments in bacterial 16S rRNA genes and primer design for 16S ribosomal DNA amplicons in metagenomic studies; NERC Center for Ecology and Hydrology, United Kingdom, 2009. Genetic identification of bacteria 36 APPENDIXES SUPPORT TESTS Plesiomonas shigelloides (A5B) Rods Gram - D-Glucose + Growth on TSA media D-Mannitol + SIM (-++) Genetic identification of bacteria 37 Acinetobacter sp. (B5H) Rods Gram - D-Glucose - Growth on TSA media D-Mannitol - SIM (- - -) Genetic identification of bacteria 38 Exiguobacterium sp. (B7I) Rods Gram + D-Glucose + Growth on TSA media D-Mannitol + SIM (- - +) Genetic identification of bacteria 39 Enterobacter sp. (B7K) Rods Gram - D-Glucose + Glu + Lys - Growth on TSA media D-Mannitol + Orn - H2S/Ind-/+ Ado+ Lact+ SIM (- + +) Ara+ Sorb- VP- Dul/Phe+/- Urea+ Citr - Enterotube Genetic identification of bacteria 40 Aeromonas sp. (B5D) Rods Gram - D-Glucose + Growth on TSA media D-Mannitol + SIM (- + +) Genetic identification of bacteria 41 Enterobacter sp. (B7G) Rods Gram - D-Glucose + Growth on TSA media D-Mannitol + Glu + Lys - Orn - H2S/Ind- Ado - Lact - Ara+ Enterotube SIM (- - +) Sorb+ VP+ Dul/Phe+/– Urea- Citr + Genetic identification of bacteria 42 Aeromonas sp. (B7C) Rods Gram - D-Glucose - Growth on TSA media D-Mannitol - SIM (+ - 0 ) Genetic identification of bacteria 43 Acinetobacter sp. (B5G) Rods Gram - D-Glucose + Growth on TSA media D-Mannitol - SIM (- - -) Genetic identification of bacteria 44 Staphylococcus sp. (B5J) Coccus Gram + Growth on TSA media Glu + Lys + Orn + H2S/Ind Ado - Lact - Ara- Enterotube Sorb- VP+ Dul/Phe – Urea + Citr - Genetic identification of bacteria 45 Pseudomonas sp. (B7D) Rods Gram - D-Glucose + Growth on TSA media D-Mannitol - SIM (+ + +) Genetic identification of bacteria 46 Planococcus sp. (C5B) Rods Gram + D-Glucose- Growth on TSA media D-Mannitol - SIM (- - -)