Download Metabolomics based gene function annotation in Escherichia coli

Document related concepts

Paracrine signalling wikipedia , lookup

Biosynthesis wikipedia , lookup

Secreted frizzled-related protein 1 wikipedia , lookup

Genetic engineering wikipedia , lookup

Community fingerprinting wikipedia , lookup

Gene nomenclature wikipedia , lookup

RNA-Seq wikipedia , lookup

Two-hybrid screening wikipedia , lookup

Fatty acid synthesis wikipedia , lookup

Gene wikipedia , lookup

Gene therapy wikipedia , lookup

Expression vector wikipedia , lookup

Amino acid synthesis wikipedia , lookup

Vectors in gene therapy wikipedia , lookup

Fatty acid metabolism wikipedia , lookup

Pharmacometabolomics wikipedia , lookup

Biochemical cascade wikipedia , lookup

Specialized pro-resolving mediators wikipedia , lookup

Biochemistry wikipedia , lookup

Metabolism wikipedia , lookup

Silencer (genetics) wikipedia , lookup

Gene regulatory network wikipedia , lookup

Hepoxilin wikipedia , lookup

Point mutation wikipedia , lookup

Gene therapy of the human retina wikipedia , lookup

Gene expression profiling wikipedia , lookup

Endogenous retrovirus wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Metabolomics wikipedia , lookup

Transcript
Graduate Theses and Dissertations
Graduate College
2014
Metabolomics based gene function annotation in
Escherichia coli and Methanosarcina acetivorans
with genetic gain- or loss-of-function strains
Lucas John Showman
Iowa State University
Follow this and additional works at: http://lib.dr.iastate.edu/etd
Part of the Biochemistry Commons
Recommended Citation
Showman, Lucas John, "Metabolomics based gene function annotation in Escherichia coli and Methanosarcina acetivorans with
genetic gain- or loss-of-function strains" (2014). Graduate Theses and Dissertations. Paper 14013.
This Dissertation is brought to you for free and open access by the Graduate College at Digital Repository @ Iowa State University. It has been accepted
for inclusion in Graduate Theses and Dissertations by an authorized administrator of Digital Repository @ Iowa State University. For more
information, please contact [email protected].
Metabolomics based gene function annotation in Escherichia coli and
Methanosarcina acetivorans with genetic gain- or loss-of-function strains
by
Lucas Showman
A dissertation submitted to the graduate faculty
in partial fulfillment of the requirements for the degree of
DOCTOR OF PHILOSOPHY
Major: Biochemistry
Program of Study Committee:
Basil J. Nikolau, Major Professor
Eve S. Wurtele
Young-Jin Lee
Thomas A. Bobik
Alan A. Dispirito
Iowa State University
Ames, Iowa
2014
Copyright © Lucas Showman, 2014. All rights reserved
ii
TABLE OF CONTENTS
Page
ACKNOWLEDGEMENTS .......................................................................................
iv
ABSTRACT………………………………. ..............................................................
v
CHAPTER 1
1
GENERAL INTRODUCTION: THESIS FORMATTING ............
CHAPTER 2 METABOLITE AND GROWTH ANALYSIS OF Β-KETOACYL
-ACP SYNTHASE II AND III DELETION MUTANTS IN E. COLI .....................
6
Abstract…… ........................................................................................................
6
Introduction .........................................................................................................
7
Materials and methods .........................................................................................
10
Results…….. ........................................................................................................
16
Discussion… ........................................................................................................
22
References…. .......................................................................................................
25
CHAPTER 3 GENE 2 FUNCTION PROJECT IN METHANOSARCINA
ACETIVORANS: INVESTIGATING A NOVEL COMBINED NMR LIGAND
SCREENING AND METABOLOMIC ANALYSIS BASED APPROACH FOR
EFFICIENT AND ACCURATE GENE FUNCTIONAL ANNOTATION ..............
56
Introduction… ......................................................................................................
56
Materials and methods .........................................................................................
62
Results…….. ........................................................................................................
66
iii
Page
Discussion… ........................................................................................................
70
References…. .......................................................................................................
73
CHAPTER 4 THE BIOTIN NETWORK OF METHANOSARCINA
ACETIVORANS: GENE FUNCTION, REGULATION, AND METABOLIC
ROLE OF BIOTIN IN A METHANOGENIC ARCHAEON’S METABOLISM ....
93
Abstract…… ........................................................................................................
94
Introduction… ......................................................................................................
95
Materials and methods .........................................................................................
97
Results…….. ........................................................................................................ 102
Discussion… ........................................................................................................ 105
References…. ....................................................................................................... 109
CHAPTER 5
GENERAL CONCLUSIONS ......................................................... 129
APPENDIX
........................................................................................................... 130
iv
ACKNOWLEDGEMENTS
First of all, I would like to thank my major professor, Basil Nikolau, for his guidance
and support throughout the course of my research and graduate studies. I would also like to
express my gratitude to my committee members, Eve Wurtele, Young-Jin Lee, Thomas
Bobik, and Alan Dispirito for their assistance and guidance in my doctoral studies.
In addition, I would also like to thank my friends and colleagues in the Nikolau group
for making my time doing research at Iowa State University a wonderful experience.
Finally, thanks to my family for their encouragement: my parents Gail and Homer for
their love and support, my wife Leanna for her hours of patience, respect and love, and my
girls Nora and Ayla for their love and ridiculous shenanigans that distract me from the daily
grind of my research.
v
ABSTRACT
In this dissertation three studies are presented, all of which incorporated
metabolomics as part of their experimental methods. The first study is a loss of function
study focused on examining E. coli ΔfabH and ΔfabF single and double mutants.
Metabolomics studies of ΔfabH β-Ketoacyl-acyl carrier protein synthases (KAS) III (KASIII)
assisted in the development of the hypothesis that fabF (KASII) is involved in the survival of ΔfabH
E. coli cells. This hypothesis was tested by generating and evaluating ΔfabF/ ΔfabH double mutants.
Characterization of the ΔfabF/ ΔfabH double mutant revealed that the fabB gene (KASI) is able to
support E. coli cell growth as the sole KAS enzyme present in this strain. Additionally it was
determined that fabF is required for the hyper-accumulation of cis-vaccenic acid observed in ΔfabH
mutants.
The second study focused on a functional genomics effort to develop a gene-function
annotation method based on metabolomics and NMR ligand binding assays. This study had
mixed results. The metabolomics effort in this study was met with great technical hurdles.
Only in a single case (MA3250) was metabolomics useful in annotating a gene’s functions.
While the NMR substrate binding assays were able to provide useful data for more accurate
gene functional annotations in more than 10% of the genes examined.
The third study presented was an in depth study of the biotin network of Methanosarcina
acetivorans, where metabolic changes associated with the presence and absence of biotin in
Methanosarcina acetivorans cultures was investigated. In this study metabolomics studies
were able to detect metabolic changes that were associated with the presents or absence of
vi
biotin. Additionally it was revealed that M. acetivorans does not require biotin or any
detectable biotinylated protein to survive and maintain growth.
1
CHAPTER I. GENERAL INTRODUCTION
From time of the sequencing of the first complete genome in 1995 (Fleischmann et al.,
1995) to the present, sequencing efforts have yielded the sequences of approximately 22,800
genomes, from 16,500 organisms, representing 349 million genes and genomic features (CoGe,
2014). As the number of sequenced genomes has grown, the observation remains that a
significant proportion (as high as 70% in some cases) of prokaryotic, archaeal, and eukaryotic
genes have unknown functions. For example, among all archaea ~33% of the genes sequenced
were recently labeled as genes of unknown function (Yelton et al., 2011). The persistent lack of
annotations represents significant gap in our understanding of biology and biochemistry.
Metabolomics is a functional genomics tool that has emerged as a valuable method of
resolving gene functional annotations. Metabolomics is the science of identifying and measuring
the pool of metabolites that represents the chemical composition of a biological sample, termed
the metabolome. Although metabolomics is a relatively new field in biology (Fiehn et al., 2000;
Hall et al., 2002), metabolomic analysis has increasing examples of being successfully employed
to determine the metabolic functionality of genes (Hirai et al., 2005; Brauer et al., 2006; Liang et
al., 2006; Schauer and Fernie, 2006; Hegeman et al., 2007; Swanepoel and Loots, 2014). Our
main analytical approach utilized targeted and non-targeted metabolomic analysis of small
molecules (<600 Da) by gas chromatography–mass spectrometry (GC-MS) analysis. GC-MS
allows for compounds to be identified based on retention times in the chromatography, indexed
relative to a series of hydrocarbon standards, the m/z value of the molecular ion for each
metabolite, and matching the MS-fragmentation pattern of each to existing MS-databases. The
identification of metabolites is a significant hurdle in metabolomics (de Jong and Beecher,
2
2012). Fourier transform-ion cyclotron resonance mass spectrometry (FT-ICR MS) and other
high mass accuracy instruments are valuable for metabolite detection and identification. The
advantage of this technique is determining high mass resolution and accuracy, which allows for
empirical formula determination from molecular ions and is useful in determining the identity of
unknown compounds. Isotope labeling studies can even more accurately determine empirical
formulae, >85% of metabolites to a single empirical formula using 13C and 15N (Hegeman et al.,
2007; de Jong and Beecher, 2012). Metabolomics can be used for analyzing the functionality of
genes, specifically by applying the technology to gain of function and loss of function genetic
mutants. Additionally, metabolomics can be used to explore metabolic diversity of different
accessions/strains, tissues, and cell types of an organism. Once the metabolic diversity has been
described, the genetic-basis for the metabolic and/or phenotypic differences can be elucidated
using other omics technologies such genomics, transcriptomics, and proteomics. In both
strategies, the genetic and metabolomic data can be mapped to known and putative metabolic
pathways to identify the functionality of specific genes and alleles (Förster et al., 2002; Schauer
and Fernie, 2006; Swanepoel and Loots, 2014).
In this dissertation three studies are presented, all of which incorporated metabolomics as
part of their experimental methods. The first study is a loss of function study focused on
examining E. coli ΔfabH and ΔfabF single and double mutants. The second study is a functional
genomics effort to develop a gene-function annotation method. This method relied on
metabolomics and NMR ligand binding assays to assign function to genes from Methanosarcina
acetivorans. The third study presented is an in depth study of the biotin network of
Methanosarcina acetivorans, where metabolic changes associated with the presence and absence
of biotin in Methanosarcina acetivorans cultures were investigated.
3
Dissertation organization
This dissertation is composed of five chapters and one appendix. The first chapter is a
general introduction presenting brief introduction to metabolomics. The purpose of this chapter
is to describe the common approach used in the next three chapters.
The second chapter is a manuscript to be submitted to Journal of Bacteriology. In this chapter,
the growth and metabolic consequences of KASII and KASIII single and double mutants in E.
coli are presented. The third chapter describes a functional genomics effort to develop a gene
function annotation method. This method relied on metabolomics and NMR ligand binding
assays to assign function to genes from Methanosarcina acetivorans. The fourth chapter is a
manuscript to be submitted to Journal of Bacteriology. In this chapter, the growth and metabolic
consequences of the presence and absence of the vitamin biotin are described. The final chapter
is a brief general conclusion concerning the studies outlined in chapters two through four. The
last section is an appendix that contains the metabolomics results from the study in chapter three.
I was primarily responsible for conducting all of the experiments in chapter 2. Chapters 3 and 4
were part of collaboration between groups at Iowa State University and the University of
Maryland. I was responsible metabolite analysis and genetic complementation studies in chapter
3, while in chapter 4, I was primarily responsible for conducting the molecular biology and
biochemical characterization of M. acetivorans cultures in the presences and absents of biotin (I
was not responsible for NMR experiments).
REFERENCES
Brauer MJ, Yuan J, Bennett BD, Lu W, Kimball E, Botstein D, Rabinowitz JD (2006)
Conservation of the metabolomic response to starvation across two divergent microbes.
Proc Natl Acad Sci U S A 103: 19302-19307
4
Cha S, Song Z, Nikolau BJ, Yeung ES (2009) Direct profiling and imaging of epicuticular
waxes on Arabidopsis thaliana by laser desorption/ionization mass spectrometry using
silver colloid as a matrix. Anal Chem 81: 2991-3000
Cha S, Zhang H, Ilarslan HI, Wurtele ES, Brachova L, Nikolau BJ, Yeung ES (2008) Direct
profiling and imaging of plant metabolites in intact tissues by using colloidal graphiteassisted laser desorption ionization mass spectrometry. Plant J 55: 348-360
CoGe (2014) The Place to Compare Genomes. In. genomevolution.org
de Jong FA, Beecher C (2012) Addressing the current bottlenecks of metabolomics: Isotopic
Ratio Outlier Analysis™, an isotopic-labeling technique for accurate biochemical
profiling. Bioanalysis 4: 2303-2314
Fiehn O, Kopka J, Dörmann P, Altmann T, Trethewey RN, Willmitzer L (2000) Metabolite
profiling for plant functional genomics. Nat Biotechnol 18: 1157-1161
Fleischmann RD, Adams MD, White O, Clayton RA, Kirkness EF, Kerlavage AR, Bult CJ,
Tomb JF, Dougherty BA, Merrick JM (1995) Whole-genome random sequencing and
assembly of Haemophilus influenzae Rd. Science 269: 496-512
Förster J, Gombert AK, Nielsen J (2002) A functional genomics approach using metabolomics
and in silico pathway analysis. Biotechnol Bioeng 79: 703-712
Hall R, Beale M, Fiehn O, Hardy N, Sumner L, Bino R (2002) Plant metabolomics: the
missing link in functional genomics strategies. Plant Cell 14: 1437-1440
Hegeman AD, Schulte CF, Cui Q, Lewis IA, Huttlin EL, Eghbalnia H, Harms AC, Ulrich
EL, Markley JL, Sussman MR (2007) Stable isotope assisted assignment of elemental
compositions for metabolomics. Anal Chem 79: 6912-6921
Hirai MY, Klein M, Fujikawa Y, Yano M, Goodenowe DB, Yamazaki Y, Kanaya S,
Nakamura Y, Kitayama M, Suzuki H, Sakurai N, Shibata D, Tokuhisa J, Reichelt
M, Gershenzon J, Papenbrock J, Saito K (2005) Elucidation of gene-to-gene and
metabolite-to-gene networks in arabidopsis by integration of metabolomics and
transcriptomics. J Biol Chem 280: 25590-25595
Jun JH, Song Z, Liu Z, Nikolau BJ, Yeung ES, Lee YJ (2010) High-spatial and high-mass
resolution imaging of surface metabolites of Arabidopsis thaliana by laser desorptionionization mass spectrometry using colloidal silver. Anal Chem 82: 3255-3265
5
Lee YJ, Perdian DC, Song Z, Yeung ES, Nikolau BJ (2012) Use of mass spectrometry for
imaging metabolites in plants. Plant J 70: 81-95
Liang YS, Kim HK, Lefeber AW, Erkelens C, Choi YH, Verpoorte R (2006) Identification
of phenylpropanoids in methyl jasmonate treated Brassica rapa leaves using twodimensional nuclear magnetic resonance spectroscopy. J Chromatogr A 1112: 148-155
Marcinowska R, Trygg J, Wolf-Watz H, Mortiz T, Surowiec I (2011) Optimization of a
sample preparation method for the metabolomic analysis of clinically relevant bacteria. J
Microbiol Methods 87: 24-31
Schauer N, Fernie AR (2006) Plant metabolomics: towards biological function and mechanism.
Trends Plant Sci 11: 508-516
Swanepoel CC, Loots dT (2014) The use of functional genomics in conjunction with
metabolomics for Mycobacterium tuberculosis research. Dis Markers 2014: 124218
Takahashi H, Kai K, Shinbo Y, Tanaka K, Ohta D, Oshima T, Altaf-Ul-Amin M,
Kurokawa K, Ogasawara N, Kanaya S (2008) Metabolomics approach for determining
growth-specific metabolites based on Fourier transform ion cyclotron resonance mass
spectrometry. Anal Bioanal Chem 391: 2769-2782
Wu L, Rowe EW, Jeftinija K, Jeftinija S, Rizshsky L, Nikolau BJ, McKay J, Kohut M,
Wurtele ES (2010) Echinacea-induced cytosolic Ca2+ elevation in HEK293. BMC
Complement Altern Med 10: 72
Yelton AP, Thomas BC, Simmons SL, Wilmes P, Zemla A, Thelen MP, Justice N, Banfield
JF (2011) A semi-quantitative, synteny-based method to improve functional predictions
for hypothetical and poorly annotated bacterial and archaeal genes. PLoS Comput Biol 7:
e1002230
Zech H, Thole S, Schreiber K, Kalhöfer D, Voget S, Brinkhoff T, Simon M, Schomburg D,
Rabus R (2009) Growth phase-dependent global protein and metabolite profiles of
Phaeobacter gallaeciensis strain DSM 17395, a member of the marine Roseobacter-clade.
Proteomics 9: 3677-3697
6
CHAPTER II. METABOLITE AND GROWTH ANALYSIS OF ΒKETOACYL-ACP SYNTHASE II AND III DELETION
MUTANTS IN E. COLI
Lucas Showman1, Riley Brien1, Ann Perera2, and Basil J Nikolau1 2 3*
Author Affiliations:
1
Department of Biochemistry, Biophysics and Molecular Biology, Iowa State University, Ames
IA 50011, USA
2
W.M. Keck Metabolomics, Research Laboratory Iowa State University, Ames, IA 50011
3
Center for Biorenewable Chemicals, Iowa State University, Ames IA 50011, USA
ABSTRACT
3-Ketoacyl-acyl carrier protein synthase III catalyzes the first carbon-carbon bondforming reaction (Claisen condensation) of type II fatty acid synthesis systems in bacteria and
plant plastids. In E. coli, KASIII is encoded by the fabH gene. FabH deletion mutants were
originally thought to be lethal but more recent work has shown that ΔfabH mutants are viable.
Fatty acid profiling of ΔfabH mutants showed they display cis-vaccenic acid hyper-accumulation
chemotype. In this study we recovered a ΔfabF/ΔfabH double deletion mutant that was viable on
both complete and minimal mediums. We also demonstrated that fabF (KASII) is required for
the cis-vaccenic acid hyper-accumulation chemotype displayed by ΔfabH cells. Here we propose
that ΔfabH deletion mutant cells compensate for the loss of KASIII activity by fabB (or fabF)
initiating fatty acid synthesis in the absence of an acetoacetyl-ACP primer via a side reaction
involving malonyl-ACP decarboxylase activity and the elongation of the available substrate
acetyl-ACP to form the required acetoacetyl-ACP, there by forming a fabH bypass pathway.
Also, we demonstrated for the first time in E. coli, in vivo data showing that fabB is able to
support growth as the sole KAS enzyme present in E. coli.
7
INTRODUCTION
Three biochemically and genetically distinct β-Ketoacyl-acyl carrier protein (ACP)
synthases (KAS) exist in E. coli: KASI, KASII, and KASIII, that are encoded by the genes fabB,
fabF, and fabH, respectively. KASIII catalyzes the first elongation reaction (Claisen
condensation) of type II fatty acid synthesis systems in bacteria (Fig. 1) and plant plastids.
KASIII enzymes differ from KASI and KASII enzymes by using a coenzyme A (CoA) bound
acetyl group as a primer in place of the acyl-ACP molecules used by both KASI and KASII
enzymes in E. coli. The fabH gene, encoding the sole copy of KASIII found in E. coli, was
originally thought to be an essential gene and fabH deletion mutants made in the K-12 UB1005
(F- metB1 relA1 spoT1 gyrA216 λrλ-) derived strains were found not to be viable (Lai and
Cronan, 2003). Because KASIII enzymes are part of type II fatty acid synthesis systems found
in bacteria, they have been extensively studied as a potential antibacterial drug targets (Heath
and Rock, 2004; Li et al., 2011; Wang et al., 2012; Yang et al., 2012; Ramamoorthy et al., 2013;
Zhou et al., 2013). However, ΔfabH deletion mutants were generated in the K-12 BW25113 (FΔ(araD-araB)567 ΔlacZ4787(::rrnB-3) rph-1 Δ(rhaD-rhaB)568 hsdR514 λ-) background E. coli
and are viable and available in the Keio collection of mutants (Baba et al., 2006; Baba and Mori,
2008). FabH deletion mutants made in the UB1005 derived strains were reevaluated and it was
shown that the lethality found in these FabH mutants was caused by a background effect. The
UB1005 strain carries the spoT1 mutant allele and it was determined that ΔspoT and ΔfabH are
synthetically lethal (Battesti and Bouveret, 2006, 2009; Yao et al., 2012). SpoT protein interacts
directly with ACP molecules and initiates a stringent response during fatty acid starvation
(Battesti and Bouveret, 2006, 2009). A ΔfabH E. coli mutant in the BW25113 background was
investigated and it was revealed that the fabH gene is required for proper regulation of cell size.
8
ΔfabH cells not only were shorter in length but also were uniquely smaller in measured cell
width and had a remarkable 70% reduction in cell volume with respect to wild type cell width
(Yao et al., 2012).
3-Ketoacyl-acyl carrier II or KASII is another family of KAS genes found in E. coli.
KASII in E. coli is encoded by the fabF gene. The substrate specificity of fabF differs from that
of E. coli KASIII and KASI in that it is known to have a preference towards the elongation of
C16:1 acyl-ACP to C18:1 acyl-ACP (Fig. 1). FabF plays a role in the regulation of membrane
fluidity by altering the amount of the unsaturated fatty acid, cis-vaccenic acid (C18:1 Δ11) that is
synthesized. The cis-vaccenic acid becomes incorporated in membrane phospholipids, which
leads to increased membrane fluidity. E. coli express fabF in a temperature dependent manner,
with increased fabF activity observed at lower temperatures (Garwin et al., 1980; de Mendoza et
al., 1983; Ulrich et al., 1983; Oh-hashi and Okuyama, 1986). Although fabF is required to
maintain proper membrane fluidity, fabF is not an essential gene in E. coli (Garwin et al., 1980).
KASI differs from the KASIII and KASII enzymes by possessing the ability to accept a
broader array of substrates, ranging from short chain acyl-APCs to long chain acyl-ACPs (Fig.
1). KASI is encoded by the fabB gene in E. coli and is able to catalyze condensation reactions to
elongate acyl chains up to 18 carbons in length. FabB is an essential gene and is required for the
elongation of medium-chain unsaturated acyl-ACP (C10:1-C14:1)(Garwin et al., 1980;
Jackowski and Rock, 1987; Magnuson et al., 1993). In addition to catalyzing condensation
reactions, both KASI and KASII enzymes are known to have malonyl-ACP decarboxylase
activity that leads to the formation of acetyl-ACP (Alberts et al., 1972). FabB also has an
interesting role in biotin biosynthesis where the KASI enzyme catalyzes the elongation of 3-
9
ketoglutaryl-ACP-methyl-ester to pimeloyl-ACP-methyl-ester as part of the biotin biosynthesis
pathway (Lin et al., 2010; Cronan and Lin, 2011).
Studying fabH mutants was of particular interest, as fabH was originally thought to be an
essential gene and ΔfabH deletion mutants were surprisingly generated in E. coli and are
available in the Keio collection of mutants. These ΔfabH deletion mutants provide an attractive
opportunity to investigate the metabolic consequences of removing the KASIII from E. coli.
Understanding the metabolic effects of the ΔfabH deletion could elucidate how the cells
overcome the mutation as well as provide chemotypic evidence that may be related to observable
phenotypes. ΔfabH deletion mutants also provided a tool to study how the remaining KAS
enzymes function in the absence of a KASIII in E. coli.
10
MATERIALS AND METHODS
Development and confirmation of KAS mutant strains
The ΔfabH (JW1077) single mutant was obtained from the Keio mutant collection (Baba
et al., 2006). PCR was conducted with primers for detection of fabH genes to detect any full
length fabH genes sequences with GoTaq DNA Polymerase master mix (Promega, Madison,
WI). Primers fabF gene flanking primers used for the amplification of the GenR cassette were
based on the KanR cassette primers used to form the Keio mutant collection (Baba et al., 2006).
The KanR specific primer sequences were replaced with GenR specific sequences and the PCR
was amplified using GoTaq DNA Polymerase master mix (Promega, Madison, WI). The fabF
gene was replaced with the GenR gene via the Datsenko and Wanner method (Datsenko and
Wanner, 2000). Cells were recovered on TB plates containing 50µg/mL kanamycin and
20µg/mL gentamicin. The ΔfabF (JW1081) single mutant was obtained from the Keio mutant
collection (Baba et al., 2006). FabF specific antibodies and primers were used to confirm the
deletion of the fabF gene through western blot, and PCR gel electrophoresis analysis. Keio
collection mutants were generously provided by Thomas Bobik, Iowa State University.
Growth of analysis of KAS mutants
The four medias used to assess the growth characteristics of the KAS mutant cells were:
MOPS minimal media (Teknova, Hollister, CA), lysogeny broth (LB) (BERTANI, 1951), terrific
broth (TB) (Tartoff and Hobbs, 1987), and M9 minimal media (Sigma-Aldrich, St. Louis, MO).
Inoculum pre-cultures were grown in the presence of the appropriate antibiotics at the following
concentrations: kanamycin 50µg/mL, gentamycin 20µg/mL. Each sample was diluted to the
same starting OD600 (0.001) and media without antibiotics. Culture growth was monitored at an
11
absorbance of 600nm using a Bio-Tek (Winooski, VT) ELx808 microplate reader. Culture
growth data was analyzed using the microplate reader’s Gen5 software package. All growth was
analyzed at 37°C and cultured with constant shaking performed at 250RPM in flask cultures and
the fast shake setting in the microplate cultures. The OD600 readings where observed every 15
minutes until the incubation period of up to 48 hours was completed. Lag times were measured
at the point when the measured OD surpassed the threshold of OD600 0.07. Solid plates with LB
and MOPS media were made with 1% agar. Plates were incubated up to 48 hours at 37°C. The
OD600 to cell density was calculated based on the number of recovered colonies from diluted
cultures plated on LB plates.
Fatty acid analysis with gas chromatography–mass spectrometry
Methyl ester fatty acid derivatives were used to analyze medium, hydroxyl, and long
chain fatty acids. Cultures were grown in MOPS minimal media with 0.4% glucose at 37°C and
culture with constant shaking performed at 250RPM in flasks. Cells were harvest by
centrifugation at mid log phase. Cell pellets were lyophilized overnight to dryness. Dry masses
of 5-10mg were recorded. 50µg of the odd chain fatty acid C19:0 was added as internal
standards. The cell pellets were treated with 10% barium hydroxide and 1,4-dioxane and
incubated overnight at 110°C, yielding free fatty acids. The free fatty acids were recovered in
hexane. The extracted free fatty acids were converted to methyl esters by treatment with acidic
methanol. The methyl esters were recovered in hexane (Bonaventure et al., 2003). The methyl
ester solution was then dried, re-suspended in 0.5mL acetonitrile (Sigma-Aldrich, St. Louis,
MO), then silylated with 35µL 1% BSTFA in TCMS (Sigma-Aldrich, St. Louis, MO), and
incubated 30 minutes at 60°C. Finally the samples were dried under N2 (g) to near dryness, re-
12
suspended with 0.1-1mL hexane, and submitted to GC-MS (gas chromatography–mass
spectrometry) analysis.
Ethyl esters were used to view the whole range of fatty acids: short, medium, and long.
Cultures were grown in MOPS minimal media with 0.4% glucose at 37°C and culture with
constant shaking performed at 250RPM in flasks. Cells were harvest by centrifugation at 0.2,
0.8, 1.0 OD600 and at stationary phase. Cell pellets were lyophilized overnight to dryness. Dry
masses of 5-10mg were recorded. The odd chain fatty acids C7:0, C13:0, and C19:0 were added
as internal standards with 50µg of each fatty acid added. The cell pellets were treated with 10%
barium hydroxide and 1,4-dioxane and incubated overnight at 110°C, yielding free fatty acids.
The free fatty acids were recovered in hexane (Bonaventure et al., 2003). The extracted free fatty
acids were converted to ethyl esters by treatment with 49:1 ethanol:H2SO4, recovered again in
hexane, and submitted to GC-MS analysis (Unpublished protocol from Fuyuan Jing, Iowa State
University).
The following GC-MS method was used for the fatty acid ester quantifications. All GCMS data was captured using an Agilent Technologies Model 7890A Gas Chromatograph with an
Agilent 19091S-433: 1926.58224 HP-5MS 5% Phenyl Methyl Siloxane column -60 °C—325 °C
(325 °C): 30 m x 250 µm x 0.25 µm column coupled to a Model 5975C mass selective detector
with electrical ionization (EI). The GC-MS system was controlled by and the data collected with
ChemStation software (Agilent Technologies, Santa Clara, CA). This GC-MS system was made
available by the Office of Biotechnology W.M. Keck Metabolomics Research Laboratory, Iowa
State University. The injection port temperature was 280°C with a splitless mode. He was the
carrier gas at a flow rate of 2.0 mL/min. The GC oven temperature ramp procedures were as
follows: temperature set to 50 °C and held for 2 minutes, ramp #1 25 °C/min to 200°C, hold 1
13
minute, ramp #2 20°C/min to 320°C, hold 3 minute. MS conditions were as follows: ion source,
200°C; electron energy, 70eV; quadrupole temperature, 150°C; GC-MS interface zone, 280°C;
scan range, 40-1000 mass units, a 3.6 minute solvent delay was used for the ethyl esters, while a
6.5 minute solvent delay was used for the methyl esters. Data was analyzed using AMDIS
(Automated Mass Spectral Deconvolution and Identification System;
http://chemdata.nist.gov/mass-spc/amdis/) software with data interpretation from NIST08 library.
The NIST 08 MS Library and AMDIS were provided by National Institute of Standards and
Technology (www.nist.gov). Note that fatty acids will be abbreviated in Lipid numbers format,
as follows: first the total number carbon atoms (X), CX, next the total number of unsaturated
carbon-carbon double bounds (D) and their carbon number from the carboxyl group (ΔY),
(CX:D ΔY). Take palmitoleic acid as an example, it is abbreviated as C16:1 cis Δ9 (Rigaudy and
Klesney, 1979). Other possible substitutions include: hydroxyl groups (CXOH, C14OH is
myristic acid) and cyclopropane fatty acyl groups cyCX:0 or CX:cyc.
Lipid analysis with FT-MS
Cultures were grown in MOPS minimal media with 0.4% glucose at 37°C and culture
with constant shaking performed at 250RPM in flasks. Metabolite extraction with hot methanol
was carried out similar to Schmidt et al., 2011. This particular extraction results in two phases
for polar and nonpolar metabolites.
ESI-MS and MS/MS experiments were performed on a Bruker Daltonics SolariX FTICR
equipped with a 7.0 tesla shielded superconducting magnet. The ions were generated from an
external ESI source with mass range from m/z 100 up to 3,000. MS/MS profiles were generated
using CID with varying CE settings (0.5-35 eV) depending on the abundance and fragmentation
14
energies required for each molecule. The instrument was externally calibrated with Agilent ESI
calibration mix as well as using nonadecanoic acid and octadecylamine (Sigma-Aldrich, St.
Louis, MO) as internal quantification and mass lock standards.
Phospholipid analysis FT-MS and 13C labeling
Cultures were grown in MOPS minimal media with 0.4% 13C glucose (Cambridge
Isotope Laboratories, Tewksbury, MA) at 37°C and culture with constant shaking performed at
250RPM in flasks. Cultures were harvested at late log phase. Samples were analyzed by ESI-MS
and MS/MS on a Bruker Daltonics SolariX FTICR. 13C shifted peaks were identified using
ShiftedIonsFinder analysis software (Kazusa DNA Research Institute, Kisarazu, Chiba, Japan)
courtesy of Dr. Kota Kera and shared as part of a JST NSF collaboration.
Expression analysis of KASI and KASII
fabB and fabF genes were cloned from K-12 BW25113 E. coli and protein was expressed
and purified as described previously by (Edwards et al., 1997). Purified protein samples were
submitted to the Hybridoma Facility of the Iowa State University Office of Biotechnology for the
generation of polyclonal antibodies in mice. Sera specific to the fabB or fabF proteins were
generated. These antibodies were then used for the western blot detection of native fabB and
fabF. Protein was extracted from cell pellets of cells at mid-log growth stage. Western blot
analysis was performed on the Wt and ΔfabH protein extracts. Detection was achieved by goat
anti-mouse IgG (H+L)-HRP conjugate (Bio-Rad, Hercules, CA) using chemilumninescence with
ELC western blotting reagents (GE Healthcare Bio-science Corporation, Piscataway, NJ).
Western blots were imaged with a ChemiDoc XRS+ imaging system and images were analyzed
with Image Lab Software (Hercules, CA) to quantify abundances.
15
Profiling of short chain acyl-ACPs
Freshly grown cells at mid-log phase were centrifuged to collect and remove media.
20uL of 6 mg/mL purified HIS tagged E. coli ACP (Xin Guan, Publication in preparation) was
added. 1.3 mL 7.5% ice cold trichloroacetic acid (TCA) was added and the samples were
homogenized by vortexing. Centrifuge at 25000g for 10 minutes. We then removed the
supernatant and quickly repeated with two additional 1.3 mL 7.5% TCA washes. Next the pellet
was washed once more with 0.65mL 1% TCA; this wash supernatant was discarded. ACPs were
dissolved in 50mM Mops buffer (pH 7.6 , with KOH), elute and centrifuged to pellet 3 times
with 1.33 mL (4mL total). To remove high molecular weight contaminates, the dissolved ACPs
solution was passed through 4mL Amicon centrifugal filter units with 50kDa MWCO (Millipore,
Billerica, MA, USA) by centrifuging for 30 minutes at 3400g to collect the flow through. The
ACP extracts were then washed and concentrated in 3kDa MWCO 4mL Amicon centrifugal
filter units with 50kDa MWCO (Millipore, Billerica, MA, USA) by centrifuging for 30 minutes
at 3400g three times with 1.33 mL HPLC grade water (Sigma-Aldrich, St. Louis, MO). The
extracts were then centrifuged to concentrate until there was ~0.25mL of ACP solution
remaining above the filter membrane.
ESI-MS experiments were performed on a Bruker Daltonics SolariX FTICR equipped with a 7.0
tesla shielded superconducting magnet. The ions were generated from an external ESI source
with mass range from m/z 200 up to 10,000 and monitored in negative mode. The instrument
was externally calibrated with Agilent ESI calibration mix as well as using purified HIS tagged
E. coli ACP as an internal quantification and mass lock standard.
16
RESULTS
Development and confirmation of KAS mutant strains
To confirm the genotypes of the stock strains and the recovered ΔfabF/ΔfabH double
mutant, PCR and western blot based studies were utilized. The PCR with fabH specific primers
showed observable bands in the Wt and ΔfabF strains, while no products were visible in the
ΔfabH or the ΔF /ΔH after visualization by gel electrophoresis (Fig. 2 A). These findings
confirm that the Wt and ΔfabF strains have a copy of the fabH gene. Western blot detection of
protein extracts using anti-fabF antibodies revealed that Wt and ΔfabH strains have detectible
luminescence from fabFp bands, while ΔF /ΔH and ΔfabF strains had no detectable band for
fabFp (Fig. 2 B). This western blot data was consistent with the ΔfabF genotype and the newly
generated ΔF /ΔH cells. The PCR with primers specific for the GenR knockout cassette used to
replace the native fabF gene showed an observable band in the ΔF /ΔH cells, while no products
were visible in the Wt and ΔfabF, ΔfabH after visualization by gel electrophoresis (Fig. 2 C).
This PCR data confirms that the ΔF /ΔH cells have a copy of the GenR knockout cassette.
Growth of single and double KAS mutants
Growth of wild type and ΔfabH cells on MOPS minimal media and LB rich media agar
plates revealed that the ΔfabH cells had slow growth effect. This slower growth was most visible
after 24 hours, while after 48 hours the ΔfabH cells had developed larger colonies (Fig. 3).
Growth curves of wild type and ΔfabH cells in LB rich media (Fig. 4 A) and in MOPS minimal
media (Fig. 3 B) both were monitored for 20 hours at 600nm. Doubling times (Fig. 4 C) were
calculated at the point of maximum growth rate. The ΔfabH cells had statistically significantly
longer doubling times in both LB and MOPS minimal media (p-value <0.05) with the doubling
17
times increasing by over 0.3 hours in both conditions. In addition, the final cell density of the
ΔfabH cultures was reduced, by nearly 0.2 OD600 under both media conditions.
Because ΔfabH cells are smaller in size than Wt cells (Yao et al., 2012), there was the
prospect that the OD600 may not be an accurate measurement of cell numbers. To determine the
utility of OD600 in determining growth rate and cell densities, cells/OD600/mL values were
established (Fig. 4 D) based on the number of recovered colonies from diluted cultures plated on
LB agar plates. There was not a significant difference in the number of cells/mL/OD600 between
wild type and ΔfabH cells. Having equivalent numbers of cells/mL/OD600 indicates that OD600
values are an accurate measurement of cell number and can be used for the calculation of cell
division rates on other growth curve characteristics of the Wt and ΔfabH strains.
Phospholipid analysis FT-MS and 13C labeling
Stable isotope labeling is a powerful tool that can be used with high accuracy mass
spectrometers such as FT-ICR MS to assign the elemental compositions of metabolites
(Hegeman et al., 2007). 13C labeled glucose was used as the sole carbon source to grow Wt and
ΔfabH cultures with 13C labeled metabolites that were compared to reference cultures grown on
natural abundance glucose. The resulting cultures were subjected to lipid analysis using FT-ICR
MS and the resulting 13C and 12C spectrums were compared using shifted peak finding software,
ShiftedIonsFinder (Kazusa DNA Research Institute, Kisarazu, Chiba, Japan). These results were
used to confirm the identity of the known PE and PG lipids. Interestingly, two shifted peaks
787.7m/z (12C) and 801.7m/z (12C) with 43 and 44 incorporated 13C labels, respectively, were
detected in the lipid extracts. These masses match the predicted molecule weights of PG
18
C18:1/cyC19:0 and PG cyC19:0/cyC19:0. However these phospholipids are not known to exist
in E. coli and are not found in literature. In an effort to confirm the identity of these novel
phospholipids, MS-MS analysis was conducted on the 13C labeled samples (Fig. 5). As the more
abundant of the two new lipids, MS-MS was successful in confirming the PG C18:1/cyC19:0
(13C labeled 830.8m/z) peak as being appropriately identified based on the identification
techniques previously described (Oursel et al., 2007). To determine the positions of the acyl
groups as sn1 or sn2, we looked at the relative abundance of 515m/z and 533m/z fragments,
which are C18:1 specific, and 529m/z and 547m/z peak fragments that are cyC19 specific. The
presence of the four times more abundant 515m/z and 533m/z peaks correspond to fragments
without the sn2 R group and an aldehyde in its place, this fragment is observed with and without
water. This peaks (515m/z and 533m/z) indicated a C18:1 acyl chain at the sn1 R group position.
These key fragments indicate that C18:1 group is most common at the sn1 positions (farthest
away from PG group). In addition to being novel phospholipids PG C18:1/cyC19:0 and PG
cyC19:0/cyC19:0 are both statically significantly increased in the ΔfabH mutant cells with a 7
and 6 fold increase in abundance respectively (Fig. 7).
Fatty acid and lipid analysis with GC and FT-MS
Because the product of fabH, acetoacetyl-ACP contributes directly to de novo fatty acid
biosynthesis, it is imperative to understand the impact of the ΔfabH mutation on the fatty acid
pool and lipid composition of the mutant cells. Understanding the impact of removing the fabH
gene on fatty acid and lipid pools can aid in elucidating the function of fabH and add information
concerning how E. coli compensate for the lost KASIII activity. The fatty acid and lipid content
19
of ΔfabH deletion mutant cells were analyzed and compared to Wt cells. The total fatty acid
content (Fig. 6 A) of the ΔfabH cells had a statistically significant reduction compared to wild
type of 38 nmoles/g dry weight (p-value<0.5). The percent of unsaturated fatty acid remained
unchanged between the Wt and ΔfabH cells (Fig. 6 B). Figure 6 C displays the relative
abundance of each fatty acid as a percent of the total recovered. With the exception of C14:OH,
all of the fatty acid groups C16 and shorter are significantly higher in the ΔfabH mutant, while
any chains longer than C16 are statistically significantly increased compared to wild type (pvalue<0.5).
The phospholipids quantified in using HPLC and FT-ICR MS (Fourier transform ion
cyclotron resonance mass spectrometry), detected in both negative mode (Fig. 7) and positive
mode (Fig. 8) electrospray ionization (ESI), strongly reflect the fatty acid quantification data
(Fig. 6). In ΔfabH cells phospholipid molecules with C16:1 or C16:0 acyl groups are
significantly reduced in both types of the major membrane lipids found in E. coli,
phosphatidylethanolamine (PE) and phosphatidylglycerol (PG). While PG and PE molecules
with C18:1 or CyC19 without C16:0 or C16:1 acyl groups were significantly increased in
abundance (p-value<0.5) in the ΔfabH mutant (Fig. 7).
ACP profiling of KASIII mutant
ACP molecules and short chain acyl-ACPs were analyzed to view the direct effect the
fabH mutation has on the early intermediates of fatty acid biosynthesis. The acetyl, holo, and an
unknown acyl-ACP (of 8925m/z) (Fig. 9), all are statistically significantly elevated in the ΔfabH
mutant, with increases of 0.35 +/-0.08, 0.38+/-0.08, and 0.38+/ 0.06-fold respectively (Fig. 10
A). Interestingly, the acetoacetyl-ACP content was not statistically different between the Wt and
ΔfabH cells. When the APC molecules were compared as a fraction of the total small ACP pool,
20
the acetoacetyl-ACP content of the ΔfabH mutant was statistically significantly reduced 27%
compared to wild type (p-value <0.5) (Fig. 10 B), while the fractional content of the other
detected ACP molecules were unchanged between Wt and ΔfabH samples.
Quantification of fabF and fabB protein accumulation
Polyclonal antibodies specific to fabB and fabF proteins were produced to detect changes
in the protein accumulation of these two KAS enzymes. Western blot analysis of fabF and fabB
proteins showed the ΔfabH cells had increases in both fabF and fabB by 4- and 2-fold
respectively (Fig. 11 A). When observed as a ratio of fabF/fabB, the ΔfabH cells showed an
increase of over 50% in this ratio as compared to Wt (Fig. 11 B). These were statistically
significant differences (p-value <0.05) between the Wt and ΔfabH cells.
Growth analysis of KAS single and double mutants among varying media types
The growth characteristics of the ΔfabH and ΔfabF single mutants, as well as the
ΔfabF/ΔfabH double mutant, were monitored under four specific media conditions: LB, terrific
broth (TB), MOPS minimal media, and M9 minimal media. All the graphs Figures 12 to 17
represent the average of four replicates of each genotype under these four media conditions. Two
different starting inoculum treatments were tested; in Figures 12-14 the cultures were started
from cells grown to stationary phase in LB, while in Figures 15-17 the cultures were inoculated
with cells grown to stationary phase in same media type they were entering. Even with this
obvious slow growth phenotype, the final OD reached by all genotypes was nearly unchanged
among them (Fig. 12 A and Fig. 15 A). Lag time was observed under three inoculum
conditions: starting from overnight LB cultures, overnight MOPS cultures, and from mid-log
21
MOPS cultures (Fig. 18). The results were constant across the conditions with the ΔfabF/ΔfabH
double mutant having the longest lag time, followed by the ΔfabH single mutant.
Samples were acquired throughout the growth of cultures in minimal MOPS media: collections
were performed specifically at 4 hours after inoculation and at OD600 values of 0.2 (late lag/early
log phase), 0.8 (log phase), 1 (early stationary phase) and at stationary phase, as indicated in
Figure 19. Fatty acids were analyzed as ethyl ester derivates and are presented (Fig. 20 – Fig.
26) The total fatty acid content was calculated relative to the Wt samples collected at an OD600
of 0.2 (Fig. 19 and Fig. 22). The only cells which had significant increases in fatty acid were
ΔfabF/ΔfabH double mutant cells collected at OD600 0.2, which had a 40% increase in total fatty
acid content compared to the Wt control. This increase in total fatty acid amount was seen across
nearly all detected fatty acid compounds, with C18:0 making the largest increase in proportion to
the total, at 7%, from 13% in Wt to 20% in ΔfabF/ΔfabH cells (Fig. 21 and Fig. 22). Figure 23
focuses on the long chain fatty acids of the log phase samples. The ΔfabH samples showed C18:1
as more abundant than C16:0, while ΔfabF and Wt cells maintained C16:0 as the more abundant
fatty acid constituent. The ΔfabF/ΔfabH cells reverted back to Wt condition, where the C16:0
amount exceeded that of the C18:1 fatty acid (Fig. 24). Figures 25 and 26 show the medium
and short chain fatty acids profiles of the four genotypes analyzed. The profiles of these show
cultures had significant changes within each genotype, depending on the growth phase.
22
DISCUSSION
The ΔfabH deletion mutant showed a consistent slow growth phenotype with longer
doubling times and lag times. In addition to the slow growth, the ΔfabH cells also had a modest
reduction in total fatty acid content as well as a dramatic increase in cis-vaccenic acid (C18:1)
abundance. In fact C18:1 fatty acid replaced C16:0 as the most abundant fatty acid in ΔfabH
cultures. Because cis-vaccenic acid is formed by the elongation of pamitoleoyl-ACP by the
KASII enzyme fabF, we investigated the role of fabF in the survival of ΔfabH mutant cells.
Perhaps in E. coli, the KASII gene was able to compensate for the loss of fabH, as has been
shown previously in L. lactis (Morgan-Kiss and Cronan, 2008). We were able to recover
ΔfabF/ΔfabH double mutant E. coli cells and evaluate their growth characteristics. The double
mutant had a slightly more severe slow growth phenotype than the ΔfabH single mutant, with
longer doubling times and lag times, while the ΔfabF single mutant control did not have any
discernible growth phenotypes when compared to Wt cells in the media conditions tested at
37°C.
The accumulation of C18:1 fatty acid may be a result of the increase in fabF/fabB ratio
observed in ΔfabH mutant cells. Another possible mode of generating additional C18:1 would be
an up regulation of fabA, as the fabA/fabB ratios are responsible for regulating the amount of
unsaturated fatty acids (Xiao et al., 2013). However fabA is unlikely to be responsible for the
C18:1 accumulation, as there was not an observed change in total unsaturated fatty acid percent
in the ΔfabH mutant cells compared to wild type. The definitive evidence for fabFs role in the
accumulation of the C18:1 in the ΔfabH mutant cells was revealed in the ΔfabF/ΔfabH double
mutant cells. In late log phase (0.8 OD600) cells the ΔfabH cells had significantly more C18:1
23
than C16:0 while the ΔfabF/ΔfabH double mutant cells had returned to the wild type profile of
C16:0 greater than C18:1. This result demonstrated that fabF is required for the C18:1 hyperaccumulation chemotype displayed in ΔfabH cells. As expected, the ΔfabF mutants showed
increases in palmitoleic acid (C16:1). That is consistent with the loss of fabF function.
The altered phospholipid profile of the ΔfabH mutant was observed using high accuracy
FT-ICR MS with and without 13C labeling. Two novel phospholipids were detected. The two
new phospholipids were both phosphatidylglycerol molecules: 1-(11Z-octadecenoyl)-2-(11cyclopropyl-octadecanoyl)-sn-glycero-3-phosphatidylglycerol and 1-(11-cyclopropyloctadecanoyl)-2-(11-cyclopropane-octadecanoyl)-sn-glycero-3-phosphatidylglycerol. These
phospholipids are not specifically reported in E. coli phospholipid literature but were detectable
in the wild types cells at reduced levels. The combination of these lipids being hyperaccumulated in the ΔfabH mutant and the sensitivity of the FT-ICR MS allowed for their
detection and identification in this study.
The KASII and KASIII single and double mutants were grown and compared to the wild
type cells under varying media and pre-culture conditions. The summation of these results is that
regardless of the media condition and starting pre-culture inoculum treatment, the Wt and ΔfabF
cells grew very similarly, while the ΔfabF/ΔfabH double mutant and ΔfabH single mutant grew
more slowly. The ΔfabF/ΔfabH double mutant grew the slowest followed by the ΔfabH single
mutant. It is unclear which role of fabF, initiation or elongation or both, the double mutants cells
are deprived of, but their growth was modestly effected by the loss of fabF activity. Even with
this obvious slow growth phenotype, the final OD reached by all genotypes was nearly
unchanged among them. This result suggests that the growth rate is decreased but the growth
24
potential (final OD) of the cultures was not severely diminished by the loss of KASII and KAS II
enzymes.
Although the ΔfabF/ΔfabH double mutant grew more slowly than the ΔfabH single
mutant, it was, in fact, viable. This finding reveals that KASI (fabB) can support growth alone
without any other KAS genes present. In addition to catalyzing condensation reactions, both
KASI and KASII enzymes are known to have malonyl-ACP decarboxylase activity that leads to
the formation of acetyl-ACP (Alberts et al., 1972; Morgan-Kiss and Cronan, 2008). These
previously reported findings were supported by the presence of considerable higher amounts
acetyl-ACP in both the wild type and ΔfabH cells. This data, taken together with the previous
findings, suggests that FabB likely can initiate fatty acid synthesis in the absence of an
acetoacetyl-ACP primer via a side reaction involving malonyl-ACP decarboxylase activity and
the elongation of the available acetyl-ACP to form the missing acetoacetyl-ACP primer, there by
forming a fabH bypass pathway (Fig. 27) (Alberts et al., 1972; Garwin et al., 1980, 1980;
Jackowski and Rock, 1987; Magnuson et al., 1993; Morgan-Kiss and Cronan, 2008).
Additionally, it has also been shown that fabB is capable of catalyzing all the condensation
reactions up to C18 in E. coli; this trait could allow fabB to produce mature long chain fatty
acids without fabF or fabH present (Alberts et al., 1972; Garwin et al., 1980, 1980; Jackowski
and Rock, 1987; Magnuson et al., 1993; Morgan-Kiss and Cronan, 2008). Although it has been
predicted that fabB may be able to support growth as the only KAS enzyme present in E. coli,
the ΔfabF/ΔfabH double mutant data in this study is the first in vivo demonstration of fabB
supporting growth as the sole KAS in E. coli.
25
REFERENCES
Alberts AW, Bell RM, Vagelos PR (1972) Acyl carrier protein. XV. Studies of -ketoacyl-acyl
carrier protein synthetase. J Biol Chem 247: 3190-3198
Baba T, Ara T, Hasegawa M, Takai Y, Okumura Y, Baba M, Datsenko KA, Tomita M,
Wanner BL, Mori H (2006) Construction of Escherichia coli K-12 in-frame, single-gene
knockout mutants: the Keio collection. Mol Syst Biol 2: 2006.0008
Baba T, Mori H (2008) The construction of systematic in-frame, single-gene knockout mutant
collection in Escherichia coli K-12. Methods Mol Biol 416: 171-181
Battesti A, Bouveret E (2006) Acyl carrier protein/SpoT interaction, the switch linking SpoTdependent stress response to fatty acid metabolism. Mol Microbiol 62: 1048-1063
Battesti A, Bouveret E (2009) Bacteria possessing two RelA/SpoT-like proteins have evolved a
specific stringent response involving the acyl carrier protein-SpoT interaction. J Bacteriol
191: 616-624
BERTANI G (1951) Studies on lysogenesis. I. The mode of phage liberation by lysogenic
Escherichia coli. J Bacteriol 62: 293-300
Bonaventure G, Salas JJ, Pollard MR, Ohlrogge JB (2003) Disruption of the FATB gene in
Arabidopsis demonstrates an essential role of saturated fatty acids in plant growth. Plant
Cell 15: 1020-1033
Cronan J, Lin S (2011) Synthesis of the alpha,omega-dicarboxylic acid precursor of biotin by
the canonical fatty acid biosynthetic pathway. Current Opinion in Chemical Biology 15:
407-413
Datsenko KA, Wanner BL (2000) One-step inactivation of chromosomal genes in Escherichia
coli K-12 using PCR products. Proc Natl Acad Sci U S A 97: 6640-6645
de Mendoza D, Klages Ulrich A, Cronan JE (1983) Thermal regulation of membrane fluidity
in Escherichia coli. Effects of overproduction of beta-ketoacyl-acyl carrier protein
synthase I. J Biol Chem 258: 2098-2101
26
Edwards P, Nelsen JS, Metz JG, Dehesh K (1997) Cloning of the fabF gene in an expression
vector and in vitro characterization of recombinant fabF and fabB encoded enzymes from
Escherichia coli. FEBS Lett 402: 62-66
Garwin JL, Klages AL, Cronan JE (1980) Beta-ketoacyl-acyl carrier protein synthase II of
Escherichia coli. Evidence for function in the thermal regulation of fatty acid synthesis. J
Biol Chem 255: 3263-3265
Garwin JL, Klages AL, Cronan JE (1980) Structural, enzymatic, and genetic studies of betaketoacyl-acyl carrier protein synthases I and II of Escherichia coli. J Biol Chem 255:
11949-11956
Heath RJ, Rock CO (2004) Fatty acid biosynthesis as a target for novel antibacterials. Curr
Opin Investig Drugs 5: 146-153
Hegeman AD, Schulte CF, Cui Q, Lewis IA, Huttlin EL, Eghbalnia H, Harms AC, Ulrich
EL, Markley JL, Sussman MR (2007) Stable isotope assisted assignment of elemental
compositions for metabolomics. Anal Chem 79: 6912-6921
Jackowski S, Rock CO (1987) Altered molecular form of acyl carrier protein associated with
beta-ketoacyl-acyl carrier protein synthase II (fabF) mutants. J Bacteriol 169: 1469-1473
Lai CY, Cronan JE (2003) Beta-ketoacyl-acyl carrier protein synthase III (FabH) is essential
for bacterial fatty acid synthesis. J Biol Chem 278: 51494-51503
Li HQ, Luo Y, Zhu HL (2011) Discovery of vinylogous carbamates as a novel class of βketoacyl-acyl carrier protein synthase III (FabH) inhibitors. Bioorg Med Chem 19: 44544459
Lin S, Hanson RE, Cronan JE (2010) Biotin synthesis begins by hijacking the fatty acid
synthetic pathway. Nat Chem Biol 6: 682-688
Magnuson K, Jackowski S, Rock CO, Cronan JE (1993) Regulation of fatty acid biosynthesis
in Escherichia coli. Microbiol Rev 57: 522-542
Morgan-Kiss RM, Cronan JE (2008) The Lactococcus lactis FabF fatty acid synthetic enzyme
can functionally replace both the FabB and FabF proteins of Escherichia coli and the
FabH protein of Lactococcus lactis. Arch Microbiol 190: 427-437
Oh-hashi Y, Okuyama H (1986) Effects of the temperature range and the lack of beta-ketoacyl
acyl-carrier protein synthase II on fatty acid synthesis in Escherichia coli K12 after shifts
in temperature. Biochim Biophys Acta 876: 146-153
27
Oursel D, Loutelier-Bourhis C, Orange N, Chevalier S, Norris V, Lange CM (2007) Lipid
composition of membranes of Escherichia coli by liquid chromatography/tandem mass
spectrometry using negative electrospray ionization. Rapid Commun Mass Spectrom 21:
1721-1728
Ramamoorthy D, Turos E, Guida WC (2013) Identification of a new binding site in E. coli
FabH using Molecular dynamics simulations: validation by computational alanine
mutagenesis and docking studies. J Chem Inf Model 53: 1138-1156
Rigaudy J, Klesney SP (1979) Nomenclature of Organic Chemistry. Pergamon Press
Tartoff KD, Hobbs CA (1987) Improved media for growing plasmid and cosmid clones. In, Vol
9:12. Bethseda Research Laboratories Focus
Ulrich AK, de Mendoza D, Garwin JL, Cronan JE (1983) Genetic and biochemical analyses
of Escherichia coli mutants altered in the temperature-dependent regulation of membrane
lipid composition. J Bacteriol 154: 221-230
Wang XL, Zhang YB, Tang JF, Yang YS, Chen RQ, Zhang F, Zhu HL (2012) Design,
synthesis and antibacterial activities of vanillic acylhydrazone derivatives as potential βketoacyl-acyl carrier protein synthase III (FabH) inhibitors. Eur J Med Chem 57: 373-382
Xiao X, Yu X, Khosla C (2013) Metabolic flux between unsaturated and saturated fatty acids is
controlled by the FabA:FabB ratio in the fully reconstituted fatty acid biosynthetic
pathway of Escherichia coli. Biochemistry 52: 8304-8312
Yang YS, Zhang F, Gao C, Zhang YB, Wang XL, Tang JF, Sun J, Gong HB, Zhu HL
(2012) Discovery and modification of sulfur-containing heterocyclic pyrazoline
derivatives as potential novel class of β-ketoacyl-acyl carrier protein synthase III (FabH)
inhibitors. Bioorg Med Chem Lett 22: 4619-4624
Yao Z, Davis RM, Kishony R, Kahne D, Ruiz N (2012) Regulation of cell size in response to
nutrient availability by fatty acid biosynthesis in Escherichia coli. Proc Natl Acad Sci U S
A 109: E2561-2568
Zhou Y, Du QR, Sun J, Li JR, Fang F, Li DD, Qian Y, Gong HB, Zhao J, Zhu HL (2013)
Novel Schiff-base-derived FabH inhibitors with dioxygenated rings as antibiotic agents.
ChemMedChem 8: 433-441
28
Figure 1. E. coli de novo fatty acid biosynthesis pathway
This schematic shows the specific roles of the KAS enzymes in the elongation steps of the fatty
acid biosynthetic pathway.
FabB (KAS I), β-Ketoacyl-[acyl carrier protein] synthase I
FabF (KAS II), β-Ketoacyl-[acyl carrier protein] synthase II
FabH (KAS III), β-Ketoacyl-[acyl carrier protein] synthase III
(Adapted figure from EC FAS paper)
29
Figure 2. KAS single mutants and ΔfabF/ Δ fabH double mutant confirmation
Gel electrophoresis analysis of PCR products of gene specific primers for fabH (KASIII) used to
detect the presence of the fabH gene at the DNA level (A). Panel B shows the western blot
analysis of protein extracts from the Wt and KAS mutants with mouse anti-fabF antibodies. Gel
electrophoresis analysis of PCR products of primers specific for GenR knockout cassette used to
delete the native fabF gene (C).
30
Figure 3. Growth of wild type and ΔfabH E. coli on solid media
Growth of wild type and ΔfabH cells on MOPS minimal media plates agar at 24 (A) and 48
hours (B) after streaking and LB rich media agar plates are also shown at 24 (C) and 48 (D)
hours after streaking.
31
Figure 4. Growth of wild type and ΔfabH E. coli in liquid media
Growth curves of wild type and ΔfabH cells in LB rich media (A) and in MOPS minimal media
(B), both were monitored for 20 hours at 600nm. Doubling times (C) were calculated at the
point of maximum growth rate. The ΔfabH had statistically significantly longer doubling times
in both LB and MOPS minimal media. The OD600 to cell density (D) was calculated based on
the number of recovered colonies from diluted cultures plated on LB plates.
32
Figure 5. MS/MS spectrum of 830.8m/z in negative mode
The spectrum of the isolated 830.8 m/z ion from 13C labeled E. coli ΔfabH nonpolar extracts
detected in negative mode with CID fragmentation is presented. Red arrows indicated C18:1
containing fatty acid constituents while blue arrows indicate cyC19:0 containing fragments (A).
The bar graphs in figure B represent the relative abundance of the average of four biological
replicates for each phospholipid relative to an internal standard and the mass of the sample.
Molecules were detected using ESI in negative mode. The error bars represent +/- standard error.
The “*” represent statistically different (p-value <0.05) values between Wt and ΔfabH cells.
33
Figure 6. Fatty methyl ester quantification of ΔfabH and Wt cells using GC-MS
The total fatty acid content of the Wt and ΔfabH cells is presented (A). The total percent of
unsaturated fatty acids (acyl chains containing cyclopropyl or carbon-carbon double bounds) is
presented in B. Figure C displays the relative abundance of each fatty acid as a percent of the
total. Statistically significant differences are indicated by “*” (p-value<0.5). Error bars represent
+/- standard error.
34
Figure 7. HPLC FT-ICR MS quantification of phospholipids
The bar graphs represent the relative abundance of the average of four biological replicates for
each phospholipid relative to an internal standard and the mass of the sample. Molecules were
detected using ESI in negative mode. Both types of the major membrane lipids of E. coli
phosphatidylethanolamine (PE) and phosphatidylglycerol were detected (PG). The error bars
represent +/- standard error. The * represent statistically different (p-value <0.05) between Wt
and ΔfabH cells. Note for low abundance compounds: PG cyC19:0/cyC19:0 is elevated in
ΔfabH while PE C16:0/C16:0 reduced in ΔfabH.
35
Figure 8. HPLC FT-ICR MS quantification of phospholipids
The bar graphs represent the relative abundance of the average of four biological replicates for
each phospholipid relative to an internal standard and the mass of the sample. Molecules were
detected using ESI in positive mode. The error bars represent +/- standard error. The * represent
statistically different (p-value <0.05) between the Wt and ΔfabH cells.
36
Figure 9. FT-ICR MS spectrum of acyl-ACP extracts
Above are the mass spectrums of ACP containing extracts from Wt and ΔfabH cells analyzed
using FT-MS with ESI in negative mode. The arrows highlight specific compounds with a -8
charge.
37
Figure 10. FT-ICR MS quantification of acyl-ACP
The short chain acyl-ACPs are presented as bar graphs of the average of four biological
replicates made relative to the wild type average value (A). Figure B presents the ACPs as a
fraction of the total amount of the total of the four ACP derivatives, average of four biological
replicates is presented. The error bars represent +/- standard error. The “*” represent statistically
different values (p-value <0.05) between Wt and ΔfabH cells.
38
Figure 11. Quantification of fabF and fabB protein accumulation
Western blot analysis of fabF and fabB proteins is presented as bar graphs with the average of
four biological replicates made relative to Wt levels (A) and the calculated fabF/fabB ration (B).
The error bars represent +/- standard error. The * represent statistically different values (p-value
<0.05) between Wt and ΔfabH cells.
39
Figure 12. Growth curves of wild type and KAS mutant E. coli in liquid media starting
from overnight LB cultures
The graphs represent the average of four replicates each genotype under the respective
conditions.
Each graph contains all four genotypes grown in a specific media. LB (A), terrific broth (TB)
(B), MOPS minimal media (C), M9 minimal media (D). Each sample was diluted to the same
starting OD600 (0.001) and grown until stationary phase was reached and maintained. The error
bars represent +/- standard error.
40
Figure 13. Growth characteristics of wild type and KAS mutant E. coli in liquid media
The bars represent the average of four replicates each genotype under the respective conditions.
Each sample was started with inoculum grown in LB overnight and all samples were diluted to
an equal starting OD600 (0.001). Max OD600 represents the highest OD600 reached during the time
the cultures were observed (A). Figure B shows the time to reach the max OD observed. The
fastest doubling time is represented as max growth shown in figure C. The time to reach the
observed max growth was calculated (D). The error bars represent +/- standard error. The “*”
represent statistically different values (p-value <0.05) between Wt and other genotypes cells and
“**” represents statistically different values for both Wt and ΔfabH.
41
Figure 14. Lag times of wild type and KAS mutant E. coli in liquid media
The bars represent the average of four replicates each genotype under the respective conditions.
Each sample was started with inoculum grown in LB overnight and all samples were diluted to
an equal starting OD600 (0.001). The lag time is measured at the point when the measured OD
surpasses the threshold OD600 (0.07). The error bars represent +/- standard error.
42
Figure 15. Growth curves of wild type and KAS mutant E. coli in liquid media starting
from each respective media
The graphs represent the average of four replicates each genotype under the respective
conditions. Each graph contains all four genotypes grown in a specific media. LB (A), terrific
broth (TB) (B), MOPS minimal media (C), M9 minimal media (D). Each sample was diluted to
the same starting OD600 (0.001) and grown until stationary phase was reached and maintained.
The error bars represent +/- standard error.
43
Figure 16. Lag times of wild type and KAS mutant E. coli in liquid media
The bars represent the average of four replicates each genotype under the respective conditions.
Each sample was started with inoculum grown in the same media type and all samples were
diluted to an equal starting OD600 (0.001). Max OD600 represents the highest OD600 reached
during the time the cultures were observed (A). Figure B shows the time to reach the max OD
observed. The fastest doubling time is represented as max growth shown in figure C. The time
to reach the observed max growth was calculated (D). The error bars represent +/- standard error.
The “*” represent statistically different values (p-value <0.05) between Wt and other genotypes
cells and “**” represents statistically different values for both Wt and ΔfabH.
44
Figure 17. Lag times of wild type and KAS mutant E. coli in liquid media
The bars represent the average of four replicates each genotype under the respective conditions.
Each sample was started with inoculum grown in the same media type and all samples were
diluted to an equal starting OD600 (0.001). The lag time is measured at the point when the
measured OD surpasses the threshold OD600 (0.07). The error bars represent +/- standard error.
45
Figure 18. Lag times of wild type and KAS mutant E. coli in liquid media
The bars represent the average of four replicates each genotype under the respective conditions.
Each samples was diluted to the same starting OD600 (0.001) and the time is measured at the
point when the measured OD surpasses the threshold OD600 (0.07). The error bars represent +/standard error. “*” denotes that the peak is statistically significantly different from all the other
genotypes in that condition.
46
Figure 19. Collection of Wt and KAS mutant samples at specific growth points
The plots represent the average of three replicates each genotype grown in 250mL cultures in
flasks. Samples were diluted to the same starting OD600 (0.001). Each arrow indicates a specific
OD600 cut off where samples were taken for further analysis.
47
Figure 20. Quantification of the total fatty acid content of KAS mutants and Wt cells by
GC-MS analysis of ethyl esters
The total fatty acid content of KAS mutants and Wt cells at specific growth stages are displayed.
Each bar graph represents the average of four replicates calculated relative to the Wt OD600 0.2
average. Statistically significant changes compared to Wt OD600 0.2 (p-value<0.5) are marked
(*). Error bars represent +/- standard error. F= ΔfabF, H= ΔfabH , FH= ΔfabF/ΔfabH.
48
Figure 21. Fatty composition of KAS mutants and Wt cells by GC-MS analysis of ethyl
esters. The fatty acid content of KAS mutants and Wt cells at specific growth stages are
presented. Each colored section of the bar graph represents the average of four replicates
calculated percent contribution of each fatty acid to the total fatty acid pool. F= ΔfabF, H=
ΔfabH , FH= ΔfabF/ΔfabH.
49
Figure 22. Fatty composition and total content of KAS mutants and Wt cells by GC-MS
analysis of ethyl esters
The fatty acid content of KAS mutants and Wt cells at specific growth stages are displayed. Each
colored section of the bar graph represents the average of four replicates calculated moles/mg
contribution of each fatty acid to the total fatty acid pool. F= ΔfabF, H= ΔfabH , FH=
ΔfabF/ΔfabH.
50
Figure 23. Long chain fatty composition of KAS mutants and Wt cells by GC-MS analysis
of ethyl esters
The fatty acid content of KAS mutants and Wt cells at specific growth stages are presented. Each
colored bar graph represents the average of four replicates and its content in moles/mg. Only
fatty acids longer that 16 carbons are displayed. F= ΔfabF, H= ΔfabH , FH= ΔfabF/ΔfabH.
51
Figure 24. Long chain fatty composition of KAS mutants and Wt cells by GC-MS analysis
of ethyl-esters
The fatty acid content of KAS mutants and Wt cells at specific growth stages are presented. Each
colored bar graph represents the average of four replicates and its content in moles/mg. Only
fatty acids longer that 16 carbons are displayed. Statistically significant changes compared to Wt
(a), ΔfabH (b), ΔfabF (c), and ΔfabF/ΔfabH (d) (p-value<0.5). Error bars represent +/- standard
error. F= ΔfabF, H= ΔfabH, FH= ΔfabF/ΔfabH.
52
Figure 25. Medium chain fatty composition of KAS mutants and Wt cells by GC-MS
analysis of ethyl-esters
The fatty acid content of KAS mutants and Wt cells at specific growth stages are presented. Each
colored bar graph represents the average of four replicates and its content in moles/mg. Only
fatty acids between 10-14 carbons are displayed. F= ΔfabF, H= ΔfabH , FH= ΔfabF/ΔfabH.
53
Figure 26. Short chain fatty composition of KAS mutants and Wt cells by GC-MS analysis
of ethyl-esters
The fatty acid content of KAS mutants and Wt cells at specific growth stages are presented. Each
colored bar graph represents the average of four replicates and its content in moles/mg. Only
fatty acids with fewer than 10 carbons are displayed. F= ΔfabF, H= ΔfabH, FH= ΔfabF/ΔfabH.
54
Figure 27. E. coli de novo fatty acid biosynthesis fabH bypass pathway
This schematic shows the specific roles of the KAS enzymes in the elongation of steps of the
fatty acid biosynthetic pathway without a KASIII enzyme.
FabB (KAS I), β-Ketoacyl-[acyl carrier protein] synthase I
FabF (KAS II), β-Ketoacyl-[acyl carrier protein] synthase II
(Adapted figure from EC FAS paper)
55
Supplemental Table 1. Primer list
56
CHAPTER III. GENE 2 FUNCTION PROJECT IN
METHANOSARCINA ACETIVORANS: INVESTIGATING A
NOVEL COMBINED NMR LIGAND SCREENING AND
METABOLOMIC ANALYSIS BASED APPROACH FOR
EFFICIENT AND ACCURATE GENE FUNCTIONAL
ANNOTATION
INTRODUCTION
The following was a collaboration among three groups, the Sowers and Orban groups at
the University of Maryland and the Nikolau group here at Iowa State University. Note that as a
group, we have published portions of this work already (Chen et al., 2011).
The first organism to have its complete genome sequenced was the Haemophilus
influenza (Fleischmann et al., 1995). Of the 1740 genes in this genome approximately 25% had
no known function. The majority of these had sequence homologs in other organisms, but no
homology to any genes with determined functions. These genes were labeled as “conserved
hypotheticals”. The amount of genomic sequence data has become massive. Currently,
sequencing efforts have yielded the sequencing of approximately 22,800 genomes, from 16,500
organisms, representing 349 million genes and genomic features (CoGe, 2014). As the number
of sequenced genomes has grown, the observation remains that a significant proportion (as high
as 70% in some cases) of prokaryotic, archaeal, and eukaryotic genomes have unknown
functions. For example, among all archaea ~33% of the genes sequenced are currently labeled as
genes of unknown function (Yelton et al., 2011). This lack of functional annotation represents a
persistent and significant gap in our understanding of Archaeal biology and biochemistry.
Advantageously, conserved hypotheticals often have sequence homologs in hundreds of other
57
organisms and so functional assignment for one homolog would provide leverage in annotating a
large set of genes. With this potential cascading effect, these conserved hypothetical genes
represent strong candidates for functional studies.
Protein functions are currently annotated in genomic databases using automated
algorithms that search for sequence homology to a gene product with an established function.
These sequence-based annotations frequently have variability in their accuracy, especially if the
sequence identity to a protein of known function is low. Additionally, recent work has shown
that even proteins with very high sequence identity can have different structures and functions
(Parsons et al., 2003; Yeh et al., 2005; He et al., 2008; Luo and Yu, 2008; Tuinstra et al., 2008);
therefore caution is necessary when assigning functions simply by sequence homology in the
absence of experimental validation. Traditional experimental approaches to determine function,
such as enzyme assays, are slow, painstaking, and have not been able to keep up with the everincreasing large body of genome sequence data that contains many genes with unconfirmed and
undetermined function. There is an obvious need for more efficient methods for accurate,
experimentally based annotation and validation of protein function.
A novel combined metabolomic and NMR-based approach was developed to meet the
challenge of efficient and accurate experimental validation of the functions of putative enzymes
(Fig. 1). This method utilized a NMR-based ligand screening strategy (Dalvit et al., 2002). Our
preferred method was waterLOGSY (water-ligand observed via gradient spectroscopy) pulse
sequence (Dalvit et al., 2001), as consistent results were obtained in our hands using this
approach. The second method employed was metabolomics, which allowed for the determination
of a biochemical phenotype associated with the presence or absence of a particular gene (Förster
et al., 2002; Hall et al., 2002; Phelps et al., 2002; Bino et al., 2004; Fridman and Pichersky,
58
2005; Koek et al., 2006; Swanepoel and Loots, 2014). Knowledge of a gene’s biochemical
phenotype can be exploited as a method to identify potential functionality directly, or provide
rationale for choosing potential substrates for subsequently analysis via NMR-based ligand
screening strategy. Conversely, proteins which waterLOGSY has indicated bind a particular
ligand may be further investigated using a more directed metabolomics approach to ascertain
which metabolic pathways are affected by the presence or absence of that particular protein. In
order to independently confirm gene functions determined with waterLOGSY and or
metabolomics, genetic complementation and enzyme assays were implemented, when
appropriate. This combination of approaches has the potential to work synergistically towards
efficiently and accurately determining gene functions.
Choice of organism
We used the methanogenic archaeon, Methanosarcina acetivorans (MA) (SOWERS et
al., 1984; Galagan et al., 2002; Longstaff et al., 2007; Rother et al., 2007; Chen et al., 2011), as a
model organism to develop tools for efficient and reliable annotation and validation of protein
function. Methane-producing organisms are of particular interest because of their ability to
generate cost-effective biofuel. However, there was and is a significant gap in the biological
understanding of methanogens. The lack of accurate functional annotation for the large quantity
sequence data represents a major hurdle in the pursuit of biological knowledge of these
organisms. Information gleaned from the demonstration of our novel approach to experimentally
annotating gene functions is valuable in the pursuit of biological understanding of methanogens
in general, in addition to MA itself.
59
The waterLOGSY method was initially developed for use as a ligand screening tool for
drug targets and is useful as a rapid, relatively high through-put approach (Dalvit et al., 2001).
WaterLOGSY experiments observe the transfer of magnetization between ligand and water
molecules (Fig. 2 A.). When a ligand is in solution with a protein capable of binding it, there are
two competing flows of magnetization: 1) from water to the free ligand and 2) from the bound
water (via the protein) to the bound ligand (Chen et al., 2011). The two competing flows lead to
opposite signs of NOEs (nuclear Overhauser enhancements) (Chen et al., 2011). The stronger
magnetization flow establishes the sign of the waterLOGSY peak. Ligands that bind with the
protein will have positive peaks while non-binding compounds will have negative peaks (Fig. 2
B.) WaterLOGSY has shown promise as a potential efficient and accurate experimental
validation of function annotations (Chen et al., 2011). While WaterLOGSY is useful for
experimentally determining the functionality of proteins when there is knowledge of potential
substrate molecules, WaterLOGSY has limited ability to determine functionality when potential
substrate molecules are unknown.
Metabolomics is the science of identifying and measuring the pool of metabolites that
represents the chemical composition of a biological sample, termed the metabolome. Although
metabolomics is a relatively new field in biology (Fiehn et al., 2000; Hall et al., 2002),
metabolomic analysis has increasing examples of being successfully employed to determine the
metabolic functionality of genes (Hirai et al., 2005; Brauer et al., 2006; Liang et al., 2006;
Schauer and Fernie, 2006). Our main analytical approach utilized targeted and non-targeted
metabolomic analysis of small molecules (<600 Da) by gas chromatography–mass spectrometry
(GC-MS) analysis. Some ways GC-MS can identify compounds are based on retention times in
the chromatography, indexed relative to a series of standards, the m/z value of the molecular ion
60
for each metabolite, and matching the MS-fragmentation pattern of each to existing MSdatabases. Two commonly used strategies for analyzing the functionality of genes using
metabolomics are gain of function studies and loss of function studies. Loss of function studies
rely on the comparison of a control strain to a strain where a gene function has been removed.
Gain of function metabolomic studies rely on the comparison of a control strain to an
experimental strain where a new gene or function has been introduced. We originally attempted
to generate directed single gene mutants in MA for loss of function studies. However there were
significant technical difficulties that made this approach impractical. As a second choice we
utilized a gain of function approach, where selected MA genes were heterologously expressed in
E. coli and were compared to a control strain not containing an MA gene.
Many pathways in MA are incompletely annotated. One pathway with which we have
direct experience and interest is the biotin network. Figure 3 shows the biotin biosynthesis
pathway with current putative annotations for biotin synthase (MA0154 and/or MA01489),
biotin ligase (MA0676), and MA’s only known biotin dependent enzyme, a putative pyruvate
carboxylase with subunits: pycA(MA0675) and pycB (MA0674). Notably, the enzymes in the
pathway leading up to dethiobiotin (the immediate precursor of biotin) have not been annotated.
In addition, there are two putative biotin synthase genes; MA0154 and MA1489. MA0154 and
MA1489 have 43% and 39% similarity to E. coli biotin synthase and further experimental
evidence is required to determine which gene(s) has actual biotin synthase activity. Though the
biotin biosynthetic pathway appears to be incomplete, initial observations showed that MA does
not require exogenous biotin to sustain normal growth (SOWERS et al., 1984). This observation
implies that MA does indeed have a complete but not yet described biotin biosynthetic pathway
or a novel situation in which MA does not require biotin to survive. Considering that the biotin
61
network is poorly understood in MA and we have expertise in this pathway, the biotin network
was selected as our model pathway. In an effort to elucidate the complete biotin network in MA,
we targeted and searched for the missing enzymes in the biotin biosynthetic pathway of MA.
Examining the biotin network using our novel approach to gene annotation allowed us a proof of
concept opportunity that also tapped into our experience and interest in the biotin network.
The aim of this study was to demonstrate the effectiveness of this novel combination of
waterLOGSY NMR substrate binding assays and metabolomics as an accurate and efficient
approach to annotating gene functions. Within the broader objective of evaluation there is an
opportunity to discover gene functions, elucidate incomplete pathways, and uncover new
biochemical activities. Applying our experimental approaches to genes within our selected target
list will provide an excellent venue for examining the effectiveness and accuracy of our method.
Examining the biotin network of MA will allow for the opportunity to apply this combined
NMR-substrate binding and metabolomics approach to elucidate an incomplete pathway.
Furthermore an in-depth study of the biotin network of MA will pursue the elucidation of the
MA biotin biosynthetic pathway. Understanding MA’s biotin biosynthetic pathway will provide
significant insight towards determining whether MA has the capacity to synthesize biotin de
novo, or if MA has a novel situation where it does not require biotin to sustain growth.
62
MATERIALS AND METHODS
WaterLOGSY ligand screening
WaterLOGSY requires purified protein for analysis. Selected MA genes were
heterologously expressed in E. coli and soluble proteins were purified and folding is confirmed
via 1D 1H NMR spectroscopy. MA gene cloning, protein purification, and NMR waterLOGSY
ligand screening experiments were carried out as previously described (Chen et al., 2011). The
selected MA genes were cloned into pET-21a plasmids and these plasmids, carrying cloned MA
genes, were used for the metabolomics and genetic complementation assays, as well.
Gene complementation
Genetic complementation of E. coli mutants by the MA genes was performed with E. coli
deletion mutants generated in the BW25113 background (Baba et al., 2006). The specific mutant
strains used in this study were: JW1122 (icd), JW5807 (leuB), and JW1789 (yeaU), and JW2535
(glyA). Each strain was transformed with a non-recombinant control pET-21a plasmid (Novagen)
and a recombinant pET-21a plasmid carrying a cloned MA gene. Complementation of glyA
mutants was conducted with glycine as the sole carbon sources(Newman et al., 1976). Genetic
complementation experiments were carried out as previously described (Chen et al., 2011).
Total metabolite analysis
The selected MA genes were cloned previously into pET-21a plasmids and these
plasmids carrying cloned MA genes were used for the metabolomics assays (Chen et al., 2011).
The pET-21a plasmids containing the select MA genes were transformed into Rosetta 2.0 E. coli
cells (Novagen) along with an empty pET-21a control. Two colonies of each genotype were
63
picked and cultures started in lysogeny broth (LB) (BERTANI, 1951) with 50µg/mL each of
ampicillin and chloramphenicol. These cultures were then incubated 16 hours at 37°C with
constant shaking performed at 250RPM. Cells were then centrifuged down at 2000G and were
washed and re-suspended with MOPS minimal media (Teknova, Hollister, CA). Four 200mL
MOPS minimal media with 0.4% glucose cultures were started for each genotype from the
overnight inoculums. Cultures started at the same OD600, 0.03. Expression of the genes of interest
was induced with 1uL/mL 0.1mM IPTG when each culture reached 0.4 OD600. Four hours after
induction the cells were quenched 1:1 with -40°C cold methanol (Buchholz et al., 2001), with
each sample containing 25mL of culture. The cells were then centrifuged down at 5500G for 15
minutes at20°C and the supernatant was removed.
Cell pellets were lyophilized overnight and dry masses were recorded (4-15mg). All
reagents for metabolite analysis were from Sigma-Aldrich, St. Louis, MO, unless stated
differently. 25µL of ribitol was added as an internal standard (0.6 mg/mL in dry-pyridine) as
well as 25µL of C19 fatty acid as internal standard (0.6 mg/mL in chloroform) was added to the
cell pellets. 0.35mL hot methanol (Thermo Fisher Scientific, Fair Lawn, NJ) (60°C) was added,
mixed, and immediately incubated at 60°C for 10 minutes. Samples were then vortexed 10
seconds and put in a sonication water bath for 10 minutes. 0.35mL of chloroform was added and
mixed by vortexing for 2 minutes. 0.3mL HPLC grade water was added and vortexed 2 minutes.
Phases were separated by centrifuge, 5 minutes at 13000G. The upper layer (polar) and lower
layer (non-polar) were separated, derivatized, and analyzed separately. Derivatization was
adapted from (Koek et al., 2006). 200-400µl polar/non-polar extracts were then dried under
nitrogen gas. 50µl methoxyamine (20mg/mL in pyridine) was added and was incubated at 30°C
while shaking, 1.5 hours. 70µl of bis-trimethyl silyl trifluoroacetamide with 1%
64
trimethylchlorosilane (BSTFA+1% TMCS) was then added (TMS derivatization reagent) and
was incubated at 37°C for 30 minutes. Hydrocarbon standards were also spiked into a
representative sample for the calculation of retention indices. Samples were then submitted to
GC-MS (gas chromatography–mass spectrometry) analysis. The following GC-MS method was
used for the total metabolite GC-MS analysis. All GC-MS data was captured using an Agilent
Technologies Model 6890 Gas Chromatograph with an Agilent 19091S-433: 1926.58224 HP5MS 5% Phenyl Methyl Siloxane column -60 °C—325 °C (325 °C): 30 m x 250 µm x 0.25 µm
column coupled to a Model 5973 mass selective detector with electrical ionization (EI). The GCMS system was controlled by and the data collected with ChemStation software (Agilent
Technologies, Santa Clara, CA). This GC-MS system was made available by the Office of
Biotechnology W.M. Keck Metabolomics Research Laboratory, Iowa State University. The
injection port temperature was 280°C with a splitless mode. He was the carrier gas at a flow rate
of 2.0 mL/min. The GC oven temperature ramp procedures were as follows: temperature set to
70 °C and held for 4 minutes, ramp 5°C/min to 320°C, hold 12 minutes. MS conditions were as
follows: ion source, 200°C; electron energy, 70eV; quadrupole temperature, 150°C; GC-MS
interface zone, 280°C; scan range, 50-550 mass units, a 6.7 minute solvent delay was used. Data
was analyzed using AMDIS (Automated Mass Spectral Deconvolution and Identification
System; http://chemdata.nist.gov/mass-spc/amdis/) software, with data interpretation from
NIST08 library and quantification with MET-IDEA (Broeckling et al., 2006). The NIST 08 MS
Library and AMDIS were provided by National Institute of Standards and Technology
(www.nist.gov) and MET-IDEA software was obtained from the Samuel Roberts Noble
Foundation (Ardmore, OK).
65
Amino acid analysis
Cell pellets were lyophilized overnight and dry masses were recorded (4-15mg). Dried
cell pellets were treated with 0.6mL 10% trichloroacetic acid and mix by vortex for 10 minutes.
The mixture was centrifuged at 13,000G and the supernatant was collected. Amino acid
quantification was performed using EZ:faast Amino Acid Analysis Kit (Phenomenex, Torrance,
CA) using the manufacturer’s protocol for GC-MS/FID analysis. All GC-MS data was captured
using an Agilent Technologies Model 6890 Gas Chromatograph coupled to a Model 5973 mass
selective detector with electrical ionization (EI). The injection port temperature was 280°C with
a splitless mode. He was the carrier gas at a flow rate of 2.0 mL/min. The GC oven temperature
ramp procedures were as follows: temperature set to 70 °C and held for 4 minutes, ramp 5°C/min
to 320°C, hold 12 minutes. MS conditions were as follows: ion source, 200°C; electron energy,
70eV; quadrupole temperature, 150°C; GC-MS interface zone, 280°C; scan range, 50-550 mass
units, a 6.7 minute solvent delay was used. The GC-MS system was controlled by, and the data
was collected and analyzed with ChemStation software (Agilent Technologies, Santa Clara, CA).
This GC-MS system was made available by the Office of Biotechnology W.M. Keck
Metabolomics Research Laboratory, Iowa State University.
66
RESULTS
Target gene selection
MA is considered metabolically diverse among methanogenic archaea, possessing the
largest known archaeal genome at 4524 predicted genes (Galagan et al., 2002). Close to 50% of
these genes were annotated as having unknown or unconfirmed function. In preparation for
target gene selection, the functional annotations were updated for over 700 of the 4524 predicted
genes in the MA genome. This was performed by transferring many of the recently revised
manual annotations in the related species M. burtonii (Allen et al., 2009) to homologous genes in
MA. In addition, a thorough literature search was conducted for published data that
experimentally confirms the functionality of MA genes and closely related orthologs in other
species. Confidence levels were given (Fig. 4) to each re-annotation based on current literature
as follows - Level 1: An exact match in the literature with an experimentally defined function.
Level 2: Gene product contains all domains needed for enzymatic function with ≥35% sequence
identity to a gene product of known function. Level 3: Gene product contains all domains needed
for enzymatic function but ≤35% sequence identity to a gene product of known function. Level
4: Gene product has no experimental match but some domain similarities to a known function
are recognizable. Level 5: Has no experimental match or domain similarities – i.e. annotated as
hypothetical. This re-annotated list of more than 700 genes provided a platform from which gene
targets with varying confidence levels were selected for experimental validation using our
approach. Selection criteria were as follows: the two main selection criteria were 1) the protein
should have a putative enzymatic activity on a small molecule substrate and 2) the protein should
be nonmembranous based on amino acid sequence analysis. Additional characteristics that are
preferable but not absolutely required were that the gene product was expressed in vivo in MA
67
based on published reports (Rohlin and Gunsalus, 2010), the gene is conserved among other
species, and that an E. coli homolog exists for potential genetic complementation studies. 70
genes within these selected criteria were selected for further analysis. This was conducted as
previously described (Chen et al., 2011).
WaterLOGSY ligand screening of MA4265, a putative isocitrate/isopropylmalate
dehydrogenase
MA4265 is annotated as an Isocitrate/isopropylmalate dehydrogenase family protein.
Based on this functional annotation, substrates were selected. MA4265 was submitted to onedimensional waterLOGSY 1H NMR analysis with each of the following putative substrates (300
μM) as follows: citrate, cis-aconitate, isocitrate, and 2-ketoglutarate, each tested with 30 μM
MA4265. Of these four metabolites, only isocitrate bound to MA4265 under the conditions used
(Chen et al., 2011). Note that the negative peak at 3.4 ppm in all spectra is thought to be a
contaminant from the protein concentration process.
Genetic complementation results for MA4265
BLASTp results representing E. coli homologs to the MA4265 amino acid sequence
(Table 1.) were used to select the Keio (Baba et al., 2006) mutants of each of three top blast hits:
yeaU, leuB, and icd. All three genes were selected for use in genetic complementation studies
with MA4265. Genetic complementation experiments (Fig. 6) of yeaU, leuB, and icd mutants in
E. coli strains in the BW25113 (Wt) background revealed that MA4265 was only able to
complement the icd mutant cells. Complementation was observed in both liquid and solid media.
This complementation result correlates with the waterLOGSY data (Fig. 5). Icd is an isocitrate
dehydrogenase mutant and MA4265 bound to isocitrate, a substrate of isocitrate dehydrogenase.
68
Amino acid profile for MA3520 expressed in Rosetta 2.0 E. coli cells
Amino acid profiling of E.coli cells expressing MA3520 by GC-MS (Fig. 7) revealed that
four amino acids had statistically significant changes, alanine (3.1 fold decrease), glycine (1.7
fold decrease), asparagine (1.5 fold decrease), and β-aminoisobutyric acid (1.3 fold decrease).
Glycine and serine are two predicted substrates of this putative serine hydroxymethyltransferase.
Between the two amino acids, only glycine showed a change in accumulation in the MA3520
expressing cells (Fig. 8).
WaterLOGSY ligand screening of MA3520, a putative Serine hydroxymethyltransferase
The MA3520 is annotated as a serine hydroxymethyltransferase and also had changes in
the accumulation of amino acids including glycine when expressed in E. coli. Based on this
functional annotation, substrates were selected. MA3520 was submitted to one-dimensional
waterLOGSY 1H NMR analysis with each of the following putative substrates as follows:
glycine (gly), serine (ser), alanine (ala), histidine (his), and methionine (met). Of these five
amino acids, only glycine bound to MA3520 under the conditions used (Fig. 9) (Chen et al.,
2011).
Genetic complementation results for MA3520
Based on the amino acid profile of MA3520 expressing cells and substrate binding
profile of MA3520, the serine hydroxymethyltransferase (glyA) knockout mutant was selected
for use in genetic complementation studies with MA3520. The KEIO (Baba et al., 2006) mutant
glyA was obtained and genetic complementation studies performed (Fig. 10). MA3520 was able
to complement the glyA mutant cells when they were grown on M9 minimal media with glycine
69
as the sole carbon source. Complementation studies were attempted with serine as the sole
carbon source, but no complementation was observed and the Wt cells showed severe growth
inhibition as serine can be toxic near the levels required for it to be the sole carbon source (data
not shown) (Newman et al., 1976). This complementation result correlates with the
waterLOGSY data (Fig. 9) and amino acid analysis (Fig. 8) for MA3520.
Metabolite analysis of MA genes expressed in Rosetta 2.0 E. coli cells
Amino acid and total metabolite analysis was conducted on 21 MA genes. Examples of
the total metabolite analysis are in Figures 11 and 12. These plots both represent data from the
polar faction of the total metabolite analysis compared to the empty vector control. Figure 11
shows an ideal situation where there are only a few statistically significantly changed
metabolites, while Fig. 12 displays a plot where there is an overwhelming amount of statistically
significant changed metabolites. Unfortunately, the data generated from this gain of function
approach was very rich in metabolite changes. Amino acids were often affected by the
expression of the MA genes, on average 27% (SE +/-1%) of the amino acids were changed.
There were on average 987 (SE +/- 103) ion peaks (potential metabolites) detected using the total
metabolites profiling method. Of the 987 ion peaks, on average 176 (SE +/- 46) or 18% (SE+/5%) had statistically significantly changed compared to the empty vector control. These
metabolite changes were consistently multiple and unrelated pathways. Unknown metabolites
consisted of 69% (SE+/-11%) of all the ion peaks detected. Table 2. shows the annotation status
of the MA genes that were analyzed with metabolite analysis; only in the case of MA3520 was
metabolite profiling able to assist directly in the final annotation of a gene’s function. Without
consensus in the metabolomics data, the rest of the MA genes had inconclusive results in the
evaluation of their putative functions.
70
DISCUSSION
MA4265 displayed isocitrate dehydrogenase activity
MA4265 was originally annotated as isocitrate/isopropylmalate dehydrogenase family
protein. When analyzed by NMR-based ligand screening, MA4265 displayed selectivity for and
only bound isocitrate and not any other putative substrates of isocitrate/isopropylmalate
dehydrogenase family proteins under the conditions tested. Isocitrate is the substrate of isocitrate
dehydrogenase. Additionally MA4265 was only able to complement the icd (isocitrate
dehydrogenase) E. coli mutant not leuB (3-isopropylmalate dehydrogenase) of yeaU (D-malate
dehydrogenase). The NMR and complementation results correlate and together provided the
evidence to assign MA4265 a more precise annotation as an isocitrate dehydrogenase. This was
an example of the use of NMR-based ligand screening to gain more accurate functional
annotation that can be tested, further evaluated, and in this case, confirmed by conventional
studies such as genetic complementation.
MA3520 displayed serine hydroxymethyltransferase activity
Amino acid profiling of E.coli cells expressing MA3520 by GC-MS showed that the
MA3520 expressing cells had statistically significant changes in four amino acids, including
glycine. Glycine is a putative substrate of serine hydroxymethyltransferase. MA3520 was
analyzed with NMR-based waterLOGSY ligand screening and was presented with five amino
acids as potential substrates: glycine, serine, alanine, histidine, and methionine. Among these
substrates MA3520 only displayed positive binding to glycine. The glyA (serine
hydroxymethyltransferase) E. coli mutant was able to be complemented by MA3520, this result
was again consistent with the annotated serine hydroxymethyltransferase activity of MA3520.
71
This study of MA3520 was an example of using the combination of NMR substrate binding
assays with metabolomics to provide evidence for assigning a more accurate functional
annotation. Again, fortunately, we were able to confirm this putative result with a conventional
genetic complementation study.
Metabolite analysis of MA genes expressed in Rosetta 2.0 E. coli cells
Amino acid and total metabolite analysis was conducted on 21 MA genes. Of these 21
genes, only in the case of MA3520 was metabolite profiling able to assist directly in the final
annotation of a gene’s function. Without consensus in the metabolomics data, the rest of the MA
genes had inconclusive results in the evaluation of their putative functions. To determine a
potential function of an enzyme from metabolomics data, metabolites correlating to the potential
function of the enzyme need to be detected, identified, and have statically significant changes in
accumulation compared to controls. If one of these factors is lacking or uncertain it becomes
difficult to use metabolite data to assign function. The ability to zero in on the possible function
of a putative enzyme using metabolite data was limited due to the overwhelming amount of
changing metabolites that were detected (176 peaks changed on average/sample). Without
consensus in the data, generating a putative function for an enzyme from metabolomics data
becomes increasingly speculative and subjective. The majority of the metabolomics data we
generated in this study was significantly complicated by the large amount of statistically
significant changes in metabolite accumulation of the MA expressing cells compared to the
controls. This was particularly problematic because the majority of the metabolites (69%) were
not identified.
72
Trouble shooting
Ideally we would have used single gene knockouts in MA for our metabolomics studies
as originally planned. Expressing the gene in E. coli was not the ideal situation for several
reasons. A major drawback of using E. coli is that the metabolic pathway the MA gene normally
participates in may not be present in E. coli. This could lead to the lack of substrate molecules in
the E. coli host for the MA gene to act upon. An additional concern is that the E. coli could
maintain tight regulation of the molecules affected by the MA gene by regulating precursors and
degrading products. A potential problem and source of error was the use of the IPTG inducible
T7 polymerase to express the MA genes in E. coli. This expression system can produce large
quantities of protein even with a low level of induction (Studier, 2005) and this protein
accumulation could cause significant changes in growth rates (observed, data not shown) and in
metabolite profiles of the cells. The protein’s expression may possibly deplete amino acid pools
and cause potential toxic effects related to protein hyperaccumulation (Studier, 2005). The
knockout MA mutant strains would have elevated these concerns although we then would face
other limitations. One possible major limitation would be that we could only do metabolite
profiling on genes that are not essential under our growth conditions. Another possible issue was
with the timing of the sample collections. Our protocol was to induce for four hours and then
collect the samples. However this method does not take into account that the MA expressing
cells growth rates may change depending on the MA gene that was expressed in E. coli cells.
These potential changes in growth rates could even cause the cells to be in different growth
phases during collection. Collecting cells at the same growth stage, or OD, has been shown to be
more important for controlling metabolite variation (Takahashi et al., 2008; Zech et al., 2009;
Marcinowska et al., 2011).
73
Author contributions in this section
Lucas Showman performed genetic complementation and metabolomics studies. Riley
Brien assisted with metabolomics experiments and analysis. John Orban set up NMR
experiments and the small molecule library, interpreted NMR and ITC data, participated in the
manual re-annotation. Yihong Chen expressed and purified proteins, helped to acquire NMR
data, and performed the calorimetry experiments. Ethel Apolinario cloned MA genes and
participated in the manual re-annotation. Libuse Brachova assisted in setting up the
complementation and metabolomics studies. Zvi Kelman designed strategies for cloning MA
genes and assisted in the manual re-annotation. Zhuo Li cloned MA genes and assisted in the
manual re-annotation. Basil J Nikolau supervised the complementation and metabolomics
studies. Kevin Sowers coordinated cloning and manual re-annotation.
Acknowledgments for this section
This research is supported by the Office of Science (BER), U.S. Department of Energy,
Grant Number DE-FG02-07ER64502, and an equipment grant from the W. M. Keck Foundation.
We also wish to thank Dr. Dennis Maeder (SAIC-Frederick Inc., NCI) for providing spreadsheets lining up orthologs between M. acetivorans and M. burtonii.
REFERENCES
Allen MA, Lauro FM, Williams TJ, Burg D, Siddiqui KS, De Francisci D, Chong KW,
Pilak O, Chew HH, De Maere MZ, Ting L, Katrib M, Ng C, Sowers KR, Galperin
MY, Anderson IJ, Ivanova N, Dalin E, Martinez M, Lapidus A, Hauser L, Land M,
Thomas T, Cavicchioli R (2009) The genome sequence of the psychrophilic archaeon,
74
Methanococcoides burtonii: the role of genome evolution in cold adaptation. ISME J 3:
1012-1035
Baba T, Ara T, Hasegawa M, Takai Y, Okumura Y, Baba M, Datsenko KA, Tomita M,
Wanner BL, Mori H (2006) Construction of Escherichia coli K-12 in-frame, single-gene
knockout mutants: the Keio collection. Mol Syst Biol 2: 2006.0008
BERTANI G (1951) Studies on lysogenesis. I. The mode of phage liberation by lysogenic
Escherichia coli. J Bacteriol 62: 293-300
Bino RJ, Hall RD, Fiehn O, Kopka J, Saito K, Draper J, Nikolau BJ, Mendes P, RoessnerTunali U, Beale MH, Trethewey RN, Lange BM, Wurtele ES, Sumner LW (2004)
Potential of metabolomics as a functional genomics tool. Trends Plant Sci 9: 418-425
Brauer MJ, Yuan J, Bennett BD, Lu W, Kimball E, Botstein D, Rabinowitz JD (2006)
Conservation of the metabolomic response to starvation across two divergent microbes.
Proc Natl Acad Sci U S A 103: 19302-19307
Broeckling CD, Reddy IR, Duran AL, Zhao X, Sumner LW (2006) MET-IDEA: data
extraction tool for mass spectrometry-based metabolomics. Anal Chem 78: 4334-4341
Buchholz A, Takors R, Wandrey C (2001) Quantification of intracellular metabolites in
Escherichia coli K12 using liquid chromatographic-electrospray ionization tandem mass
spectrometric techniques. Anal Biochem 295: 129-137
Chen Y, Apolinario E, Brachova L, Kelman Z, Li Z, Nikolau BJ, Showman L, Sowers K,
Orban J (2011) A nuclear magnetic resonance based approach to accurate functional
annotation of putative enzymes in the methanogen Methanosarcina acetivorans. BMC
Genomics 12 Suppl 1: S7
CoGe (2014) The Place to Compare Genomes. In. genomevolution.org
75
Dalvit C, Flocco M, Veronesi M, Stockman BJ (2002) Fluorine-NMR competition binding
experiments for high-throughput screening of large compound mixtures. Comb Chem
High Throughput Screen 5: 605-611
Dalvit C, Fogliatto G, Stewart A, Veronesi M, Stockman B (2001) WaterLOGSY as a method
for primary NMR screening: practical aspects and range of applicability. J Biomol NMR
21: 349-359
Fiehn O, Kopka J, Dörmann P, Altmann T, Trethewey RN, Willmitzer L (2000) Metabolite
profiling for plant functional genomics. Nat Biotechnol 18: 1157-1161
Fleischmann RD, Adams MD, White O, Clayton RA, Kirkness EF, Kerlavage AR, Bult CJ,
Tomb JF, Dougherty BA, Merrick JM (1995) Whole-genome random sequencing and
assembly of Haemophilus influenzae Rd. Science 269: 496-512
Fridman E, Pichersky E (2005) Metabolomics, genomics, proteomics, and the identification of
enzymes and their substrates and products. Curr Opin Plant Biol 8: 242-248
Förster J, Gombert AK, Nielsen J (2002) A functional genomics approach using metabolomics
and in silico pathway analysis. Biotechnol Bioeng 79: 703-712
Galagan JE, Nusbaum C, Roy A, Endrizzi MG, Macdonald P, FitzHugh W, Calvo S,
Engels R, Smirnov S, Atnoor D, Brown A, Allen N, Naylor J, Stange-Thomann N,
DeArellano K, Johnson R, Linton L, McEwan P, McKernan K, Talamas J, Tirrell
A, Ye W, Zimmer A, Barber RD, Cann I, Graham DE, Grahame DA, Guss AM,
Hedderich R, Ingram-Smith C, Kuettner HC, Krzycki JA, Leigh JA, Li W, Liu J,
Mukhopadhyay B, Reeve JN, Smith K, Springer TA, Umayam LA, White O, White
RH, Conway de Macario E, Ferry JG, Jarrell KF, Jing H, Macario AJ, Paulsen I,
Pritchett M, Sowers KR, Swanson RV, Zinder SH, Lander E, Metcalf WW, Birren
76
B (2002) The genome of M. acetivorans reveals extensive metabolic and physiological
diversity. Genome Res 12: 532-542
Hall R, Beale M, Fiehn O, Hardy N, Sumner L, Bino R (2002) Plant metabolomics: the
missing link in functional genomics strategies. Plant Cell 14: 1437-1440
He Y, Chen Y, Alexander P, Bryan PN, Orban J (2008) NMR structures of two designed
proteins with high sequence identity but different fold and function. Proc Natl Acad Sci
U S A 105: 14412-14417
Hirai MY, Klein M, Fujikawa Y, Yano M, Goodenowe DB, Yamazaki Y, Kanaya S,
Nakamura Y, Kitayama M, Suzuki H, Sakurai N, Shibata D, Tokuhisa J, Reichelt
M, Gershenzon J, Papenbrock J, Saito K (2005) Elucidation of gene-to-gene and
metabolite-to-gene networks in arabidopsis by integration of metabolomics and
transcriptomics. J Biol Chem 280: 25590-25595
Koek MM, Muilwijk B, van der Werf MJ, Hankemeier T (2006) Microbial metabolomics
with gas chromatography/mass spectrometry. Anal Chem 78: 1272-1281
Liang YS, Kim HK, Lefeber AW, Erkelens C, Choi YH, Verpoorte R (2006) Identification
of phenylpropanoids in methyl jasmonate treated Brassica rapa leaves using twodimensional nuclear magnetic resonance spectroscopy. J Chromatogr A 1112: 148-155
Longstaff DG, Larue RC, Faust JE, Mahapatra A, Zhang L, Green-Church KB, Krzycki
JA (2007) A natural genetic code expansion cassette enables transmissible biosynthesis
and genetic encoding of pyrrolysine. Proc Natl Acad Sci U S A 104: 1021-1026
Luo X, Yu H (2008) Protein metamorphosis: the two-state behavior of Mad2. Structure 16:
1616-1625
77
Newman EB, Batist G, Fraser J, Isenberg S, Weyman P, Kapoor V (1976) The use of
glycine as nitrogen source by Escherichia coli K12. Biochim Biophys Acta 421: 97-105
Parsons L, Bonander N, Eisenstein E, Gilson M, Kairys V, Orban J (2003) Solution structure
and functional ligand screening of HI0719, a highly conserved protein from bacteria to
humans in the YjgF/YER057c/UK114 family. Biochemistry 42: 80-89
Phelps TJ, Palumbo AV, Beliaev AS (2002) Metabolomics and microarrays for improved
understanding of phenotypic characteristics controlled by both genomics and
environmental constraints. Curr Opin Biotechnol 13: 20-24
Rohlin L, Gunsalus RP (2010) Carbon-dependent control of electron transfer and central
carbon pathway genes for methane biosynthesis in the Archaean, Methanosarcina
acetivorans strain C2A. BMC Microbiol 10: 62
Rother M, Oelgeschlager E, Metcalf W (2007) Genetic and proteomic analyses of CO
utilization by Methanosarcina acetivorans. Archives of Microbiology 188: 463-472
Schauer N, Fernie AR (2006) Plant metabolomics: towards biological function and mechanism.
Trends Plant Sci 11: 508-516
SOWERS K, BARON S, FERRY J (1984) METHANOSARCINA-ACETIVORANS SP-NOV,
AN ACETOTROPHIC METHANE-PRODUCING BACTERIUM ISOLATED FROM
MARINE-SEDIMENTS. Applied and Environmental Microbiology 47: 971-978
Studier FW (2005) Protein production by auto-induction in high density shaking cultures.
Protein Expr Purif 41: 207-234
Swanepoel CC, Loots dT (2014) The use of functional genomics in conjunction with
metabolomics for Mycobacterium tuberculosis research. Dis Markers 2014: 124218
78
Tuinstra RL, Peterson FC, Kutlesa S, Elgin ES, Kron MA, Volkman BF (2008)
Interconversion between two unrelated protein folds in the lymphotactin native state.
Proc Natl Acad Sci U S A 105: 5057-5062
Yeh DC, Parsons LM, Parsons JF, Liu F, Eisenstein E, Orban J (2005) NMR structure of
HI0004, a putative essential gene product from Haemophilus influenzae, and comparison
with the X-ray structure of an Aquifex aeolicus homolog. Protein Sci 14: 424-430
Yelton AP, Thomas BC, Simmons SL, Wilmes P, Zemla A, Thelen MP, Justice N, Banfield
JF (2011) A semi-quantitative, synteny-based method to improve functional predictions
for hypothetical and poorly annotated bacterial and archaeal genes. PLoS Comput Biol 7:
e1002230
79
Target MA genes
Select MA genes
Clone selected MA genes from MA strain C2A into
pET vectors (IPTG inducible)
Express in E. coli
Complementation studies
Metabolite analysis
Purify protein
In vitro enzyme assays
Submit soluble protein to
ligand screening via NMR
Function
Figure 1. Flow diagram of our experimental strategy
Arrows represent the work flow as well as the information flow for determining conditions of further
experiments and designating new gene functional annotations.
80
Figure 2. The NMR waterLOGSY Principle. Figure A shows when a ligand is in solution with
a protein capable of binding it, there are two competing flows of magnetization: 1) from water to
the free ligand and 2) from the bound water (via the protein) to the bound ligand. The two
competing flows lead to opposite signs of NOEs, with the stronger flow determining the sign.
1
Example of one dimensional waterLOGSY H NMR spectrum. Positive peaks are indicative of
binding while negative peaks represent unbound ligand (B).
81
Figure 3. The MA biotin network. The MA biotin biosynthesis pathway with current putative
annotations for biotin synthase (MA0154 and/or MA01489), biotin ligase (MA0676), and MA’S
only known biotin dependent enzyme, a putative pyruvate carboxylase with subunits:
pycA(MA0675) and pycB (MA0674). Notably, the enzymes in the pathway leading up to
dethiobiotin (the immediate precursor of biotin) have not been annotated. In addition, there are
two putative biotin synthase genes MA0154 and MA1489. MA0154 and MA1489 have 43% and
39% similarity to E. coli biotin synthase. (Figure from John Orban)
82
Figure 4. Distributions of gene re-annotated results and confidence levels for the selected
gene pool. Figure A) is a graph of the distribution of changes in functional annotation among
selected MA genes. Graph B) displays the distribution of confidence levels of annotation within
selected MA gene. Level 1: An exact match in the literature with an experimentally defined
function. Level 2: Gene product contains all domains needed for enzymatic function with ≥35%
sequence identity to a gene product of known function. Level 3: Gene product contains all
domains needed for enzymatic function but ≤35% sequence identity to a gene product of known
function. Level 4: Gene product has no experimental match but some domain similarities to a
known function are recognizable. Level 5: Has no experimental match or domain similarities –
i.e. annotated as hypothetical. (Figure from Kevin Sowers (Chen et al., 2011))
83
Figure 5. NMR-based ligand screening of MA4265, a putative isocitrate/isopropylmalate
dehydrogenase. Part of the metabolic pathway involving isocitrate dehydrogenase
functionality is shown at the top of the figure. The relevant section of the one-dimensional
waterLOGSY 1H NMR spectrum is shown for each metabolite (300 μM) in the presence of
MA4265 (30 μM) as follows: (a) citrate; (b) cis-aconitate; (c) isocitrate; (d) 2-ketoglutarate;
(e) control spectrum of MA4265 alone. Peaks due to each compound are labeled with
asterisks. Metabolite interaction with MA4265 is represented as positive peaks whereas
negative peaks indicate no binding to the protein. Only isocitrate binds to MA4265 under
the conditions used. (Chen et al., 2011).
84
Table 1. E. coli homologs of MA4265. Below are the BLASTp results representing E. coli
homologs to the MA4265 amino acid sequence. The Keio mutants of each of these three genes;
yeaU, leuB, and icd were selected for use in genetic complementation studies with MA4265.
85
Figure 6. Genetic complementation results for MA4265. E. coli strains in the BW25113
(WT) background carrying deletion mutant alleles icd (a), leuB (b), and yeaU (c), were
transformed with an empty control vector (EV=pET-21a) or a plasmid expressing MA4265. All
strains were grown on M9 minimal media with glucose (or malate in plate c) as the carbon
source, and contained 100 μg/mL ampicillin. Plates were incubated at 37°C for 48 hours. For
panel (c), D-malate was substituted for glucose as the sole carbon source to examine the growth
phenotype of yeaU, which is masked with glucose (Reed et al., 2006). d) The icd mutant strain
carrying either the empty control vector ( icd +EV) or the MA4265-expressing vector ( icd
+MA4265) and the parental strain (WT) empty control vector (WT+EV) or the MA4265expressing vector (WT+MA4265) were grown in M9 liquid media with glucose as the carbon
source and 100 μg/mL ampicillin. Cultures were incubated at 37°C and OD600 was monitored.
Data from the average of three replicates (± standard error) are presented (Chen et al., 2011).
86
Figure 7. Amino acid profile for MA3520 expressed in Rosetta 2.0 E. coli cells. MA3520
was expressed in MOPS minimal liquid media with glucose as the sole carbon source with 100
μg/mL ampicillin and induction for 4 hours with 0.1mM IPTG. The log2 ratio of these cells is
compared to the content of an empty vector control. All amino acid amounts were determined by
GC-MS. The average of 6 replicates (± standard error) are presented. Statistically significant
changes are labelled with orange markers (p-value<0.5).
87
Figure 8. MA3520 Effects on E. coli Glycine and Serine Accumulation. MA3520 was
expressed in MOPS minimal liquid media with glucose as the sole carbon source with100 μg/mL
ampicillin and induction for 4 hours with 0.1mM IPTG. The control cells were grown under the
same conditions with an empty vector control. Amino acid content determined by GC-MS. The
bar graphs represent the average of 6 replicates (± standard error) are presented. The observed
change in glycine was statistically significant (p-value<0.5).
88
Figure 9. NMR-based ligand screening of MA3520, a putative serine
hydroxymethyltransferase. The relevant section of the one-dimensional waterLOGSY 1H NMR
spectrum is shown for each metabolite in the presence of MA3520 as follows: glycine (gly),
serine (ser), alanine (ala), histidine (his), and methionine(met). Metabolite interaction with
MA3520 is represented as positive peaks whereas negative peaks indicate no binding to the
protein. Only glyince was positive for binding.
89
Figure 10. Genetic complementation results for MA3520. Growth of isogenic E. coli strains
in the BW25113 (WT) background carrying a deletion mutant allele for glyA .The glyA mutant
strain carrying either the empty control vector ( icd +EV) or the MA3520-expressing vector (
glyA+MA3520) and the parental strain (WT) empty control vector (WT+EV) were grown in M9
liquid media with glycine as the sole carbon source with100 μg/mL ampicillin and induction with
0.1mM IPTG. Cultures were incubated at 37°C and OD600 was monitored. Data from the average
of three replicates (± standard error) are presented.
90
Figure 11. Example of the total polar metabolites data of MA3520 vs EV as a volcano plot.
This volcano plot represents the polar total metabolite profile for Z052 (MA3520) expressed in
Rosetta 2.0 E. coli cells. Cell carrying the MA3520 were grown in MOPS minimal liquid media
with glucose as the sole carbon source with100 μg/mL ampicillin and induction for 4 hours with
0.1mM IPTG. The log2 ratio of these cells is compared to the content of an empty vector control
and displayed as the x-coordinate, while the p-value between the MA gene and the control cells
is presented in a -Log10 format as the y-coordinate. All metabolite amounts were determined by
GC-MS. The data from average of 6 replicates of each MA gene and the EV control are
presented. Statistically significant changes are located above the orange shaded bar (pvalue<0.5).
91
Figure 12. Example of the total polar metabolites data of MA1272 vs EV as a volcano plot.
This volcano plot represents the polar total metabolite profile for Z052 (MA1272) expressed in
Rosetta 2.0 E. coli cells. Cells carrying the MA1272 were grown in MOPS minimal liquid media
with glucose as the sole carbon source with100 μg/mL ampicillin and induction for 4 hours with
0.1mM IPTG. The log2 ratio of these cells is compared to the content of an empty vector control
and displayed as the x-coordinate, while the p-value between the MA gene and the control cells
is presented in a -Log10 format as the y-coordinate. All metabolite amounts were determined by
GC-MS. The data from average of 6 replicates of each MA gene and the EV control are
presented. Statistically significant changes are located above the orange shaded bar (pvalue<0.5).
92
Table 2. Annotation status of selected and analyzed MA genes
93
CHAPTER IV. THE BIOTIN NETWORK OF METHANOSARCINA
ACETIVORANS: GENE FUNCTION, REGULATION, AND METABOLIC
ROLE OF BIOTIN IN A METHANOGENIC ARCHAEON’S
METABOLISM
Lucas Showman1, Yihong Chen2, Ethel Apolinario3, Libuse Brachova1, Zvi Kelman2 4,
Zhuo Li2, John Orban2 5*, Kevin Sowers3, and Basil J Nikolau1 6*
Author Affiliations:
1
Department of Biochemistry, Biophysics and Molecular Biology, Iowa State University, Ames
IA 50011, USA
2
Institute for Bioscience and Biotechnology Research, University of Maryland, 9600 Gudelsky
Drive, Rockville MD 20850, USA
3
Department of Marine Biotechnology, University of Maryland Baltimore County, Baltimore
MD, USA
4
Department of Cell Biology and Molecular Genetics, University of Maryland College Park,
USA
5
Department of Chemistry and Biochemistry, University of Maryland College Park, USA
6
Center for Biorenewable Chemicals, Iowa State University, Ames IA 50011, USA
* Corresponding author: Basil Nikolau [email protected]
94
ABSTRACT
In this study we showed that Methanosarcina acetivorans was able to produce
biotinylated pyruvate carboxylase from the biotin vitamers: 7,8-diaminopelargonic acid,
dethiobiotin, and biotin. Furthermore the addition of exogenous biotin led to the
hypeaccumulation of amino acids, including oxaloacetate derived aspartic acid. The pyruvate
carboxylase and birA genes were found to be contained in a single putative birA biotin
dependant regulated operon. Methanosarcina acetivorans cultures with no biotin vitamers
available were unable to produce biotinylated pyruvate carboxylase, however the growth of these
cells was not adversely affected. This demonstrated that neither biotin nor biotinylated proteins
are required for the growth of Methanosarcina acetivorans.
95
INTRODUCTION
Biotin (vitamin H) is a water-soluble vitamin that was originally characterized at the
beginning of the 20th century (Wildiers, 1901; Kogl and Tonnis, 1936; Du Vigneaud and
Melville, 1942). Biotin is covalently bound to enzymes and acts as a catalytic prosthetic group in
carboxylation, decarboxylation, transcarboxylation reactions (KNOWLES, 1989; Nikolau et al.,
2003; Smith et al., 2007; Ding et al., 2012). Biotin is present as an essential enzyme cofactor of
enzymes important in several metabolic processes including: anaplerosis, lipogenesis, and amino
acid metabolism. Appropriately categorized as a vitamin, biotin is utilized by all organisms,
though not all species have the ability to synthesize biotin de novo (Bender, 1992).
All higher plants, most bacteria, and some fungi can synthesize biotin, while animals
must obtain biotin from their diets. The biotin biosynthetic in genes from model organisms such
as E. coli (Eisenberg et al., 1987; Otsuka et al., 1988), B. subtilis (GLOECKLER et al., 1990;
FLORENTIN et al., 1994; Bower et al., 1996), S. cerevisiae (Phalip et al., 1999), and
Arabidopsis (Schneider et al., 1989; SHELLHAMMER and MEINKE, 1990; BALDET et al.,
1993; Weaver et al., 1996; Alban et al., 2000; Picciocchi et al., 2001; Pinon et al., 2005; Muralla
et al., 2008; Cobessi et al., 2012) are well defined, while the biotin biosynthetic pathways in the
domain of Archaea are poorly understood. Most commonly used media used to culture Archaeal
species contain small amounts of biotin for proper growth. Interestingly, many species of
Archaea lack genes with significant homology to the well elucidated biotin biosynthetic genes
(Streit and Entcheva, 2003). However the methanogenic Archaeon M. jannaschii (MJ) possesses
a biotin biosynthetic operon that is highly similar to the one found in B. subtilis (Rodionov et al.,
96
2002; Streit and Entcheva, 2003). A complete pathway, the biosynthetic gene homologs found in
MJ are sufficient to derive biotin from the conserved precursor pimelic acid.
Methanosarcina acetivorans has six putative biotin network genes (Methanosarcina
Sequencing Project. Broad Institute of MIT and Harvard (www.broad.mit.edu), (Galagan et al.,
2002)). Fig 1. shows a consensus biotin biosynthesis pathway of known microbial biotin
biosynthetic genes (Streit and Entcheva, 2003), along with putative annotations for biotin
synthase (MA0154 and/or MA01489) and bioY, an ABC type biotin transporter (MA4340)
(Rodionov et al., 2002; Guillén-Navarro et al., 2005; Hebbeln et al., 2007; Finkenwirth et al.,
2013). The other biotin networks genes are biotin ligase (MA0676), and MA’s only known biotin
dependent enzyme, a putative pyruvate carboxylase with subunits: pycA (MA0675) and pycB
(MA0674). Notably, the enzymes required for producing dethiobiotin (the penultimate precursor
of biotin) have not been annotated. In addition, there are two putative biotin synthase genes
MA0154 and MA1489. MA0154 and MA1489 have 43% and 39% similarity to E. coli biotin
synthase and further experimental evidence was required to determine which gene(s) has actual
biotin synthase activity. Though the biotin biosynthetic pathway appears to be incomplete, initial
observations showed that MA does not require exogenous biotin to sustain normal growth
(SOWERS et al., 1984). This observation implies that MA does indeed have a complete but not
yet described biotin biosynthetic pathway or a novel situation in which MA does not require
biotin to survive.
97
MATERIALS AND METHODS
Identification of biotin network gene homologs in MA
In preparation for target gene selection for aconcurrent study, the functional annotations
were manually updated for over 700 of the 4524 predicted genes in the MA genome. This was
performed by transferring many of the recently revised manual annotations in the closely related
species M. burtonii (Allen et al., 2009) to homologous genes in MA. Biotin biosynthetic genes
were targeted among the 700 selected (Chen et al., 2011) (see chapter III). Also, a thorough
literature search was conducted for published data that experimentally confirms the functionality
of MA genes and closely related orthologs in other species (Chen et al., 2011). In addition,
BLASTp searches (National Center for Biotechnology Information BLASTp
blast.ncbi.nlm.nih.gov/Blast.cgi, (Altschul et al., 1990; Johnson et al., 2008) were conducted
searching the MA genome database (taxid:2214) using known biotin network genes as query
sequences. The known biotin biosynthetic genes sequences were from the model organisms: E.
coli (Eisenberg et al., 1987; Otsuka et al., 1988), B. subtilis (GLOECKLER et al., 1990;
FLORENTIN et al., 1994; Bower et al., 1996), S. cerevisiae (Phalip et al., 1999), and
Arabidopsis (Schneider et al., 1989; SHELLHAMMER and MEINKE, 1990; BALDET et al.,
1993; Weaver et al., 1996; Alban et al., 2000; Picciocchi et al., 2001; Pinon et al., 2005; Muralla
et al., 2008; Cobessi et al., 2012) (data not shown)
MA growth conditions
MA cells were grown in Marine media (cite with 80:20 nitrogen gas: carbon dioxide)
containing 0.25g/L Na2S, 100mM trimethylamine, trace minerals and vitamins (WOLIN et al.,
1963), 50mM proline, either 8.0pM biotin (Sigma-Aldrich, St. Louis, MO), or 1.0mg/L avidin
98
from egg white (Sigma-Aldrich, St. Louis, MO), or 20pM KAPA (Cayman, Ann Arbor, MI),
DAPA, or Pimelic acid (Sigma-Aldrich, St. Louis, MO), or dethiobiotin (Sigma-Aldrich, St.
Louis, MO). The growth of 10mL cultures was monitored at 550nm with 3 to 12 three replicates
for each time point.
Vitamer dependent biotinylation of pyruvate carboxylase subunit B
MA cultures were incubated for 48 hours in the presents of biotin a singal biotin
vitamers. The biotinylation status of pyruvate carboxylase subunit B was detected using
streptavidin-HRP, avidin with no biotin vitamers (-), pimelic acid (PA), 7-keto-8aminopelargonic acid (KAPA), 7,8-diaminopelargonic acid (DAPA), dethiobiotin (BTB) and
biotin. and subsequently stained with Coomassie Brilliant Blue or subjected to western blot
analysis. Western blots were incubated with 0.2µg/mL of streptavidin-HRP (GE Health UK
Limited, Buckinghampshire,UK) and detection achieved by chemilumninescence with ELC
western blotting reagents (GE Healthcare Bio-science Corporation, Piscataway, NJ). Western
blots were imaged with a ChemiDoc XRS+ imaging system and images were analyzed with
Image Lab Software (Hercules, CA) to quantify abundances.
One dimensional waterLOGSY 1H NMR spectrum candidate BirA MA0676
WaterLOGSY (Dalvit et al., 2001; Chen et al., 2011) experiments were conducted with
purified MA0676, purification as previously described (Chen et al., 2011). 1 µM up to 500 µM
of the potential substrates: biotin, ATP, lysine and biotinyl-lysine were used The waterLOGSY
experiments were conducted as describes previously (Chen et al., 2011).
99
Western blot analysis
Antibodies were created using purified proteins isolated as previously described (Chen et
al., 2011). Purified protein samples were submitted to the Hybridoma Facility of the Iowa State
University Office of Biotechnology for the generation of polyclonal antibodies in mice. Sera
specific to the birA, pycA, and pycB proteins were generated. These antibodies were used for the
western blot detection of native birA, pycA, and pycB. Protein was extracted from cell pellets of
cells at mid-log growth stage. Western blot analysis was performed on the biotin - and biotin +
protein extracts. Detection was achieved by goat anti-mouse IgG (H+L)-HRP conjugate (BioRad, Hercules, CA) using chemilumninescence with ELC western blotting reagents (GE
Healthcare Bio-science Corporation, Piscataway, NJ). Western blots were imaged with a
ChemiDoc XRS+ imaging system and images were analyzed with Image Lab Software
(Hercules, CA) to quantify abundances.
Real-time RT-PCR transcriptional analysis
RNA was extracted using the Aurum Total RNA Mini Kit (Bio-Rad, Hercules, CA), and
then quantified using NanoDrop ND-1000 (Thermo Fisher Scientific, Waltham, MA) and the
quality was observed via agarose gel electrophoresis. RNA was then treated with DNase I
(Invitrogen, Carlsbad, CA) to remove genomic DNA contamination and Superscript First-Strand
synthesis system (III) (Invitrogen, Carlsbad, CA) was used to produce cDNA. The probes and
primers used in the real-time experiments were obtained from Integrated DNA Technology.
TaqMan Fast Advanced Master Mix (Applied Biosystems, Foster City, CA) was used for all of
the real-time experiments using the standard conditions provided in the user manual. Real-time
PCR experiments were performed using a StepOnePlus system (Applied Biosystems, Foster
100
City, CA) at Iowa State University DNA Facility and results were analyzed using the
instrument’s supporting software.
Metabolite analysis
Cell pellets were lyophilized overnight and dry masses were recorded (4-10mg). All
reagents for metabolite analysis were from Sigma-Aldrich, St. Louis, MO unless stated
differently. 25µL of ribitol was added as an internal standard (0.6 mg/mL in dry-pyridine) as
well as 25µL of C19 fatty acid as internal standard (0.6 mg/mL in chloroform) was added to the
cell pellets. 0.35mL hot methanol (Thermo Fisher Scientific, Fair Lawn, NJ) (60°C) was added,
mixed, and immediately incubated at 60°C for 10 minutes. Samples were then vortexed 10 sec
and put in sonicating water bath for 10 minutes. 0.35mL of chloroform was added and mixed by
vortexing for 2 minutes. 0.3mL HPLC grade water was added and vortexed 2 minutes. Phases
were separated by centrifuge, 5 minutes at 13000G. The upper layer (polar) and lower layer
(non-polar) were separated then derivatized and analyzed separately. Derivatization was adapted
from (Koek et al., 2006). 200-400µl polar/non-polar extracts were dried under nitrogen gas. 50µl
methoxyamine (20mg/mL in pyridine) was added and was incubated at 30°C while shaking, 1.5
hours. 70µl of bis-trimethyl silyl trifluoroacetamide with 1% trimethylchlorosilane (BSTFA+1%
TMCS) was then added (TMS derivatization reagent) and incubated at 37°C for 30 minuites.
Samples were then submitted to GC-MS (gas chromatography–mass spectrometry) analysis. The
following GC-MS method was used for the total metabolite GC-MS analysis. All GC-MS data
was captured using an Agilent Technologies Model 7890A Gas Chromatograph with an Agilent
19091S-433: 1926.58224 HP-5MS 5% Phenyl Methyl Siloxane column -60 °C—325 °C (325
°C): 30 m x 250 µm x 0.25 µm column coupled to a Model 5975C mass selective detector with
electrical ionization (EI). The GC-MS system was control by and the data collected with
101
ChemStation software (Agilent Technologies, Santa Clara, CA). This GC-MS system was made
available by the Office of Biotechnology W.M. Keck Metabolomics Research Laboratory, Iowa
State University. The injection port temperature was 280°C with a splitless mode. He was the
carrier gas at a flow rate of 1.5 mL/min. The GC oven temperature ramp procedures were as
follows: temperature set to 80 °C and held for 2 minutes, ramp #1 8°C/min to 260°C, hold 5
minutes, ramp #2 10°C/min to 320°C, hold 20 minutes. MS conditions were as follows: ion
source, 200°C; electron energy, 70eV; quadrupole temperature, 150°C; GC-MS interface zone,
280°C; scan range, 50-550 mass units, a 3.7 minute solvent delay was used. Data was analyzed
using AMDIS (Automated Mass Spectral Deconvolution and Identification System;
http://chemdata.nist.gov/mass-spc/amdis/) software, with data interpretation from NIST08 library
and quantification with MET-IDEA (Broeckling et al., 2006). The NIST 08 MS Library and
AMDIS were provided by National Institute of Standards and Technology (www.nist.gov) and
MET-IDEA software was obtained from the Samuel Roberts Noble Foundation (Ardmore, OK).
Computation identification of the MA biotin network
We searched the MA genome for homologs of known E. coli, Arabidopsis, S. cerevisiae,
and B. subtilis biotin biosynthetic genes (BLASTp). MA biotin network genes were also
identified as previously described (Chen et al., 2011). Sequences from Methanosarcina barkeri
and Methanosarcina mazei were used as query sequences to identify putative BirA binding DNA
sequences sites (Rodionov et al., 2002).
102
RESULTS
Vitamer dependent biotinylation of pyruvate carboxylase subunit B
In order to understand the biotin biosynthetic capabilities of MA and the specific biotin
precursors MA is capable of utilizing, an exogenous feeding study was conducted. The
biotinylation status of pyruvate carboxylase subunit B was monitored (Fig. 2) to indicate ability
of MA to form biotin from its known precursors. The negative controls were cultures grown
containing the biotin scavenging protein avidin, with no added biotin vitamers. These conditions
did not lead to any detectable biotinylation. The addition pimelic acid or 7-keto-8aminopelargonic acid (KAPA), also did not facilitate biotinylation,while 7,8-diaminopelargonic
acid (DAPA), dethiobiotin (BTB) and biotin all led to observable biotinylation.
One dimensional waterLOGSY 1H NMR spectrum candidate BirA MA0676
WaterLOGSY (water-Ligand Observed via Gradient Spectroscopy), relies on competing
flows of magnetization which lead to binding manifested as positive peaks and non-binding
manifested as negative peaks (Dalvit et al., 2001; Chen et al., 2011). WaterLOGSY 1H NMR
experiments were used to test candidate substrates of birA MA0676. Purified recombinant
MA0676 shows positive signal indicating binding of biotin (Fig. 3). Addition of ATP causes a
notable change in the signal, most likely due to the formation of biotinyl-AMP. Moreover, the
addition of a substrate mimic, lysine, leads to negative biotin signals, consistent with the
formation of a dissociated biotinyl-lysine product. Note that control experiments showed that
lysine alone does not displace biotin, which could be an alternative explanation of the binding
data. Also, a biotinyl-lysine (biocytin) standard did not bind to MA0676 (data not shown).
103
Growth of MA with and without biotin
The growth curves in Figure 4 show the growth of MA cultures with and without biotin.
After 24 hours of incubation, the growth of the biotin containing cultures began to slow and
lagged behind the biotin depleted cultures, leading to a reduction of 0.25 OD550 by 48 hours
incubation; this reduction was maintained into stationary phase. Samples from the cultures,
whose growth was represented in Figure 4, were collected at 24, 48, and 72 hours and were used
for subsequent qRT-PCR, western, and metabolomics analysis.
Expression of MA pyruvate carboxylase subunits and birA with and without biotin
Quantitative RT-PCR was used to determine the expression levels of pyruvate
carboxylase subunits and birA at three different growth time points: 24, 48, and 72 hours of
incubation (Fig. 5A). The MA0674 (pycB), MA0675 (pycA), and MA0676 (birA) which are
predicted to be members of a single operon, followed a very similar expression pattern. At 24
hours of growth, the expression level of all three transcripts are elevated in the biotin depleted
cultures 12 to 20 fold, later on at 48 and 72 hours the transcripts levels returned to back to near
the biotin containing cultures’ levels. Western blot comparison of pyruvate carboxylase subunits
and birA of MA with and without biotin showed (Fig. 5B) that all three proteins were
hyperaccumulated under biotin depleted conditions (increases estimate at ~10 fold, data not
shown). The increased pyruvate carboxylase subunit accumulation at the protein level echoes the
qRT-PCR results. Although there was significantly more pycB protein in the biotin depleted
cultures there was no detectable biotinylated pycB, in contrast, inthe biotin containing cultures
the pycB was clearly biotinylated.
104
BirA binding sequence homology
The putative birA DNA binding sites in the MA genome were identified by their
homology to Methanosarcina barkeri (MBA) and Methanosarcina mazei (MMZ) identified
previously by their homology to other known birA sites (Rodionov et al., 2002). Two birA DNA
binding sites were found in the MA genome. One site is located in the operator region of the
pycA–pycB–birA operon and the other is located in the putative operator region of the putative
ABC transporter bioY–cbiO–cbiO–cbiQ operon (Table 1). The MA birA DNA binding sites are
identical to those found in MMZ.
Non-targeted metabolite profiling
Total metabolite profiling revealed that MA cells grown in media containing biotin
accumulated higher amounts of 85 detected metabolites (indicated by p-values <0.5), in contrast
only 8 detected metabolites were found to be reduced in the biotin containing cultures (Fig. 7).
Several of the hyperaccumulated metabolites were identified as amino acids. These amino acids
included aspartic acid which exhibited a 5 fold increase and a 7 fold increase in alanine along
with increases in serine, glycine, glutamine, and β-alanine (Fig. 7). While an amino acid,
asparagines, was among the 8 metabolites reduced in the biotin containing cells.
105
DISCUSSION
We were able to establish that MA has the capacity to synthesize biotin from the vitamers
DAPA and dethiobiotin through an exogenous feeding study. This result was notable because
MA lacks genes with significant homology to known dethiobiotin synthetase genes found in
other organisms. While the other biotin vitamers KAPA and pimelic acids failed to be converted
into biotin by MA under the conditions tested. This negative result may not be indicative of MA
lacking the proper enzymatic activity to process these vitamers to biotin, as the MA cells may
lack uptake mechanisms for KAPA and pimelic acid and/or the proper enzymes may not be
expressed under the conditions tested. There were initially two biotin synthase homologs found
in MA, MA0154 and MA01489. MA0154 is now known to be involved in pyrrolysine
biosynthesis and is located in pyrrolysine biosynthesis operon region in MA’s genome
(Longstaff et al., 2007). Additionally MA0154 did not bind dethiobiotin or biotin in
waterLOGSY experiments(Chen et al., 2011) (Sp Fig. 3),while MA1489 displayed binding of
dethiobiotin in waterLOGSY experiments (Sp Fig. 2). As a result MA1489 is the leading
candidate for the biotin synthase activity in MA. MA3250 is the only gene with significant
sequence homology to known dethiobiotin synthetases that catalyze the formation of DAPA to
dethiobiotin. MA3250 is annotated as a cobyric acid synthase gene (Methanosarcina Sequencing
Project. Broad Institute of MIT and Harvard (www.broad.mit.edu), (Galagan et al., 2002)).
Cobyric aicd sythases are amidotransferases that catalyze the formation of amide groups on four
different locations of a signal substrate. Cobyric acid synthase genes are known to have a Nterminal domain that is homologous to known dethiobiotin synthetases and in
phosphotransacetylases. The active sites among these enzymes are thought to be remarkably
similar (Galperin and Grishin, 2000; Endrizzi et al., 2004). MA3250 NMR waterLOGSY
106
substrate binding experiments indicated the positive binding of DAPA to MA3250 protein (Sp
Fig. 1). This substrate binding and the presence of the N-terminal domain similar to the domain
found in dethiobiotin synthetases may indicate the MA3250 may be a dual functional enzyme or
the binding could be nonspecific and the domain homology may be indicative of a common
mechanism or mode of binding that cobyric acid synthases and dethiobiotin synthase share rather
than actual dual enzyme function.
The birA homolog MA0676, contains both a predicted biotin ligase domain as well as a
putative DNA binding domain (Rodionov et al., 2002). Not all BirA orthologs possess the Nterminal DNA-binding domain with the HTH motif (D-b-BirA). The presence of D-b-BirA in a
genome coincides with the presence of potential BirA DNA binding sites existing upstream of
biotin-related genes (Rodionov et al., 2002). When MA0676 was submitted to waterLOGSY
NMR experiments, MA0676 showed binding with biotin and the binding NMR profile was
shifted by the second substrate, ATP, possibly indicating the formation of biotin-‘5-AMP
intermediate. When in the substrate analog, lysine was presented in combination with biotin and
ATP, the bind signal was lost. Perhaps the MA0676 was able to catalyze the formation of
biocytin, causing the release of this product and the loss of binding signals. When taken together,
the substrate binding profile of MA0676 was indicative of biotin ligase activity. The pycB,
pycA, and birA which were predicted to be members of a single operon (Mukhopadhyay et al.,
2001), followed a very similar expression pattern. At 24 hours of growth the expression level of
all three transcripts were elevated in the biotin depleted cells and protein levels of pycB, pycA,
and birA were all three hyperaccumulated under biotin depleted conditions. The pycB subunit
was only biotinylated in the biotin containing cells. Although there was significantly more pycB
protein in the biotin depleted cultures, there was no detectable biotinylated pycB. The expression
107
of these genes matches the regulation of birA of biotin synthetic genes in E. coli. The birA
protein monitors the biotinylation status of the biotin carboxyl carrier protein (BCCP) an acetylcoA carboxylase subunit and reduces the expression of biotin biosynthetic genes once the BCCP
protein is biotinylated and no longer a sink for biotin-5’-AMP (Beckett, 2007). In the case of the
MA birA regulated operons, the biotin carrier protein is pycB. Once pycB is biotinylated it
would no longer act as a sink for birA bond biotin-5’-AMP (Fig. 6). The concentration of birA
bond to biotin-5’-AMP would then build up and this induces the formation homo-dimer
complexes of birA bond to biotin-5’-AMP. The homo-dimers of birA bond to biotin-5’-AMP are
the regulatory active form of birA. These dimers bind to the birA DNA binding sites (bioO)
thereby inhibiting transcription of the genes contained in these operons (Beckett, 2007). The key
difference in MA is that the birA gene itself appears to be under its own regulation as part of an
operon that has a bioO, birA DNA binding site. In fact we located two birA DNA binding sites in
MA, one with the pycA, pycB, birA genes (Fig. 6) and the other in the operator region of the
biotin uptake gene, bioY. This places biotin uptake, pyruvate carboxylase, and biotin ligase
activities under the regulation of birA.
The appearance of a measurable growth defect in MA cultures with the addition of biotin
was unexpected. The biological relevance of this result in MA’s natural environments is unclear,
as this study was only conducted with synthetic marine media. The cause of this growth defect is
likely a result of unbalanced growth caused by the hyperaccumulation pyruvate carboxylase
produced OAA derived and secondarily derived metabolites (Fig. 8). OAA regenerates the
carbon backbone of the anabolic biosynthesis focused TCA cycle in MA. This allows the
potential for secondary pathways to have access to an increased amount in TCA intermediates.
This is supported by the significant number of metabolites (85 polar) that were hyper
108
accumulated in the biotin containing cultures. One major direct derivative of OAA is aspartic
acid, produced via aspartate aminotransferase (Tomita et al., 2014). In fact aspartic acid was
increased to 5-fold of the biotin depleted culture level. Aspartic acid, in addition to being a
proteinogenic amino acid, is also an intermediate and precursor in several other pathways
(Mukhopadhyay et al., 2001; Tomita et al., 2014). A direct product of aspartic acid is amino acid
β-alanine, produced via aspartate-decarboxylase, and it was also increased by 6-fold in the biotin
containing cultures. Potentially the biotinylated and active pyruvate carboxylase may lead to the
funneling of carbon from anaerobic respiration and methane metabolism, and other anabolic
pathways, to OAA mediated anabolic biosynthesis pathways and their intermediates and
products. This metabolic shift could have led to unbalanced growth and the growth phenotype
observed with the cultures incubated in the synthetic marine media with biotin.
The question arises that if biotin is a vitamin that has been shown to be required without
exception in previously studied organisms (Schneider et al., 1989; Bender, 1992; Cobessi et al.,
2012), how is MA able to thrive without it? The answer is twofold. First, because MA uses
iosprenoid based membrane lipids, not fatty acids, and consequently there is no biotin dependent
acetyl-CoA carboxylase homolog found in MA. Acetyl-CoA carboxylase is normally required
for fatty acid production in other organisms (Bailey et al., 1995; Ke et al., 2000; Rodríguez et al.,
2001; Kode et al., 2005; Kurth et al., 2009). Second, there is likely a biotin independent pyruvate
carboxylase bypass pathway in MA (Fig. 8). The combination of phosphoenolpyruvate synthase
and phosphoenolpyruvate carboxylase enzymes can form OAA from pyruvate with
phosphoenolpyruvate as an intermediate (Galagan et al., 2002; Ettema et al., 2004).
109
REFERENCES
Alban C, Job D, Douce R (2000) Biotin metabolism in plants. Annual Review of Plant
Physiology and Plant Molecular Biology 51: 17-47
Allen MA, Lauro FM, Williams TJ, Burg D, Siddiqui KS, De Francisci D, Chong KW,
Pilak O, Chew HH, De Maere MZ, Ting L, Katrib M, Ng C, Sowers KR, Galperin
MY, Anderson IJ, Ivanova N, Dalin E, Martinez M, Lapidus A, Hauser L, Land M,
Thomas T, Cavicchioli R (2009) The genome sequence of the psychrophilic archaeon,
Methanococcoides burtonii: the role of genome evolution in cold adaptation. ISME J 3:
1012-1035
Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ (1990) Basic local alignment search
tool. J Mol Biol 215: 403-410
Bailey A, Keon J, Owen J, Hargreaves J (1995) The ACC1 gene, encoding acetyl-CoA
carboxylase, is essential for growth in Ustilago maydis. Mol Gen Genet 249: 191-201
BALDET P, GERBLING H, AXIOTIS S, DOUCE R (1993) BIOTIN BIOSYNTHESIS IN
HIGHER-PLANT CELLS - IDENTIFICATION OF INTERMEDIATES. European
Journal of Biochemistry 217: 479-485
Beckett D (2007) Biotin sensing: Universal influence of biotin status on transcription. Annual
Review of Genetics 41: 443-464
Bender DA (1992) Nutrition and Biochemistry of the Vitamins. Cambridge University Press,
New York, NY, USA
Bower S, Perkins JB, Yocum RR, Howitt CL, Rahaim P, Pero J (1996) Cloning, sequencing,
and characterization of the Bacillus subtilis biotin biosynthetic operon. J Bacteriol 178:
4122-4130
110
Broeckling CD, Reddy IR, Duran AL, Zhao X, Sumner LW (2006) MET-IDEA: data
extraction tool for mass spectrometry-based metabolomics. Anal Chem 78: 4334-4341
Chen Y, Apolinario E, Brachova L, Kelman Z, Li Z, Nikolau BJ, Showman L, Sowers K,
Orban J (2011) A nuclear magnetic resonance based approach to accurate functional
annotation of putative enzymes in the methanogen Methanosarcina acetivorans. BMC
Genomics 12 Suppl 1: S7
Cobessi D, Dumas R, Pautre V, Meinguet C, Ferrer JL, Alban C (2012) Biochemical and
structural characterization of the Arabidopsis bifunctional enzyme dethiobiotin
synthetase-diaminopelargonic acid aminotransferase: evidence for substrate channeling in
biotin synthesis. Plant Cell 24: 1608-1625
Dalvit C, Fogliatto G, Stewart A, Veronesi M, Stockman B (2001) WaterLOGSY as a method
for primary NMR screening: practical aspects and range of applicability. J Biomol NMR
21: 349-359
Ding G, Che P, Ilarslan H, Wurtele ES, Nikolau BJ (2012) Genetic dissection of
methylcrotonyl CoA carboxylase indicates a complex role for mitochondrial leucine
catabolism during seed development and germination. Plant J 70: 562-577
Du Vigneaud V, Melville D (1942) The structure of biotin: A study of desthiobiotin. Journal of
Biological Chemistry 146: 0475-0485
Eisenberg M, Neidhardt F, Ingraham J, Low K, Magasanik B, Schaechter M, Umbarger M
(1987) Biosynthesis of biotin and lipoic acid in Escherichia coli and Salmonella
thyphimurium. Cellular and Molecular Biology: American Society of Microbiology, New
York: 544-550
111
Endrizzi JA, Kim H, Anderson PM, Baldwin EP (2004) Crystal structure of Escherichia coli
cytidine triphosphate synthetase, a nucleotide-regulated glutamine
amidotransferase/ATP-dependent amidoligase fusion protein and homologue of
anticancer and antiparasitic drug targets. Biochemistry 43: 6447-6463
Ettema TJ, Makarova KS, Jellema GL, Gierman HJ, Koonin EV, Huynen MA, de Vos
WM, van der Oost J (2004) Identification and functional verification of archaeal-type
phosphoenolpyruvate carboxylase, a missing link in archaeal central carbohydrate
metabolism. J Bacteriol 186: 7754-7762
Finkenwirth F, Kirsch F, Eitinger T (2013) Solitary BioY proteins mediate biotin transport
into recombinant Escherichia coli. J Bacteriol 195: 4105-4111
FLORENTIN D, BUI B, MARQUET A, OHSHIRO T, IZUMI Y (1994) ON THE
MECHANISM OF BIOTIN SYNTHASE OF BACILLUS-SPHAERICUS. Comptes
Rendus De L Academie Des Sciences Serie Iii-Sciences De La Vie-Life Sciences 317:
485-488
Galagan JE, Nusbaum C, Roy A, Endrizzi MG, Macdonald P, FitzHugh W, Calvo S,
Engels R, Smirnov S, Atnoor D, Brown A, Allen N, Naylor J, Stange-Thomann N,
DeArellano K, Johnson R, Linton L, McEwan P, McKernan K, Talamas J, Tirrell
A, Ye W, Zimmer A, Barber RD, Cann I, Graham DE, Grahame DA, Guss AM,
Hedderich R, Ingram-Smith C, Kuettner HC, Krzycki JA, Leigh JA, Li W, Liu J,
Mukhopadhyay B, Reeve JN, Smith K, Springer TA, Umayam LA, White O, White
RH, Conway de Macario E, Ferry JG, Jarrell KF, Jing H, Macario AJ, Paulsen I,
Pritchett M, Sowers KR, Swanson RV, Zinder SH, Lander E, Metcalf WW, Birren
112
B (2002) The genome of M. acetivorans reveals extensive metabolic and physiological
diversity. Genome Res 12: 532-542
Galperin MY, Grishin NV (2000) The synthetase domains of cobalamin biosynthesis
amidotransferases cobB and cobQ belong to a new family of ATP-dependent
amidoligases, related to dethiobiotin synthetase. Proteins 41: 238-247
GLOECKLER R, OHSAWA I, SPECK D, LEDOUX C, BERNARD S, ZINSIUS M,
VILLEVAL D, KISOU T, KAMOGAWA K, LEMOINE Y (1990) CLONING AND
CHARACTERIZATION OF THE BACILLUS-SPHAERICUS GENESCONTROLLING THE BIOCONVERSION OF PIMELATE INTO DETHIOBIOTIN.
Gene 87: 63-70
Guillén-Navarro K, Araíza G, García-de los Santos A, Mora Y, Dunn MF (2005) The
Rhizobium etli bioMNY operon is involved in biotin transport. FEMS Microbiol Lett
250: 209-219
Hebbeln P, Rodionov DA, Alfandega A, Eitinger T (2007) Biotin uptake in prokaryotes by
solute transporters with an optional ATP-binding cassette-containing module. Proc Natl
Acad Sci U S A 104: 2909-2914
Johnson M, Zaretskaya I, Raytselis Y, Merezhuk Y, McGinnis S, Madden TL (2008) NCBI
BLAST: a better web interface. Nucleic Acids Res 36: W5-9
Ke J, Wen TN, Nikolau BJ, Wurtele ES (2000) Coordinate regulation of the nuclear and
plastidic genes coding for the subunits of the heteromeric acetyl-coenzyme A
carboxylase. Plant Physiol 122: 1057-1071
KNOWLES J (1989) THE MECHANISM OF BIOTIN-DEPENDENT ENZYMES. Annual
Review of Biochemistry 58: 195-221
113
Kode V, Mudd EA, Iamtham S, Day A (2005) The tobacco plastid accD gene is essential and
is required for leaf development. Plant J 44: 237-244
Koek MM, Muilwijk B, van der Werf MJ, Hankemeier T (2006) Microbial metabolomics
with gas chromatography/mass spectrometry. Anal Chem 78: 1272-1281
Kogl F, Tonnis B (1936) Concerning Bio-problems. Representations of crystallized Biotin in
egg yolk. 20. Communication concerning plant developments materials. Hoppe-Seylers
Zeitschrift Fur Physiologische Chemie 242: 43-73
Kurth DG, Gago GM, de la Iglesia A, Bazet Lyonnet B, Lin TW, Morbidoni HR, Tsai SC,
Gramajo H (2009) ACCase 6 is the essential acetyl-CoA carboxylase involved in fatty
acid and mycolic acid biosynthesis in mycobacteria. Microbiology 155: 2664-2675
Longstaff DG, Larue RC, Faust JE, Mahapatra A, Zhang L, Green-Church KB, Krzycki
JA (2007) A natural genetic code expansion cassette enables transmissible biosynthesis
and genetic encoding of pyrrolysine. Proc Natl Acad Sci U S A 104: 1021-1026
Mukhopadhyay B, Purwantini E, Kreder CL, Wolfe RS (2001) Oxaloacetate synthesis in the
methanarchaeon Methanosarcina barkeri: pyruvate carboxylase genes and a putative
Escherichia coli-type bifunctional biotin protein ligase gene (bpl/birA) exhibit a unique
organization. J Bacteriol 183: 3804-3810
Muralla R, Chen E, Sweeney C, Gray J, Dickerman A, Nikolau B, Meinke D (2008) A
bifunctional locus (BIO3-BIO1) required for biotin biosynthesis in Arabidopsis. Plant
Physiology 146: 60-73
Nikolau BJ, Ohlrogge JB, Wurtele ES (2003) Plant biotin-containing carboxylases. Arch
Biochem Biophys 414: 211-222
114
Otsuka AJ, Buoncristiani MR, Howard PK, Flamm J, Johnson C, Yamamoto R, Uchida K,
Cook C, Ruppert J, Matsuzaki J (1988) The Escherichia coli biotin biosynthetic
enzyme sequences predicted from the nucleotide sequence of the bio operon. J Biol Chem
263: 19577-19585
Phalip V, Kuhn I, Lemoine Y, Jeltsch J (1999) Characterization of the biotin biosynthesis
pathway in Saccharomyces cerevisiae and evidence for a cluster containing BI05, a novel
gene involved in vitamer uptake. Gene 232: 43-51
Picciocchi A, Douce R, Alban C (2001) Biochemical characterization of the Arabidopsis biotin
synthase reaction. The importance of mitochondria in biotin synthesis. Plant Physiology
127: 1224-1233
Pinon V, Ravanel S, Douce R, Alban C (2005) Biotin synthesis in plants. The first committed
step of the pathway is catalyzed by a cytosolic 7-keto-8-aminopelargonic acid synthase.
Plant Physiology 139: 1666-1676
Rodionov D, Mironov A, Gelfand M (2002) Conservation of the biotin regulon and the BirA
regulatory signal in eubacteria and archaea. Genome Research 12: 1507-1516
Rodionov DA, Mironov AA, Gelfand MS (2002) Conservation of the biotin regulon and the
BirA regulatory signal in Eubacteria and Archaea. Genome Res 12: 1507-1516
Rodríguez E, Banchio C, Diacovich L, Bibb MJ, Gramajo H (2001) Role of an essential acyl
coenzyme A carboxylase in the primary and secondary metabolism of Streptomyces
coelicolor A3(2). Appl Environ Microbiol 67: 4166-4176
Schneider T, Dinkins R, Robinson K, Shellhammer J, Meinke DW (1989) An embryo-lethal
mutant of Arabidopsis thaliana is a biotin auxotroph. Dev Biol 131: 161-167
115
SHELLHAMMER J, MEINKE D (1990) ARRESTED EMBRYOS FROM THE BIO1
AUXOTROPH OF ARABIDOPSIS-THALIANA CONTAIN REDUCED LEVELS OF
BIOTIN. Plant Physiology 93: 1162-1167
Smith AG, Croft MT, Moulin M, Webb ME (2007) Plants need their vitamins too. Curr Opin
Plant Biol 10: 266-275
SOWERS K, BARON S, FERRY J (1984) METHANOSARCINA-ACETIVORANS SP-NOV,
AN ACETOTROPHIC METHANE-PRODUCING BACTERIUM ISOLATED FROM
MARINE-SEDIMENTS. Applied and Environmental Microbiology 47: 971-978
Streit WR, Entcheva P (2003) Biotin in microbes, the genes involved in its biosynthesis, its
biochemical role and perspectives for biotechnological production. Appl Microbiol
Biotechnol 61: 21-31
Tomita H, Yokooji Y, Ishibashi T, Imanaka T, Atomi H (2014) An archaeal glutamate
decarboxylase homolog functions as an aspartate decarboxylase and is involved in βalanine and coenzyme A biosynthesis. J Bacteriol 196: 1222-1230
Weaver LM, Yu F, Wurtele ES, Nikolau BJ (1996) Characterization of the cDNA and gene
coding for the biotin synthase of Arabidopsis thaliana. Plant Physiol 110: 1021-1028
Wildiers E (1901) Nouvelle substance indispensable au developpement de la levure. La Cellule
18: 313-316
WOLIN EA, WOLIN MJ, WOLFE RS (1963) FORMATION OF METHANE BY
BACTERIAL EXTRACTS. J Biol Chem 238: 2882-2886
116
Figure 1. Biotin biosynthetic pathway
Displayed is the combined gram-positive and gram-negative bacterial biotin biosynthetic
pathway, with MA genes label in red. Biotin biosynthetic pathways from plants, fungi, and
Achaea are also thought to contain the last five compounds in the pathway. (Figure adapted from
W. R. Streit and P. Entcheva, 2003)
117
Figure 2. Vitamer Dependant Biotinylation of Pyruvate Carboxylase Subunit B.
The biotinylation status of pyruvate carboxylase subunit B. detected using streptavidin-HRP,
avidin with no biotin vitamers (-), pimelic acid (PA), 7-keto-8-aminopelargonic acid (KAPA),
7,8-diaminopelargonic acid (DAPA) , dethiobiotin (DTB), and biotin.
118
Figure 3. One Dimensional waterLOGSY 1H NMR Spectrum Candidate BirA MA0676.
One dimensional waterLOGSY 1H NMR spectrum was used to test candidate substrates of BirA
MA0676. When biotin is added to purified recombinant MA0676 a positive signal is observed
indicating binding. The reference spectrum is a biotin spectrum that is inverted to show a
positive binding control spectrum (a). The following conditions were tested: biotin with
MA0676 protein (b), biotin and ATP with MA0676 protein (c), and biotin, ATP, lysine with
MA0676 (d) (Figure from John Orban).
119
Figure 4. Growth of MA cultures in +/- biotin conditions
MA cells were grown in Marine synthetic media containing either 8.0pM biotin, or 1.0mg/L
avidin. The growth of 10mL cultures was monitored at 550nm absorbance for 72 hours.
120
Table 1. BirA Binding Sequence Homology
Shown above are the known E.coli birA binding site and operon along with the homologous
putative birA binding sites found in and the associated genes forming candidate operons found in
Methanosarcina . Methanosarcina barkeri (MBA) and Methanosarcina mazei (MMZ). $ denotes
the location birA binding sites. The number inside the parenthesis (#) indicates the number of
base pairs in-between the two sections of the binding sites.
121
Figure 6. Diagram of Novel birA Self Regulation
Proposed MA birA mediated biotin ligation and gene regulation pathway based on the
homologous pathway found in E. coli (Adapted from Beckett, 2007). In the presence of biotin,
if apoPYCB is present in significant quantities, the unrepressed bioO state is preferred, while
once the apoPYCB is depleted, the repressed state is favored.
122
Figure 7. Volcano Plot of Polar Metabolites
This plot distributes the detected metabolites by relative fold changes and by statistical
significance. Metabolites above the orange shaded box are considered to have statistically
significantly changed (p-value<0.05) between the biotin + and biotin - conditions. Data points
and the right side of the y-axis are elevated in the biotin + condition, while molecules on the left
side of the y-axis are reduced in the biotin + condition.
123
Figure 8. Putative Metabolic Map Showing Affects of Pyruvate Carboxylase Activity
Red arrows indicate metabolite changes in the biotin containing MA cultures.
Abv.
Enzyme name, MA gene number
pyC:
pyruvate carboxylase MA0674, MA0675
ACDS: acetyl-CoA decarbonylase/synthase MA1011, MA1012, MA1014 MA3862, MA3864,
MA3865
pyOR:
pyruvate ferredoxin oxidoreductase MA0031, MA0032, Ma0033, MA0034, MA2407
pps:
phosphoenolpyruvate synthase MA2458, MA2667, MA3408
AAT:
aspartate aminotransferase MA0636, MA1385, MA1819
asdA:
L-aspartate 4-carboxy-lyase ?
ADC:
aspartate-decarboxylase MA1949
124
(Figure 8. Continued)
Alt:
Alanine transaminase ?
CO2:
maybe from bicarbonate
H2N-R maybe on glutamate and donated to alpha keto gluterate
PEPC
Phosphoenolpyruvate carboxylase MA2690
Gene annotations assisted by KEGG (Kanehisa Laboratories, JP: Kanehisa, M.; "Post-genome
Informatics", Oxford University Press (2000).) http://www.genome.jp/keggbin/show_pathway?mac00680
125
Supplemental Table 1. Primers and probes used in this study
126
Supplemental Figure 1. One Dimensional waterLOGSY 1H NMR Spectrum Candidate
DTB synthase MA3250
One dimensional waterLOGSY 1H NMR spectrum was used to test candidate substrates of the
candidate DTB synthase MA3250. A positive signal indicates binding. MA3250 protein was
evaluated under the following conditions: MA3250 only with no biotin vitamers, pimelic acid, 7keto-8-aminopelargonic acid (KAPA), 7,8-diaminopelargonic acid (DAPA), dethiobiotin, and
biotin. Only DAPA had a spectrum indicating binding. (Figure from John Orban)
127
Supplemental Figure 2. One dimensional waterLOGSY 1H NMR spectrum candidate
biotin synthase MA1489
One dimensional waterLOGSY 1H NMR spectrum was used to test candidate substrates of the
candidate biotin synthase MA1489. A positive signal indicates binding. MA1489 protein was
evaluated under the following conditions: MA1489 only with no biotin vitamers, pimelic acid
(pimelate), 7-keto-8-aminopelargonic acid (KAPA), 7,8-diaminopelargonic acid (DAPA),
dethiobiotin, and biotin . Biotin and dethiobiotin have spectra indicating binding. (Figure from
John Orban)
128
Supplemental Figure 3. One dimensional waterLOGSY 1H NMR spectrum candidate
biotin synthase MA0154
One dimensional waterLOGSY 1H NMR spectrum was used to test candidate substrates of the
candidate biotin synthase MA0154. A positive signal indicates binding. MA0154 protein was
evaluated under the following conditions: MA0154 with dethiobiotin, or MA0154 with biotin.
Biotin and dethiobiotin had spectra indicating no detectable binding under these conditions
(Figure from John Orban).
129
CHAPTER V. GENERAL CONCLUSIONS
Metabolomics studies of ΔfabH (KASII) assisted in the development of the hypothesis that fabF
(KASII) was involved in the survival of ΔfabH E. coli cells. This hypothesis was tested by generating and
evaluating ΔfabF/ ΔfabH double mutants. Through the studies of the ΔfabF/ ΔfabH double mutant, it
was revealed that the KASI, fabB gene was able to support E. coli cell growth as the sole KAS enzyme
present. Additionally it was determined that fabF is responsible for the hyper-accumulation of cisvaccenic acid observed in ΔfabH mutants.
The second study focused on a functional genomics effort to develop a gene-function
annotation method based on metabolomics and NMR ligand binding assays. This study had
mixed results. The metabolomics effort in this study was met with great technical hurdles. Only
in a single case (MA3250) was metabolomics useful in annotating a gene’s functions. While the
NMR substrate binding assays were able to provide useful data for more accurate gene functional
annotations in more than 10% of the genes examined.
The third study presented was an in depth study of the biotin network of Methanosarcina
acetivorans, where metabolic changes associated with the presence and absence of biotin in
Methanosarcina acetivorans cultures were investigated. In this study metabolomics studies were
able to detect metabolic changes that were associated with the presents or absence of biotin. In
this study it was also revealed that MA does not require biotin or any detectable biotinylated
protein to survive.
130
APPENDIX
This appendix presents the individual figures from the metabolomics data described in
chapter 3. Both total metabolite and amino acid data are presented. Methods are on pages 68-71
and discussion on pages 77.
131
Figure 1. The total metabolite data of MA0589 vs EV as volcano plots.
These volcano plots represent the non-polar (A) and polar (B) total metabolite profile for the
specified MA gene expressed in Rosetta 2.0 E. coli cells. Cells carrying the specified MA gene
were grown in MOPS minimal liquid media with glucose as the sole carbon source with100
μg/mL ampicillin and induction for 4 hours with 0.1mM IPTG. The log2 ratio of these cells is
compared to the content of an empty vector control and displayed as the x-coordinate, while the
p-value between the MA gene and the control cells is presented in a -Log10 format as the ycoordinate. All metabolite amounts were determined by GC-MS. The data from average of 6
replicates of each MA gene and the EV (empty vector) control are presented. Statistically
significant changes are located above the orange shaded bar (p-value<0.5).
132
Figure 2. The total metabolite data of MA2489 vs EV as volcano plots.
These volcano plots represent the non-polar (A) and polar (B) total metabolite profile for the
specified MA gene expressed in Rosetta 2.0 E. coli cells. Cells carrying the specified MA gene
were grown in MOPS minimal liquid media with glucose as the sole carbon source with100
μg/mL ampicillin and induction for 4 hours with 0.1mM IPTG. The log2 ratio of these cells is
compared to the content of an empty vector control and displayed as the x-coordinate, while the
p-value between the MA gene and the control cells is presented in a -Log10 format as the ycoordinate. All metabolite amounts were determined by GC-MS. The data from average of 6
replicates of each MA gene and the EV (empty vector) control are presented. Statistically
significant changes are located above the orange shaded bar (p-value<0.5).
133
Figure 3. The total metabolite data of MA2715 vs EV as volcano plots.
These volcano plots represent the non-polar (A) and polar (B) total metabolite profile for the
specified MA gene expressed in Rosetta 2.0 E. coli cells. Cells carrying the specified MA gene
were grown in MOPS minimal liquid media with glucose as the sole carbon source with100
μg/mL ampicillin and induction for 4 hours with 0.1mM IPTG. The log2 ratio of these cells is
compared to the content of an empty vector control and displayed as the x-coordinate, while the
p-value between the MA gene and the control cells is presented in a -Log10 format as the ycoordinate. All metabolite amounts were determined by GC-MS. The data from average of 6
replicates of each MA gene and the EV (empty vector) control are presented. Statistically
significant changes are located above the orange shaded bar (p-value<0.5).
134
Figure 4. The total metabolite data of MA3994 vs EV as volcano plots.
These volcano plots represent the non-polar (A) and polar (B) total metabolite profile for the
specified MA gene expressed in Rosetta 2.0 E. coli cells. Cells carrying the specified MA gene
were grown in MOPS minimal liquid media with glucose as the sole carbon source with100
μg/mL ampicillin and induction for 4 hours with 0.1mM IPTG. The log2 ratio of these cells is
compared to the content of an empty vector control and displayed as the x-coordinate, while the
p-value between the MA gene and the control cells is presented in a -Log10 format as the ycoordinate. All metabolite amounts were determined by GC-MS. The data from average of 6
replicates of each MA gene and the EV (empty vector) control are presented. Statistically
significant changes are located above the orange shaded bar (p-value<0.5).
135
Figure 5. The total metabolite data of MA2859 vs EV as volcano plots.
These volcano plots represent the non-polar (A) and polar (B) total metabolite profile for the
specified MA gene expressed in Rosetta 2.0 E. coli cells. Cells carrying the specified MA gene
were grown in MOPS minimal liquid media with glucose as the sole carbon source with100
μg/mL ampicillin and induction for 4 hours with 0.1mM IPTG. The log2 ratio of these cells is
compared to the content of an empty vector control and displayed as the x-coordinate, while the
p-value between the MA gene and the control cells is presented in a -Log10 format as the ycoordinate. All metabolite amounts were determined by GC-MS. The data from average of 6
replicates of each MA gene and the EV (empty vector) control are presented. Statistically
significant changes are located above the orange shaded bar (p-value<0.5).
136
Figure 6. The total metabolite data of MA3520 vs EV as volcano plots.
These volcano plots represent the non-polar (A) and polar (B) total metabolite profile for the
specified MA gene expressed in Rosetta 2.0 E. coli cells. Cells carrying the specified MA gene
were grown in MOPS minimal liquid media with glucose as the sole carbon source with100
μg/mL ampicillin and induction for 4 hours with 0.1mM IPTG. The log2 ratio of these cells is
compared to the content of an empty vector control and displayed as the x-coordinate, while the
p-value between the MA gene and the control cells is presented in a -Log10 format as the ycoordinate. All metabolite amounts were determined by GC-MS. The data from average of 6
replicates of each MA gene and the EV (empty vector) control are presented. Statistically
significant changes are located above the orange shaded bar (p-value<0.5).
137
Figure 7. The total metabolite data of MA0589 vs EV as volcano plots.
These volcano plots represent the non-polar (A) and polar (B) total metabolite profile for the
specified MA gene expressed in Rosetta 2.0 E. coli cells. Cells carrying the specified MA gene
were grown in MOPS minimal liquid media with glucose as the sole carbon source with100
μg/mL ampicillin and induction for 4 hours with 0.1mM IPTG. The log2 ratio of these cells is
compared to the content of an empty vector control and displayed as the x-coordinate, while the
p-value between the MA gene and the control cells is presented in a -Log10 format as the ycoordinate. All metabolite amounts were determined by GC-MS. The data from average of 6
replicates of each MA gene and the EV (empty vector) control are presented. Statistically
significant changes are located above the orange shaded bar (p-value<0.5).
138
Figure 8. The total metabolite data of MA0674 vs EV as volcano plots.
These volcano plots represent the non-polar (A) and polar (B) total metabolite profile for the
specified MA gene expressed in Rosetta 2.0 E. coli cells. Cells carrying the specified MA gene
were grown in MOPS minimal liquid media with glucose as the sole carbon source with100
μg/mL ampicillin and induction for 4 hours with 0.1mM IPTG. The log2 ratio of these cells is
compared to the content of an empty vector control and displayed as the x-coordinate, while the
p-value between the MA gene and the control cells is presented in a -Log10 format as the ycoordinate. All metabolite amounts were determined by GC-MS. The data from average of 6
replicates of each MA gene and the EV (empty vector) control are presented. Statistically
significant changes are located above the orange shaded bar (p-value<0.5).
139
Figure 9. The total metabolite data of MA4460 vs EV as volcano plots.
These volcano plots represent the non-polar (A) and polar (B) total metabolite profile for the
specified MA gene expressed in Rosetta 2.0 E. coli cells. Cells carrying the specified MA gene
were grown in MOPS minimal liquid media with glucose as the sole carbon source with100
μg/mL ampicillin and induction for 4 hours with 0.1mM IPTG. The log2 ratio of these cells is
compared to the content of an empty vector control and displayed as the x-coordinate, while the
p-value between the MA gene and the control cells is presented in a -Log10 format as the ycoordinate. All metabolite amounts were determined by GC-MS. The data from average of 6
replicates of each MA gene and the EV (empty vector) control are presented. Statistically
significant changes are located above the orange shaded bar (p-value<0.5).
140
Figure 11. The total metabolite data of MA3751 vs EV as volcano plots.
These volcano plots represent the non-polar (A) and polar (B) total metabolite profile for the
specified MA gene expressed in Rosetta 2.0 E. coli cells. Cells carrying the specified MA gene
were grown in MOPS minimal liquid media with glucose as the sole carbon source with100
μg/mL ampicillin and induction for 4 hours with 0.1mM IPTG. The log2 ratio of these cells is
compared to the content of an empty vector control and displayed as the x-coordinate, while the
p-value between the MA gene and the control cells is presented in a -Log10 format as the ycoordinate. All metabolite amounts were determined by GC-MS. The data from average of 6
replicates of each MA gene and the EV (empty vector) control are presented. Statistically
significant changes are located above the orange shaded bar (p-value<0.5).
141
Figure 12. The total metabolite data of MA589 vs EV as volcano plots.
These volcano plots represent the non-polar (A) and polar (B) total metabolite profile for the
specified MA gene expressed in Rosetta 2.0 E. coli cells. Cells carrying the specified MA gene
were grown in MOPS minimal liquid media with glucose as the sole carbon source with100
μg/mL ampicillin and induction for 4 hours with 0.1mM IPTG. The log2 ratio of these cells is
compared to the content of an empty vector control and displayed as the x-coordinate, while the
p-value between the MA gene and the control cells is presented in a -Log10 format as the ycoordinate. All metabolite amounts were determined by GC-MS. The data from average of 6
replicates of each MA gene and the EV (empty vector) control are presented. Statistically
significant changes are located above the orange shaded bar (p-value<0.5).
142
Figure 13. The total metabolite data of MA4459 vs EV as volcano plots.
These volcano plots represent the non-polar (A) and polar (B) total metabolite profile for the
specified MA gene expressed in Rosetta 2.0 E. coli cells. Cells carrying the specified MA gene
were grown in MOPS minimal liquid media with glucose as the sole carbon source with100
μg/mL ampicillin and induction for 4 hours with 0.1mM IPTG. The log2 ratio of these cells is
compared to the content of an empty vector control and displayed as the x-coordinate, while the
p-value between the MA gene and the control cells is presented in a -Log10 format as the ycoordinate. All metabolite amounts were determined by GC-MS. The data from average of 6
replicates of each MA gene and the EV (empty vector) control are presented. Statistically
significant changes are located above the orange shaded bar (p-value<0.5).
143
Figure 14. The total metabolite data of MA3734 vs EV as volcano plots.
These volcano plots represent the non-polar (A) and polar (B) total metabolite profile for the
specified MA gene expressed in Rosetta 2.0 E. coli cells. Cells carrying the specified MA gene
were grown in MOPS minimal liquid media with glucose as the sole carbon source with100
μg/mL ampicillin and induction for 4 hours with 0.1mM IPTG. The log2 ratio of these cells is
compared to the content of an empty vector control and displayed as the x-coordinate, while the
p-value between the MA gene and the control cells is presented in a -Log10 format as the ycoordinate. All metabolite amounts were determined by GC-MS. The data from average of 6
replicates of each MA gene and the EV (empty vector) control are presented. Statistically
significant changes are located above the orange shaded bar (p-value<0.5).
144
Figure 15. The total metabolite data of MA1491 vs EV as volcano plots.
These volcano plots represent the non-polar (A) and polar (B) total metabolite profile for the
specified MA gene expressed in Rosetta 2.0 E. coli cells. Cells carrying the specified MA gene
were grown in MOPS minimal liquid media with glucose as the sole carbon source with100
μg/mL ampicillin and induction for 4 hours with 0.1mM IPTG. The log2 ratio of these cells is
compared to the content of an empty vector control and displayed as the x-coordinate, while the
p-value between the MA gene and the control cells is presented in a -Log10 format as the ycoordinate. All metabolite amounts were determined by GC-MS. The data from average of 6
replicates of each MA gene and the EV (empty vector) control are presented. Statistically
significant changes are located above the orange shaded bar (p-value<0.5).
145
Figure 16. The total metabolite data of MA3342 vs EV as volcano plots.
These volcano plots represent the non-polar (A) and polar (B) total metabolite profile for the
specified MA gene expressed in Rosetta 2.0 E. coli cells. Cells carrying the specified MA gene
were grown in MOPS minimal liquid media with glucose as the sole carbon source with100
μg/mL ampicillin and induction for 4 hours with 0.1mM IPTG. The log2 ratio of these cells is
compared to the content of an empty vector control and displayed as the x-coordinate, while the
p-value between the MA gene and the control cells is presented in a -Log10 format as the ycoordinate. All metabolite amounts were determined by GC-MS. The data from average of 6
replicates of each MA gene and the EV (empty vector) control are presented. Statistically
significant changes are located above the orange shaded bar (p-value<0.5).
146
Figure 17. The total metabolite data of MA3994 vs EV as volcano plots.
These volcano plots represent the non-polar (A) and polar (B) total metabolite profile for the
specified MA gene expressed in Rosetta 2.0 E. coli cells. Cells carrying the specified MA gene
were grown in MOPS minimal liquid media with glucose as the sole carbon source with100
μg/mL ampicillin and induction for 4 hours with 0.1mM IPTG. The log2 ratio of these cells is
compared to the content of an empty vector control and displayed as the x-coordinate, while the
p-value between the MA gene and the control cells is presented in a -Log10 format as the ycoordinate. All metabolite amounts were determined by GC-MS. The data from average of 6
replicates of each MA gene and the EV (empty vector) control are presented. Statistically
significant changes are located above the orange shaded bar (p-value<0.5).
147
Figure 18. The total metabolite data of MA0202 vs EV as volcano plots.
These volcano plots represent the non-polar (A) and polar (B) total metabolite profile for the
specified MA gene expressed in Rosetta 2.0 E. coli cells. Cells carrying the specified MA gene
were grown in MOPS minimal liquid media with glucose as the sole carbon source with100
μg/mL ampicillin and induction for 4 hours with 0.1mM IPTG. The log2 ratio of these cells is
compared to the content of an empty vector control and displayed as the x-coordinate, while the
p-value between the MA gene and the control cells is presented in a -Log10 format as the ycoordinate. All metabolite amounts were determined by GC-MS. The data from average of 6
replicates of each MA gene and the EV (empty vector) control are presented. Statistically
significant changes are located above the orange shaded bar (p-value<0.5).
148
Figure 19. The total metabolite data of MA4460 vs EV as volcano plots.
These volcano plots represent the non-polar (A) and polar (B) total metabolite profile for the
specified MA gene expressed in Rosetta 2.0 E. coli cells. Cells carrying the specified MA gene
were grown in MOPS minimal liquid media with glucose as the sole carbon source with100
μg/mL ampicillin and induction for 4 hours with 0.1mM IPTG. The log2 ratio of these cells is
compared to the content of an empty vector control and displayed as the x-coordinate, while the
p-value between the MA gene and the control cells is presented in a -Log10 format as the ycoordinate. All metabolite amounts were determined by GC-MS. The data from average of 6
replicates of each MA gene and the EV (empty vector) control are presented. Statistically
significant changes are located above the orange shaded bar (p-value<0.5).
149
Figure 20. The total metabolite data of MA0677 vs EV as volcano plots.
These volcano plots represent the non-polar (A) and polar (B) total metabolite profile for the
specified MA gene expressed in Rosetta 2.0 E. coli cells. Cells carrying the specified MA gene
were grown in MOPS minimal liquid media with glucose as the sole carbon source with100
μg/mL ampicillin and induction for 4 hours with 0.1mM IPTG. The log2 ratio of these cells is
compared to the content of an empty vector control and displayed as the x-coordinate, while the
p-value between the MA gene and the control cells is presented in a -Log10 format as the ycoordinate. All metabolite amounts were determined by GC-MS. The data from average of 6
replicates of each MA gene and the EV (empty vector) control are presented. Statistically
significant changes are located above the orange shaded bar (p-value<0.5).
150
Figure 21. The total metabolite data of MA1272 vs EV as volcano plots.
These volcano plots represent the non-polar (A) and polar (B) total metabolite profile for the
specified MA gene expressed in Rosetta 2.0 E. coli cells. Cells carrying the specified MA gene
were grown in MOPS minimal liquid media with glucose as the sole carbon source with100
μg/mL ampicillin and induction for 4 hours with 0.1mM IPTG. The log2 ratio of these cells is
compared to the content of an empty vector control and displayed as the x-coordinate, while the
p-value between the MA gene and the control cells is presented in a -Log10 format as the ycoordinate. All metabolite amounts were determined by GC-MS. The data from average of 6
replicates of each MA gene and the EV (empty vector) control are presented. Statistically
significant changes are located above the orange shaded bar (p-value<0.5).
151
Figure 22. The total metabolite data of MA#### vs EV as volcano plots.
These volcano plots represent the non-polar (A) and polar (B) total metabolite profile for the
specified MA gene expressed in Rosetta 2.0 E. coli cells. Cells carrying the specified MA gene
were grown in MOPS minimal liquid media with glucose as the sole carbon source with100
μg/mL ampicillin and induction for 4 hours with 0.1mM IPTG. The log2 ratio of these cells is
compared to the content of an empty vector control and displayed as the x-coordinate, while the
p-value between the MA gene and the control cells is presented in a -Log10 format as the ycoordinate. All metabolite amounts were determined by GC-MS. The data from average of 6
replicates of each MA gene and the EV (empty vector) control are presented. Statistically
significant changes are located above the orange shaded bar (p-value<0.5).
152
Figure 23. The total metabolite data of MA0106 vs EV as volcano plots.
These volcano plots represent the non-polar (A) and polar (B) total metabolite profile for the specified
MA gene expressed in Rosetta 2.0 E. coli cells. Cells carrying the specified MA gene were grown in
MOPS minimal liquid media with glucose as the sole carbon source with100 μg/mL ampicillin and
induction for 4 hours with 0.1mM IPTG. The log2 ratio of these cells is compared to the content of an
empty vector control and displayed as the x-coordinate, while the p-value between the MA gene and the
control cells is presented in a -Log10 format as the y-coordinate. All metabolite amounts were
determined by GC-MS. The data from average of 6 replicates of each MA gene and the EV (empty
vector) control are presented. Statistically significant changes are located above the orange shaded bar.
153
Table 1. Amino acid abbreviations
Compound Name
α-aminobutyric acid
Allo-Isoleucine
L-Alanine
Alpha-aminopimelic acid_P
L-Asparagine
L-Aspartic acid
Beta-Aminoisobutyric acid
L-Cystine
Cystathionine
Glutamine_P
L-Glutamic acid
Glycine
Glycine-Proline
L-Histidine
Hydroxylysine
4-Hydroxyproline
L-Isoleucine
L-Leucine
L-Lysine
L-Methionine
L-Ornithine
L-Phenylalanine
Proline-hydroxyproline
L-Proline
Thioproline
Sarcosine
L-Serine
L-Threonine
L-Tryptophan
L-Tyrosine
L-alpha-Aminoadipic acid
L-valine
Abbreviation
AABA
aILE
ALA
APA
ASN
ASP
BAIB
C-C
CHT
GLN
GLU
GLY
GPR
HIS
HYL
HYP
ILE
LEU
LYS
MET
ORN
PHE
PHP
PRO
PRS
SAR
SER
THR
TRP
TYR
UN1
VAL
154
Figure 24. The amino acid quantification data of MA0202, MA2489, MA4264 vs EV as
volcano plots.
This volcano plot represents the amino acid profiles for the specified MA genes expressed in
Rosetta 2.0 E. coli cells. Cells carrying the specified MA genes were grown in MOPS minimal
liquid media with glucose as the sole carbon source with100 μg/mL ampicillin and induction for
4 hours with 0.1mM IPTG. The log2 ratio of these cells is compared to the content of an empty
vector control and displayed as the x-coordinate, while the p-value between the MA gene and the
control cells is presented in a -Log10 format as the y-coordinate. All metabolite amounts were
determined by GC-MS. The data from average of 6 replicates of each MA gene and the EV
(empty vector) control are presented. Statistically significant changes are located above the
orange shaded bar (p-value<0.5).
155
Figure 25. The amino acid quantification data of MA3520, MA3994, MA0246, and
MA4460 vs EV as volcano plots.
This volcano plot represents the amino acid profiles for the specified MA genes expressed in
Rosetta 2.0 E. coli cells. Cells carrying the specified MA genes were grown in MOPS minimal
liquid media with glucose as the sole carbon source with100 μg/mL ampicillin and induction for
4 hours with 0.1mM IPTG. The log2 ratio of these cells is compared to the content of an empty
vector control and displayed as the x-coordinate, while the p-value between the MA gene and the
control cells is presented in a -Log10 format as the y-coordinate. All metabolite amounts were
determined by GC-MS. The data from average of 6 replicates of each MA gene and the EV
(empty vector) control are presented. Statistically significant changes are located above the
orange shaded bar (p-value<0.5).
156
Figure 26. The amino acid quantification data of MA3751, MA4459, and MA2715 vs EV as
volcano plots.
This volcano plot represents the amino acid profiles for the specified MA genes expressed in
Rosetta 2.0 E. coli cells. Cells carrying the specified MA genes were grown in MOPS minimal
liquid media with glucose as the sole carbon source with100 μg/mL ampicillin and induction for
4 hours with 0.1mM IPTG. The log2 ratio of these cells is compared to the content of an empty
vector control and displayed as the x-coordinate, while the p-value between the MA gene and the
control cells is presented in a -Log10 format as the y-coordinate. All metabolite amounts were
determined by GC-MS. The data from average of 6 replicates of each MA gene and the EV
(empty vector) control are presented. Statistically significant changes are located above the
orange shaded bar (p-value<0.5).
157
Figure 27. The amino acid quantification data of MA2489, MA3734, and MA1491, and
MA0589 vs EV as volcano plots.
This volcano plot represents the amino acid profiles for the specified MA genes expressed in
Rosetta 2.0 E. coli cells. Cells carrying the specified MA genes were grown in MOPS minimal
liquid media with glucose as the sole carbon source with100 μg/mL ampicillin and induction for
4 hours with 0.1mM IPTG. The log2 ratio of these cells is compared to the content of an empty
vector control and displayed as the x-coordinate, while the p-value between the MA gene and the
control cells is presented in a -Log10 format as the y-coordinate. All metabolite amounts were
determined by GC-MS. The data from average of 6 replicates of each MA gene and the EV
(empty vector) control are presented. Statistically significant changes are located above the
orange shaded bar (p-value<0.5).
158
Figure 28. The amino acid quantification data of MA3242, MA1272, and MA3342, and
MA0202 vs EV as volcano plots.
This volcano plot represents the amino acid profiles for the specified MA genes expressed in
Rosetta 2.0 E. coli cells. Cells carrying the specified MA genes were grown in MOPS minimal
liquid media with glucose as the sole carbon source with100 μg/mL ampicillin and induction for
4 hours with 0.1mM IPTG. The log2 ratio of these cells is compared to the content of an empty
vector control and displayed as the x-coordinate, while the p-value between the MA gene and the
control cells is presented in a -Log10 format as the y-coordinate. All metabolite amounts were
determined by GC-MS. The data from average of 6 replicates of each MA gene and the EV
(empty vector) control are presented. Statistically significant changes are located above the
orange shaded bar (p-value<0.5).
159
Figure 29. The amino acid quantification data of MA0677, MA4460, and MA1615, and
MA0106 vs EV as volcano plots.
This volcano plot represents the amino acid profiles for the specified MA genes expressed in
Rosetta 2.0 E. coli cells. Cells carrying the specified MA genes were grown in MOPS minimal
liquid media with glucose as the sole carbon source with100 μg/mL ampicillin and induction for
4 hours with 0.1mM IPTG. The log2 ratio of these cells is compared to the content of an empty
vector control and displayed as the x-coordinate, while the p-value between the MA gene and the
control cells is presented in a -Log10 format as the y-coordinate. All metabolite amounts were
determined by GC-MS. The data from average of 6 replicates of each MA gene and the EV
(empty vector) control are presented. Statistically significant changes are located above the
orange shaded bar (p-value<0.5).