* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Baker - International School of Crystallography
Vectors in gene therapy wikipedia , lookup
Metalloprotein wikipedia , lookup
Ancestral sequence reconstruction wikipedia , lookup
Silencer (genetics) wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Metabolic network modelling wikipedia , lookup
Magnesium transporter wikipedia , lookup
G protein–coupled receptor wikipedia , lookup
Paracrine signalling wikipedia , lookup
Endogenous retrovirus wikipedia , lookup
Biochemical cascade wikipedia , lookup
Gene regulatory network wikipedia , lookup
Biochemistry wikipedia , lookup
Gene expression wikipedia , lookup
Evolution of metal ions in biological systems wikipedia , lookup
Expression vector wikipedia , lookup
Protein purification wikipedia , lookup
Signal transduction wikipedia , lookup
Interactome wikipedia , lookup
Western blot wikipedia , lookup
Structural alignment wikipedia , lookup
Homology modeling wikipedia , lookup
Two-hybrid screening wikipedia , lookup
New Drug Targets from Mycobacterium tuberculosis: Strategies, Progress and Pitfalls from a Structural Genomics Enterprise Ted Baker School of Biological Sciences University of Auckland New Zealand On behalf of TB Structural Genomics Consortium The challenge posed by complete genome sequences The Mycobacterium tuberculosis genome Approx. 3900 open reading frames (ORFs) ~60% of gene products have an inferred function (mostly by homology) ~25% are “conserved hypotheticals” ~15% are “unknowns” ~30% can be related to proteins of known 3D structure - but only ~25 TB protein structures Many metabolic pathways appear incomplete Function from structure? Relationships that are hidden at the sequence level SpeB – virulence factor from S. pyogenes Actinidin – plant cysteine protease - < 10% sequence identity Structural Genomics The use of genomic information to guide protein structure discovery - and its inverse The use of protein structure analysis to add value to genomic sequence data – to deduce function Reversal of the ‘traditional’ direction of structural analysis Many targets – whole genomes, pathways, functional classes, folds Beginnings…~1998 A pilot pilot programme – Pyrobaculum aerophilum Using laboratory-scale approaches - PCR cloning - Expression in E. coli, cleavable affinity tags - Variation of expression temperature - Purification by affinity chromatography and gel filtration Genomic approach – most tractable first Results – P. aerophilum Cloned Expressed Soluble Purified Crystallized Structures Main bottlenecks 25 (274) 20 (168) 12 (80) 12 (43) 6 (24) 4 (11) - solubility - crystallization Pa_989 (TB homologue) HisF (imidazoleglycerol phosphate synthase) Banfield et al. Acta Cryst. D (2001) Pa_2307 (unknown) ‘Ancient conserved domain’ found in bacteria and archaea. No functional annotation Reproducible crystals with Li2SO4 - but twinned Two crystals grown from PEG/phosphate 1.5 A native data from one, SAD data from Pt(NO2)4 deriv of the other (used gel shift) Structure solved: SAD/Solve/Resolve/ARP Pa_2307 The next phase – larger enterprises Publicly funded - NIH Protein Structure Initiative (USA) - Initiatives in Japan, Germany, UK, France, Canada Biotech companies - Structural Genomix, Syrrx NIH Protein Structure Initiative 10 groups (consortia) funded Aim to develop methods and tools for “high throughput” structure determination Goals primarily structural - representative structures for all protein sequence families - discover novel folds (cover “fold space”) - estimate 10,000 structures needed But evolving Mycobacterium tuberculosis Causative agent of TB One-third of world’s population affected - approximately 3 million deaths annually Five front-line drugs (isoniazid, pyrazinamide, ethambutol, rifampin, streptomycin) but… - effective only against actively-growing bacteria - very long treatment regime (6-9 months) - resistance rising - need for new drugs Peculiarities of the organism Very slow-growing Gram-positive organism Complex waxy cell wall – outer layer rich in unusual lipids, glycolipids, polysaccharides Novel biosynthetic pathways Complex lifestyle - persistence - enters dormant state within active macrophages - survives through switches in metabolism - can be reactivated years later Led in United States by: - Tom Terwilliger (Los Alamos NL) - David Eisenberg (UCLA) Jim Sacchettini (Texas A&M) Bill Jacobs (Albert Einstein Coll. of Med.) Tom Alber (UC Berkeley)….. and many others Aims are focused on function: - understanding TB biology - discovery and structural analysis of novel drug targets http://www.doe-mbi.ucla.edu/TB/ Philosophy and policies Open participation - to all with an interest in TB Operates as a wider consortium of >30 participating labs in 13 countries worldwide Collaboration between structural biologists TB biologists, chemists…. Commitment to common policies - collaboration and cooperation - shared database for logging progress - sharing of data and materials - structures to be placed in public domain Operational aspects Central facilities for - bioinformatic analysis and data storage - protein expression and evolution - crystallization - synchrotron data collection - gene knockouts Technologies and facilities available to all Individuals choose their own targets according to their own interests – and assign priorities Targeting scores determine priorities of facilities Parallel efforts in individual labs Progress to date Most of structural results to date come as a result of efforts in individual labs But - availability of high-throughput facilities gives flexible options for individual labs and for efforts in the facilities Within facilities – 688 genes cloned (out of 720 targeted to date) First phase – concentrate on soluble proteins Next phase – the insoluble proteins Dealing with insoluble proteins GFP fusions as reporter of solubility – G. Waldo Folding Reporter - GFP • Function of R (GFP) depends on solubility of X-L-R. • Solubility of X-L-R depends on X. Express fusion protein X-L-R N C L X Non-functional R Insoluble R Detect function R Soluble Cell Colonies In Vitro Transcription + Translation X-L-GFP FUSION FLUORESCENCE Soluble Fraction SDS-PAGE X (Non-Fusion) Pellet Fraction Using GFP-fusions to engineer proteins for solubility Insoluble Protein Mutate Gene FORWARD EVOLUTION Recombine Optima Clone Select BACKCROSSING Recombine Optima & Wild type Clone Select Soluble Protein G.Waldo Solubilisation by evolution Rv2002 – Se Won Suh Putative ketoacyl ACP reductase Rendered soluble by 3 random mutations I6T and T69K mutations are on the molecular surface V47M mutation enhances a semiexposed hydrophobic contact Potential new TB drug targets Early results from the TB Structural Genomics Consortium Target ORF Selection in Mycobacterium tuberculosis Selection of ORFs: (a) potential drug targets and (b) to understand TB biology Biosynthetic enzymes for essential amino acids, cofactors, lipids, polysaccharides Secreted proteins Proteins implicated in antibiotic resistance or response Proteins implicated in persistence 1. Cell wall biosynthesis - mycolic acids (Sacchettini lab) Long chain branched lipids - form dense waxy outer layer of the mycobacterial cell wall Contribute to its impenetrability Implicated in both virulence and persistence Either covalently attached to cell wall or released as trehalose dimycolate (“cord factor”) Modification of mycolic acids, eg. cyclopropanation – varies between pathogenic and non-pathogenic species Cyclopropanation of mycolic acid chains Cyclopropane groups introduced by methylation Three cyclopropane synthases (C. Smith, J. Sacchettini – Texas A&M) CmaA1 CmaA2 PcaA 2. Secreted proteins (Eisenberg lab) Secreted proteins attractive drug targets for M. tuberculosis because: Often determinants of virulence or persistence - involved in cell wall modification - role in survival in macrophages M. tuberculosis secretes large number of proteins Cell wall is impermeable to many antibacterial agents Secreted proteins (C. Goulding, D. Anderson, H. Gill, D. Eisenberg – UCLA) C Rv2220 Glutamine synthetase - Synthesis of poly-(L-Glu-L-Gln) for cell wall N Rv1926c Unknown, resembles cell surface binding proteins (invasin, adaptin, arrestin) Rv1886c Antigen 85B Mycolyl transferase 3. Targets against persistence (Sacchettini lab) Persistence within activated macrophages facilitated by switch in metabolism Glycolysis downregulated – instead glyoxalate shunt allows use of C2 substrates generated by b-oxidation of fatty acids Enzymes isocitrate lyase and malate synthase are drug targets for persistent bacteria Glyoxalate shunt enzymes (V. Sharma, J. Sacchettini - Texas A&M) Rv0867 Isocitrate lyase Rv1837c Malate synthase 4. Antibiotic resistance - Isoniazid response genes DNA microarray analysis of TB ORFs upregulated by exposure to isoniazid Some code for proteins of known function – cell wall biosynthesis Others represent ‘unknowns’ The proteins encoded by these ORFs may represent the bacterial response to the toxic effects of the antibiotic Wilson et al., PNAS 96:12833-12838 (1999) Putative INH response operon Four ORFs appear to make up part of a putative operon in the TB genome: Rv0340, Rv0341, Rv0342, Rv0343. Rv0340 Rv0341 Rv0342 Rv0343 None of the four ORFs have detectable sequence homologues in other organisms. Rv0340 and Rv0341 are paralogues, as are Rv0342 and Rv0343 Same genes also upregulated by ethambutol. Isoniazid response – Rv0340 Moyra Komen, Vic Arcus, Shaun Lott Crystallization attempts Oil Spherulites NMR – shows only partially folded Limited proteolysis – gives N-terminal fragment with excellent NMR spectrum NMR spectrum – Rv0340 (residues 1-131) Indicates helical bundle with flexible tail Possible homology with acyl carrier protein Gives putative role in cell wall biosynthesis Problems of partial or incorrect functional annotation Rv1347c Widespread in bacteria, but not eukaryotes No clearly indicated function - closest sequence homologs: malonyl CoA decarboxylase siderophore biosynthesis aminoglycoside acetyltransferase No structure prediction Rv1347c structure - Graeme Card Rv1347c Acetyl-CoA dependent aminoglycoside acetyltransferase (11% identity) Rv1347c Aminoglycoside N-acetyl transferase (GCN5 family) ~ 11% sequence identity Problem of partial or incorrect functional annotations Rv3853 - “menG” Putative SAM-dependent methyltransferase catalysing final step in menaquinone biosynthesis Potential drug target – menaquinone pathway is essential and is not present in humans Genome also includes ubiE (Rv0558) - catalyses this step in both menaquinone and ubiquinone biosynthesis (menG is specific for menaquinone) Expressed, refolded, crystallized, solved to 1.9Å by SIRAS Common methyltransferase fold MenG structure – Jodie Johnston Structure does not look like a methyltransferase Resembles a phosphate transfer domain? Incorrect annotation Challenges for the future Membrane proteins Solubility of expressed proteins Hetero-oligomeric proteins Protein-protein interactions Assignment of function to “unknowns” Cellular pathways - metabolic pathways - signalling pathways Conclusions Structural biology is being transformed by new technologies – some driven by genomics Less effort in solving initial structures – more emphasis on “downstream” studies TB structural genomics consortium – a different model for large scale structure determination - access to centralised facilities - international effort on a common goal - collaboration rather than competition - opportunities for smaller labs Thanks Mycobacterium tuberculosis structural genomics consortium Members of Auckland Structural Biology Laboratory – Vic Arcus, Kristina Backbro, Mark Banfield, Heather Baker, Graeme Card, Jodie Johnston, Rainer Knijff, Moyra Komen, Shaun Lott, Andrew McCarthy, Clyde Smith Marsden Fund Health Research Council New Economy Research Fund