Download Bioinformatics - Oxford Academic

Book reviews research scientist wishing to become acquainted with the ®eld. Simon Dear Director of Bioinformatics Engineering GlaxoSmithKline Gunnels Wood Road Stevenage, UK References 1. Letovsky, S. (Ed.) (1999), `Bioinformatics: Databases and Systems', Kluwer Academic, Dordrecht. Bioinformatics: A Practical Guide to the Analysis of Genes and Proteins (Methods of Biochemical Analysis, 43) Andreas D. Baxevanis and B. F. Francis Ouellette (Editors) 2nd Edn; John Wiley & Sons, New York; 2001; ISBN: 0 471 38390 2; 470pp. US$69.95 (pbk) US$164.95 (hbk) As yet another indication that bioinformatics has come of age, this is the ®rst ever second edition of a textbook in the ®eld. And it is an excellent general bioinformatics text and reference, perhaps even the best currently available. A check with Amazon revealed that there have been many sales of the book in Maryland, particularly Bethesda, MD. This sales blip is not so surprising given that it would be only slightly unfair to subtitle the book `a guide to resources at the NCBI'. After an introduction to bioinformatics and the Internet, the next three chapters cover the NCBI data model, the NCBI's Genbank sequence database and Sequin as a method for submitting new data to Genbank at the NCBI. A whole chapter is given over to NCBI's Entrez database interrogation software with speci®c examples, a table of Boolean syntax, and many screen-shots. On the other hand, the existence of EMBL and the EBI, and the DNA Data Bank of Japan are acknowledged but only in the most peripheral manner. SRS, which many would argue does everything that Entrez does but does it more comprehensively, is not mentioned at all. Do not let all this deter you from appreciating the book; however, it is just a mildly chauvinistic Not Invented Here idiosyncrasy. Far more important than such details as how to drive the Entrez data-mining backhoe is the fact that this book covers all the topics that, most of us would agree, make up bioinformatics in the twenty-®rst century. If your favourite method or program is not dealt with here, then a suitable, perhaps even better, equivalent will be. There is coverage of pre-genomic era approaches, including a chapter on twosequence alignment, substitution matrices, dotplots and homology searching. The treatment is more than mere mechanics and makes helpful suggestions about how to separate meaningful hits from false positives and statistical artefacts. This is followed by an exposition of protein multiple sequence alignments by Geoff Barton. It is a bit locked into software written by Geoff Barton but nevertheless points out some useful general approaches and identi®es some common potential pitfalls. These issues are extended and complemented by a chapter on phylogenetic analysis by Brinkman and Leipe. This latter chapter, in contrast with the rest of the book which is commendably up to date, cites no work more recent than 1997 and I cannot believe that nothing new has happened in this ®eld in the last four years. Nevertheless, it is a good introduction to phylogeny, substitution models and tree evaluation and includes a gallop through available software. Three-dimensional structure, prediction, databases and visualisation software are also well dealt with in two separate chapters. It is a tribute to the modernity of the book that a large proportion of it is devoted to big sequence and genomic era problems, approaches and solutions. Indeed, two chapters and 50 pages deal & HENRY STEWART PUBLICATIONS 1467-5463. B R I E F I N G S I N B I O I N F O R M A T I C S . VOL 2. NO 4. 405±410. DECEMBER 2001 407 Book reviews speci®cally with comparative and largescale genome analysis. The ®rst of these chapters deals with organism-speci®c databases and shows how clusters of orthologous genes (COGs) and other resources can be used to elucidate metabolic pathways. The large-scale genome analysis chapter deals more with issues of expression level, primarily serial analysis of gene expression (SAGE) methods. For those contemplating a large-scale sequencing project, there is a short but intense chapter on sequence assembly using Staden's Gap4 software. Maps and mapping databases have their own chapter, with a speci®c section called `Complexities and pitfalls'. Baxevanis considers the multiplicity of programs to parse genes out of genomic sequence and recommends a protocol for integrating them into an effective strategy. Wolfsberg and Landsman have written a nice exposition of expressed sequence tags, their clustering, and their relevance to gene prediction, gene expression and genetic variability. This chapter has a particularly large problem set, which gives a good ¯avour of the kind of questions that are possible and productive. Indeed, almost all the chapters end with a problem set ± the answers to which reside on the publisher's web site; which also hosts hotlinks to all the WWW resources cited in each chapter and most, if not all, the ®gures. The most surprising element of the book was the ®nal chapter, which offers a primer on Perl programming as a means to solving a genomics problem typical of what bioinformatics now comprehends. Instead of using a package such as GCG or Genejockey to analyse one or a few genes, molecular biologists may well want to abstract speci®c information about, say, all 19,000 genes from Caenorhabditis elegans. The data deluge has resulted in a paradigm shift. There is little agreement about what sorts of questions or analyses are appropriate for a recently sequenced genome. The data cowboys are riding out in all directions across the information prairie, whooping and sorting, corralling 408 ideas and branding new ways of looking at the biological world. There are no standards, no packages out there. Perl is here offered as the appropriate tool for empowering open-ended user-driven curiosity. Coincidentally, Gibas and Jambeck's book makes the same judgement. I have extensive experience of computing as a foreign language: I've taught myself Fortran and Basic and suffered formal courses in PL/1, Pascal and C. I think that Perl is a far more accessible option than any of those languages for a general programming language. I suspect, however, that my computer-anxious but thoughtful and curious Head of Department will wonder why `they' cannot write a plain text biological interrogation language: more like Cobol than Corba (or Perl or even BioPerl) please. So it's congratulations to the authors, editors and publisher for producing a weighty, authoritative, readable and attractive book. The colour plate idea, which must add to production costs, is held over from the previous edition but is now largely redundant because the book is so well integrated with the web. It's so good that it will sell many copies, even if graduate students have to go without food to afford it. Andrew Lloyd INCBI, the Irish EMBnet Node Post-genome Informatics Minoru Kanehisa Oxford University Press, Oxford; 2000; ISBN 0 19 850327 X (hbk), 0 19 850326 1 (pbk); 148pp; US$35.00 £19.95 (pbk) This book can be de®ned as a treatise on computational biology, while most books currently available on this subject can be considered as handbooks. The existence of a treatise on bioinformatics suggests that, owing to genome sequencing projects and all the computational aspects involved in & HENRY STEWART PUBLICATIONS 1467-5463. B R I E F I N G S I N B I O I N F O R M A T I C S . VOL 2. NO 4. 405±410. DECEMBER 2001

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Bioinformatics - Oxford Academic