Download The Microsoft Biology Foundation and its Applications Simon Mercer

yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts

History of biology wikipedia, lookup

Biology wikipedia, lookup

The Microsoft Biology Foundation
and its Applications
Simon Mercer
Director for Health & Wellbeing
Microsoft External Research
Ontology Add-in for Word
Services: Ontology
download web service
• John Wilbanks
Intent: Term recognition
& disambiguation
• Phil Bourne
• Lynn Fink
Ontology browser
Source code and binary:
Binary and source code:
3D Molecule Viewer
•PDB File Viewer
•Written in C# using WPF
Binary and source code:
The Trident Scientific Workflow Workbench
A visual workflow environment that allows researchers to better manage, evaluate
and interact with even the most complex scientific datasets
Built on top of Windows
Workflow Foundation
Write once, deploy and
run anywhere…
Visually program
Libraries of activities and
Automatic provenance
Available at:
Origins of a Platform
Previous bioinformatics project outputs
Jaroslav Pillardy, Computational Biology Service Unit, Cornell University
BioHPC: Suite of 28 applications modified and adapted for efficient use in an
Windows HPC environment with ASP.NET interface
Currently supports the areas of DNA sequence analysis, protein structure
prediction, population genetics and phylogenetics
Jim Hogan, SilverMap: Queensland University of Technology
MQUTer supports research into bioinformatics, sensor networks, visualization
and parallelism on the Microsoft platform
Six new tools – the latest under development using MBF and Silverlight 3 which
visualizes DNA sequence similarity and is integrated into MBF (and will shortly
be available as an Excel plug-in)
Robin Gutell, Center for Computational Biology and Bioinf., UT Austin
Suite of tools to explore evolutionary relationships and predict function of RNA
Available as a website – also a complementary open-source suite of Windowsbased tools, under development using MBF (H1 FY11)
+ Cancer Bioinformatics in ER
Marty Humphrey, Department of Computer Science, University of Virginia
The caBIG platform connects consumers, the care delivery system, and the research
community. Close to 60 NCI-designated Cancer Centers are deploying caBIG®
infrastructure and tools, as are 16 Community Cancer Centers that in the aggregate
touch 20 million lives.
This project pilots caBIG clients on Windows, leveraging and extending MBF, and
tutorials demonstrating the value of Microsoft technologies to the caBIG developer and
user community.
Fighting HIV and AIDS
• Four-year collaboration between Bruce Walker
at Harvard and David Heckerman’s team
(Microsoft Research)
• Discovered three key insights to fight HIV:
– Immune system is led astray by decoy
epitopes (Nature Medicine, 2006)
– Frameshift epitopes exist (JEM, 2010)
– Natural killer cells directly attack HIV (Nature
Medicine, in review)
• 40+ publications, including Nature and Science
• Walker has obtained $110M+ subsequent
• PhyloD.Net, a tool for inferring HIV evolution in
an individual, is used by 100+ HIV researchers
and is now part of Microsoft Biology Foundation
• Numerous press stories including Business Week
and NPR
Convergence on a Strategic Platform for
Microsoft Biology
• Beta 1: Nov 5, 2009 (MS Connect)
• Beta 2: Feb 10, 2010 (CodePlex)
• V1 release: July 2010
• Early adopters from industry and
Azure engagement through XCG
(Azure BLAST, PhyloD services)
Product engagement and
prototyping use by TC, HSG
• Bio-IT Alliance partner
• Leveraging Microsoft assets: Pivot,
NodeXL, TRIDENT, Iron Python, etc
• Showcasing Microsoft products:
Excel/Office, Visual Studio 2010, .NET
4.0, WPF, Silverlight
• V1 launch June 2010
• Keynote presentations
• Training course in prep
• Community ownership
• Foundation of future MSR
genomics projects
• Foundation of all future ER
genomics engagements with
What is The Microsoft Biology Foundation?
An open-source library of reusable bioinformatics
algorithms, services and functions built on the .NET
 Easy to parallelize algorithms
 Easy to distribute computations and workflows
 Easy to visualize massive data sets
 Ability to leverage greater strength from existing use of
other MS technologies
 Provides transition from local to cloud-based computation
and data storage
Architecture: Namespaces
• Sequences
• Alphabets
• Alignments
• Genomic Intervals
• Phylogeny
• GenBank
• Translation
• Alignment
• Sequence Assembly
• ClustalW
• BioHPC
• Modular by design
• Commonly used features
• Exceptionally welldocumented
• Extensible
• Interoperable
Initial Areas of Focus
• Genomics
– Sequencing
– Analysis and Annotation
• Advanced Research
– Phylogenetics
– Genome Wide Association
– Haplotype reconstruction
• Next Targets
– Visualization
– Large data sets
• Open Source
Available free of charge for commercial and noncommercial use and modification under the MS-PL
license (
• Community-Developed
Moved to CodePlex, Creating advisory board and
building a community
• Community-Curated
Modify code, find bugs, contribute new features
• V1 Release
Late June 2010
Different Styles of Usage
• Build executables
– Visual Studio
• Office add-in
– BioExcel
• Commandline scripting access
– Iron Python, PowerShell
• Workflow Activities
– Trident, WF
• Services on the Cloud
– Azure
Selecting Restriction Endonucleases: DNA PReDuST
(Aditi Technologies)
Fragment Size Distribution Graph
Restriction Map [Circular DNA]
Computational Biology
Service Unit
Computational Biology Applications Suite
for High Performance Computing (BioHPC)
MBF Team
Microsoft Research
Vivek Kumar
Illumina Corporation
Robin Gutell
Aditi Technologies
Jim Hogan
University of Texas at Austin
Jarek Pillardy
Queensland University of Technology
David Heckerman, Bob Davidson, Carl Kadie, Yogesh Simmhan,
Jennifer Listgarten, Jonathan Carlson
Cornell University
Mike Zyskowski, Chris Wu
Scott Kahn
Johnson & Johnson Pharmaceutical Research Division LLC.
Dimitris Agrafiotis, Victor Lobanov, Jeremy Kolpak
© 2008 Microsoft Corporation. All rights reserved.
Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.
The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to
changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date
of this presentation.