Download SCAPE-IntroductionToTaverna-myExper

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Pharmacogenomics wikipedia , lookup

Pathogenomics wikipedia , lookup

Gene therapy wikipedia , lookup

History of genetic engineering wikipedia , lookup

Public health genomics wikipedia , lookup

Oncogenomics wikipedia , lookup

Genomic imprinting wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

Nutriepigenomics wikipedia , lookup

Gene desert wikipedia , lookup

Minimal genome wikipedia , lookup

Ridge (biology) wikipedia , lookup

Biology and consumer behaviour wikipedia , lookup

Gene nomenclature wikipedia , lookup

Genome evolution wikipedia , lookup

Gene wikipedia , lookup

Mir-92 microRNA precursor family wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

Epigenetics of human development wikipedia , lookup

Gene expression programming wikipedia , lookup

Genome (book) wikipedia , lookup

RNA-Seq wikipedia , lookup

Metabolic network modelling wikipedia , lookup

Microevolution wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Gene expression profiling wikipedia , lookup

Designer baby wikipedia , lookup

Transcript
Taverna
SCAPE
Taverna and myExperiment
Tools for creating and sharing workflows
Alexandra Nenadic, David Withers
University of Manchester
Practical Tools for Digital Preservation: A Hack-a-thon
York, 28th September 2011
SCAPE
What is a workflow?
• Connecting a set of tools/services to create
automated and repeatable processing/analysis
SCAPE
Design and run workflows
Taverna
SCAPE
Taverna Features - Overview
• Access to remote, distributed or local services and
resources
• Enables service interoperability and integration
• Automates data flow between services
• Implicit iteration over data sets, list handling and
control links to determine order of service invocation
• Extensible
• Large external developer base
• Various third party plugins available
• Data and provenance collection
SCAPE
Taverna Workbench
• Graphical desktop
tool
• Drag-and-drop
services into diagram
• Connect services, run,
reconnect, rerun
• Integrates diverse set
of tools
SCAPE
Workflow Design
Available
services
Tree view of
the workflow
structure
Workflow
diagram
SCAPE
Taverna Workflows – Features in Detail
Workflow Inputs
start_position
chromosome_name
end_position
• A set of (local and remote)
services to analyze or manage
data
• Data-links connects services
genes_in_qtl
mmusculus_gene_ensembl
remove_entrez_duplicates
remove_uniprot_duplicates
merge_entrez_genes
merge_uniprot_ids
remove_Nulls
REMOVE_NULLS_2
add_ncbi_to_string
create_report
add_uniprot_to_string
Kegg_gene_ids_2
Kegg_gene_ids
concat_kegg_genes
split_gene_ids
regex_2
• i.e. output from service A is input to
service B and C
• Describes the desired dataflow
instead of process coordination
split_for_duplicates
remove_duplicate_kegg_genes
Get_pathways
Workflow Inputs
regex
gene_ids
split_by_regex
lister
get_pathways_by_genes1
Merge_pathways
•
•
•
•
concat_ids
concat_gene_pathway_ids
Merge_gene_pathways
Workflow Outputs
pathway_genes
pathway_ids
merge_pathway_list_1
merge_pathway_list_2
split_for_duplicate_pathways
remove_duplicate_ids
pathway_descriptions
gene_descriptions
merge_gene_desc
remove_nulls_3
merge_genes_and_pathways
merge_genes_and_pathways_2
merge_genes_and_pathways_3
flatten_pathway_files
remove_pathway_duplicates
merge_pathway_desc
remove_pathway_nulls
merge_patwhay_ids
remove_pathway_nulls_2
merge_kegg_references
species
kegg_pathway_release
merge_reports
getcurrentdatabase
binfo
report
ensembl_database_release
kegg_pathway_release
Nested workflows are also services
Automatic iterations
Parallelization
Can customize list handling and
control links
• Fault tolerance
Workflow Outputs
gene_descriptions
genes_pathways
merged_pathways
pathway_descriptions
pathway_ids
kegg_external_gene_reference
• Retry (with delay and back off)
• Failover (alternate services)
SCAPE
Supported Services
•
•
•
•
•
•
•
•
•
•
•
•
SOAP/WSDL Web services
REST Web services
SoapLab Web services
R statistical services
Inline Beanshell scripts
External tools and scripts (via ssh or localhost)
Spreadsheet import
XPath and text manipulation services
SADI semantic Web services
Nested workflows (workflow within workflow)
BioMoby
BioMart
• … your tool (write your own Taverna plugin)
SCAPE
Workflow Results
Progress report
Previous runs
Input data and
results per port
SCAPE
Workflow Provenance
• Information about a workflow run
• What happened?
• And when?
• Lineage tracing
• Which input produced which output
• Intermediate data
• Inputs and outputs for each workflow step
• Useful for debugging
• Saved in standard format (such as OPM)
SCAPE
Taverna is Domain-Independent
• Bioinformatics
• Biomedicine
• Chemistry
Pharmacogenomics
Association study of
Nevirapine-induced skin
rash in Thai Population
Systems Biology for
Crop research,
BioDiversity
HIV and TB research in
South Africa
Sleeping Sickness in
African Cattle
SCAPE
Taverna is Domain-Independent
•
•
•
•
Astronomy
Data and text mining
Digital content preservation (IMPACT)
Social simulations
Observing Systems
Simulation
Experiments
JPL, NASA
Astronomy &
HelioPhysics
Library Document
Preservation
British Library
SCAPE
Share, discover and reuse workflows
SCAPE
myExperiment
• http://www.myexperiment.org
• Social networking for people to share workflows and
collaborate
• Makes it easy for people to contribute to a pool of
workflows, build communities and form relationships
• Enables people to share, describe, reuse and
repurpose workflows, reduce time-to-production,
share expertise and avoid reinvention
SCAPE
myExperiment
SCAPE
Workflow Sharing, Ownership and Attribution
• myExperiment can provide a central location for workflows
from one community/group
• myExperiment allows you to say
– Who can look at your workflow
– Who can download your workflow
– Who can modify your workflow
– Who can run your workflow
• Workflow ownership and attribution
• Users do not need to start from scratch – reuse or modify
existing workflows
• Attribute/credit original author
SCAPE
Use myExperiment from Taverna
SCAPE
Training
• Tutorials and Training
• 58+ tutorials to >900 people
• >20 Universities, institutes
and networks
• Major conferences
• Summer schools
• Developer and User Days
• Annotation Jamborees
• Undergraduate and
Postgraduate
Bioinformatics in > 30
Universities
SCAPE
Taverna and SCAPE
• SCAPE preservation components/actions as services
in Taverna workflows
• Use Taverna Workbench to create and test SCAPE
preservation workflows on local data
• Then scale-up and run the workflows on a
parallelized platform using Hadoop MapReduce
• Share Taverna SCAPE workflows on myExperiment