* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download SCAPE-IntroductionToTaverna-myExper
Pharmacogenomics wikipedia , lookup
Pathogenomics wikipedia , lookup
Gene therapy wikipedia , lookup
History of genetic engineering wikipedia , lookup
Public health genomics wikipedia , lookup
Oncogenomics wikipedia , lookup
Genomic imprinting wikipedia , lookup
Therapeutic gene modulation wikipedia , lookup
Nutriepigenomics wikipedia , lookup
Gene desert wikipedia , lookup
Minimal genome wikipedia , lookup
Ridge (biology) wikipedia , lookup
Biology and consumer behaviour wikipedia , lookup
Gene nomenclature wikipedia , lookup
Genome evolution wikipedia , lookup
Mir-92 microRNA precursor family wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Epigenetics of human development wikipedia , lookup
Gene expression programming wikipedia , lookup
Genome (book) wikipedia , lookup
Metabolic network modelling wikipedia , lookup
Microevolution wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Taverna SCAPE Taverna and myExperiment Tools for creating and sharing workflows Alexandra Nenadic, David Withers University of Manchester Practical Tools for Digital Preservation: A Hack-a-thon York, 28th September 2011 SCAPE What is a workflow? • Connecting a set of tools/services to create automated and repeatable processing/analysis SCAPE Design and run workflows Taverna SCAPE Taverna Features - Overview • Access to remote, distributed or local services and resources • Enables service interoperability and integration • Automates data flow between services • Implicit iteration over data sets, list handling and control links to determine order of service invocation • Extensible • Large external developer base • Various third party plugins available • Data and provenance collection SCAPE Taverna Workbench • Graphical desktop tool • Drag-and-drop services into diagram • Connect services, run, reconnect, rerun • Integrates diverse set of tools SCAPE Workflow Design Available services Tree view of the workflow structure Workflow diagram SCAPE Taverna Workflows – Features in Detail Workflow Inputs start_position chromosome_name end_position • A set of (local and remote) services to analyze or manage data • Data-links connects services genes_in_qtl mmusculus_gene_ensembl remove_entrez_duplicates remove_uniprot_duplicates merge_entrez_genes merge_uniprot_ids remove_Nulls REMOVE_NULLS_2 add_ncbi_to_string create_report add_uniprot_to_string Kegg_gene_ids_2 Kegg_gene_ids concat_kegg_genes split_gene_ids regex_2 • i.e. output from service A is input to service B and C • Describes the desired dataflow instead of process coordination split_for_duplicates remove_duplicate_kegg_genes Get_pathways Workflow Inputs regex gene_ids split_by_regex lister get_pathways_by_genes1 Merge_pathways • • • • concat_ids concat_gene_pathway_ids Merge_gene_pathways Workflow Outputs pathway_genes pathway_ids merge_pathway_list_1 merge_pathway_list_2 split_for_duplicate_pathways remove_duplicate_ids pathway_descriptions gene_descriptions merge_gene_desc remove_nulls_3 merge_genes_and_pathways merge_genes_and_pathways_2 merge_genes_and_pathways_3 flatten_pathway_files remove_pathway_duplicates merge_pathway_desc remove_pathway_nulls merge_patwhay_ids remove_pathway_nulls_2 merge_kegg_references species kegg_pathway_release merge_reports getcurrentdatabase binfo report ensembl_database_release kegg_pathway_release Nested workflows are also services Automatic iterations Parallelization Can customize list handling and control links • Fault tolerance Workflow Outputs gene_descriptions genes_pathways merged_pathways pathway_descriptions pathway_ids kegg_external_gene_reference • Retry (with delay and back off) • Failover (alternate services) SCAPE Supported Services • • • • • • • • • • • • SOAP/WSDL Web services REST Web services SoapLab Web services R statistical services Inline Beanshell scripts External tools and scripts (via ssh or localhost) Spreadsheet import XPath and text manipulation services SADI semantic Web services Nested workflows (workflow within workflow) BioMoby BioMart • … your tool (write your own Taverna plugin) SCAPE Workflow Results Progress report Previous runs Input data and results per port SCAPE Workflow Provenance • Information about a workflow run • What happened? • And when? • Lineage tracing • Which input produced which output • Intermediate data • Inputs and outputs for each workflow step • Useful for debugging • Saved in standard format (such as OPM) SCAPE Taverna is Domain-Independent • Bioinformatics • Biomedicine • Chemistry Pharmacogenomics Association study of Nevirapine-induced skin rash in Thai Population Systems Biology for Crop research, BioDiversity HIV and TB research in South Africa Sleeping Sickness in African Cattle SCAPE Taverna is Domain-Independent • • • • Astronomy Data and text mining Digital content preservation (IMPACT) Social simulations Observing Systems Simulation Experiments JPL, NASA Astronomy & HelioPhysics Library Document Preservation British Library SCAPE Share, discover and reuse workflows SCAPE myExperiment • http://www.myexperiment.org • Social networking for people to share workflows and collaborate • Makes it easy for people to contribute to a pool of workflows, build communities and form relationships • Enables people to share, describe, reuse and repurpose workflows, reduce time-to-production, share expertise and avoid reinvention SCAPE myExperiment SCAPE Workflow Sharing, Ownership and Attribution • myExperiment can provide a central location for workflows from one community/group • myExperiment allows you to say – Who can look at your workflow – Who can download your workflow – Who can modify your workflow – Who can run your workflow • Workflow ownership and attribution • Users do not need to start from scratch – reuse or modify existing workflows • Attribute/credit original author SCAPE Use myExperiment from Taverna SCAPE Training • Tutorials and Training • 58+ tutorials to >900 people • >20 Universities, institutes and networks • Major conferences • Summer schools • Developer and User Days • Annotation Jamborees • Undergraduate and Postgraduate Bioinformatics in > 30 Universities SCAPE Taverna and SCAPE • SCAPE preservation components/actions as services in Taverna workflows • Use Taverna Workbench to create and test SCAPE preservation workflows on local data • Then scale-up and run the workflows on a parallelized platform using Hadoop MapReduce • Share Taverna SCAPE workflows on myExperiment