Download Java API - seqware

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Design Principles
Separation between components into a modular
system


Independent standalone modules, that are also
runnable programs
–
Collaborator wants to run srf2FastQ at home,
without a MetaDB
–
Researcher tries custom parameters, but still
track his run in the MetaDB
XML Workflows that defines jobs and data
dependencies
–
Parameterized to reuse workflows on different
Application Wrapper Interface

Application conforms to a standard interface

Developers and users to not have to understand rest of the the pipeline

Force users to adhere to best practices


Syntax, --help option

Required test harness

Verifications of input, output, parameters
Wrapped applications mustLocal
be runnable
both
Execution:
Java API:
public interface
WrapperInterface {
int init(); // Optional
int get_syntax();
int do_test();
int do_verify_input();
int do_verify_parameters();
int do_run();
int do_verify_output();
int clean_up(); // Optional
}
$ java SeqWareRunner bpostprocess --help
→ Reports get_syntax()
$ java SeqWareRunner bpostprocess input
→ Run bpostprocess on the command line
$ java SeqWareRunner bpostprocess --db input
→ Same as above, but without MetaDB feedback
$ java SeqWareRunner bpostprocess --db input --config=config.txt
$ java SeqWareRunner bpostprocess --db input -A 0 -n 8
XML Workflow

Follows DAX Standard, which is input to Pegasus

Defines jobs, arguments, configuration, and data dependencies

Defines dependencies between jobs
<?xml version="1.0" encoding="UTF-8"?>
 xmlns="http://pegasus.isi.edu/schema/DAX"
Use Java Freemarker to populate the XML template
<adag
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://pegasus.isi.edu/schema/DAX
http://pegasus.isi.edu/schema/dax-2.1.xsd" version="2.1" count="1" index="0"
name="bfast" jobCount="3" fileCount="0" childCount="2">
for each experiment
<!-- Dependencies -->
<!-- jobs -->
<job id="ID0000001" namespace="seqware" name="runner" version="0.0.1">
<argument>bfast matches %{reference_file} %{experiment}.fastq...</argument>
<profile namespace="globus" key="max_memory">24576</profile>
<profile namespace="globus" key="count">8</profile>
<uses file="%{experiment}.fastq" link="input">
<uses file="%{experiment}.bmf" link="output" transfer="false" register="false">
</job>
<job id="ID0000002" namespace="seqware" name="runner" version="0.0.1">
<argument>bfast localalign ...</argument>
<uses file="%{experiment}.bmf" link="input">
<uses file="%{experiment}.baf" link="output" transfer="false" register="false">
</job>
<job id="ID0000003" namespace="seqware" name="runner" version="0.0.1">
<argument>bfast postprocess ...</argument>
<uses file="%{experiment}.bmf" link="input">
<uses file="%{experiment}.bam" link="output" transfer="true" register="true">
</job>
.....
<child ref="ID0000002">
<parent ref="ID0000001"/>
</child>
<child ref="ID0000003">
<parent ref="ID0000001"/>
<parent ref="ID0000002"/>
</child>
</adag>
</xml>
Pegasus

Each task is a standalone application,
independently runnable





Scientific says 'how do I run Bfast'
Collaborator wants to run srf2FastQ at home, but
does not have a pipeline or Metadata DB
Researcher wants to try some custom parameters,
but we still want to try his run in the Metadata DB
Each application conforms to a standard, welldefined interface
The interface is abstract enough for users to
wrap their applications without knowing
Pegasus

Each task is a standalone application,
independently runnable





Scientific says 'how do I run Bfast'
Collaborator wants to run srf2FastQ at home, but
does not have a pipeline or Metadata DB
Researcher wants to try some custom parameters,
but we still want to try his run in the Metadata DB
Each application conforms to a standard, welldefined interface
The interface is abstract enough for users to
wrap their applications without knowing