snakemake

  223 downloads
2.5 MIT/X Consortium License    
4.7/5 1
A Python based language and execution environment for make-like workflows

description

download

specifications

Build systems like make are frequently used to create complicated workflows, e.g. in bioinformatics. snakemake aims to reduce the complexity of creating workflows by providing a clean and modern domain specific specification language (DSL) in python style, together with a fast and comfortable execution environment.

Installation

- On Ubuntu 12.04, you can install the Debian package python3-snakemake available in our launchpad repository.
- On other systems, you need a working installation of Python >= 3.2. Depending on your system, you can then install snakemake by issuing either easy_install snakemake or easy_install3 snakemake in the command line. If you don't have administrator priviledges, have a look at the argument --user of easy_install.
- Finally, snakemake can be manually installed by downloading the source code archive from pypi.

Usage

Snakemake offers a simple DSL to describe workflows that create files in several subsequent steps:

samples = ["01", "02"]

# optionally define a directory where the work should be done.
workdir: "path/to/workdir"

# similar to make, define dummy rules that act as build targets.
rule all:
 input: "diffexpr.tsv", ...

rule summarize:
 input: "{sample}.mapped.bam".format(sample = s) for s in samples
 output: "diffexpr.tsv"
 run:
 #... provide some python code to produce the output from the input files
 #e.g. access input files by index
 input[1]
 # access wildcard values
 wildcards.sample
 # easily run shell commands automatically using your default shell while having direct access
 # to all local and global variables via the format minilanguage
 threads = 6
 shell("somecommand --threads {threads} {input[0]} {output[0]}")

rule map_reads:
 # assign names for input and output files
 input: reads = "{sample}.fastq", hg19 = "hg19.fasta"
 # mark output files to be write-protected after creation
 output: mapped = protected("{sample}.mapped.sai")
 # Optionally define messages that are displayed instead of generic rule description on execution of the rule:
 message: "Mapping reads to {input.hg19}"
 threads: 8
 shell:
 # directly provide shell commands (in a multi or single line string) if python syntax is not needed.
 # again, global and local variables can be accessed via the format minilanguage.
 # Further, number of threads used by the rule can be specified. The snakemake scheduler ensures that the rule is run with the specified number of threads if enough cores are made available via the -j command line option.
 """
 bwa aln -t {threads} {input.hg19} {input.reads} > {output.mapped}
 some --other --command
 """


Given a "Snakefile" with such a syntax, the workflow can be executed (e.g. using up to 6 parallel processes) by issueing:

 snakemake -j6 -s Snakefile

For more details please see the Tutorial.
READ MORE   
Last updated on July 26th, 2012

0 User reviews so far.

SUBMIT