snakemake For Linux

4.7/5 1

Last updated: Jul 26, 2012 MIT/X Consortium License

SOFTPEDIA® DOWNLOAD NOW 248 downloads so far

A Python based language and execution environment for make-like workflows. #Execution environment #Make-like workflows #Python #Execution #Environment #Make

Description

features

Free Download

Build systems like make are frequently used to create complicated workflows, e.g. in bioinformatics. snakemake aims to reduce the complexity of creating workflows by providing a clean and modern domain specific specification language (DSL) in python style, together with a fast and comfortable execution environment.

- On Ubuntu 12.04, you can install the Debian package python3-snakemake available in our launchpad repository. - On other systems, you need a working installation of Python >= 3.2. Depending on your system, you can then install snakemake by issuing either easy_install snakemake or easy_install3 snakemake in the command line. If you don't have administrator priviledges, have a look at the argument --user of easy_install. - Finally, snakemake can be manually installed by downloading the source code archive from pypi.

Snakemake offers a simple DSL to describe workflows that create files in several subsequent steps: samples = ["01", "02"]

# optionally define a directory where the work should be done. workdir: "path/to/workdir"

# similar to make, define dummy rules that act as build targets. rule all: input: "diffexpr.tsv", ...

rule summarize: input: "{sample}.mapped.bam".format(sample = s) for s in samples output: "diffexpr.tsv" run: #... provide some python code to produce the output from the input files #e.g. access input files by index input[1] # access wildcard values wildcards.sample # easily run shell commands automatically using your default shell while having direct access # to all local and global variables via the format minilanguage threads = 6 shell("somecommand --threads {threads} {input[0]} {output[0]}")

rule map_reads: # assign names for input and output files input: reads = "{sample}.fastq", hg19 = "hg19.fasta" # mark output files to be write-protected after creation output: mapped = protected("{sample}.mapped.sai") # Optionally define messages that are displayed instead of generic rule description on execution of the rule: message: "Mapping reads to {input.hg19}" threads: 8 shell: # directly provide shell commands (in a multi or single line string) if python syntax is not needed. # again, global and local variables can be accessed via the format minilanguage. # Further, number of threads used by the rule can be specified. The snakemake scheduler ensures that the rule is run with the specified number of threads if enough cores are made available via the -j command line option. """ bwa aln -t {threads} {input.hg19} {input.reads} > {output.mapped} some --other --command """

Given a "Snakefile" with such a syntax, the workflow can be executed (e.g. using up to 6 parallel processes) by issueing:

snakemake -j6 -s Snakefile

For more details please see the Tutorial.