Enrich is a Python tool for analysing high throughput sequencing data from a pair of unselected and selected libraries to create fitness estimates for protein or DNA variants in each library.
Enrich (Protein Functional Analysis by Enrichment and Depletion of Variants) is an analysis pipeline for using high throughput sequencing data to assess protein sequence-function relationships. Enrich takes FASTQ files as input, translating and identifying unique protein sequences and calculating enrichment ratios between libraries for each sequence. Enrich can be run from the command line or in an interactive mode, and is capable of using paired-end read data. Each step of the pipeline can be run separately or the entire sequence of steps can be run consecutively. Enrich employs DRMAA parallelize time-consuming tasks. For a description of Enrich input, function, and output please see the following publications:
High-resolution mapping of protein sequence-function relationships. Fowler DM, Araya CL, Fleishman SJ, Kellogg EH, Stephany JJ, Baker D, Fields S. Nat Methods. 2010 Sep;7(9):741-6. Epub 2010 Aug 15.
Deep mutational scanning: assessing protein function on a massive scale. Araya CL, Fowler DM. Trends Biotechnol. 2011 May 9. [Epub ahead of print]
Protein Functional Analysis by Enrichment and Depletion of Variants (Enrich). Fowler DM, Araya CL, Fields S. manuscript in preparation (email D. Fowler for a copy).
Read the documentation
Enrich is installed using the easy_install module which is part of setuptools. Please see the documentation for instructions on how to use easy_install to install Enrich.