AREM 1.0.1

Aligning Reads by Expectation-Maximization
AREM is a based on MACS (Model Based Analysis for ChIP-Seq data).

High-throughput sequencing coupled to chromatin immuno- precipitation (ChIP-Seq) is widely used in characterizing genome-wide binding patterns of transcription factors, cofactors, chromatin modifiers, and other DNA binding proteins. A key step in ChIP-Seq data analysis is to map short reads from high-throughput sequencing to a reference genome and identify peak regions enriched with short reads.

Although several methods have been proposed for ChIP-Seq analysis, most ex- isting methods only consider reads that can be uniquely placed in the reference genome, and therefore have low power for detecting peaks lo- cated within repeat sequences. Here we introduce a probabilistic ap- proach for ChIP-Seq data analysis which utilizes all reads, providing a truly genome-wide view of binding patterns.

Reads are modeled using a mixture model corresponding to K enriched regions and a null genomic background. We use maximum likelihood to estimate the locations of the enriched regions, and implement an expectation-maximization (E-M) al- gorithm, called AREM, to update the alignment probabilities of each read to different genomic locations.

For additional information, see our paper in RECOMB 2011 or visit our website:

AREM is based on the popular MACS peak caller, as described below:

With the improvement of sequencing techniques, chromatin immunoprecipitation followed by high throughput sequencing (ChIP-Seq) is getting popular to study genome-wide protein-DNA interactions. To address the lack of powerful ChIP-Seq analysis method, we present a novel algorithm, named Model-based Analysis of ChIP-Seq (MACS), for identifying transcript factor binding sites.

MACS captures the influence of genome complexity to evaluate the significance of enriched ChIP regions, and MACS improves the spatial resolution of binding sites through combining the information of both sequencing tag position and orientation. MACS can be easily used for ChIP-Seq data alone, or with control sample with the increase of specificity.

last updated on:
May 18th, 2011, 23:37 GMT
license type:
Creative Commons Attribution 
developed by:
Jake Biesinger, Daniel Newkirk, Alvin ...
ROOT \ Science and Engineering \ Bioinformatics
Download Button

In a hurry? Add it to your Download Basket!

user rating 1



Rate it!

Add your review!