tapir 1.0

Tally Approximations of Phylogenetic Informativeness Rapidly (TAPIR)

  Add it to your Download Basket!

 Add it to your Watch List!


Rate it!
send us
an update
BSD License 
5.0/5 1
Brant Faircloth, Jonathan Chang and Mi...
ROOT \ Science and Engineering \ Bioinformatics
tapir is a Python tool that contains programs to estimate and plot phylogenetic informativeness for large datasets.

Citing tapir

When using tapir, please cite:

- Faircloth BC, Chang J, Alfaro ME: tapir enables high throughput analysis of phylogenetic informativeness.
- Townsend JP: Profiling phylogenetic informativeness. Systematic Biol. 2007, 56:222-231.
- Pond SLK, Frost SDW, Muse SV: HyPhy: hypothesis testing using phylogenies. Bioinformatics 2005, 21:676-679.


For the moment, the easiest way to install the program is:

git clone git://github.com/faircloth-lab/tapir.git /path/to/tapir

To run tests:

cd /path/to/tapir/
python test/test_townsend_code.py


The estimate_p_i.py code calls a batch file for hyphy that is in templates/. This file needs to be in the same position relative to wherever you put estimate_p_i.py. If you install thins as above, you'll be fine, for the moment.

To run:

cd /path/to/tapir/

python tapir_compute.py Input_Folder_of_Nexus_Files/ Input.tree \
 --output Output_Directory \
 --epochs=32-42,88-98,95-105,164-174 \
 --times=37,93,100,170 \

--multiprocessing is optional, without it, each locus will be run consecutively.

If you have already run the above and saved results to your output folder (see below), you can use the pre-existing site-rate records rather than estimating those again with:

python tapir_compute.py Input_Folder_of_Site_Rate_JSON_Files/ Input.tree \
 --output Output_Directory \
 --epochs=32-42,88-98,95-105,164-174 \
 --times=37,93,100,170 \
 --multiprocessing \


tapir writes results to a sqlite database in the output directory of your choosing. This directory also holds site rate files in JSON format for each locus passed through tapir_compute.py.

You can access the results in the database as follows. For more examples, including plotting, see the documentation

- crank up sqlite:

 sqlite3 Output_Directory/phylogenetic-informativeness.sqlite

- get integral data for all epochs:

 select locus, interval, pi from loci, interval where loci.id = interval.id

- get integral data for a specific epoch:

 select locus, interval, pi from loci, interval
 where interval = '95-105' and loci.id = interval.id;

- get the count of loci having max(PI) at different epochs:

 create temporary table max as select id, max(pi) as max from interval group by id;

 create temporary table t as select interval.id, interval, max from interval, max
 where interval.pi = max.max;

 select interval, count(*) from t group by interval;


We thank Francesc Lopez-Giraldez and Jeffrey Townsend for providing us with a copy of their web-application source code. BCF thanks S Hubbell and P Gowaty.

Last updated on November 6th, 2011


#phylogenetic informativeness #Python library #phylogenetic #informativeness #estimate #plotting

Add your review!