tapir 1.0

Tally Approximations of Phylogenetic Informativeness Rapidly (TAPIR)
tapir is a Python tool that contains programs to estimate and plot phylogenetic informativeness for large datasets.

Citing tapir

When using tapir, please cite:

- Faircloth BC, Chang J, Alfaro ME: tapir enables high throughput analysis of phylogenetic informativeness.
- Townsend JP: Profiling phylogenetic informativeness. Systematic Biol. 2007, 56:222-231.
- Pond SLK, Frost SDW, Muse SV: HyPhy: hypothesis testing using phylogenies. Bioinformatics 2005, 21:676-679.

Installation

For the moment, the easiest way to install the program is:

git clone git://github.com/faircloth-lab/tapir.git /path/to/tapir

To run tests:

cd /path/to/tapir/
python test/test_townsend_code.py


Use

The estimate_p_i.py code calls a batch file for hyphy that is in templates/. This file needs to be in the same position relative to wherever you put estimate_p_i.py. If you install thins as above, you'll be fine, for the moment.

To run:

cd /path/to/tapir/

python tapir_compute.py Input_Folder_of_Nexus_Files/ Input.tree \
 --output Output_Directory \
 --epochs=32-42,88-98,95-105,164-174 \
 --times=37,93,100,170 \
 --multiprocessing


--multiprocessing is optional, without it, each locus will be run consecutively.

If you have already run the above and saved results to your output folder (see below), you can use the pre-existing site-rate records rather than estimating those again with:

python tapir_compute.py Input_Folder_of_Site_Rate_JSON_Files/ Input.tree \
 --output Output_Directory \
 --epochs=32-42,88-98,95-105,164-174 \
 --times=37,93,100,170 \
 --multiprocessing \
 --site-rates


Results

tapir writes results to a sqlite database in the output directory of your choosing. This directory also holds site rate files in JSON format for each locus passed through tapir_compute.py.

You can access the results in the database as follows. For more examples, including plotting, see the documentation

- crank up sqlite:

 sqlite3 Output_Directory/phylogenetic-informativeness.sqlite

- get integral data for all epochs:

 select locus, interval, pi from loci, interval where loci.id = interval.id

- get integral data for a specific epoch:

 select locus, interval, pi from loci, interval
 where interval = '95-105' and loci.id = interval.id;


- get the count of loci having max(PI) at different epochs:

 create temporary table max as select id, max(pi) as max from interval group by id;

 create temporary table t as select interval.id, interval, max from interval, max
 where interval.pi = max.max;

 select interval, count(*) from t group by interval;


Acknowledgements

We thank Francesc Lopez-Giraldez and Jeffrey Townsend for providing us with a copy of their web-application source code. BCF thanks S Hubbell and P Gowaty.

last updated on:
November 6th, 2011, 11:37 GMT
price:
FREE!
developed by:
Brant Faircloth, Jonathan Chang and Mi...
license type:
BSD License 
category:
ROOT \ Science and Engineering \ Bioinformatics

FREE!

In a hurry? Add it to your Download Basket!

user rating 1

UNRATED
5.0/5
 

0/5

Add your review!

SUBMIT