CorpusSearch 2.002.71

A tool that finds syntactic structures in a corpus

  Add it to your Download Basket!

 Add it to your Watch List!


Rate it!

What's new in CorpusSearch 2.002.68:

  • Added extend_span to revision software.
  • More cleaning up of "collapse".
Read full changelog
send us
an update
GPL (GNU General Public License) 
2.2/5 12
Beth Randall
ROOT \ Science
CorpusSearch is a tool that finds syntactic structures in a corpus of annotated sentence trees. It can be used as a research tool on a corpus, or as a development tool for building the corpus.

CorpusSearch 2 is a Java program that supports research in corpus linguistics. It is useful both for the construction of syntactically annotated (parsed) corpora and for searching them.

Both the input and output files of CorpusSearch are ordinary text files, with syntactic annotations in the Penn-Treebank format.


1. Download CS.jar
2. Put the file in a convenient place.
3. Open a terminal
4. Assuming that you have put CS.jar into the folder FOO, the following line will start CorpusSearch in any flavor of Unix that has Java installed (including Mac OS X):

% java -classpath /FOO/CS.jar csearch/CorpusSearch

Don't type the '%'. That stands for the terminal prompt. Note that we are assuming Unix path syntax and that FOO is a top-level directory. The classpath must give the full path, using appropriate syntax.

Last updated on February 18th, 2010

feature list requirements

#java application #research tool #corpus research #corpus #research #syntactic #structure

Add your review!