dbacl is a digramic Bayesian text classifier.
dbacl project can be used to sort incoming email into arbitrary categories such as spam, work, and play, or simply to distinguish an English text from a French text.
It fully supports international character sets, and uses sophisticated statistical models based on the Maximum Entropy Principle.
The dbacl project includes a tutorial or two, and a mathematical design paper (.ps.gz). Alternatively, browse the online manual pages for dbacl, bayesol, mailcross, mailtoe, mailfoot, mailinspect.
I have found two uses for dbacl so far:
* As an automated Bayesian email classification tool, it can recognize spam, and more generally sort incoming email into any number of categories such as work, play, etc.
* As a noise filter, it is useful during the indexing of personal document collections.
Both dbacl and its companion programs are written in C and run on UNIX/POSIX.
What's New in This Release:
· This is a hodge-podge of fixes and improvements.
· A new hypex command, the TREC 2005 options files, and an essay on chess are now in the tarball.
· Several improvements to the parsing engine were made, including a new -e char option and bugfixes.
· Compilation problems on various architectures were fixed, and libslang2 support was added.