Softpedia
 


LINUX CATEGORIES:



GLOBAL PAGES >>
NEWS ARCHIVE >>
SOFTPEDIA REVIEWS >>
MEET THE EDITORS >>
WEEK'S BEST
  • BackTrack 5 R2
  • Wine 1.4 / 1.5.5
  • Mozilla Firefox 12...
  • Ubuntu 11.04
  • Angry Birds 1.1.2.1
  • Ubuntu 10.04.4 LTS
  • Linux Kernel 3.4
  • Ubuntu Manual 10.10
  • Adobe Flash Player...
  • Pidgin 2.10.4
  • Home > Linux > Programming > Perl Modules

    AI::Categorizer 0.09

    Download button

    No screenshots available
    Downloads: 394  View global page NEW!  Tell us about an update
    User Rating:
    Rated by:
    Very Good (4.0/5)
    2 user(s)
    Developer:

    License / Price:

    Last Updated:

    Category:
    Ken Williams | More programs
    GPL / FREE
    March 3rd, 2008, 11:38 GMT
    ROOT / Programming / Perl Modules

     Read user reviews (0)  Refer to a friend  Subscribe

    AI::Categorizer description

    AI::Categorizer is a Perl module for automatic text categorization.

    AI::Categorizer is a Perl module for automatic text categorization.

    SYNOPSIS

    use AI::Categorizer;
    my $c = new AI::Categorizer(...parameters...);

    # Run a complete experiment - training on a corpus, testing on a test
    # set, printing a summary of results to STDOUT
    $c->run_experiment;

    # Or, run the parts of $c->run_experiment separately
    $c->scan_features;
    $c->read_training_set;
    $c->train;
    $c->evaluate_test_set;
    print $c->stats_table;

    # After training, use the Learner for categorization
    my $l = $c->learner;
    while (...) {
    my $d = ...create a document...
    my $hypothesis = $l->categorize($d); # An AI::Categorizer::Hypothesis object
    print "Assigned categories: ", join ', ', $hypothesis->categories, "n";
    print "Best category: ", $hypothesis->best_category, "n";
    }

    AI::Categorizer is a framework for automatic text categorization. It consists of a collection of Perl modules that implement common categorization tasks, and a set of defined relationships among those modules. The various details are flexible - for example, you can choose what categorization algorithm to use, what features (words or otherwise) of the documents should be used (or how to automatically choose these features), what format the documents are in, and so on.

    The basic process of using this module will typically involve obtaining a collection of pre-categorized documents, creating a "knowledge set" representation of those documents, training a categorizer on that knowledge set, and saving the trained categorizer for later use. There are several ways to carry out this process. The top-level AI::Categorizer module provides an umbrella class for high-level operations, or you may use the interfaces of the individual classes in the framework.

    A simple sample script that reads a training corpus, trains a categorizer, and tests the categorizer on a test corpus, is distributed as eg/demo.pl .
    Disclaimer: the results of any of the machine learning algorithms are far from infallible (close to fallible?). Categorization of documents is often a difficult task even for humans well-trained in the particular domain of knowledge, and there are many things a human would consider that none of these algorithms consider. These are only statistical tests - at best they are neat tricks or helpful assistants, and at worst they are totally unreliable. If you plan to use this module for anything really important, human supervision is essential, both of the categorization process and the final results.

    For the usage details, please see the documentation of each individual module.

    Product's homepage

    Requirements:

    · Perl

      


    TAGS:

    text categorization | automatic categorization | Perl module | AI | text | categorization



    HTML code for linking to this page:


    Go to top

    WindowsGamesDriversMacLinuxScriptsMobileHandheldNews

    SUBMIT PROGRAM   |   ADVERTISE   |   GET HELP   |   SEND US FEEDBACK   |   RSS FEEDS   |   UPDATE YOUR SOFTWARE   |   ROMANIAN FORUM