Softpedia
 


LINUX CATEGORIES:



GLOBAL PAGES >>
NEWS ARCHIVE >>
SOFTPEDIA REVIEWS >>
MEET THE EDITORS >>
WEEK'S BEST
  • Linux Kernel 3.9.3 / 3....
  • LibreOffice 3.6.6 / 4.0.3
  • MPlayer 1.1.1
  • systemd 204
  • Arch Linux 2013.05.01
  • Blender 2.67a
  • KDE Software Compilatio...
  • CrunchBang Linux Stable...
  • Elementary OS 0.1 / 0.2...
  • SystemRescueCd 3.6.0
  • Home > Linux > Programming > Perl Modules

    Text::Ngrams 2.002

    Download button

    No screenshots available
    Downloads: 465  View global page NEW!  Tell us about an update
    User Rating:
    Rated by:
    NOT RATED
    0 user(s)
    Developer:

    License / Price:

    Last Updated:

    Category:
    Simon Cozens | More programs
    Perl Artistic License / FREE
    August 23rd, 2007, 02:05 GMT
    ROOT / Programming / Perl Modules

     Read user reviews (0)  Refer to a friend  Subscribe

    Text::Ngrams description

    A flexible Ngram analysis (for characters, words, and more).

    Text::Ngrams is a flexible Ngram analysis (for characters, words, and more).

    SYNOPSIS

    For default character n-gram analysis of string:

    use Text::Ngrams;
    my $ng3 = Text::Ngrams->new;
    $ng3->process_text('abcdefg1235678hijklmnop');
    print $ng3->to_string;
    my @ngramsarray = $ng3->get_ngrams;

    One can also feed tokens manually:

    use Text::Ngrams;
    my $ng3 = Text::Ngrams->new;
    $ng3->feed_tokens('a');
    $ng3->feed_tokens('b');
    $ng3->feed_tokens('c');
    $ng3->feed_tokens('d');
    $ng3->feed_tokens('e');
    $ng3->feed_tokens('f');
    $ng3->feed_tokens('g');
    $ng3->feed_tokens('h');

    We can choose n-grams of various sizes, e.g.:

    my $ng = Text::Ngrams->new( windowsize => 6 );

    or different types of n-grams, e.g.:

    my $ng = Text::Ngrams->new( type => byte );
    my $ng = Text::Ngrams->new( type => word );
    my $ng = Text::Ngrams->new( type => utf8 );

    To process a list of files:

    $ng->process_files('somefile.txt', 'otherfile.txt');


    This module implement text n-gram analysis, supporting several types of analysis, including character and word n-grams.

    The module Text::Ngrams is very flexible. For example, it allows a user to manually feed a sequence of any tokens. It handles several types of tokens (character, word), and also allows a lot of flexibility in automatic recognition and feed of tokens and the way they are combined in an n-gram. It counts all n-gram frequencies up to the maximal specified length. The output format is meant to be pretty much human-readable, while also loadable by the module.
    The module can be used from the command line through the script ngrams.pl provided with the package.


    Product's homepage

    Requirements:

    · Perl

      


    TAGS:

    Ngram analysis | character analyzer | word analyzer | Ngram | character | word

    Go to top

    WindowsGamesDriversMacLinuxScriptsMobileHandheldNews

    SUBMIT PROGRAM   |   ADVERTISE   |   GET HELP   |   SEND US FEEDBACK   |   RSS FEEDS   |   UPDATE YOUR SOFTWARE   |   ROMANIAN FORUM