Softpedia
 


LINUX CATEGORIES:



GLOBAL PAGES >>
NEWS ARCHIVE >>
SOFTPEDIA REVIEWS >>
MEET THE EDITORS >>
WEEK'S BEST
  • Linux Kernel 3.9.3 / 3....
  • LibreOffice 3.6.6 / 4.0.3
  • MPlayer 1.1.1
  • systemd 204
  • Arch Linux 2013.05.01
  • Blender 2.67a
  • KDE Software Compilatio...
  • CrunchBang Linux Stable...
  • Elementary OS 0.1 / 0.2...
  • SystemRescueCd 3.6.0
  • Home > Linux > Programming > Perl Modules

    Text::Affixes 0.07

    Download button

    No screenshots available
    Downloads: 514  View global page NEW!  Tell us about an update
    User Rating:
    Rated by:
    Excellent (5.0/5)
    1 user(s)
    Developer:

    License / Price:

    Last Updated:

    Category:
    Jose Castro | More programs
    Perl Artistic License / FREE
    August 23rd, 2007, 10:05 GMT
    ROOT / Programming / Perl Modules

     Read user reviews (0)  Refer to a friend  Subscribe

    Text::Affixes description

    Text::Affixes - prefixes and suffixes analysis of text.

    Text::Affixes - prefixes and suffixes analysis of text.

    SYNOPSIS

    use Text::Affixes;
    my $text = "Hello, world. Hello, big world.";
    my $prefixes = get_prefixes($text);

    # $prefixes now holds
    # {
    # 3 => {
    # 'Hel' => 2,
    # 'wor' => 2,
    # }
    # }

    # or

    $prefixes = get_prefixes({min => 1, max => 2},$text);

    # $prefixes now holds
    # {
    # 1 => {
    # 'H' => 2,
    # 'w' => 2,
    # 'b' => 1,
    # },
    # 2 => {
    # 'He' => 2,
    # 'wo' => 2,
    # 'bi' => 1,
    # }
    # }


    # the use for get_suffixes is similar

    METHODS

    get_prefixes

    Extracts prefixes from text. You can specify the minimum and maximum number of characters of prefixes you want.

    Returns a reference to a hash, where the specified limits are mapped in hashes; each of those hashes maps every prefix in the text into the number of times it was found.

    By default, both minimum and maximum limits are 3. If the minimum limit is greater than the lower one, an empty hash is returned.

    A prefix is considered to be a sequence of word characters (w) in the beginning of a word (that is, after a word boundary) that does not reach the end of the word ("regular expressionly", a prefix is the $1 of /b(w+)w/).

    # extracting prefixes of size 3
    $prefixes = get_prefixes( $text );

    # extracting prefixes of sizes 2 and 3
    $prefixes = get_prefixes( {min => 2}, $text );

    # extracting prefixes of sizes 3 and 4
    $prefixes = get_prefixes( {max => 4}, $text );

    # extracting prefixes of sizes 2, 3 and 4
    $prefixes = get_prefixes( {min => 2, max=> 4}, $text);

    get_suffixes


    The get_suffixes function is similar to the get_prefixes one. You should read the documentation for that one and than come back to this point.

    A suffix is considered to be a sequence of word characters (w) in the end of a word (that is, before a word boundary) that does not start at the beginning of the word ("regular expressionly" speaking, a prefix is the $1 of /w(w+)b/).

    # extracting suffixes of size 3
    $suffixes = get_suffixes( $text );

    # extracting suffixes of sizes 2 and 3
    $suffixes = get_suffixes( {min => 2}, $text );

    # extracting suffixes of sizes 3 and 4
    $suffixes = get_suffixes( {max => 4}, $text );

    # extracting suffixes of sizes 2, 3 and 4
    $suffixes = get_suffixes( {min => 2, max=> 4}, $text);


    OPTIONS

    Apart from deciding on a minimum and maximum size for prefixes or suffixes, you can also decide on some configuration options.

    exclude_numbers

    Set to 0 if you consider numbers as part of words. Default value is 1.

    # this
    get_suffixes( {min => 1, max => 1, exclude_numbers => 0}, "Hello, but w8" );

    # returns this:
    {
    1 => {
    'o' => 1,
    't' => 1,
    '8' => 1
    }
    }

    lowercase

    Set to 1 to extract all prefixes in lowercase mode. Default value is 0.

    ATTENTION: This does not mean that prefixes with uppercased characters won't be extracted. It means they will be extracted after being lowercased.

    # this...
    get_prefixes( {min => 2, max => 2, lowercase => 1}, "Hello, hello");

    # returns this:
    {
    2 => {
    'he' => 2
    }
    }



    Product's homepage

    Requirements:

    · Python

      


    TAGS:

    text analyzer | text affixes | Perl module | suffixes | prefixes | affixes

    Go to top

    WindowsGamesDriversMacLinuxScriptsMobileHandheldNews

    SUBMIT PROGRAM   |   ADVERTISE   |   GET HELP   |   SEND US FEEDBACK   |   RSS FEEDS   |   UPDATE YOUR SOFTWARE   |   ROMANIAN FORUM