Softpedia
 


LINUX CATEGORIES:



GLOBAL PAGES >>
NEWS ARCHIVE >>
SOFTPEDIA REVIEWS >>
MEET THE EDITORS >>
WEEK'S BEST
  • Linux Kernel 3.9.2 / 3....
  • LibreOffice 3.6.6 / 4.0.3
  • MPlayer 1.1.1
  • systemd 204
  • Arch Linux 2013.05.01
  • Blender 2.67
  • KDE Software Compilatio...
  • CrunchBang Linux Stable...
  • Elementary OS 0.1 / 0.2...
  • SystemRescueCd 3.6.0
  • Home > Linux > Text Editing&Processing > Others

    NoAho 0.9.02

    Download button

    No screenshots available
    Downloads: 127  Tell us about an update
    User Rating:
    Rated by:
    NOT RATED
    0 user(s)
    Developer:

    License / Price:

    Last Updated:

    Category:
    Jeff Donner | More programs
    MIT/X Consortium Lic... / FREE
    March 21st, 2012, 14:33 GMT
    ROOT / Text Editing&Processing / Others

     Read user reviews (0)  Refer to a friend  Subscribe

    NoAho description

    Non-Overlapping Aho-Corasick Trie

    NoAho provides fast, non-overlapping simultaneous multiple keyword search.

    Features:
    - 'short' and 'long' (longest matching key) searches, both one-off and iteration over all non-overlapping keyword matches in some text.
    - Works with both unicode and str in Python 2, and unicode in Python 3 (it's all UCS4 under the hood).
    - Allows you to associate an arbitrary Python object payload with each keyword, and supports dict operations len(), [], and 'in' for the keywords (though no del or traversal).
    - Does the 'compilation' (generation of Aho-Corasick failure links) of the trie on-demand; you can mix adding keywords and searching text freely.
    - Can be used commercially, it's under the minimal, MIT license.

    Anti-Features:
    - Will not find overlapped keywords (eg given keywords "abcde" and 'defgh", will not find "defgh" in "abcdefgh"; would find both in "abcdedefgh"), unless you move along the string manually, one character at a time, which would defeat the purpose. The package 'Acora' is an alternative package for this use.
    - Lacking overlap, find[all]_short is kind of useless.
    - Lacks key iteration and deletion from the mapping (dict) protocol
    - Memory leaking untested (should be ok but ...)
    - No /testcase/ for unicode in Python 2 (did manual test however)
    - Unicode chars represented as ucs4, and, each character has its own hashtable, so it's relatively memory-heavy.
    - Requires a C++ compiler.

    Bug reports and patches welcome of course!


    Product's homepage

    Requirements:

    · Python

      


    TAGS:

    Non-Overlapping Aho-Corasick | Python library | keyword search | Python | Non-Overlapping | Aho-Corasick

    Go to top

    WindowsGamesDriversMacLinuxScriptsMobileHandheldNews

    SUBMIT PROGRAM   |   ADVERTISE   |   GET HELP   |   SEND US FEEDBACK   |   RSS FEEDS   |   UPDATE YOUR SOFTWARE   |   ROMANIAN FORUM