Softpedia
 


LINUX CATEGORIES:



GLOBAL PAGES >>
NEWS ARCHIVE >>
SOFTPEDIA REVIEWS >>
MEET THE EDITORS >>
WEEK'S BEST
  • BackTrack 5 R2
  • Wine 1.4 / 1.5.5
  • Mozilla Firefox 12...
  • Ubuntu 11.04
  • Angry Birds 1.1.2.1
  • Ubuntu 10.04.4 LTS
  • Linux Kernel 3.4
  • Ubuntu Manual 10.10
  • Adobe Flash Player...
  • Pidgin 2.10.4
  • Home > Linux > Text Editing&Processing > Others

    guess-language 0.2

    Download button

    No screenshots available
    Downloads: 161  View global page NEW!  Tell us about an update
    User Rating:
    Rated by:
    NOT RATED
    0 user(s)
    Developer:

    License / Price:

    Last Updated:

    Category:
    Kent Johnson | More programs
    LGPL / FREE
    August 2nd, 2010, 23:06 GMT
    ROOT / Text Editing&Processing / Others

     Read user reviews (0)  Refer to a friend  Subscribe

    guess-language description

    Guess the natural language of a text

    guess-language attempts to determine the natural language of a selection of Unicode (utf-8) text.

    Based on guesslanguage.cpp by Jacob R Rideout for KDE which itself is based on Language::Guess by Maciej Ceglowski.

    Detects over 60 languages - all languages listed in the trigrams directory plus Japanese, Chinese, Korean and Greek.

    guess_language uses heuristics based on the character set and trigrams in a sample text to detect the language. It works better with longer samples and will be confused if the sample text includes markup such as HTML tags.
    Usage

    The main entry points all take a single string as input and return a language identifier. The string must be Unicode or UTF-8 text. The language identifer can be the language name in English, the two- or three-letter IANA language code, a language ID or a tuple containing all three codes.

    The primary entry points, and the return values, are as follows:

    guessLanguage(txt) - IANA language code
    guessLanguageTag(txt) - IANA language code (same as guessLanguage)
    guessLanguageName(txt) - Language name in English
    guessLanguageId(txt) - language ID
    guessLanguageInfo(txt) - tuple of (IANA code, id, name)



    Product's homepage

    Requirements:

    · Python

      


    TAGS:

    natural language | Unicode text | Unicode | natural | language



    HTML code for linking to this page:


    Go to top

    WindowsGamesDriversMacLinuxScriptsMobileHandheldNews

    SUBMIT PROGRAM   |   ADVERTISE   |   GET HELP   |   SEND US FEEDBACK   |   RSS FEEDS   |   UPDATE YOUR SOFTWARE   |   ROMANIAN FORUM