Softpedia
 


LINUX CATEGORIES:



GLOBAL PAGES >>
NEWS ARCHIVE >>
SOFTPEDIA REVIEWS >>
MEET THE EDITORS >>
WEEK'S BEST
  • BackTrack 5 R2
  • Wine 1.4 / 1.5.5
  • Mozilla Firefox 12...
  • Ubuntu 11.04
  • Angry Birds 1.1.2.1
  • Ubuntu 10.04.4 LTS
  • Linux Kernel 3.4
  • Ubuntu Manual 10.10
  • Adobe Flash Player...
  • Pidgin 2.10.4
  • Home > Linux > Utilities

    lang-detect 0.0.1

    Download button

    No screenshots available
    Downloads: 92  View global page NEW!  Tell us about an update
    User Rating:
    Rated by:
    NOT RATED
    0 user(s)
    Developer:

    License / Price:

    Last Updated:

    Category:
    Mingli Yuan | More programs
    BSD License / FREE
    August 20th, 2011, 01:06 GMT
    ROOT / Utilities

     Read user reviews (0)  Refer to a friend  Subscribe

    lang-detect description

    A tool to detecting the language for a small piece of unicode text without any dependency to other libraries

    lang-detect is a Python tool to detect language.

    Detecting the language for a small piece of unicode text without any dependency to other libraries.

    Currently we support detecting de, en, es, fr, it, ja, nl, pl, ru, zh-hans, zh-hant, and zh-yue.

    After some simple testing, we found that the result for long sentence is better.

    Method

    We focus on the Basic Multilingual Plane in unicode encoding, and current language support set could be extended.

    For each language, we use a uniformed ngram vector to represent the language itself. This vector can be seen at the data folder.

    When we detect a text, we generate the uniformed ngram vector for this text, and just comparing the cosine value of the angle between the text vector and the language vector.

    To get the language vector, we use feature articles on Wikipedia as corpus.

    Usage

    cd to the project root

    bin/langdetect YOUR_SENTENCE_HERE


    Product's homepage

    Requirements:

    · Python

      


    TAGS:

    language detector | unicode text | language | detector | unicode



    HTML code for linking to this page:


    Go to top

    WindowsGamesDriversMacLinuxScriptsMobileHandheldNews

    SUBMIT PROGRAM   |   ADVERTISE   |   GET HELP   |   SEND US FEEDBACK   |   RSS FEEDS   |   UPDATE YOUR SOFTWARE   |   ROMANIAN FORUM