Softpedia
 


LINUX CATEGORIES:



GLOBAL PAGES >>
NEWS ARCHIVE >>
SOFTPEDIA REVIEWS >>
MEET THE EDITORS >>
WEEK'S BEST
  • BackTrack 5 R2
  • Wine 1.4 / 1.5.5
  • Mozilla Firefox 12...
  • Ubuntu 11.04
  • Angry Birds 1.1.2.1
  • Ubuntu 10.04.4 LTS
  • Linux Kernel 3.4
  • Ubuntu Manual 10.10
  • Adobe Flash Player...
  • Pidgin 2.10.4
  • Home > Linux > Programming > Libraries

    Unihandecode 0.31

    Download button

    No screenshots available
    Downloads: 167  View global page NEW!  Tell us about an update
    User Rating:
    Rated by:
    NOT RATED
    0 user(s)
    Developer:

    License / Price:

    Last Updated:

    Category:
    Hioshi Miura | More programs
    GPL v3 / FREE
    February 19th, 2011, 22:30 GMT
    ROOT / Programming / Libraries

     Read user reviews (0)  Refer to a friend  Subscribe

    Unihandecode description

    US-ASCII transliterations of Unicode text

    Unihandecode is a fork project of unidecode, created to provide transliterations of Unicode text by its readings in each native languages in Python environment.

    Unihandecode is a fork project of unidecode to provide transliterations of Unicode text by its readings in each native languages in Python environment.
    There is a description in original unidecode(http://search.cpan.org/~sburke/Text-Unidecode-0.04/lib/Text/Unidecode.pm) said that;

    "It often happens that you have non-Roman text data in Unicode, but you can't display it -- usually because you're trying to show it to a user via an application that doesn't support Unicode, or because the fonts you need aren't accessible. You could represent the Unicode characters as "???????" or "\15BA\15A0\1610...", but that's nearly useless to the user who actually wants to read what the text says."

    What unihandecode provide is a decode(...) function that takes Unicode data and tries to represent it in US-ASCII characters. There is a simple but big problem for China, Japanese and Korean characters. In some black history, CJK characters in Unicode are share same code blocks for similar(but not same figure, pronounce and meanings) characters.
    This is why I want to add a feature on unidecode that can recognize user's preferable language and transliterate it based on its readings.

    Sean M. Burke, an original unidecode auther, said that;

    "Unidecode, in other words, is quick and dirty. Sometimes the output is not so dirty at all... But sometimes the output is very dirty: Unidecode does quite badly on Japanese and Thai."

    I am Japanese and feel bad for output of unidecode because of limitations as Sean said.
    Unihandecode provide good functionality over unidecode code base even for Japanese, Korean, Thai and more.

    There are only Python bindings now. It is based on python port of unidecode (http://pypi.python.org/pypi/Unidecode).

    The first target application is 'calibre' (http://calibre-ebook.com) that is used unidecode to generate filename from ebook's title and author.


    Product's homepage

    Requirements:

    · Python

      


    TAGS:

    US-ASCII transliterations | Unicode text | US-ASCII | ASCII | transliteration



    HTML code for linking to this page:


    Go to top

    WindowsGamesDriversMacLinuxScriptsMobileHandheldNews

    SUBMIT PROGRAM   |   ADVERTISE   |   GET HELP   |   SEND US FEEDBACK   |   RSS FEEDS   |   UPDATE YOUR SOFTWARE   |   ROMANIAN FORUM