Softpedia
 


LINUX CATEGORIES:



GLOBAL PAGES >>
NEWS ARCHIVE >>
SOFTPEDIA REVIEWS >>
MEET THE EDITORS >>
WEEK'S BEST
  • Linux Kernel 3.9.6 / 3....
  • Linux Kernel 3.0.82 LTS...
  • KDE Software Compilatio...
  • PulseAudio 4.0
  • Wireshark 1.10.0
  • NetworkManager 0.9.8.2
  • LibreOffice 3.6.6 / 4.0...
  • SystemRescueCd 3.7.0
  • Linux Kernel 3.10 RC6
  • Ubuntu Tweak 0.8.5
  • Home > Linux > Programming > Libraries

    translitcodec 0.3

    Download button

    No screenshots available
    Downloads: 334  View global page NEW!  Tell us about an update
    User Rating:
    Rated by:
    NOT RATED
    0 user(s)
    Developer:

    License / Price:

    Last Updated:

    Category:
    Jason Kirtland | More programs
    MIT/X Consortium Lic... / FREE
    February 21st, 2012, 01:49 GMT [view history]
    ROOT / Programming / Libraries

     Read user reviews (0)  Refer to a friend  Subscribe

    translitcodec description

    Unicode to 8-bit charset transliteration codec

    translitcodec is a Python library that contains codecs for transliterating ISO 10646 texts into best-effort representations using smaller coded character sets (ASCII, ISO 8859, etc.). The translation tables used by the codecs are from the ``transtab`` collection by Markus Kuhn.

    Three types of transliterating codecs are provided:

    "long", using as many characters as needed to make a natural replacement. For example, u00e4 LATIN SMALL LETTER A WITH DIAERESIS ``ä`` will be replaced with ``ae``.

    "short", using the minimum number of characters to make a replacement. For example, u00e4 LATIN SMALL LETTER A WITH DIAERESIS ``ä`` will be replaced with ``a``.

    "one", only performing single character replacements. Characters that can not be transliterated with a single character are passed through unchanged. For example, u2639 WHITE FROWNING FACE ``☹`` will be passed through unchanged.

    Using the codecs is simple:

    >>> import translitcodec
    >>> u'fácil € ☺'.encode('translit/long')
    u'facil EUR :-)'
    >>> u'fácil € ☺'.encode('translit/short')
    u'facil E :-)'


    The codecs return Unicode by default. To receive a bytestring back, either chain the output of encode() to another codec, or append the name of the desired byte encoding to the codec name:

    >>> u'fácil € ☺'.encode('translit/one').encode('ascii', 'replace')
    'facil E ?'
    >>> u'fácil € ☺'.encode('translit/one/ascii', 'replace')
    'facil E ?'


    The package also supplies a 'transliterate' codec, an alias for 'translit/long'.

    Product's homepage

    Requirements:

    · Python

    What's New in This Release: [ read full changelog ]

    · Fixes to the transtab table rebuilding tool.
    · Added translitcodec.__version__

      


    TAGS:

    transliterating text | coded character | ISO 8859 | ASCII | Python | transliteration

    Go to top

    WindowsGamesDriversMacLinuxScriptsMobileHandheldNews

    SUBMIT PROGRAM   |   ADVERTISE   |   GET HELP   |   SEND US FEEDBACK   |   RSS FEEDS   |   UPDATE YOUR SOFTWARE   |   ROMANIAN FORUM