Softpedia
 


LINUX CATEGORIES:



GLOBAL PAGES >>
NEWS ARCHIVE >>
SOFTPEDIA REVIEWS >>
MEET THE EDITORS >>
WEEK'S BEST
  • BackTrack 5 R2
  • Wine 1.4 / 1.5.5
  • Mozilla Firefox 12...
  • Ubuntu 11.04
  • Angry Birds 1.1.2.1
  • Ubuntu 10.04.4 LTS
  • Linux Kernel 3.4
  • Ubuntu Manual 10.10
  • Adobe Flash Player...
  • Pidgin 2.10.4
  • Home > Linux > Multimedia > Graphics

    Tesseract OCR 3.0

    Download button

    No screenshots available
    Downloads: 4,604  View global page NEW!  Tell us about an update
    User Rating:
    Rated by:
    Fair (2.5/5)
    17 user(s)
    Developer:

    License / Price:

    Last Updated:

    Category:
    Ray Smith and Tom | More programs
    The Apache License 2.0 / FREE
    October 4th, 2010, 14:04 GMT [view history]
    ROOT / Multimedia / Graphics

     Read user reviews (0)  Refer to a friend  Subscribe

    Tesseract OCR description

    Tesseract OCR is a commercial quality OCR engine originally developed at HP between 1985 and 1995.

    Tesseract OCR is a commercial quality OCR engine originally developed at HP between 1985 and 1995. In 1995, this engine was among the top 3 evaluated by UNLV. It was open-sourced by HP and UNLV in 2005. The source code will read a binary, grey or color image and output text. A tiff reader is built in that will read uncompressed TIFF images, or libtiff can be added to read compressed images.

    Supported Platforms

    The developers are regularly testing on the following platforms:

    � Ubuntu 6.06 (x86/32, x86/64)
    � Ubuntu 6.10 (x86/32, x86/64)
    � Windows (x86/32)

    Additionally, we believe that the code should be running on these other platforms, but we don't have the resources to test on them regularly:

    � recent Linux distributions (x86/32, x86/64)
    � Mac OS X (x86, PPC)

    If you're interested in supporting in supporting other platforms or languages, please get in touch with Ray Smith.


    Product's homepage

    What's New in This Release: [ read full changelog ]

    Preparations for thread safety:
    · Changed TessBaseAPI methods to be non-static
    · Created a class hierarchy for the directories to hold instance data, and began moving code into the classes.
    · Moved thresholding code to a separate class.
    · Added major new page layout analysis module.
    · Added HOCR output.
    · Added Leptonica as main image I/O and handling. Currently optional, but in future releases linking with Leptonica will be mandatory.
    · Ambiguity table rewritten to allow definite replacements in place of fix_quotes.
    · Added TessdataManager to combine data files into a single file.
    · Some dead code deleted.
    · VC++6 no longer supported. It can't cope with the use of templates.
    · Many more languages added.
    · Doxygenation of most of the function header comments.

      


    TAGS:

    OCR engine | tiff reader | read color image | Tesseract | OCR | engine



    HTML code for linking to this page:


    Go to top

    WindowsGamesDriversMacLinuxScriptsMobileHandheldNews

    SUBMIT PROGRAM   |   ADVERTISE   |   GET HELP   |   SEND US FEEDBACK   |   RSS FEEDS   |   UPDATE YOUR SOFTWARE   |   ROMANIAN FORUM