Tesseract OCR Changelog

What's new in Tesseract OCR 3.0

Oct 4, 2010
  • Preparations for thread safety:
  • Changed TessBaseAPI methods to be non-static
  • Created a class hierarchy for the directories to hold instance data, and began moving code into the classes.
  • Moved thresholding code to a separate class.
  • Added major new page layout analysis module.
  • Added HOCR output.
  • Added Leptonica as main image I/O and handling. Currently optional, but in future releases linking with Leptonica will be mandatory.
  • Ambiguity table rewritten to allow definite replacements in place of fix_quotes.
  • Added TessdataManager to combine data files into a single file.
  • Some dead code deleted.
  • VC++6 no longer supported. It can't cope with the use of templates.
  • Many more languages added.
  • Doxygenation of most of the function header comments.