Gamera 3.2.6

A document-recognition programming framework.
  3 Screenshots
Gamera project is a framework for the creation of structured document analysis applications by domain experts. Domain experts are individuals who have a strong knowledge of the documents in a collection, but may not have a formal technical background.

The goal is to create a tool that leverages their knowledge of the target documents to create custom applications rather than attempting to meet diverse requirements with a monolithic application.

This paper gives an overview of the architecture and design principles of Gamera.

Developing recognition systems for difficult historical documents requires experimentation since the solution is often non-obvious. Therefore, Gamera's primary goal is to support an efficient test-and-refine development cycle.

Virtually every implementation detail is driven by this goal. For instance, Python [Rossum2002] was chosen as the core language because of its introspection capabilities, dynamic typing and ease of use. It has been used as a first programming language with considerable success [Berehzny2001].

C++ is used to write plugins where runtime performance is a priority, but even in that case, the Gamera plugin system is designed to make writing extensions as easy as possible. Gamera includes a full-fledged graphical user interface that provides a number of shortcuts for training, as well as inspection of the results of algorithms at every step.

By improving the ease of experimentation, we hope to put the power to develop recognition systems with those who understand the documents best. We expect at least two kinds of developers to work with the system: those with a technical background adding algorithms to the system, and those working on the higher-level aggregation of those pieces. It is important to note this distinction, since those groups represent different skill sets and requirements.

In addition to its support of test-and-refine development, Gamera also has several other advantages that are important to large-scale digitization projects in general. These are:

· Open source code and standards-compliance so that the software can interact well with other parts of a digitization framework
· Platform independence, running on a variety of operating systems including Linux, Microsoft Windows and Mac OS-X
· A workflow system to combine high-level tasks
· Batch processing
· A unit-testing framework to ensure correctness and avoid regression
· User interface components for development and classifier training
· Recognition confidence output so that collection managers can easily target documents that need correction or different recognition strategies.

Gamera has a modular plugin architecture. These modules typically perform one of five document recognition tasks:

1. Pre-processing
2. Document segmentation and analysis
3. Symbol segmentation and classification
4. Syntactical or structural analysis
5. Output

Each of these tasks can be arbitrarily complex, involve multiple strategies or modules, or be removed entirely depending on the specific recognition problem at hand. The actual steps that make up a complete recognition system are completely controlled by the user.

Pre-processing involves standard image-processing operations such as noise removal, blurring, de-skewing, contrast adjustment, sharpening, binarization, and morphology. Close attention to and refinement of these steps is particularly important when working with degraded historical documents.

last updated on:
June 24th, 2010, 8:21 GMT
license type:
GPL (GNU General Public License) 
developed by:
Michael Droettboom
ROOT \ Science and Engineering \ Image Recognition
Download Button

In a hurry? Add it to your Download Basket!

user rating 17



Rate it!
3 Screenshots
What's New in version 3.2.0
  • plugins to_numpy and from_numpy added for support of numpy; the deprecated numeric and numarry modules have been replaced with numpy
  • highlight also works with GREYSCALE and ONEBIT images
  • corrected resize function in VIGRA
  • the knn classifier can now return different confidence measures for the main id that are selectable by the user. See the classifier API documentation for details.
read full changelog

Add your review!