Softpedia
 


LINUX CATEGORIES:



GLOBAL PAGES >>
NEWS ARCHIVE >>
SOFTPEDIA REVIEWS >>
MEET THE EDITORS >>

WEEK'S BEST

  • BackTrack 5 R1
  • Wine 1.2.3 / 1.4 RC3
  • Mozilla Firefox 10...
  • Ubuntu 11.04
  • Angry Birds 1.1.2.1
  • Ubuntu 10.04.3 LTS
  • Linux Kernel 3.2.6
  • Ubuntu Manual 10.10
  • Adobe Flash Player...
  • Pidgin 2.10.1
  • Home > Linux > Text Editing&Processing > Others

    OCRfeeder 0.7.7

    Download button

    Downloads: 1,265  View global page NEW!  Tell us about an update
    User Rating:
    Rated by:
    NOT RATED
    0 user(s)
    Developer:

    License / Price:

    Last Updated:

    Category:
    Joaquim Rocha | More programs
    GPL v3 / FREE
    December 13th, 2011, 02:52 GMT [view history]
    ROOT / Text Editing&Processing / Others

     Read user reviews (0)  Refer to a friend  Subscribe

    OCRfeeder description

    A Complete OCR Suite

    OCRFeeder is a document layout analysis and optical character recognition system.

    Given the images it will automatically outline its contents, distinguish between what's graphics and text and perform OCR over the latter. It generates multiple formats being its main one ODT.

    OCRFeeder features a complete GTK graphical user interface that allows the users to correct any unrecognized characters, defined or correct bounding boxes, set paragraph styles, clean the input images, import PDFs, save and load the project, export everything to multiple formats, etc.

    Installation on Ubuntu:

    The only packages needed to be installed on Ubuntu 8.10 is PyGoocanvas and Unpaper, the rest of the dependences are already installed in a fresh install of this version of Ubuntu. The engine Ocrad is also installed for the reasons explained in the previous section.

    To install PyGoocanvas, Ocrad and Unpaper, the following command should be executed as superuser:

        apt-get install python-pygoocanvas ocrad unpaper

    After all of the packages finish the installation, OCRFeeder is ready to be
    installed. To install it, all that is needed is to run setup.py script as
    superuser:

        setup.py install

    OCRFeeder can now be run by calling it from a desktop menu or by running the *ocrfeeder* command. When using the GNOME desktop, if the desktop menu entry is not showing the OCRFeeder's icon, the following command must be used to update the icon cache (as superuser):

        gtk-update-icon-cache -f -t /usr/share/icons/hicolor

    Command Line Usage:

    This section explains how to use OCRFeeder from the command line.

    The command line interface part of OCRFeeder aims at users who want to perform quick and unattended conversions of document images to editable formats. It also makes this project usable from other applications.

    Two parameters are mandatory:

        1) the path to each document image to be processed is given after the parameter
            --images;
        2) the name of the document to be generated is given after the parameter
            --o.

    For example:

        ocrfeeder-cli --images ~/image1.png ~/image2.jpeg
            --o converted_document


    The pages of the generated documents honor the order of the given paths.

    It is also possible to specify the format of the document to be generated
    (HTML or ODT) with the option --format. In case no format is specified,
    the images will be exported to ODT. Continuing with the example above:

        ocrfeeder-cli --images ~/image1.png ~/image2.jpeg --format HTML
          --o converted_document


    OCRFeeder Studio (the graphical user interface part) can also be launched
    from the command line. Two options can be used to load images right after
    the program initiates. Those are --images which will add the images given
    as the option's arguments and --dir that will add all the images under a
    given directory path. The options can be used individually or combined,
    for example:

        ocrfeeder --images ~/image1.png ~/image2.jpeg
            --dir ~/Desktop


    For any usage, the options and parameters can be given in any order.


    Product's homepage

    Requirements:

    · Python
    · PyGTK
    · PIL
    · PyGooCanvas
    · AFPL Ghostscript
    · Unpaper

    What's New in This Release: [ read full changelog ]

    · Now the content boxes can be dragged by their limits to extend their bounds
    · Add "sane" missing dependency
    · Change some mnemonics in the menu to avoid clashes (fixes gb#645983) (thanks to Łukasz JernaÅ›)
    · Resets the favorite engine when it does not exist
    · Prevent errors when adding unexisting images
    · Focus box's editor text area automatically (gb#635308)
    · Clarify the help output about the --images option

    New and Updated Translations:
    · Marek ÄŒernocký [cs]
    · Joe Hansen [da]
    · Mario Blättermann [de]
    · Daniel Mustieles [es]
    · Claude Paroz [fr]
    · Gianvito Cavasoli [it]
    · Łukasz JernaÅ› [pl]
    · Djavan Fagundes [pt_BR]
    · Matej Urbančič [sl]
    · Aron Xu [zh_CN]

      


    TAGS:

    optical character recognition | document layout analysis | optical | character | recognition



    HTML code for linking to this page:


    Go to top

    WindowsGamesDriversMacLinuxScriptsMobileHandheldNews

    SUBMIT PROGRAM   |   ADVERTISE   |   GET HELP   |   SEND US FEEDBACK   |   RSS FEEDS   |   UPDATE YOUR SOFTWARE   |   ROMANIAN FORUM