Softpedia
 


LINUX CATEGORIES:



GLOBAL PAGES >>
NEWS ARCHIVE >>
SOFTPEDIA REVIEWS >>
MEET THE EDITORS >>
WEEK'S BEST
  • Linux Kernel 3.9.6 / 3....
  • Linux Kernel 3.0.82 LTS...
  • KDE Software Compilatio...
  • PulseAudio 4.0
  • Wireshark 1.10.0
  • NetworkManager 0.9.8.2
  • LibreOffice 3.6.6 / 4.0...
  • SystemRescueCd 3.7.0
  • Linux Kernel 3.10 RC6
  • Ubuntu Tweak 0.8.5
  • Home > Linux > Programming > Libraries

    hadoopy 0.6.0

    Download button

    No screenshots available
    Downloads: 136  View global page NEW!  Tell us about an update
    User Rating:
    Rated by:
    NOT RATED
    0 user(s)
    Developer:

    License / Price:

    Last Updated:

    Category:
    Brandyn A. White | More programs
    GPL v3 / FREE
    January 12th, 2012, 23:42 GMT
    ROOT / Programming / Libraries

     Read user reviews (0)  Refer to a friend  Subscribe

    hadoopy description

    Python MapReduce library written in Cython

    hadoopy is a Python MapReduce library written in Cython.

    Visit us in #hadoopy on freenode. See the link below for documentation and tutorials.

    Source https://github.com/bwhite/hadoopy/
    Issues https://github.com/bwhite/hadoopy/issues
    Docs http://bwhite.github.com/hadoopy/

    Used in

    - A Case for Query by Image and Text Content: Searching Computer Help using Screenshots and Keywords (to appear in WWW'11)
    - Web-Scale Computer Vision using MapReduce for Multimedia Data Mining (at KDD'10)
    - Vitrieve: Visual Search engine
    - Picarus: Hadoop computer vision toolbox

    Ubuntu Install (others are similar)
    sudo apt-get install python-dev build-essential
    sudo python setup.py install



    Product's homepage

    Here are some key features of "hadoopy":

    · oozie support
    · Automated job parallelization 'auto-oozie' available in the hadoopy_flow project (maintained out of branch)
    · typedbytes support (very fast)
    · Local execution of unmodified MapReduce job with launch_local
    · Read/write sequence files of TypedBytes directly to HDFS from python (readtb, writetb)
    · Works on OS X
    · Allows printing to stdout and stderr in Hadoop tasks without causing problems (uses the 'pipe hopping' technique, both are available in the task's stderr)
    · critical path is in Cython
    · works on clusters without any extra installation, Python, or any Python libraries (uses Pyinstaller that is included in this source tree)
    · Simple HDFS access (readtb and ls) inside Python, even inside running jobs
    · Unit test interface
    · Reporting using status and counters (and print statements! no need to be scared of them in Hadoopy)
    · Supports design patterns in the Lin/Dyer book (http://www.umiacs.umd.edu/~jimmylin/book.html)

    Requirements:

    · Python
    · Cython

      


    TAGS:

    MapReduce library | Python library | Python | MapReduce | library

    Go to top

    WindowsGamesDriversMacLinuxScriptsMobileHandheldNews

    SUBMIT PROGRAM   |   ADVERTISE   |   GET HELP   |   SEND US FEEDBACK   |   RSS FEEDS   |   UPDATE YOUR SOFTWARE   |   ROMANIAN FORUM