Softpedia
 


LINUX CATEGORIES:



GLOBAL PAGES >>
NEWS ARCHIVE >>
SOFTPEDIA REVIEWS >>
MEET THE EDITORS >>
WEEK'S BEST
  • BackTrack 5 R2
  • Wine 1.4 / 1.5.5
  • Mozilla Firefox 12...
  • Ubuntu 11.04
  • Angry Birds 1.1.2.1
  • Ubuntu 10.04.4 LTS
  • Linux Kernel 3.4
  • Ubuntu Manual 10.10
  • Adobe Flash Player...
  • Pidgin 2.10.4
  • Home > Linux > Internet > HTTP (WWW)

    InfoCrawler 1.1

    Download button

    No screenshots available
    Downloads: 488  View global page NEW!  Tell us about an update
    User Rating:
    Rated by:
    Good (3.0/5)
    9 user(s)
    Developer:

    License / Price:

    Last Updated:

    Category:
    Rafik Kaddouri | More programs
    GPL v3 / FREE
    April 23rd, 2008, 09:19 GMT
    ROOT / Internet / HTTP (WWW)

     Read user reviews (0)  Refer to a friend  Subscribe

    InfoCrawler description

    InfoCrawler is an open source knowledge management solution.

    InfoCrawler is an open source knowledge management solution that allows you to crawl, index, and query various types of documents, accessing data from various resources: Intranets, News groups, FTP sites, public WEB sites, local or remote file systems.

    Here are some key features of "InfoCrawler":

    · Distributed architecture: InfoCrawler was designed from the ground up for distributed architecture, it is a 100% java service, and can be executed permanently on one or more machines. Communicating using XML, its components can be installed on different machines: the administration, the spider, and the indexing engine.

    · Intuitive administration: Using its own WEB based administrating interface, you can administer and monitor the different collections in a very user-friendly manner. The simplicity and flexibility limits the total costs of ownership.

    · Optimized crawling: Thanks to its multi-threaded architecture, InfoCrawler can spider many collections in parallel, and can have many threads per collection.

    · Powerful indexing: Using Lucene indexing engine to index the documents, InfoCrawler can index various file types: HTML files, Microsoft office documents, PDF, XML, and more.

    · Open technology: InfoCrawler does not use any proprietary technology, URLs are maintained using mySql database, The Indexing engine is Lucene (Open Source Indexer), the WEB administration is done using Apache Tomcat and JSP, the communication between the administration and the spider is done using XML, and the spider itself is 100% java.

    · Flexible: Being compatible with standards like HTML, XML, JSP, Java, and JDBC, InfoCrawler can be integrated easily in large projects.

    · Unique features: InfoCrawler has some unique features like the JavaScript interpreter or the intelligent URL management.

    Requirements:

    · Java Runtime Environment (JRE) 5 or higher

    What's New in This Release:

    · Major feature enhancements.



    Product's homepage

      


    TAGS:

    knowledge management | information management | web crawler | knowledge | information | content



    HTML code for linking to this page:


    Go to top

    WindowsGamesDriversMacLinuxScriptsMobileHandheldNews

    SUBMIT PROGRAM   |   ADVERTISE   |   GET HELP   |   SEND US FEEDBACK   |   RSS FEEDS   |   UPDATE YOUR SOFTWARE   |   ROMANIAN FORUM