libarc 0.2.1

Libarc is a C library for accessing the contents of GZIP compressed ARC files generated by the Heritrix web crawler.
Libarc is a C library for accessing the contents of GZIP compressed ARC files generated by the Internet Archive's Heritrix web crawler.

Main features:

  • Opening and scanning the contents of GZIP compressed ARC file. The library does not currently read CDX index files, though this feature will be added in a future release.
  • You can get an iterator to walk over the contents of the ARC file member by member. You can specify a media type to limit the types members you see.
  • You can access the information in the member's URL record and the response headers from the HTTP server.
  • You can access the member's data in a single API call.

last updated on:
January 16th, 2008, 9:36 GMT
price:
FREE!
developed by:
Tom Emerson
homepage:
libarc.sourceforge.net
license type:
GPL (GNU General Public License) 
category:
ROOT \ Text Editing&Processing \ Indexing
libarc
Download Button

In a hurry? Add it to your Download Basket!

user rating 13

2.5/5
 

0/5

Rate it!

Add your review!

SUBMIT