HTTP::GetImages is a spider to recover and store images from web pages.
SYNOPSIS
use HTTP::GetImages;
$_ = new HTTP::GetImages (
dir => '.',
todo => ['http://www.google.com/',],
dont => ['http://www.somewhere/ignorethis.html','http://and.this.html'],
chat => 1,
);
$_->print_imgs;
$_->print_done;
$_->print_failed;
$_->print_ignored;
my $hash = $_->imgs_as_hash;
foreach (keys %{$hash}){
warn "$_ = ",$hash->{$_},"n";
}
exit;
This module allow syou to automate the searching, recovery and local storage of images from the web, including those linked by anchor (A), mage (IMG) and image map (AREA) elements.
Supply a URI or list of URIs to process, and HTTP::GetImages will recurse over every link it finds, searching for images.
By supplying a list of URIs, you can restrict the search to certain webservers and directories, or exclude it from certain webservers and directories.
You can also decide to reject images that are too small or too large.
Product's homepage
Requirements:
· Perl