PyGalleryCrawler project is a Web crawler for online image galleries.
tar -xzf pygallerycrawler.tar.gz
Extra python modules
psyco @ http://psyco.sourceforge.net
Python Imaging Library aka PIL @ http://www.pythonware.com/products/pil/
- thumbnails generation
- size verification
feedparser @ http://feedparser.org
- feed parser
chmod a+x pygallerycrawler.py
If you make change in config.py, your changes will be overwrite at the next update. So, you can create a personal configuration and use it with the --config (or -c) switch.
cp config.py ~/pgc_config.py
./pygallerycrawler.py -c ~/pgc_config.py the_url_you_want_crawl
· No check if some pictures are the same after download. Some gallery have a presentation link wich is one of the pictures. So the images will be double.
What's New in This Release:
· A check for the image size of both pictures and thumbnails was added.
· Regexp support was improved.
· An internal algorithm was cleaned.
· A simple feed which can be tried if there is direct link to the gallery was added.