scraperwiki_local is a Python library that aims to be a drop-in replacement for the Python scraperwiki library for use locally. That is, functions will work the same way, and data will go into a local SQLite database; a targeted bombing of ScraperWiki's servers will not stop this local library from working, unless you happen to be running it on one of ScraperWiki's servers.
Installing
This will soon be in PyPI, but for now you can just install from the git repository.
Documentation
Read the standard ScraperWiki Python library's documentation, then look below for some quirks about the local version.
Quirks
The local library aims to be a drop-in replacement. In reality, the local version sometimes works better, though not all of the features have been implemented.
Differences
Datastore differences
The local scraperwiki.sqlite is powered by DumpTruck, so some things work a bit differently.
Data are stored to a local sqlite database named scraperwiki.db.
Bizarre table and column names are supported.
Dates and datetimes are stored in a different standard format.
scraperwiki.sqlite.execute returns lists of dictionaries.
scraperwiki.sqlite.attach downloads the whole datastore from ScraperWiki, So you might not want to use this too often on large databases.
scraperwiki.sqlite.get_var and scraperwiki.sqlite.save_var store their data in the table _dumptruckvars, and they use a slightly different format.
Other Differences
Status of implementation
In general, features that have not been implemented raise a NotImplementedError.
Datastore
scraperwiki.sqlite is missing the following features.
- Data argument to scraperwiki.sqlite.select
- All of the verbose keyword arguments (These control what is printed on the ScraperWiki code editor)
- scraperwiki.sqlite.show_tables only works for the main database is implemented.
Geo
The UK geocoding helpers have not been implemented
- scraperwiki.geo
Utils
scraperwiki.utils is implemented, as well as the following functions.
- scraperwiki.log
- scraperwiki.scrape
- scraperwiki.pdftoxml
- scraperwiki.swimport
Deprecated
These submodules are deprecated and thus will not be implemented.
- scraperwiki.apiwrapper
- scraperwiki.datastore
- scraperwiki.jsqlite
- scraperwiki.metadata
- scraperwiki.newsql
Development
Run tests with ./runtests; this small wrapper cleans up after itself.
Product's homepage
Requirements:
· Python