urlclean iconurlclean 0.5.3

Utilities to clean up URLs
urlclean provides the following functions:

- to follow a http redirect,
- to follow a HTML META redirect,
- to remove Urchin and Facebook tracker URL parameters
- that combines all these to unshorten and resolve various URLS

Try it out from the commandline:

 python -m urlclean < some url >

Contents:

Indices and tables

- Index
- Module Index
- Search Page

Documentation for the Code

urlcleaner a module that resolves redirected urls and removes tracking url params

urlclean.weedparams(url)

 removes Urchin Tracker and Facebook surveillance params from urls.

 Args:

 url (str): The url to scrub of ugly params

 Returns:

 (str). The return cleaned url

urlclean.httpresolve(url, ua=None, proxyhost='localhost', proxyport=8118)

 resolve one redirection of a http request.

 Args:

 url (str): The url to follow one redirect

 ua (fn): A function returning a User Agent string (optional)

 proxyhost (str): http proxy server (optional)

 proxyport (int): http proxy server port (optional)

 Returns: (str, httplib.response). The return resolved url, and
 the response from the http query

urlclean.unmeta(url, res)

 Finds any meta redirects a httplib.response object that has text/html as content-type.

 Args:

 url (str): The url to follow one redirect

 res (httplib.response): a http.response object

 Returns: (str). The return resolved url

urlclean.unshorten(url, cache=None, ua=None, > >**< >**<

last updated on:
February 19th, 2012, 22:06 GMT
price:
FREE!
developed by:
Stefan Marsiske
license type:
AGPL v3 
category:
ROOT \ Internet \ HTTP (WWW)

FREE!

In a hurry? Add it to your Download Basket!

user rating

UNRATED
0.0/5
 

0/5

What's New in version 0.5.1
  • Install/doc fixes
read full changelog

Add your review!

SUBMIT