lxml Changelog

What's new in lxml 3.4.1

Jan 3, 2015

Features added:
New htmlfile HTML generator to accompany the incremental xmlfile serialisation API. Patch by Burak Arslan.
Bugs fixed:
lxml.sax.ElementTreeContentHandler did not initialise its superclass.

New in lxml 3.3.1 (Feb 12, 2014)

New in lxml 3.0 Alpha 2 (Aug 27, 2012)

New in lxml 2.3.5 (Aug 1, 2012)

New in lxml 2.3.4 (Mar 27, 2012)

New in lxml 2.3.2 (Nov 14, 2011)

New in lxml 2.3.1 (Sep 26, 2011)

Features added:
New option kill_tags in lxml.html.clean to remove specific tags and their content (i.e. their whole subtree).
pi.get() and pi.attrib on processing instructions to parse pseudo-attributes from the text content of processing instructions.
lxml.get_include() returns a list of include paths that can be used to compile external C code against lxml.etree. This is specifically required for statically linked lxml builds when code needs to compile against the exact same header file versions as lxml itself.
Resolver.resolve_file() takes an additional option close_file that configures if the file(-like) object will be closed after reading or not. By default, the file will be closed, as the user is not expected to keep a reference to it.
Bugs fixed:
HTML cleaning didn't remove 'data:' links.
The html5lib parser integration now uses the 'official' implementation in html5lib itself, which makes it work with newer releases of the library.
In lxml.sax, endElementNS() could incorrectly reject a plain tag name when the corresponding start event inferred the same plain tag name to be in the default namespace.
When an open file-like object is passed into parse() or iterparse(), the parser will no longer close it after use. This reverts a change in lxml 2.3 where all files would be closed. It is the users responsibility to properly close the file(-like) object, also in error cases.
Assertion error in lxml.html.cleaner when discarding top-level elements.
In lxml.cssselect, use the xpath 'A//B' (short for 'A/descendant-or-self::node()/B') instead of 'A/descendant::B' for the css descendant selector ('A B'). This makes a few edge cases to be consistent with the selector behavior in WebKit and Firefox, and makes more css expressions valid location paths (for use in xsl:template match).
In lxml.html, non-selected tags no longer show up in the collected form values.
Adding/removing values to/from a multiple select form field properly selects them and unselects them.
Other changes:
Static builds can specify the download directory with the --download-dir option.

New in lxml 2.3 (Feb 7, 2011)

New in lxml 2.2.8 / 2.3 Beta 1 (Sep 7, 2010)

New in lxml 2.2.8 (Sep 2, 2010)

New in lxml 2.2.7 (Jul 25, 2010)

New in lxml 2.2.6 (Mar 2, 2010)

New in lxml 2.2.5 (Feb 28, 2010)

New in lxml 2.2.2 (Jun 22, 2009)

New in lxml 2.2.1 (Jun 3, 2009)

New in lxml 2.2 (Mar 21, 2009)

New in lxml 2.2 Beta 4 (Feb 27, 2009)

New in lxml 2.2 Beta 3 (Feb 18, 2009)

New in lxml 2.2 Beta 2 (Jan 26, 2009)

New in lxml 2.1.5 (Jan 6, 2009)

New in lxml 2.2 Alpha 1 (Nov 24, 2008)

New in lxml 2.1.3 (Nov 18, 2008)