March 28th, 2013· New section detection via CSS class names.
· If your HTML contains headings that are not using the h1-6 tags, but via CSS formatted p, div, or similar tags, the new section detection can help you to create the proper sectioning structure in DocBook.
· List detection via CSS class names.
· Sometimes HTML contains "lists", which truly are special formatted paragraphs.
· The new list detection can help you to reconstruct the proper lists in DocBook.
November 5th, 2012· Fixed usage of invalid values for align attribute.
· Fixed wrong normalization of literal environments.
October 25th, 2011· Improved customization possibilities via groovy scripts.
· A fix for "invalid escape sequence error" in the parser.
March 10th, 2010· Support for conversion of HTML fragments. Until now you got a ugly NullPointerException when trying to convert an incomplete HTML document.
· Improved detection of the title of the document from the HTML code. Now the xpath expressions /html/head/title, //h(1-6)[1] and //text()[1] are used to find a hopefully reasonable title.
· The new HTML parser didn't treat HTML comments inside of text correctly. Now the comments are located and removed from the DocBook XML.