New in version 6.1.0

March 28th, 2013
  • New section detection via CSS class names.
  • If your HTML contains headings that are not using the h1-6 tags, but via CSS formatted p, div, or similar tags, the new section detection can help you to create the proper sectioning structure in DocBook.
  • List detection via CSS class names.
  • Sometimes HTML contains "lists", which truly are special formatted paragraphs.
  • The new list detection can help you to reconstruct the proper lists in DocBook.

New in version 6.0.2 (November 5th, 2012)

  • Fixed usage of invalid values for align attribute.
  • Fixed wrong normalization of literal environments.

New in version 5.4.0 (October 25th, 2011)

  • Improved customization possibilities via groovy scripts.
  • A fix for "invalid escape sequence error" in the parser.

New in version 5.2.4 (March 10th, 2010)

  • Support for conversion of HTML fragments. Until now you got a ugly NullPointerException when trying to convert an incomplete HTML document.
  • Improved detection of the title of the document from the HTML code. Now the xpath expressions /html/head/title, //h(1-6)[1] and //text()[1] are used to find a hopefully reasonable title.
  • The new HTML parser didn't treat HTML comments inside of text correctly. Now the comments are located and removed from the DocBook XML.