Arabica is a C++ XML parser toolkit that has a full SAX2 implementation (the Simple API for XML), including the optional interfaces and helper classes. It also implements the W3C DOM (Document Object Model) Level 2.0 Core, together with XPath 1.0.
It delivers UTF-8 encoded std::strings or UCS-2 std::wstrings, but can also accommodate custom string types and alternative encodings. Arabica provides uniform SAX2 wrappers for the expat parser, Xerces, libxml, and the Microsoft XML parser COM component.
What's New in This Release: [ read full changelog ]
· Taggle, a port of John Cowan's TagSoup package, was added.
· Taggle parses HTML as found in the wild, generating a well-formed stream of SAX events.
· The Taggle interface is identical to Arabica's XML parser, allowing HTML to be processed with XML tools.