NodeTree most XML libraries fit into one of two categories; they either parse XML streams with callbacks for each event encountered but leave it to the user to store and organize these events (such as expat or SAX), or they parse the entire XML document into memory in one batch and return a handle to the document's root element only after its finished (DOM and ElementTree).
While the latter is much easier to work with, it also requires that the entire XML stream be available before any of it can be processed and must load the entire stream into memory, even when only a piece of it needs to be evaluated at a time.
With NodeTree we seek a hybrid of these two techniques. Callbacks can be set for virtually every stage of processing, but what is returned is the (possibly incomplete) object being processed. Nodes which have been fully processed can be removed from the tree in processing to save memory and the user can even specify an alternative class to create child nodes of an element. The goal is a clean, Pythonic API usable for the most basic to the most advanced XML processing.
NodeTree is similar to the familiar ElementTree API with a few notable differences:
- Element.tag has been renamed to Element.name
- Element attributes are a dictionary at Element.attributes
- Elements are sequences of their children
- Text inside an element is a child node, not Element.text property, so the order of text and child elements is preserved and available. Text nodes are simply strings, so you can just Element.append('text').
- Nodes work by duck typing and can be freely mixed from other XML libraries including (with very little work) ElementTree or DOM
- All nodes can be converted to XML strings with their __str__ method