Main Page | Packages | Class Hierarchy | Class List | File List | Class Members

treebuilder.TreeBuilder Class Reference

Collaboration diagram for treebuilder.TreeBuilder:

Collaboration graph
[legend]
List of all members.

Detailed Description

TreeBuilder Sep 12, 2004.

This class supports the construction of a light weight in memory tree from an XML file. Like the W3C DOM Document object, the structure of this tree reflects the structure of the XML.

The tree that is constructed is designed for rapid traversal and in memory modification. It also has the advantage of using less memory than the java.sun.com DOM Document implementation.

This code demonstrates how little source code is required to parse XML using the XmlPullParser.

A few notes about attributes and namespaces. In general I think that the XmlPullParser rocks. The API is will designed and the calls mostly make sense. But... the XmlPullParser does not treat all attributes the same way. In particular it does not treat name space definitions like other attributes. This can be seen in XML designed to be processed via a schema. For example:

          <ex:EXPRESSION 
              xmlns:ex="http://www.bearcave.com/expression" 
              xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
              xsi:schemaLocation="http://www.bearcave.com/expression xmlexpr/expression.xsd">

 

Here there are two name space definitions. One defining the namespace associated with the "ex" prefix and one associated with XML schemas (for the schema location attribute).

The getAttributeCount() method will return 1 when processing the EXPRESSION tag. This is because the name space definitions (the attributes with the xmlns prefix) are not treated as normal attributes. And sometimes this is good, because these attributes are not necessarily of interest. In this case, however, the intent is to exactly mirror the XML in an in-memory tree. So that if the tree is serialized the original XML will be recovered (with the exception of white space TEXT, since this is not included).

The attributes are only available when the END_TAG element for the document tag is processed. The getDepth() method tells the current XML nesting depth, so it can be determined that an END_TAG is the document end tag. The attributes are then fetched and prepended to the attribute list. Sort of awkward. This is the one place where I would differ in the design of the XmlPullParser. I'd treat all attributes the same way, including those with the "xmlns" prefix. Then the user can simply ignore operands with the namespace prefix.

Author:
Ian Kaplan, www.bearcave.com, iank@bearcave.com


Public Member Functions

TreeNode parseXML (FileReader reader) throws XmlPullParserException, IOException
 This is the public entry point for the TreeBuilder.

Private Member Functions

Attribute buildAttr (int index)
Attribute buildNS (int index) throws XmlPullParserException
void addNamespaces (int depth) throws XmlPullParserException
 This method is called for the end tag of the root document tag.
TagNode buildTagNode ()
 Build a tag node.
TreeNode buildTree () throws XmlPullParserException, IOException
 Recursively parse an XML file into an in-memory tree data structure.
XmlPullParser getParser () throws XmlPullParserException
 Allocate and initial an XmlPullParser.

Private Attributes

XmlPullParser mParser = null
TreeNode mDocumentTag = null


Member Function Documentation

void treebuilder.TreeBuilder.addNamespaces int  depth  )  throws XmlPullParserException [private]
 

This method is called for the end tag of the root document tag.

If there are name spaces, they would have been defined in this tag. Insert them in the front of the attribute list.

Exceptions:
XmlPullParserException 
00121 { 00122 TagNode tag = (TagNode)mDocumentTag; 00123 AttributeList attrList = tag.getAttrList(); 00124 00125 int nsStart = mParser.getNamespaceCount( depth-1 ); 00126 int nsEnd = mParser.getNamespaceCount( depth ); 00127 for (int i = nsEnd-1; i >= nsStart; i--) { 00128 Attribute attr = buildNS( i ); 00129 attrList.insert( attr ); 00130 } 00131 } // addNamespaces

Attribute treebuilder.TreeBuilder.buildAttr int  index  )  [private]
 

00088 { 00089 String name = mParser.getAttributeName( index ); 00090 String prefix = mParser.getAttributePrefix( index ); 00091 String namespace = mParser.getAttributeNamespace( index ); 00092 00093 Attribute attr = new Attribute( name, prefix, namespace ); 00094 00095 String attrType = mParser.getAttributeType( index ); 00096 String attrVal = mParser.getAttributeValue( index ); 00097 00098 attr.setAttrType( attrType ); 00099 attr.setValue( attrVal ); 00100 return attr; 00101 } // buildAttr

Attribute treebuilder.TreeBuilder.buildNS int  index  )  throws XmlPullParserException [private]
 

00105 { 00106 String nsName = mParser.getNamespacePrefix( index ); 00107 String uri = mParser.getNamespaceUri( index ); 00108 Attribute attr = new Attribute( nsName, "xmlns", uri ); 00109 attr.setValue( uri ); 00110 return attr; 00111 }

TagNode treebuilder.TreeBuilder.buildTagNode  )  [private]
 

Build a tag node.

Note that a tag node always has an AttributeList object, even if there is no attribute list. This wastes some memory, but in theory should make the tree processing more regular, since it can always be assumed that this object exits.

00142 { 00143 String name = mParser.getName(); 00144 String prefix = mParser.getPrefix(); 00145 String namespace = mParser.getNamespace(); 00146 TagNode tag = new TagNode( name, prefix, namespace ); 00147 00148 AttributeList attrList = new AttributeList(); 00149 int numAttr = mParser.getAttributeCount(); 00150 for (int i = 0; i < numAttr; i++) { 00151 Attribute attr = buildAttr( i ); 00152 attrList.append( attr ); 00153 } // for 00154 tag.setAttrList( attrList ); 00155 return tag; 00156 } // buildTagNode

TreeNode treebuilder.TreeBuilder.buildTree  )  throws XmlPullParserException, IOException [private]
 

Recursively parse an XML file into an in-memory tree data structure.

Currently this code only handles the COMMENT, TEXT and START_TAG elements returned by the XmlPullParser. Other XML elements are ignored.

Exceptions:
XmlPullParserException 
IOException 
00173 { 00174 TreeNode root = null; 00175 TreeNode child = null; 00176 TreeNode curSib = null; 00177 boolean done = false; 00178 do { 00179 int event = mParser.nextToken(); 00180 if (event == XmlPullParser.START_TAG) { 00181 root = buildTagNode(); 00182 if (mParser.getDepth() == 1) { 00183 mDocumentTag = root; 00184 } 00185 for (TreeNode t = buildTree(); !(t instanceof EndTag); t = buildTree()) { 00186 if (child == null) { 00187 child = t; 00188 curSib = child; 00189 } 00190 else { 00191 curSib.setSibling( t ); 00192 curSib = t; 00193 } 00194 } // for 00195 root.setChild( child ); 00196 done = true;; 00197 } 00198 else if (event == XmlPullParser.COMMENT) { 00199 String comment = mParser.getText(); 00200 root = new TextNode( TreeNodeType.COMMENT, comment ); 00201 done = true; 00202 } 00203 else if (event == XmlPullParser.TEXT) { 00204 String text = mParser.getText(); 00205 root = new TextNode( text ); 00206 done = true; 00207 } 00208 else if (event == XmlPullParser.END_TAG) { 00209 String name = mParser.getName(); 00210 root = new EndTag( name ); 00211 int depth = mParser.getDepth(); 00212 if (depth == 1) { 00213 // add the namespace attribtues to the document tag, if they exist 00214 addNamespaces( depth ); 00215 } 00216 done = true; 00217 } 00218 else if (event == XmlPullParser.END_DOCUMENT) { 00219 root = new EndTag( "END DOCUMENT" ); 00220 done = true; 00221 } 00222 } while (! done); 00223 return root; 00224 } // buildTree

XmlPullParser treebuilder.TreeBuilder.getParser  )  throws XmlPullParserException [private]
 

Allocate and initial an XmlPullParser.

At the time this code was written the XmlPullParser did not support validation, so the call to setValidating() is passed "false".

00237 { 00238 XmlPullParserFactory factory; 00239 factory = XmlPullParserFactory.newInstance(); 00240 factory.setNamespaceAware( true ); 00241 factory.setValidating( false ); 00242 XmlPullParser parser = factory.newPullParser(); 00243 return parser; 00244 } // getParser

TreeNode treebuilder.TreeBuilder.parseXML FileReader  reader  )  throws XmlPullParserException, IOException
 

This is the public entry point for the TreeBuilder.

It is passed a FileReader, which has been opened for an XML file.

00253 { 00254 TreeNode root = null; 00255 mParser = getParser(); 00256 if (mParser != null) { 00257 mParser.setInput( reader ); 00258 root = buildTree(); 00259 } 00260 return root; 00261 } // parseXML


Member Data Documentation

TreeNode treebuilder.TreeBuilder.mDocumentTag = null [private]
 

XmlPullParser treebuilder.TreeBuilder.mParser = null [private]
 


The documentation for this class was generated from the following file:
Generated on Tue Sep 21 22:08:43 2004 for Building an in-memory tree using the XmlPullParser by doxygen 1.3.8