The Design of Software (CLOSED)

A public forum for discussing the design of software, from the user interface to the code architecture. Now closed.

The "Design of Software" discussion group has been merged with the main Joel on Software discussion group.

The archives will remain online indefinitely.

XML parsing and design

I've a very premitive question regarding writing or should say designing an XML parser. I am trying to use Java DOM to parse a very complex and large XML file, not so much as amount of data crampped into the file but rather the number of distinct elements (i.e. 500+).

I can write a huge class to encapsulate all the parsing and go through each node tree (namespaces) and parse the required information into the appropriate "model" objects. Now, what I want to know is that: is there some sort of design pattern that facilitate my parser class?

I appreciate any comment.
Aria Kockochka Send private email
Sunday, May 15, 2005
Perhaps you should consider SAX instead if it is a large file, or XPath if you prefer. Otherwise your program will turn out to be a vast switch statement or something similar.

Sunday, May 15, 2005
Indeed, the code would be clottered with countless else/if statements using DOM. But beside the performance benefits, how does SAX alleviate the situation?
Aria Kockochka Send private email
Monday, May 16, 2005
I would suggest modeling the syntax of your XML document in XML Schema and then use some tool like Castor to generate classes that serialize and deserialize your XML.
Monday, May 16, 2005
Take a look at Apache's Digester.
Monday, May 16, 2005
Or JIBX - which does a similar thing but modifies your class files at compile time to add appropriate serializer/deserializer code. Very fast and neat. It has a tool that can generate most of your binding file for you, assuming that you name your object model classes inline with your XML elements.

Apart from that there are a million and one other Java to XML binding frameworks that could help.
Simon Collins Send private email
Monday, May 16, 2005
Actually, let me make a suggestion...

Take the big XML object, pick off individual nodes, and apply XSL stylesheets to them to create individual XML objects in a format that works for you.

Once you have that format, make a 1:1 relationship between your classes and these structures.

This will probably be the most flexible and allow for changes in individual objects/structures to not cascade throughout the entire class structure.
KC Send private email
Monday, May 16, 2005
I was not aware of any of these "binding" tools and frameworks which tremendously simplify the job. I've skimmed over Digester, Castor, and JiBX and they all seem to be very helpful not just on facilitating the parsing and generating XML but their generic compile-time and run-time capabilities. Thanks.

KC, Breaking down the XML file into its namespaces or major elements and mapping each to a class was basically my first approach but I wasn't sure if there is a more well defined design that I have overlooked. But then again, with such 1:1 binding, you only delegating those nasty if/else statements into more simplified objects.
Aria Kockochka Send private email
Monday, May 16, 2005

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics
Powered by FogBugz