The Design of Software (CLOSED)

A public forum for discussing the design of software, from the user interface to the code architecture. Now closed.

The "Design of Software" discussion group has been merged with the main Joel on Software discussion group.

The archives will remain online indefinitely.

How to manipulate xml's, the right way?


I have an issue troubling me for some time now, and I want to see what other developers think.

My team's main job is to edit xml,yes we do wcf and other neat stuff but at the end of the day we need to manipulate xml.

until today we loaded the document to an xmlDocumnet and did the changes we needed.

but I find that code ugly. only the programmer who wrote it knows what he intended and 3 developers used 3 different approaches using the xmlDocument.

Because we documented it you can read it but it will take a lot of affort.

I heard about xsd.exe and found it to be very similar to the dom model.

theres also Linq.

Did anyone modified xml and didnt feel like he was writing a script?

Is there a nice, correctly designed, and readble way to edit xml?
Wednesday, March 19, 2008
I'm probably misinterpreting what "editing XML" means in this context, but...

You aren't supposed to edit XML.  It's a machine format, albeit a bulky and clumsy to process one.  Most substantial XML "documents" would make your eyes bleed.
So tired
Wednesday, March 19, 2008
You mean some language to take an XML document and transform it into another language - something like XML Transforms?
Martin Send private email
Wednesday, March 19, 2008
XmlDocument means edit *programmatically*, i.e. processing and modifying XML in a program. I never liked DOM because it involved too many objects leading to complex code; not only is it difficult to read DOM code, it is difficult to know how to code it efficiently. And you are passing around a lot of objects that are pointers into a document so what does that mean when the document changes underneath them? As far as I know, linq is meant to reduce code theoretically making it more readable and add intellisense, but it also adds another layer to understanding what is efficient i.e. knowing what is going on behind the scenes. Personally I think it is a step in the wrong direction. I developed a simple XML API in a single C++ class called CMarkup (see, or free on
Ben Bryant Send private email
Wednesday, March 19, 2008
LINQ. Better than sliced bread for XML. If you're using Microsoft technologies.
Developer #13
Wednesday, March 19, 2008
VB 9.0's inline xml with expression holes are the best way I've seen to edit xml so far.  Basically anything other than the DOM is good for manipulating xml.
Wednesday, March 19, 2008
martin - thanks but we really hate xslt. we use it with the biztalk mapper and it's still horrible, its like programing in xml.

ill guess i need to pick up a good linq book :)
Thursday, March 20, 2008
I haven't tried LINQ so I can't comment on that.  I've found XSLT to be OK for simple things but mind-numbingly convoluted for complex transformations.  I'd suggest refactoring your code to extract common routines for your typical transformation steps - I'm guessing that whatever your business domain, there's some common transformation patterns that you do again and again.  This is basically what Ben suggested - developing your own API.  Your code should then be much more readable - each comment becomes a method call.
Mike Stockdale Send private email
Thursday, March 20, 2008
If you are not changing the schema then you will want to use a tool (such as xsd.exe) to take your schema and create classes to represent the xml.  You then just need to deserialize the xml into your newly created classes.  Process the changes on the class and serialize the classes back into XML.  I was a difficult convert to XML at first, but once I found the XMLSerializer class in .NET I was sold!
Jamin Send private email
Thursday, March 20, 2008
I agree xslt is horrible ;-)

I had to produce a reporting tool for an app - the data is all XML anyway so I figured, write xml, read an xslt template, throw in some css pretty colors - >output html.
The client can even write their own tempalte to do custom reports.
After a month I could just about generate basic html - after reading the wrox xslt2.0 book and discovering that MS only support 1.1
Martin Send private email
Thursday, March 20, 2008
I think XSL can be very elegant once you get used to it, though the syntax is admittedly needlessly verbose. When you run into a tricky problem, consider using two XSLs in a row, or using some DOM pre-processing before the XSL. XQuery is another good alternative if you want something a bit more procedural.
K9 Send private email
Friday, March 21, 2008
PHP's SimpleXML. :)
Berislav Lopac Send private email
Friday, March 21, 2008
Write a data structure for manipulating that data.

Write something that loads that from xml and can print to xml.

Having xml coming in and going out doesn't mean that xmlDocument is an acceptable format for manipulating data.
Lance Hampton Send private email
Friday, March 21, 2008
XSLT would be a brilliant language if it had decent string manipulation and better support for native types.  As it is, it's merely very good.

In .NET, you can accomplish a lot of what XSLT does by writing transformation code that reads from an XmlReader and writes to an XmlWriter.  It's not quite as elegant, but it's familiar.

I've seen an awful lot of XML-manipulating code that seems to be written by developers who are too afraid of doing something wrong to apply the DRY principle.  So you see code like this:

  XmlElement e1 = doc.CreateElement("e1");
  e1.InnerText = "foo";
  XmlElement e1 = doc.CreateElement("e2");
  e1.InnerText = "bar";
  XmlElement e1 = doc.CreateElement("e3");
  e1.InnerText = "baz";

instead of:

  AddElement(e, "e1", "foo");
  AddElement(e, "e2", "bar");
  AddElement(e, "e3", "baz");

or even:

  AddElements(e, "e1=foo, e2=bar, e3=baz");
Robert Rossney Send private email
Friday, March 21, 2008
Convert the XML to a relational structure in an RDB, manipulate the data using SQL, then output it back in XML.
Steve Hirsch Send private email
Tuesday, March 25, 2008
IMO the problem with XSLT is that it's overextended. It works fine as part of a solution but people seem to expect to use it as a general purpose programming language. There are constructs in the language that should never have been added to it. But I digress.

If you have document in one XML format and want to rearrange it into another XML format then by all means apply an XSL transform to it. All the prep work should be done in the host language though.

That means get your data ready using PHP's XML processing library (for example) then load the XSLT and apply it to the XML document. If there's something that's convoluted in XSLT then move it to the code before or after the XSLT is applied. Specifically you might want to do anything where you have to iterate instead of matching nodes or you feel like there should be a branch in the host language.

Rob Russell Send private email
Wednesday, March 26, 2008

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics
Powered by FogBugz