The Design of Software (CLOSED)

A public forum for discussing the design of software, from the user interface to the code architecture. Now closed.

The "Design of Software" discussion group has been merged with the main Joel on Software discussion group.

The archives will remain online indefinitely.

data storage - where does XML fit in?

hello, if i'm not mistaken XML is great for exchange data over http but what does SQL db with XML support fit into the overall software desgin? do you store data generate by your app in XML then import into SQL db or store data in SQL db then export it out in XML?

sorry, if my question is a bit vauge as i'm not sure if i ask the right question.

Wednesday, July 12, 2006
XML is useful when:

The creator and consumers of the data aren't tightly tied and need different parts of it (it doesn't matter if this
is over http or not.

File formats where the data may be extended with new entries in new versions.

Data that is heavily object/tree based eg some scientific data that doesn't easily fit into a table structure.

You just need a format that everyone can read.

Generally XML databases are useful when you have some XML and need to store it, not as a general replacement for SQL + Tables.
Martin Send private email
Wednesday, July 12, 2006
XML makes a terrible database, for several reasons (I've gone into them in other posts). My recommendation would be to convert it to relational format ASAP.
Steve Hirsch Send private email
Wednesday, July 12, 2006
There was a big push for "native XML" databases a few years back, but none of them really panned out.  Most people figured out that it's often much easier to store things in a regular database and then create the xml on the fly... especially when different versions of the XML might be necessary.
KC Send private email
Wednesday, July 12, 2006
Regarding native databases, there are applications where this makes sense.  Let's say you want to store and index a set of XML documents.  Storing them as XML prevents the loss of structural information in the document.  You can perform queries on the documents using XPath.  If you flatten out the document, you can index keywords (this is what a search engine like Lucene does), but you lose the ability to query for things like "all child nodes of element <x>."

XML databases allow you to store and retrieve "loosely structured" objects.  This is useful, since not all information in the real world can be easily mapped to a set of relational tables.
NPR Send private email
Wednesday, July 12, 2006
I may be wrong, but doesn't Oracle's XDB actually model the structure of the XML document by using tables and relations?
John Topley Send private email
Wednesday, July 12, 2006
It probably does since ultimately that's what the oracle db engine handles.
But it doesn't really matter - it's like saying you should use binary flat files because ultimately oracle stores everything as bytes in disk blocks.

XML is just one way to model data, relational databases are another - it depends on what your data looks like and what relationships it has.
Martin Send private email
Wednesday, July 12, 2006
There are only a couple circumstances in which I'll store data persistently in XML.  I'll do it when I need some sort of low-volume unstructured bag of data (like a configuration file, or a serialization of an application's objects) that needs to be read once at runtime and is updated infrequently.    I'll do it when I want the data to be an input to XSLT. 

Other than that, I think of XML as a way to represent data in motion.  If I need to ship information from one machine (or process) to another, I'll generally use XML unless performance is extremely critical.  But when the data stops moving, either I'm displaying it on someone's screen, storing it in a database, or throwing it away.
Robert Rossney Send private email
Wednesday, July 12, 2006
Here's a good case, which I have found in a project I'm working on:

The user creates some rules or configuration through a UI.  These need to be stored in their profile.

While it is possible to create database tables for this, xml is a more natural way to express this data due to its hierarchical organization.

I have also found that the tools for binding runtime data with a persistant storage format in XML are very useful.  I'm using XML Beans to generate classes from an XML Schema.  These classes make it easy to programmatically manipulate objects with very few lines of code, and then persist them as XML.  I only have to change one thing -- the XML Schema -- in order to change the structure of the data and regenerate the classes for manipulating the data.

I haven't thought much about backwards compatibility with previous versions.  Databases are probably better about maintaining backwards compatibility of data (using ALTER TABLE).
NPR Send private email
Wednesday, July 12, 2006
I do think XML has a place as an application file format.
1, It's all in one file, instead of a data.csv and a config.ini
2, Anything is better than a serialize object dump.
3, It's verbose but compresses well, just use gzopen.
4, Your data will last longer than your app, make it self describing.
5, It's much easier to handle backward/forward compatibility than in a serialize type file.
6, It's easy to make a report generator, format converter based on XSLT.
7, Storing an image as an encoded section is no worse than a blob in a table.

But, if your data fits naturally into tables look at using sqlite and use the single file database it writes as the application data file.
martin Send private email
Wednesday, July 12, 2006
I suspect XML is more commonly used as a serialization format for transactions and resultsets than for internal representation at the DBMS.  The DBMSs that are claimed to be more XML-enabled probably translate directly to/from XML and internal data representations than older DBMSs, which probably go from internal/disk structures to some native dataset format and then optionally translate THAT to/from XML.
j. freling
Wednesday, July 12, 2006
First of all, XML is only a metalanguage. It all depends on what DTD or Schema you produce. To call XML a language is like eating a cookie cutter.

That said, mark up languages are good at, and designed for, marking up text.
Steve Hirsch Send private email
Wednesday, July 12, 2006
It may be
* relatively costly to produce the XML.
* that you need to retain copies of any sent materials.
* you need a staging area for preperation

For instance,  you don't go to Oracle, DB2 and Notes every time you want to display the web page with your product detail.
You pull the relational data from Oracle, join it to the scanner images in DB2 and the descriptive waffle from notes and then... store it for the website to cache quickly.

Wednesday, July 12, 2006
XML is for communication, usually between two otherwise unrelated processes.  It can also be used for 'portable' configuration files.

Its benefit is that it is almost self-documenting.

Its drawbacks are:
1.  It is quite verbose.
2.  Nesting structures can be very hard for a human to read. 
3.  Did I mention it was quite verbose?
4.  I don't think it can be 'searched' quickly, without loading the whole thing into memory.  I could be wrong about this one.
5.  It shares with Perl that "There Is More Than One Way To Do It" when it comes to storing Token-Value pairs.  This ambiguity means you have to live with the choices the person who created the transformation schema made.
6.  It is really verbose.

Now, I'm harping on that 'verbose' thing, because that's the usual 'wart' that prevents its use as 'local' data storage, or use in a database, or use to transfer large amounts of data (like Satellite Telemetry files of 10 Gigabyte or so). YMMV.
Monday, July 17, 2006

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics
Powered by FogBugz