Print

Print


Emin asked:

> can anyone give me an idea about how to store XML documents on a
> relational database (i.e. ORACLE)

That depends on how flexible you want to be. You could just create a
large field that contains the whole document, or you could store the
document in a file system and place a reference to it in the DB. Both
solutions make searching a bit tricky, and I would regard them as quite
'static', since obtaining a smaller part of a document is awkward.

We use a simple technique where we have a table for attributes and a
table for elements. The attribute table has a join on the element table
to say what element the attribute belongs to, whilst the element has
joins to itself to say who the parent of an element is. This allows us
to store an object-like tree structure, and so generate XML documents
from any point in the tree.

We are currently nearing completion on a site for a magazine that is
implemented completely with XML, using this technique. The consequence
is that we can pull out quotes and references from an article, a whole
article, or even a whole issue, if we wanted. As a result of this
flexibility, we list all countries, CEOs and companies referred to in an
article down the side of the article, but just as easily, we list 'all
companies in this issue' on the contents page. We provide a number of
default 'styles' to deliver the XML - for example one for normal
viewing, one for clearer printing, and one as a view to email to
someone. This is all supplemented by an entry point that gets
'documents' out as raw XML.

Our experience is that generating XML documents on-the-fly - whilst not
without problems - is an important direction to try to go in because it
allows you to distribute any node in the tree. We are now expanding the
XML distribution aspect to allow for XSL-style queries, so eventually
the database becomes just one massive XML document, with nodes accessed
through queries. (Not all XSL statements are easy to implement so we
only have simple filters as yet.)

Note that some of the discussions on the site so far make me sure there
will be disagreements with this approach, the recent request for
examples of 'pure XML' applications being one. The view seems to be to
put all the functionality on the client, but in our situation to achieve
the cross-referencing between articles we wanted, we would have had to
pass down every article from every issue. Our approach reduces the
quantity of data passed to a minimum. Just say, for example, that a
portal site carried the titles of some of the lead articles from our
site, with a paragraph summary and a link to us. We can now distribute
all that data as straight XML to the portal, who then format it as they
wish. The moment we put a new issue online, the portal site changes.
That to me is a useful application of XML, even though when you look at
the pages of the magazine online you will only see HTML because the
style has already been merged on our server.

So, to sum my ramblings up, my manifesto, for what it's worth, is don't
think document ... think database.

Mark Birbeck
Managing Director
Intra Extra Digital Ltd.
39 Whitfield Street
London
W1P 5RE
w: http://www.iedigital.net/
t: 0171 681 4135
e: [log in to unmask]