This is largely a "religious" issue. Database die-hards vs. XML die-hards.
Some of the arguments for one over the other don't really stand up against
each other; they come down to personal preference or what one's familiar
You did a good job pointing out where the RDBMS excels (concurrent
transactions, speed, etc). The Oracles of the world have figured out how to
build robust, scalable, efficient systems based on the relational data
model. XML database systems are only beginning to show up on the scene and
a lot of these issues are far from being solved. Instead of assuming they
can't be (or assuming they can), I'd like to focus on what I see as a real
advantage of an XML data store over an RDMBS.
You handily dismiss the suggestion that an XML repository is more flexible
than a database. I would suggest that the relational "straightjacket" is
more than "mythical". Unlike databases, XML documents do not require a
pre-defined schema. Thus, an XML data store/information retrieval system,
could receive new documents containing new elements without breaking the
system. Yes, you could simply add a new table in a database, but that's a
manual task of changing the *schema*. I agree that XML is not inherently
more flexible when it comes to designing schemas. What I'm saying is that
schemas are inherently inflexible. This is what the semi-structured data
crowd has been shouting from the rooftops. XML describes its own structure.
XML is clearly becoming a communication standard for B2B, e-commerce, etc.
etc. If your backend system is based on an RDMBS, you will always have to
worry about mappings between two different data models. The XML and
relational data models are distinct. I know there are a number of efforts
out there, but there is certainly no consensus on how these two models
should be mapped. And I highly doubt that there's a general way to do it
that makes sense for every case.
You could, of course, flip this around and say that's why it makes no sense
to convert your database into an XML repository. Unless, that is, you
believe that the advantages of semi-structured data will outweigh the
disadvantages. And, if all of your data is already in XML, then a solution
that took advantage of that semi-structured data and did not require you to
map it to a database would be pretty compelling.
XML gives documents and data a common format. In many cases, an information
retrieval system would be required for both data and documents, which may or
may not contain structure or metadata (eg. RDF). Having one format (XML)
would allow you to seamlessly integrate those knowledge stores into one
information retrieval system, as well as happily integrate new document
types and "schemas" without ever having to formally define them.
Now for the shameless plug. At my place of employment, XYZFind Corp., we've
begun to leverage the self-describing nature of XML in order to provide
keyword search over structured data across multiple schemas. Unlike
relational databases, where users must know the structure of the data before
being able to enter a query, users of XYZFind need not know anything about
the schemas or their structure. Yet they are still able to execute precise
queries, given a parametric form dynamically generated from the schema(s)
matching their original keyword entry. This interactive search experience
automatically adapts to changes made in the structure of the underlying XML
documents, as well as to the addition of new schemas, in such a way that
would be impossible with an RDBMS. XML is thus allowing us to bridge the
gap between open-ended, full-text keyword search and precise, parametric
If you've got XML lying around and XML coming in and out and all over the
place, which is how some view the future of the world, then it makes sense
to natively use this model for storage (not necessarily the physical storage
of text documents, but the logical storage of data while being physically
stored in, say, an index).
Some people won't be convinced whatever you say. But I'm suspecting that as
new solutions to these problems are found and as the benefits of XML are
realized, XML database proponents will start to sound less out-on-a-limb.
[log in to unmask]
XYZFind, the search engine *designed* for XML
Download our free beta software: http://www.xyzfind.com/beta
From: General discussion of Extensible Markup Language
[mailto:[log in to unmask]]On Behalf Of Eliot, Topher
Sent: Friday, September 22, 2000 9:47 AM
To: [log in to unmask]
Subject: Re: XML as Database vs MYSQL etc
> > Hi,
> > I would like to know the Pros and cons of using XML as a
> database vs MYSQL
> , Oracle etc!
> There is no simple answer to this. However I have taken the liberty of
> pasting below a couple of relevant passages from "XML Unleashed"
> Increasingly XML is being used as a data store in its own
> right. Consider
> using XML as a primary storage for information when meeting any of the
> following criteria .criteria.
> [lb] The information is in a complex form. Some forms of
> record do not
> scale well to databases. A doctor's patient record may
> contain a few office
> visits or a hundred or more. How should a database be
> structured to cope
> with this variation. XML can handle this situation with ease.
> [lb] The individual fields are large and complex. Again
> records such as
> medical records where an office note can be anywhere from a
> few words to
> several pages do not store well in databases. In a database
> each field must
> be of the same size, so a traditional database can be very wasteful of
I think the author of these points needs to read an introductory database
textbook. Maybe his knowledge of DBMS is old. Many databases (including
Oracle) offer a LOB (Large OBject) or BLOB (Binary LOB) or CLOB (Character
LOB) data type that can hold a field whose size is not predetermined.
As to not being able to store one or a hundered patient visit records,
this is a very simple matter of putting the patient visit records into
another table, indexed by e.g. patient number.
Neither of these are good reasons to reject using DBMSs in general.
Maybe some simpler DBMSs don't offer LOBs, but that doesn't make (mis-)
using XML as a database the right choice.
> [lb] Speed of searching is not an issue. Database engines
> are optimized for
> speed. Although the DOM enables us to carry out any search we
> want on an XML
> document it is still a slow process compared with a data base
> search. At
> peak periods it is said that the ESPN server carries out
> 12,000 searches a
> second! There is no way (at present) that an XML search can cope with
> numbers like these
> [lb] Data typing is not important. Everything in XML is
> stored as a string.
> Although schemas will allow data descriptions, the
> information is still
> stored as a string. Databases on the other hand can store
> data types in
> their native format making it easy to pass data to programs
> that need to
> manipulate the data. (e.g. an Astronomy program with the
> co-ordinates of
> stars). Use a traditional data base to store information of this kind.
Both valid reasons not to use XML as a database.
> [lb] The database is small but the need for scalability is
> important. XML
> is ideal for small data stores such as will meet the needs of
> a start up
> company. Because it is so easy to pass the information to a DBMS if
> necessary, XML datastores are extremely scaleable.
An interesting point, but cheap, simple DBMSs like MS Access (shudder)
would also server for a small company, and would not require
first learning XML, XSLT, DOM or SAX etc and then re-writing the
apps to use SQL.
. . .
> (d)When to use XML as a data store.
> Use XML for a data store for all complex documents with
> structure that does
> not easily fit into the straightjacket of records and fields. Our file
> "movies.xml" is a great example of such a file. It allows a
> flexible number
> of 'fields' for writers, producers, directors, and actors.
Again, this mythical "straightjacket" that doesn't allow variable
numbers of values. True only if you insist on putting everything
into one table, comparable to putting everything into one element
All of this completely misses the point of shared access to the DB,
locking, and transactions, critical to any multi-user environment
(which of course means it's important to any environment that may
be multi-user in the future). Most DBMS (but not e.g. MS Access)
provide this. XML doesn't.