Wow! Thanks for all the information, Ron. I'm still having a problem with
terminology, but I did mean that I use the DTD to validate my XML document.
This information is very useful in trying to understand the best way for us
to develop flexible applications.
Thank you very much!
From: Ronald Bourret [mailto:[log in to unmask]]
Sent: Tuesday, May 14, 2002 5:29 PM
To: [log in to unmask]
Subject: Re: Question regarding DTDs
Chris Proctor wrote:
> I'm fairly new to XML, but I have written some applications to build XML
> documents. I've used DTDs to validate that my document is well-formed, but
> know that DTDs can do a lot for me.
Minor nitpick: DTDs are used to validate a document, they are not used
to check well-formedness. Validation is the process of checking that a
document conforms to the rules in a given DTD. Well-formedness has a
number of rules, but largely boils down to (a) every start tag has a
matching end tag, and (b) start and end tags are properly nested.
Well-formedness is required -- if a document isn't well-formed, it is,
by definition, not an XML document. Validity is optional -- a document
can be invalid with respect to a particular DTD and still be a legal XML
document. (The more common case is that there simply isn't a DTD that
applies -- a good example of this is XSLT stylesheets.)
> Can a DTD be used to actually build an
> XML document? In other words, can I say "here's a database file or a
> tab-delimited file" and "here's the associated DTD" and use the DTD to
> the XML document? Forgive me if this is a stupid question, but we have so
> many vendors wanting data in a various number of formats. It would be
> if something like this is possible. It would take a lot of the pressure
> of us. I'd appreciate any input.
It's definitely not a stupid question.
This is definitely possible, but generally not as automatic as one might
hope. For example, suppose you have a comma-delimited file and a DTD and
want to get the data out of the comma-delimited file and into a document
that matches the DTD. To do this, you will need to define how the data
maps from one to the other.
For example, suppose you have the following comma-delimited file:
Chris,Proctor,Manager of Systems Integration,Gart Sports
And the following DTD:
<!ELEMENT People (Person)>
<!ELEMENT Person (Company, FirstName, Title, LastName)>
<!ELEMENT Company (#PCDATA)>
<!ELEMENT FirstName (#PCDATA)>
<!ELEMENT Title (#PCDATA)>
<!ELEMENT LastName (#PCDATA)>
There is no way a program can tell which fields in the DTD match which
fields in the file. Worse yet would be a DTD like:
<!ELEMENT A (B)>
<!ELEMENT B (#PCDATA)>
How would you map the above file to this?
The solution in both cases is some sort of user-specified mapping
information from the file to the XML document. Given this mapping
information, a program can automatically transfer data from a non-XML
format (comma-delimited file, database, Word document, LDAP, etc.) to an
There are several things to note here:
1) You will note that the DTD doesn't really get involved here. Instead,
it serves as a guideline to you when you are writing your mapping. That
is, it tells you the structure of the XML document that you want to map
2) If the non-XML format has metadata (a header line in a
comma-delimited file, database metadata, etc.), it is generally possible
to generate the mapping automatically. However (and this is important)
that mapping will follow a fixed structure that is hard-coded into the
generation program. For example, if the document above had a header
FirstName, LastName, Title, Company
you would get an XML document that used these names in the order shown,
This might not be the structure you want. In this case, you will
generally use XSLT to transform the resulting document to the structure
3) Even if you hand-code the mapping, there is a good bet that the tool
you use can only use a fixed structure. For example, a tool for
comma-delimited documents will only generate documents of the form:
Again, you will need to use XSLT to transform this to match your DTD.
4) As a general rule, I have seen no tools that can take metadata for a
non-XML format, a DTD, and generate a mapping between the two -- it's
simply too difficult. Instead, you either do a hand mapping between the
two or you start with one (e.g. metadata for a non-XML format) and
generate the other (e.g. a DTD) and the mapping.
5) You will need a different tool for each type of non-XML source (text
file, Word document, database, etc.) that you have. Here are some Web
sites you can use to find tools:
6) The tool you want may not exist.
7) If you're interested in the mappings between databases and XML