For starters, it sounds like you and I are in agreement that mixed content
is necessary. This is in contrast to other things have read, however. At
least one widely-read (and otherwise excellent, in my opinion) book on XML
states that mixed content should be used only as a crutch when converting
old text data to XML. I found this view surprising, and a matter of concern
if it is indicative of future directions in development of XML tools and
In my specific case, I've found ways to use mixed content when needed, but
as a result, our DTD is more permissive than I'd like it to be in a number
of places. For example, we have an element called <filter> that can be used
to tag content as specific to a given product, market, or media, as follows:
<p>This sentence appears in both help and the book.
For more information, see chapter xref...
The filtered content can be a word or sentence within a paragraph (as
above), or multiple paragraphs or other block of content. Ideally I'd like
the DTD to allow only one media, product, or market tag, to eliminate the
possiblity of conflicting tags:
<!ELEMENT filter (product?, market?, media?, (#PCDATA | p | graphic |
However because #PCDATA is included, the only declaration that's allowed is:
<!ELEMENT filter (#PCDATA | product | market | media | p | graphic |
This makes it possible for writers to get themselves in trouble by including
multiple -- and conflicting -- product, market, or media tags within a given
filter (which is unlikely the brief example above, but more likely in cases
where larger chunks of content are filtered).
Incidentally, we're working on a strategy where the same elements that
define links in help would become cross-references in the book, which would
make the above example a moot point. But that's a separate issue.
From: Wendell Piez [mailto:[log in to unmask]]
Sent: Thursday, May 17, 2001 6:42 AM
To: [log in to unmask]
Subject: Re: XML DTD syntax
At 05:02 PM 5/17/01, you wrote:
>I've been grappling with this issue, and have been frustrated by the
>restrictions placed on #PCDATA.
Could you be more specific?
> I understand that mixed content is messy and
>problematic, and avoiding it would make life simpler. Unfortunately it also
>seems necessary if you want to do things like tag reserved or important
>terms wherever they appear in body text, which is hardly an unusual
>practice. So it seems to me this is a case where the standard is at odds
>with the real-world requirements many of us face. Am I missing something?
I wouldn't agree, entirely. Avoiding mixed content might make life simpler,
but so would eating nothing but sardines. Mixed content is very necessary
in real-world applications, at any rate those dealing with textual data.
On the other hand, having grappled with SGML's greater permissiveness, I am
on the whole happy with XML's restrictions, which prevent many problems
from even starting. Generally, I find that XML-friendly mixed-content, plus
a judicious use of helpful "wrapper" elements, makes life alot easier. But
without knowing what, specifically, you want to do that XML can't do, I
can't speak to your case.
Wendell Piez mailto:[log in to unmask]
Mulberry Technologies, Inc. http://www.mulberrytech.com
17 West Jefferson Street Direct Phone: 301/315-9635
Suite 207 Phone: 301/315-9631
Rockville, MD 20850 Fax: 301/315-8285
Mulberry Technologies: A Consultancy Specializing in SGML and XML