"John E. Simpson" wrote:
> I *think* what Vinod was getting at was not about using mixed content models.
> As I read his question, he's got a content model that contains at least one
> #PCDATA element. And he's discovered that this element validates OK even if
> it is empty. For example:
> <!DOCTYPE para [
> <!ELEMENT para (#PCDATA) >
> This doesn't produce any validity or even any warning messages. (Caveat: I
> haven't checked it against more than a couple validating parsers. Possibly
> it's parser-dependent, although I don't think so.)
No, this is correct behaviour.
> If this *is* what Vinod was asking, then no, as far as I know there's no
> way to require that the para element (in this example) be non-empty.
No, not in element content.
> As for why this is allowed, I don't know. I imagine it simplifies a
> parser's processing -- it's kind of the equivalent of a DBMS that doesn't
> distinguish between empty strings ("") and true null values. It's certainly
> convenient when you're first developing a DTD, and know that the presence
> of a given element is required but don't yet need to actually populate it
> with test data.
There are very fundamental reasons why. Validation is based on markup
structure, not data content or structure. This is the reason why the
database folks are so interested in XML Schemas: schemas allow for not
only markup structure validation, but data validation based on typing.
If you want to validate the actual content of your document and not the
structure of its markup, you should investigate XML Schemas, although
it might be awhile before a completely compliant parser exists (there's
only one completely compliant validating XML parser I'm aware of: Sun's).
Murray Altheim, SGML Grease Monkey <mailto:email@example.com>
Member of Technical Staff, Tools Development & Support
Sun Microsystems, 901 San Antonio Rd., UMPK17-102, Palo Alto, CA 94303-4900
the honey bee is sad and cross and wicked as a weasel
and when she perches on you boss she leaves a little measle -- archy