LISTSERV mailing list manager LISTSERV 15.5

Help for XML-L Archives

XML-L Archives

XML-L Archives


Next Message | Previous Message
Next in Topic | Previous in Topic
Next by Same Author | Previous by Same Author
Chronologically | Most Recent First
Proportional Font | Monospaced Font


Join or Leave XML-L
Reply | Post New Message
Search Archives

Subject: Re: single and double quotes
From: Peter Flynn <[log in to unmask]>
Reply-To:General discussion of Extensible Markup Language <[log in to unmask]>
Date:Thu, 25 Aug 2005 19:26:44 +0100

text/plain (35 lines)

Avraham Shapiro wrote:
> ** Low Priority **
> Peter,
> Thanks for your input.  I believe the validator is having trouble with the single quotes
> because it fails on the xml line with the error about the UTF-8 encoding, if the
> encoding value is 'UTF-8' and it does not fail at all if the value is "UTF-8".  

Ah, OK. In that case the parser is definitely broken.

> The reason we use this validator (although we sometimes still use generic XML
> parsers) is that in adddition to validating against a DTD, it checks a great deal 
> of semantic information in a Z39-86 compliant digital book, information that a
> more general XML parser would not check.

It sounds as if you might want to move the whole thing to a W3S or 
RelaxNG schema, where you have more control over semantic information at 
the parsing level.

What you might be able to make do is to run osgmlnorm, which is part of 
the same package (SP) that onsgmls comes from. This normalises the 
document, making all single-quote enclosures into double-quote (except
where they *contain* double quotes, in which case single quotes are used
as the container, and vice versa; and it re-expresses the whole document
in accordance with the DTD.

It was intended for use with SGML, where there was much more 
abbreviation and avoidance of quotes than is allowed with XML, and I 
haven't actually tried it with XML, but with the -wxml switch it should 
work: $ osgmlnorm -wxml xml.dec filename.xml >output.xml

Then use your Z39-86 parser on the result.


Back to: Top of Message | Previous Page | Main XML-L Page



CataList Email List Search Powered by the LISTSERV Email List Manager