LISTSERV mailing list manager LISTSERV 15.5

Help for HTML-WG Archives

HTML-WG Archives

HTML-WG Archives


Next Message | Previous Message
Next in Topic | Previous in Topic
Next by Same Author | Previous by Same Author
Chronologically | Most Recent First
Proportional Font | Monospaced Font


Join or Leave HTML-WG
Reply | Post New Message
Search Archives


Revised language on: ISO/IEC 10646 as Document Character Set


[log in to unmask] (Dan Connolly)


[log in to unmask]


Fri, 5 May 95 21:45:02 EDT





text/plain (78 lines)

Martin J. Duerst writes:
 > >>
 > >> |HTML Lexical Syntax
 > >> |
 > >> | ... A minimally conforming HTML user agent must support the SGML
 > >> | declaration in section SGML Declaration for HTML, which specifies ISO
 > >> | Latin 1 (@@full name) as the document character set; it may support
 > >> | other SGML declarations, in particular, SGML declarations with other
 > >> | document character sets.
 > Why not write it like this (another compromize):
 > "in particular, SGML declarations with ISO10646 as the document
 > character set."

Right. Try the latest version on for size:

Blech. Lemme try again.... OK. That's better.

I moved this discussion into the conformance section (it took me a
while to find it where it used to be: under "Lexical syntax"). That
way, the "Character Content" and "Document representation" parts don't
have to change if/when we revise the whole thing or excerpt parts for
other documents.

I actually make ISO10646 a binding constraint without putting it
in the public text (the SGML declaration). See what you think:
|A document is a conforming HTML document only if:
|Its document character set includes ISO-8859-1 and agrees with
|ISO10646; that is, each code position listed in section The ISO-8859-1
|Coded Character Set is included, and each code position in the
|document character set is mapped to the same character as ISO10646
|designates for that code position. (1)
|The document character set is somewhat independent of the character
|encoding scheme used to represent a document. For example, the
|ISO-2022-JP character encoding scheme can be used for HTML documents,
|since its repertoire is a subset of the ISO10646 repertoire. The
|crititcal distinction is that numeric character references agree with
|ISO10646 regardless of how the document is encoded.
|User Agents
|An HTML user agent conforms to this specification if:
|It supports the ISO-8859-1 character encoding scheme, and processes
|each character in the ISO Latin Alphabet Nr. 1 as specified in section
|The ISO Latin 1 Character Repertoire. (3)
|To support non-western writing systems, HTML user agents should
|support the Unicode-1-1-UTF-8 and Unicode-1-1-UCS-2 encodings and as
|much of the character repertoire of ISO10646 as is possible as well.

How's that for a compromise?

(note that the text and postscript versions are a bit out of date
right now...)


Back to: Top of Message | Previous Page | Main HTML-WG Page



CataList Email List Search Powered by the LISTSERV Email List Manager