LISTSERV mailing list manager LISTSERV 15.5

Help for HTML-WG Archives


HTML-WG Archives

HTML-WG Archives


View:

Next Message | Previous Message
Next in Topic | Previous in Topic
Next by Same Author | Previous by Same Author
Chronologically | Most Recent First
Proportional Font | Monospaced Font

Options:

Join or Leave HTML-WG
Reply | Post New Message
Search Archives


Subject: Re: HTML Character Representation/Transmission Model
From: Glenn Adams <[log in to unmask]>
Reply-To:[log in to unmask]
Date:Tue, 11 Apr 95 09:00:37 EDT
Content-Type:text/plain
Parts/Attachments:
Parts/Attachments

text/plain (62 lines)




Lou,

The key to this proposal is that specifying 10646 as a universal
HTML document character set is, in general, simply an editorial
change w.r.t. current practice (as Chris Lilley points out).

First, this change does not require 10646 be used in any storage
object; that is, the current practice of using 8859-1 could continue
without change.

Second, all numeric character references in the range 0 - 255 would
refer to both 10646 characters in this range and also 8859-1 characters
in this range since the first 256 code positions of 10646 *are* precisely
the same as 8859-1 in terms of their character assignments (and formal
identities).

Finally, if a numeric character reference specifies a character number
greater than 255, then such a character could be interpreted in some
default fashion or not intepreted as a character at all on existing
8859-1 systems.

The key to successfully making this change would be to limit the
"significant SGML characters" in the default SGML declaration to
only those characters of 10646 which are also found in 8859-1. The
remaining characters of 10646 would thus be treated as "dedicated
data characters" (of class DATACHAR).

Gavin's attempts to use an SGML declaration which admitted non 8859-1
characters to use for markup would have to be deferred to a later date.

The real effects of this change would be to:

(1) rationalize the use of numeric character references in a universal
fashion (at least normatively speaking)

(2) provide a significant growth path to HTML applications that wish
to begin exploiting non-Western European (8859-1) language capabilities,
and do so in a standard fashion

(3) facilitate the use of DSSSL Lite and DSSSL (ISO/IEC 10179) which
requires that all characters be expressable in terms of ISO/IEC 10646

(4) provide more consistency with newly developed national standards;
e.g.,

    JIS X 0221  = Japanese National Standard based on ISO/IEC 10646
    GB 13000    = Chinese National Standard based on ISO/IEC 10646
    ECMA ?      = upcoming ECMA standard based on ISO/IEC 10646
    etc.

(5) finally, this change would *not* necessarily change current
behavior or practice

Regards,
Glenn





Back to: Top of Message | Previous Page | Main HTML-WG Page

Permalink



LISTSERV.HEANET.IE

CataList Email List Search Powered by the LISTSERV Email List Manager