Jon Smirl wrote:
>In a message dated 96-02-05 12:06:18 EST, [log in to unmask] (Martin J
>>Thus specifying SYMBOL as a font should not be illegal (other than the
>>font tag itself should be phased out), but it should have a clearly defined
>>meaning as above. Otherwise, a Thai user cannot specify a preferred
>>font, even in a stylesheet, and even when strictly adhering to the i18n
>I still think using SYMBOL as a font should be illegal. The Thai author would
>have marked his document as being in the Thai character set. At this point
>Thai fonts are legal but the standard Windows ANSI fonts and not legal. To
>mix languages in a document you need to use the LANG attribute to change the
>set of legal fonts.
>SYMBOL is never a a legal charset. The only way to access symbol or wingdings
>is through entities or Unicode.
You are mixing three things up, or assuming stronger connections between
them, while they are actually pretty unrelated. These are
- Character encoding
- Language markup
- Fonts for display
To have some Thai document, it is not necessary to use a Thai-specific
encoding (what I assume you mean with "character set"), you could use
any Unicode-related encoding, or you could use the standard HTTP/HTML
encoding ISO-8859-1 or anything else, and numeric references.
All these variants, given that your user agent understands them, are
equivalent if the UA follows the i18n draft.
With respect to language markup, it is definitely not so that everything
in Thai has to be marked as Thai. Language markup is not compulsory,
especially also because this would make the largest part of the existing
documents illegal. Also, there is no language default (such as English).
If you think that the i18n draft can be misunderstood to mean that if
something is not language-marked, it must be English (or whatever),
then please tell us so that we can correct it. If a UA sees some Thai
characters, it is very reasonable to assume that this is indeed Thai,
and apply the corresponding line-breaking algorithms if they are
available, unless that text is marked to be something else. This is absolutely
independent on how these characters have been transmitted over the
wire, i.e. what encoding ("charset" parameter) was used.
Next, fonts are also pretty unrelated. There may be fonts that contain
the glyphs needed for Thai only, and others that cover Thai and other
languages, or there might be some cases where a font does not cover
all the characters (potentially) used in a language, such as in the case
of East Asian ideographs. Also, many languages include foreign
characters, esp. Latin and Greek, but also others, quite often, and these
are not that much seen as being in another language, but just as being
a natural extension of the current language. Restricting rendering
resources for text marked Thai to Thai fonts only will therefore not work.
So all in all, even if in general, a Thai document might be transmitted
by a Thai character encoding, might be marked as Thai with the LANG
attribute, and might be rendered using a Thai font, quite often, this is
not the only combination possible, and we should not restrict the use
of certain fonts, but only should make clear the semantics of specifying
such a font as I have tried to do above.