>note - meta is used for a bunch of stuff. I presume that by "<META>
>madness" you are referring exclusively to the use of the HTTP-EQUIV
>attribute to specify a content type with a parameter to specify the
>character encoding. Please correct me if you are infact objecting to
>the whole META entity.
No, just the HTTP-EQUIV stuff in general, but especially HTTP-EQUIV when
used to specify the encoding.
>1) the document author probably knows what character encoding was
>used, because they wrote it. Thus, they can set up a program to parse
>the document - and extract any metadata - that makes use of that knowledge.
I think most people would know if they wrote in SJIS, or EUC, but I
would not vouch for authors in general being able to write parsers,
nor would I vouch for them knowing the coded character set(s) being
>2) the server: big grey area unspecified by any web standard. Whether
The server cannot parse the document reliably without the encoding
being specified external to the file.
>3) the client - well, by then the stuff has been labelled anyway.
One would hope so, but this is currently not the case, and this "META
madness" is partially aimed at helping in cases where it is not
Note that in all 3 cases, or indeed, in any situation where parsing is
required, one must know the encoding before one can reliably parse the
document. In (1) the author or his tool can obviously handle the
encoding, but in the other two cases, it is far from guaranteed.