In message <[log in to unmask]> 23 May 1996 13:17:13, [log in to unmask] wrote:
> From: Olle Jarnefors <[log in to unmask]>, Peter Svanberg <[log in to unmask]>
> | Daniel Glazman <[log in to unmask]> wrote in
> | <[log in to unmask]>, quoting an earlier
> | message from us:
> | > Hyphenation data are part of the rendering and should not be included
> | > in HTML instances. External references to hyphenation tables + use
> | > of CSS (for the table case) are, from my point of view, a better
> | > solution. Of course not the only solution...
> | Please try your solution. It might be workable for English and
> | French. In other langauges no hyphenation table will be big
> | enough because of the promiscuous tendencies of nouns to form
> | never before anticipated compound words.
> | And there are cases of homonyms that cease to be homonyms when
> | correctly hyphenated. Take the Swedish word "mattransport"
> | - It can mean "transport of food".
> | Then possible hyphenations are "mat-trans-port".
> | - Or it can mean "transport of carpets".
> | Then possible hyphenations are "matt-trans-port".
> I find the reference to CSS a little curious - I don't see how
> hyphenation data would fit into CSS at all. I also generally disagree
> with the notion that hyphenation is presentation data; rather, I would
> say it was essential to the meaning of the word.
First, hyphenation algorithms can be dependant on the element you deal
with... I guess one will need stronger constraints on tables cells
rendering than on H1 rendering because their width is only a pourcentage
of the total width of the document.
Hyphenation is not essential to the meaning of the word: in current HTML
renderings, there is no hyphenation at all and everybody reads and
understands the Web quite correctly.
> On the other hand, the HYPH markup certainly seems to be painful and to
> make the document harder to read in source form. An alternative might
> be a two-stage approach:
> 1. add a HYPHDICT element to HEAD (and add HYPHDICT as a defined
> REL attribute to LINK, to allow external dictionaries), whose
> content would be a sequence of words with HYPH markup; a UA
> doing hyphenation would be expeted to use the given hyphenation
> in preference to its own, for given words [I would take
> advisement on whether the content would need to be a series of
> HYPHDICTITEM elements or a text string with whitespace separated
> words - the former allows for phrases with embedded whitespace
> to be hyphenated specially, but I don't know whether that matters]
> 2. use HYPH markup in the BODY *only* where different homonyms
> or contexts make the dictionary approach unacceptable
> This approach would allow minimal intrusion into the source form, is
> scaleable for author requirements (an author could accept the browser's
> guesses, or could refer to an external dictionary specific to her
> language, or could provide exceptions), but retains the full capability
> of the HYPH proposal.
Yes, I really prefer this approach. Then the "dictionnary" could be
a public resource, cached in a proxy-cache or a local-cache, re-usable
by other documents, and so on. It would also mainly reduce the number
or HYPH or HY or whatever it becomes in the instance itself.
Maybe could we have some input from SGML gurus here ? Is HTML the
first SGML application to think about hyphenation ??? If a solution,
widely implemented in this community, already exists, I guess we
could use it... [ I send a carbon-copy of this message to C.Espert
for information and, I hope, reply... ]