Print

Print


Mark writes:
> Perhaps I should have said - revolution for the web. SGML has been round
> a long time, sure, but no-one ever told me about it! All the existing

Alas not even for the Web. Panorama has been available for four or
five years at least, maybe more.

> systems I have seen or read about seem to relate to large-scale document
> storage and retrieval - at great expense. A pretty exclusive club,

This is one of the great pities of SGML. Yes, it is complex, and yes,
writing software for it does take a lot of time and skill, both of
which cost unless you are developing something in your own time for
free distribution. One of the major reasons for the traditional cost
was Uncle Sam's early adoption of SGML, and the same price-tag was
applied to that as to the apochryphal $500 hammers and $100 nuts and
bolts. When I first went looking for a decent graphical SGML editor in
the early years of this decade, the one I liked was $6,000 per copy,
no academic discount, and a salesperson so rude that it was clear my
business was unwelcome unless I was ordering in bulk.

But as has been highlighted very often and repeatedly in books,
articles, and mailing lists/newsgroups, it is possible to do very
significant amounts of SGML work using entirely free software. Both my
own book and Bob du Charme's (and several others) include lots of free
SGML software on the CDROM. But as the title of Chet Ensign's book
points out, it's the Billion Dollar Secret.

> really. XML allows for the idea that the web is one great big database,
> and that data can be moved around, processed or formatted very easily. I
> know it's obvious, but the separation of data and format on the web is
> 'a revolution' of sorts.

When it occurs, it will be :-)

> Also, I emphasise my use of the word 'consistent', since adoption is
> half the battle.

This is the key.

> Depends what you mean by inefficient. Of course it's inefficient in
> terms of storage and inefficient in terms of bandwidth. But is it not
> immensely efficient to be able to edit, store and transmit my GDP
> figures with much the same software and techniques that are used for
> Hamlet? Is it not very efficient that you could devise a web page that

Yep, good point. Once we get the bandwidth and processing speeds up to
scratch everywhere on the net, uniformly, we'll start to get a bit
closer to something like the all-pervading Gibsoneque net.presence.

> - A table that defines the attributes, e.g. is the attribute an integer
> or string, is there a list of possible values, etc. At this level we've
> also done some things that I don't think can be expressed in a DTD (but

The integer vs string (NUMBER vs CDATA) certainly can be; so can a
list of possible values. The things people look for that SGML does not
provide for are principally floating-point and validation ranges.

> (XLink/XPointer/XSchema), like defining an attribute that points to
> another node in the tree.

That once certainly is a major feature.

> We have also implemented some attribute types
> that make for greater efficiency, such as an attribute type that when
> rendered as XML will contain the database value of a named element in
> its parent object. (This is important if a node is passed out that needs
> to contain information from its parent, but to pass the whole parent as
> well would be wasteful.)

This sounds a bit like the TEI's KEY attribute. Very useful.

> > No, but I often have to get into people's files with a
> > plaintext editor
> > to fix up the things that more foolish software has corrupted.
>
> That is not the point though.

But it is. What we are calling "human-readable" is just a short way of
saying we want sequential, unencoded, inline markup on Unicode
files. No proprietary tricks, no hidden characters, private escape
sequences, special control characters, out-of-line markup, or masonic
handshakes; everything above board and explicit, readable (at a pinch)
in a plain editor.

(Note I am not suggesting that _I_ should expect to be able to read
Unicode Cantonese in my unmodified Notepad, just that a Chinese user
with a regular Cantonese Unicode editor would be able to).

> Very. As it happens, I don't think it was discovered yesterday. I DO
> think that the people who have devised these systems have probably
> missed out on a major opportunity.

Actually a lot of them have spent years trying to get database people
even vaguely interested, but without much success (a few obvious and
notable exceptions like Texcel, of course). Maybe they were asking the
wrong questions or offering the wrong solutions.

> direction, and I apologise for being so excited. But my guess is that
> the DB and systems people will make the running on this in the next few

I would like to think so.

///Peter