Real confusion runs rampant ... to use XML is not some isolated activity.
It requires agreement in field of activity as to what's called what ...
there's no magic in XML. So now a grad student not only has to figure out
how to use XML, but also has to create or find or mangle some agreed
nomenclature ... and find some faculty member or other university staff to
validate it ...
As for pdf ... well, the FDA is using it now to accept parts of regulatory
submissions. One of the main reason IMHO is to be able to handle the
complicated tables (word processing, not html) that can be cobbled together
in Word or WordPerfect. These tables don't convert well, if at all,
between word processors, or to html, and are often more creative soarings
than essential. Personally, I think pdf is worth maybe another ten years
... good old ASCII, dull as it is, does most jobs (except for things like
One would hope that the information in theses is reasonably important, but
personally I think that good abstracts, key wordings, etc., are quite
useful and require only a small amount of the work for something like XML.
Murray Altheim <[log in to unmask]> on 10/05/98 01:46:37 PM
Please respond to Murray Altheim <[log in to unmask]>
To: [log in to unmask]
cc: (bcc: David Hopp/CRL/Cato)
Subject: Re: Word/WordPerfect and XML
Elliotte Rusty Harold <[log in to unmask]> writes:
> PDF is a documented format. It's also more portable and reliable than raw
> I can't believe you're even considering XML/SGML for student theses at
> moment in time. I suspect doing this would lead to a rebellion among grad
> students and faculty.
> PDF can handle anything that can be written and printed on paper from a
> computer. XML/SGML can't, at least not yet.
This represents one of the most narrow-minded views of higher education
I've ever heard. There's a greater purpose to the publication of theses
than simply evidence of research in order to obtain a dissertation, and
use of PDF would severely restrict the re-use of valuable research. Given
that candidates author in neither PDF nor XML but either could be an
output format for storage, choosing a proprietary, non-searchable, soon-
to-be-obsolete format means that all research published in PDF will become
invisible to remote researchers, and hence in great part to history. (If
a student's goal is simply to get through school, then maybe their work
being lost to inquiry is not such a bad thing...)
Publishing theses in PDF is the most brain-dead idea I've heard in a long
time, even worse than publishing in MS Word. I'd much rather see ASCII
text in all of its unmitigated boringness than lose access to research
because of an administration's shortsightedness, or to exclude researchers
simply because they don't have the proper tools.
Since adequate tools aren't quite yet available, look at XML for now as an
output format. You won't be able to 'emulate' PDF until XSL reaches some
maturity. I know that in research I've been doing for years, I'm
frequently referring to documents published decades ago (indeed, back to
about 200 BCE). If these documents had been in published in some
proprietary, extinct format (say, MacWrite from 1986, rather than simply
on paper), I'd have no means of reading them.
If you are attempting to create an electronic archive in order to allow
remote access (there are many researches I'll never get to read), you've
got to think like a librarian: _long_term_ access to historical documents
is still extremely important. PDF is directly inimical to that goal. You're
better off looking at Project Gutenberg* as a model than Adobe's PDF.
* http://www.promo.net/pg/ (particularly the History & Philosophy)
Murray Altheim, SGML Grease Monkey <mailto:firstname.lastname@example.org>
Member of Technical Staff, Tools Development & Support
Sun Microsystems, 901 San Antonio Rd., UMPK17-102, Palo Alto, CA 94303-4900
Ernst Martin comments in 1949, "A certain degree of noise in
writing is required for confidence. Without such noise, the
writer would not know whether the type was actually printing
or not, so he would lose control."