LISTSERV mailing list manager LISTSERV 15.5

Help for XML-L Archives


XML-L Archives

XML-L Archives


View:

Next Message | Previous Message
Next in Topic | Previous in Topic
Next by Same Author | Previous by Same Author
Chronologically | Most Recent First
Proportional Font | Monospaced Font

Options:

Join or Leave XML-L
Reply | Post New Message
Search Archives


Subject: Re: Search Engines
From: Pim van der Eijk <[log in to unmask]>
Reply-To:General discussion of Extensible Markup Language <[log in to unmask]>
Date:Wed, 9 Dec 1998 14:04:19 +0100
Content-Type:text/plain
Parts/Attachments:
Parts/Attachments

text/plain (49 lines)


>>> "Daneker, Vincent" <[log in to unmask]> 12/09 11:21 am >>>
>There also the Dublin Core Initiative http://purl.oclc.org/dc/ . We've made
>use of some of the elements to mark-up official letters, in the <HEAD>
>element of an HTML document, and used a search engine that restricted its
>search to those meta-tags. The principle should apply to XML documents as
>well.
>

It's not implausible that Altavista might offer element-context-sensitive searching
as an advanced search option somewhere in the future.  In fact,  the indexing engine
of Altavista was derived from an earlier project (Systems Research Center "Hector"
project), which involved indexing and searching of an SGML corpus for lexicographers ,
so the underlying software originally supported this.


>
>We do have a big advantage: the search engine is for searches of our
>internal site only, so we know what tags are being used and who will be
>doing the searches. This is difficult to achieve on a web wide basis. While
>it might be possible to dynamically generate a list of tags in use, I really
>wouldn't want to wade through the results: <author>, <auth>, <autuer>,
><escritor>, <writer>, etc. All of which are legitimate ways to express the

The original XML Data proposal introduced some ways to express equivalences
between tag names.   RDF probably can be used to express this kind of
relations as well.  A search engine (or its indexing engine) might use this to map
all those variants to a single name.

Personally, I think endorsing a practical subset of metadata tags like Dublin Core
makes much more sense than standardizing an overly abstract  and poorly defined
general metadata language such as RDF.   In practice,  the existence of "Summary Info"
fields that Office 2000 will insert as XML islands in Office D-HTML files will produce lots
of metadata-carrying data on the Web, these fields might even become the de facto
standard in this area.

>concept. It may be that a series of domain and even language specific
>dialects develop because the participants in that area of knowledge agree,
>formally or otherwise,  that is how to proceed.

This is also the point of view of some librarians I've talked to at a recent XML seminar.
Even within a (scientific) discipline,  specific subdomains have very different needs wrt
metadata.   This reduces the relevance of metadata schemas like DC or MS Office
Summaries that attempt to appeal to the widest possible audience.

>Vincent Daneker
>Information Management
>[log in to unmask]

Pim van der Eijk.

Back to: Top of Message | Previous Page | Main XML-L Page

Permalink



LISTSERV.HEANET.IE

CataList Email List Search Powered by the LISTSERV Email List Manager