>>> "Daneker, Vincent" <[log in to unmask]> 12/09 11:21 am >>>
>There also the Dublin Core Initiative http://purl.oclc.org/dc/ . We've made
>use of some of the elements to mark-up official letters, in the <HEAD>
>element of an HTML document, and used a search engine that restricted its
>search to those meta-tags. The principle should apply to XML documents as
It's not implausible that Altavista might offer element-context-sensitive searching
as an advanced search option somewhere in the future. In fact, the indexing engine
of Altavista was derived from an earlier project (Systems Research Center "Hector"
project), which involved indexing and searching of an SGML corpus for lexicographers ,
so the underlying software originally supported this.
>We do have a big advantage: the search engine is for searches of our
>internal site only, so we know what tags are being used and who will be
>doing the searches. This is difficult to achieve on a web wide basis. While
>it might be possible to dynamically generate a list of tags in use, I really
>wouldn't want to wade through the results: <author>, <auth>, <autuer>,
><escritor>, <writer>, etc. All of which are legitimate ways to express the
The original XML Data proposal introduced some ways to express equivalences
between tag names. RDF probably can be used to express this kind of
relations as well. A search engine (or its indexing engine) might use this to map
all those variants to a single name.
Personally, I think endorsing a practical subset of metadata tags like Dublin Core
makes much more sense than standardizing an overly abstract and poorly defined
general metadata language such as RDF. In practice, the existence of "Summary Info"
fields that Office 2000 will insert as XML islands in Office D-HTML files will produce lots
of metadata-carrying data on the Web, these fields might even become the de facto
standard in this area.
>concept. It may be that a series of domain and even language specific
>dialects develop because the participants in that area of knowledge agree,
>formally or otherwise, that is how to proceed.
This is also the point of view of some librarians I've talked to at a recent XML seminar.
Even within a (scientific) discipline, specific subdomains have very different needs wrt
metadata. This reduces the relevance of metadata schemas like DC or MS Office
Summaries that attempt to appeal to the widest possible audience.
>[log in to unmask]
Pim van der Eijk.