Print

Print


steve,

        this stuff is looking really cool. i'm not sure i completely understand the
IFS though... is this a "file system" which is stored within Oracle 8i? or
are external files stored on the host filesystem *indexed* in Oracle?
depending on the sizes of files and frequency of access various options
would make sense. For example, suppose you have a million XML documents each
1kb in size, that's one thing. Alternatively you may have a million
documents each a Gb in size and only accessed infrequently. Would IFS work
with say a heirarchical storage manager that automagically swaps documents
to tape or DVD jukeboxes if not recently accessed?

Jonathan Borden
JABR Technology Corp.
http://jabr.ne.mediaone.net



|
| - The 'relational world' is useful for efficient storage of connected
| data. However, I would just treat it like a file-system. Build a layer
| on top of it and then forget about it. The Holy Grail is objects.

That's the point of view we offer with the Internet File System in
Oracle8i. It lets you define how your XML file gets stored, and then
lets you forget about the database part -- except when you want the
benefits of using SQL and Full-Text searching to search that sucker. :-)

Our example:

<DamageReport>
A massive <Cause>Fire</Cause> ravaged the building and
<Casualties>12</Casualties> people were killed. Early
FBI reports indicate that <Motive>arson</Motive> is
suspected.
</DamageReport>


is a simplification of an actual XML document which came to use from
one of our customers in the Insurance Industry. I don't understand why
you want to pack duplicate information into element attributes. We let
you do searches on the marked-up mixed content using the tags as ways
to do pinpoint searching, so the whitepaper illustrates how you can do
a search like:

WHERE CONTAINS('Fire WITHIN Cause') > 0
AND CONTAINS('Arson WITHIN Motive') > 0

as part of a larger SQL statement. This lets you find the needle in a
haystack in a data warehouse of insurance claims.

We're working on even better support for native structural queries for
the future, but one approach people can adopt in the meantime is the
"persist the DOM tree to a table of nodes" approach that you've
pointed out with decent results. That works well for storage of highly
structured information, but not so great for searching mixed-text and
markup.

Our Internet File System approach lets you more easily map things to a
table-column structure so existing tools can still work against the
data once it's in the database. Of course, it supports also
selectively maping documents and doc fragments to indexed character
blobs as well.