[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Fwd: Re: XML and JNDI



On Tue, 10 Aug 1999, you wrote:
> Falko Braeutigam wrote:
> > On Tue, 10 Aug 1999, Zvi Avraham wrote:
> > > Falko Braeutigam wrote:
> > > > From: Steve Tinney <stinney@sas.upenn.edu>
> > > > Fwiw, I would say just go ahead and store the whole DOM, no cutting.
> > > > Those with huge datasets will probably have huge machines, no?
> > > Sorry Steve, I can't agree with you.
> > > For me one of reasons for databases to exist is:
> > > 1. Persistent storage
> > > 2. Ability to work with data (tables/documents/whatever) that can't fit in RAM.
> > >
> > > I don't want only ability to store whole DOM tree in the Ozone and retrieve it
> > > again.  I want to query and update this document when it resides in Ozone, and
> > > this query will return only specific Nodes of the DOM, not _all_ the DOM tree.
> > I think this is not was Steve was saying. Of course, the DOM nodes will be
> > stored independently as OzoneObjects. The problem may be the overhead of the
> > database when storing small objects (like DOM nodes). To avoid this I was
> > thinking about clustering of nodes. Actually, Steve do not want such tricky
> > things. He want each node to be stored as an independent database object. So,
> > you are of the same opinion. Right?
> 
> Yes, Falko, that's what I meant.  By whatever means, the entire DOM
> would be on disk
> whether it be in the database or layered over it.  The storage overhead
> might be large,
> but it would just be disk space, and the low cost of that is only likely
> to go 
> down further, and it's already pretty low (in the U.S. we are at <$700
> for an 18G 
> U2W scsi drive).
> 
> In the GMD-IPSI XQL+PDOM, you just query the dataset as though it were
> an in-memory
> DOM, even though the whole thing is on disk.  T
This should be possible also with ozone. In fact, this is the way to
communicate with the database - just calling methods of proxy objects that act
like the real object in the database.

> The XQL result set is
> returned as an
> in-memory object.  I don't know if the PDOM implementation caches any
> nodes in memory;
> one could imagine that being a useful optimization, but unnecessary for
> an initial 
> release or two.  
Caching is already there in ozone.

> The PDOM uses indexing to speed disk access.
Unfortunately, DOM does not provide data access paths than the NameNodeMap. So
if indexing or other things are needed, it should go in the Common DOM Query
API (CDQA) layer on top of DOM. I do not know exactly what's going on in this
layer but Vincent has nearly finished his work on CDQA. He did a lot of
research querying semi-structured data. But CDQA is based on pure DOM. This
seems to be a good architecture. (?)

> I can see that if there is significant database overhead clustering
> would be a gain,
> but again one could do it later, no?
I think so.


Falko
-- 
______________________________________________________________________
Falko Braeutigam                         mailto:falko@softwarebuero.de
softwarebuero m&b (SMB)                    http://www.softwarebuero.de