Re: S.O.D.A. database Query API - call for comments
Date: Mon, 14 May 2001 17:40:24 +0200
Message-ID: <9doua1$ci5$01$1_at_news.t-online.com>
Philip Lijnzaad wrote:
> Your courage is indeed much admired :-)
Thanks!
> Carl> "Existing links" are defined by the class schema of the application.
The approaches definitely come from two totally different directions:
- SQL *is* the language of sets and mass updates
- Object orientation is the exact opposite. You define single object
behaviour.
The two extremes both need the functionality of the other side:
- Relational databases are headed towards "object-relational". Inheritance
for tables is introduced. Akmal Chaudhri has recently posted an example on
how Informix handles this to comp.databases.object. A side-note: Serializing
objects to columns has nothing to do with "object-relational". It merely is
"object-struct-reuse".
Let's hope we can all learn from eachother to create the best of both
worlds.
> Carl> I have said this before earlier in this thread: If you need a rigid
My compliments in return for being a fair and enjoyable discussion partner.
> >>
> >> which is a bit rigid ... you can't do tricks like finding the gaps in a
> >> series of entries (see other thread) by doing a self join.
>
> Carl> Again hack against theory.
>
> why ? The theory is so general that it allows clever (and fast) 'hacks',
> because it is based on operating on sets of things rigth from start. I
> believe this is what makes it so suitable to large-scale data management.
- Object databases also need declarative functionality for queries and
updates.
> Carl> system for day-to-day data management to maintain rather flat
> Carl> databases are probably the best choice.
>
> I object to the 'rigid' and 'flat'; the relational systems I have worked
with
> are anything but rigid or flat.
Give a finger and you've lost a hand. :-)
I meant "rigid" in a positive sense, as in "stable-in-every-day-use". Sorry
if I choose the wrong words sometimes, since English is not my mother
language. Replace "rigid" with "stable".
Forget "flat" also, if you wish. "Flat" and "heterogeneous" are a matter of
standpoint.
> Why ? Using relational calculus, you 'massage' the relevant (parts of)
> entities into a set of data that is of interest to the business logic, in
the
> most optimal way. The massaged (parts of) entities need not correspond to
any
> proper object type known to the system, but that is irrelevant.
This is where I see one of the largest advantages of relational databases: You can always set up a two-dimensional table of your result, which is instantaniously readable by a human.
> The end
> result of a query or transaction will be something that is then expressed
(or
> expressible) as objects. Relations expose every nook and cranny of the
data,
> in the most general way; sometimes this is simply needed.
> Carl> How many tables would you use to represent an inheritance hierarchy
of the
> Carl> following?
> Carl> - person
> Carl> - employee
> Carl> - manager
>
> Depends, but usually three. This can be 'horizontal' (only tables for
> leaf-classes; query super-classes through a UNION of a restriction of
them)
> or vertical (each class its own table, with sub-classes having just the
bits
> that are 'new', + a foreign key that is also a primary key; query
sub-classes
> through a join of super- and sub-table).
This is a fair and honest answer. "Depends" shows that you have a problem on
a decision, depending on application and performance issues. The need to
take a decision already is the sign of a mismatch.
Thinking OO, the above is simple and straigth forward:
manager extends employee extends person.
Using your second suggestion, the necessary queries for manager names looks
like this:
I bet, I will be faster:
- Relational:
select person.name from person, employee, manager
where employee.person = person.id
and manager.employee = employee.id
- Object-oriented:
database.get(manager);
- customer extends person
- sponsor extends person
- VC extends sponsor
- self_employed extends employee
database.get(person)
> That's not the point: the tables and primary/foreign keys guarantee you
that
> you can query the most bizarre combinations of parts of entities,
producing
> things that don't have to be any particular object. This generality is
> lacking in OODBMSs. So the 'separation' aspect is that RDMBSs do one thing
> (storing and querying), and do it well (i.e. in a very general way).
I fully agree.
Accordingly we would like to extend the query functionality for object databases.
> Carl> Maintainability is a matter of the implementation. Schema evolution
is
> Carl> not a problem inherent to object datases in principle.
>
> I'm not so sure, because objects don't have the closure property. I.e.,
> products, projection, set operations and restrictions of relations are
again
> relations. This symmetry is not available for objects.
To some extent, you may be right:
A table may exist on it's own. You can transfer it anywhere independantly.
If you transfer a second table, the joins will be there again, simply
because they only consist of numbers.
Object databases would transfer objects independantly. A complete restore of a large object network is only possible with access to OIDs. Ideally these would be unique worldwide (GUIDs).
Now what is the usecase?
Replication!
This is where GUIDs also become worthwhile using on relational systems.
> Carl> No: Reuse your object structures with a new logic if you wish.
>
> So what happens if a particular relationship changes from a one-to-many
> relationship into a many-to-many relationship ? You will have to update
your
> object structure, not matter what (both in the case of an OODBMS as well
as
> in the RDBMS + OR layer).
True.
I am just questioning the efficiency, performance, stability and resource
consumption of an additional layer.
> Carl> I don't understand your
OR layers can save you very much work. It might get your implementation work
very close to using an object database.
> Carl> round-trip-problem. This is dependant on the implementation.
>
> in theory yes (if you are clever enough to do query transformations), but
in
> practice no. In practice, complicated queries end up being done by
explicit
> pointer chasing, because that's the only thing available.
Query transformation and optimization is quite simple:
The principle will be the same for relational and for object databases.
Of course query evaluation can get more complex if object databases provide
added language binding functionality like support for Arrays or Collections
> Apart from the
- think in sets
- identify reocurring sets
- evaluate one set at a time
- always use the smallest set first
- narrow larger sets by smaller sets
> inability to do convenient aggregations over the data, you loose the
ability
> to work with sets of things ('things' not necessarily being
> objects). Compared to that, the round-trip problem is indeed minor.
"Sets of things" should be available on object databases by providing indices.
> Carl> Agreed, if an OR layer is used.
> Carl> OR layers produce lots of memory and performance overhead.
>
> no, not necessarily. In fact, frequently the OR layer is used to add some
> caching, improving the speed relative to the RDBMS. I will conceed that an
> OODBMS (which usually also does sophisticated chaching, it seems) has an
> advantage here.
Thanks!
Yes, OR layers can improve the speed by caching objects, but this only works for usecases, where a small and defined amounts of data are accessed. For typical High-End-Usecases you can assume that caches are always full.
I will conceed that most current object databases including our own perform badly, if query-evaluation hits cache limits. The future will see further development and better systems.
> Carl> As far as I am informed, views still are very slow.
>
> Where did you get this information from ?
- Empirical research, two years ago.
- Continous contact to my former company.
> A view is as fast as the query that
> defines it. So stating that a view is slower than a table may be true, but
> you're comparing apples and oranges.
I fully agree.
A view is *only* as fast as the query that defines it.
> Carl> Relational databases have only just started to allow indices on
views.
>
> Views can (and do) use the same indexes that the tables they consist of
> have, so I don't know what you're trying to say here.
A little more than a year ago MSSQL has added indexed views. The desired effect: Views can run faster than the queries that define them, because joins are pre-evaluated.
Kind regards,
Carl
--- Carl Rosenberger db4o - database for objects - http://www.db4o.comReceived on Mon May 14 2001 - 17:40:24 CEST