Re: S.O.D.A. database Query API - call for comments

From: Carl Rosenberger <carl_at_db4o.com>
Date: Mon, 14 May 2001 12:21:11 +0200
Message-ID: <9dobje$t4p$03$1_at_news.t-online.com>


Lee Fesperman wrote:
> > - Why should an internal ID get lost? There is more safety here than in
 a
> > relational system.
>
> I was just guessing about lost IDs; hierarchical and network databases
 suffer from
> broken links. In relational, these kinds of links are modelled with
 referential
> integrity. Can you prove that your scheme is safer, as you claim?

How do you *know* that hierarchical and network databases suffer from broken links? Have you used one? I can't help other vendors bad implementations. A schema to model links between objects or tables is always as safe as the database engine. There is no difference between relational or object databases. If corruption occurs, joins or links are lost on both systems.

> > - Primary keys do not model anything in the real world.
>
> Primary keys often model the real world. They are used in the real world
 for identifying
> entities. They may be existing identifiers - social security numbers,
 policy id's, ...
> or values assigned by the system and used externally -- order #, employee
 id, ... . OIDs
> are not good for this. They are too long (too many digits) and, besides,
 are not
> 'exposed'.

From what I have learned about relational systems, choosing a primary key that does model something in the real world, is bad practice, since you run into terrible trouble, if the used system changes in the real world (e.g. changed employee id pattern).

OIDs are typically exposed by most object database vendors but in my opinion this is bad practice.

> You are not removing the necessity for keys. How do you access top-level
 objects? Would
> your GUI require me to enter an OID in order to access the information for
 an individual
> employee?

How would you identify an individual employee within your company? You would use his first name and his last name. In very large companies you might also have an employee ID.

> > I can't follow your argumentation. For me it sounds like:
> > "If you walk withouth crutches, walking becomes more complex."
>
> You have two methods of access (by link, by key), but, some objects can
 only be accessed
> with one method and not the other. This is added complexity.

Again:
We don't expose access by key at all.
You can either
- use a declarative query to get a resultset, that contains the desired objects
or
- navigate within the programming language from parent to child. This is not part of the query system. The object database only helps here since it regenerate links within an object network. Both methods are available for all objects.

Again:
Navigation within the programming language is part of the programming languages capabilities. We do not add complexity here.

> The complexity is --- First and foremost, the user must have knowledge of
 existing links
> (and their direction) in order to access data.

"Existing links" are defined by the class schema of the application.

A programmer for a relational systema also has to be aware of the class schema of his application. *Additionally* he has to know the schema of the database. This is added complexity.

> Secondly, this brings more complications:
> + access to a sub-object must be through a 'parent' object, a special
 complication in
> some cases.

Navigation between objects and members is alwys from object to member. The same gos for object-oriented applications that use relational databases.

> + deeper sub-objects require more knowledge and more complicated
navigation.

The same is true for relational databases: The more tables you join, the more complex a query gets.

> + as the database becomes more complex, determining and understanding the
 required
> navigation becomes difficult -- "I know the entity I wish to access, but
 I'm unclear
> where it sits in the hierarchy."

Accessing an entity from another entity always requires you to know how they are linked together.

You are trying to use the deficiencies of relational databases as an argument for them:
- Relational databases need multiple tables to represent inheritance
hierarchies. Queries tend to get lots and lots of joins. The access pattern becomes very difficult to understand.
- With object databases you can query against a single object class, no matter how deep the inheritance hierarchy is.

> > Under "meta-data" I would understand data that describes the class
 model.
>
> Relations between objects is meta-data. Relational doesn't store links
 between tables in
> the tables themselves, because it is meta-data. What is the real world
 counterpart to
> links? Paper clips, staples?

What are foreign keys but relations between objects?

> > No, sorry, I do not suggest exposed object IDs. Object IDs are handled
> > internally.
>
> You left out the next quoted line of yours that I was responding to:
>
> "> Navigation by links is *very* specific. We also want to provide the
 ability
> > to specify comparison by object-identity in our query API."
>
> I read 'comparison by object-identity' to mean you were exposing OIDs.
 What did you
> mean?

When you use object variables within your programming language, they represent real world entities. Let's say I have an employee object with the name of "George Bush", that I have retrieved in a previous query: Now if I want to specify this object as a query constraint for another query, there are multiple ways of comparison. If I would want to retrieve all salary payments that match this specific "George Bush" I would use the constraint-evaluation "compare-by-object-identity". This is possible without exposing the OID.

> > I only wanted to point out, that the user can do anything he wishes in
> > modeling his objects. If he feels that he needs a "primary key" for
 object
> > types, he can of course add one.
>
> And will the user be able to use a foreign key and get key-type
referential integrity?

No, sorry, it just does not make sense to misuse an object database with a relational approach.

> Note: by 'misplaced data', I mean, for instance, that a member field is
 declared in the
> wrong object (class).
>
> You said your database was not normalized. How do you prevent redundant,
 misplaced data?
> Tell me exactly how you prevent it, otherwise your database is likely to
have it.

<caricature>
This is wrong. Tell me exactly which paradigm stands in contrast to relational databases. How is redundant misplaced data prevented? Tell me exactly how you prevent it in relational databases, otherwise your databasee is likely to have it.
</caricature>

Provided that you work with modern programming languages, relational databases require you to work with two programming paradigms: - object-oriented in your application class model - relational in your database scheme

Using two paradigms makes an application more complex. This will result in

- more errors
- more implementation work = higher cost
- worse performance

...and of course more misplaced data.

> Normalizing the database greatly improves
> the ease of extension.

No.
You have to extend two models:
- the application class scheme
- the database table system

You also have to correct all queries and mappings by working through all applications by hand.

> I'm sure that people will flock to a DBMS that is:
> + hard to use (multiple access techniques whose availability depends on
 the specific
> entity being accessed.)
> + untrustworthy (ignoring normalization will lead to redundant, misplaced
 data, which
> leads to data corruption.)
> + exensible primarily through rewriting rather than evolving (because it
 is not
> normalized.)

I will not dig down to your level of subjective negative argumentation. Your argumentation chain lacks any substance.

> > If you ask me personally, yes we do specialize on a niche: We are
 targetting
> > mobile Java devices with limited memory and resources.
>
> You figure those constraints eliminate the possiblity of competition from
 a relational
> DBMS?
We have run benchmarks, memory tests and database file size tests against the following:

- Oracle Light
- IBM DB2 Everyplace
- Cloudscape
- Pointbase
- Sybase SQL Anywhere
- InstantDB
- McKoy

Our engine definitely is superior for many usecases, especially on low-resource OS-es like EPOC where you can not operate with memory-consuming object-relational wrapper libraries. Database storage implementation work is drastically reduced since you simply store objects as they come. We are absolutely sure that our engine is by far the best choice for storing objects on mobile systems. Slowly but steadily our mailing list grows with fans and supporters.

Kind regards,
Carl

---
Carl Rosenberger
db4o - database for objects - http://www.db4o.com
Received on Mon May 14 2001 - 12:21:11 CEST

Original text of this message