Re: Unknown SQL

From: Bob Badour <bbadour_at_golden.net>
Date: Sat, 21 Jul 2001 23:27:50 GMT
Message-ID: <9avR6.741$H77.178313287_at_radon.golden.net>

Carl Rosenberger wrote in message <9f42pe$2le$04$1_at_news.t-online.com>...
>Bob Badour wrote:
>> >> Object languages expose physical implementation details to
programmers.
>> >> Subsequently, when object languages differ, databases based on them
must
>> >> differ as well. Your 'ideal' was long ago defined out of existence.
>> >
>> >Most object languages are very similar. You have classes, simple
datatypes,
>> >arrays and inheritance. Lower level API classes that would typically be
>> >persisted like collections are also similar.
>> >
>> >The database API would be exactly the same.
>>
>> Go back and read my original email on this topic for an abbreviated list
of
>> the ways that different OODBMSs differ in subtle and confounding ways.
>
>After reading all your postings in this thread I can not find a list of the
>differences.

That's odd. You must have missed this one:

From: Bob Badour (bbadour_at_golden.net) Subject: Re: Unknown SQL
Newsgroups: comp.databases, comp.databases.object, comp.databases.theory, comp.lang.java.databases, comp.lang.java.programmer Date: 2001-05-12 04:14:37 PST

>And in fact, from a pragmatical point of view, the fact that RDBMSs are
>largely standardized, it is much easier to switch to different languages or
>even RDBMS vendors. Same can not be said for OODBMSs. Some argue that this
>is not a strength of the relational model, but rather a failure (since
what?
>10 years?) on the part of the OODBMS vendors to get their act together on
the
>standardization front. Others will argue that it _is_ easier to standardize
>implementations of well understood technology such as relational theory,
and
>that object oriented persistence is difficult simply because there is no
>overarching formal theory for objects.

Philip,

It is not an issue of theory, per se, that makes it easy to switch between RDBMSs and hard to switch between network model OODBMSs. An RDBMS "represents all data as values in columns in relations". The value 1 does not change significantly among different representations. Whether represented with a character string, packed decimal, binary integer, IEEE floating point number or any other representation, the value 1 remains the value 1.

Moving from one vendor's RDBMS to another vendor's RDBMS, the value 1 likewise remains the value 1. Similarly, the value 346987563 remains the value 346987563.

Network model databases, however, represent data with pointers and other physical constructs that vary greatly among different vendors. Further issues such as shallow copies vs. deep copies, pointer swizzling, implicit back-pointers etc. vary as well. Every different implementation of network model database changes the meaning of the data implied by physical constructs by introducing often subtle differences in interpretation.

Regards,
Bob

>> >There simply is no (zero, nothing, rien, kein) administration work or
>> >maintenance work necessary to persist objects. The database engine
analyzes
>> >classes automatically.
>>
>> Are you for real? Of course there will be tons of administration work.
How
>> does the DBMS handle the fact that the Employee Object used by HR has
subtle
>> differences from the Employee Object used by Payroll and that Hiring an
>> employee means different things to these different departments? How does
it
>> handle the fact that what it means to these departments this year is
subtly
>> different from what it meant last year and what it will mean next year?
>
>The work to reflect changes is to be done in the programming code, am I
>right?

Nope. Many applications need not change at all because their specific needs might not change from one year to the next. Logical independence, delivered by views, provides all that's necessary. Of course, that subset of all the applications that do need to change from one year to the next will have to change in the code. Presumably, they might use new views on the database as well.

>There are object databases that automatically handle schema changes. Still
>there is no maintenance work necessary to adjust the database schema. It
>will continously be analysed by the database engine.

Right. And if pigs had wings they would fly. I suppose I just imagine the schema change I want, the database reads my mind and takes care of all the rest?

>If you think you need proxy objects to provide a different view at employee
>objects from HR and from Payroll, why not create the classes for them?

What exactly is a different view at an employee object? Is it a different view at the data? Is it different behaviour? Why would I want to create different classes for them at all? It's much simpler to create relational views in the database and handle application issues in the applications.

>If you want versioning, fine.

Versioning is an inferior attempt at addressing physical/logical independence. When I want versioning, I use PVCS or SourceSafe -- where it's appropriate. In the database, I will use relational views.

>Why not work with different versioned classes derived from a base employee
>class for the next year?

How do I ensure that all of my users exercise the correct code? If they run last year's version, my company might not be in full compliance with federal statutes.

Relationally, when I create the views to support last year's applications, they will still have to conform to all of today's constraints as declared on the database.

>> I have already seen it pointed out to you several times that thirty years
>> ago companies went down the path of basing their databases on specific
>> applications and it did not work out. Database management must support
use
>> of the data for all applications.
>
>The fact that companies go up and down is not directly associated with the
>technical quality of the products.

This is irrelevant to the point I made. The rise and fall of companies has nothing to do with the data management principle involved.

>Just yesterday I have received some
>private information on the decline of O2, inspite of the fact that it was
>technically the best object database.

It does not surprise me that the best object database would decline. Being based on regressive, network model technology, they should decline.

>IT does change, and it changes quickly.

Ahhh, but sound fundamental principles change very slowly. It is precisely my knowledge of sound fundamental principles that has allowed me to adapt to the rapid changes in my industry.

>Object-oriented languages were not common 10 years ago.
>Today they are.

Well, I have been writing object based and object oriented programmes since 1987. Let's see.... that's 14 years. I guess that makes me an early adopter.

>It is time for object databases.

Bullshit.

>> >There is no need for normalization work, creating and maintaining
tables,
>> >thinking about keys, no strings within code, no mismatch between
inheritance
>> >hierarchies and tables.
>>
>> Huh? Of course there is -- either you are ignorant of basic facts or you
are
>> ignoring them intentionally to try to make some kind of unsupportable
case.
>> Someone still has to design the data model.
>
>No. Someone designs the class model.

Please explain the difference between a class model and a data model? A rose by any other name...

Unfortunately, as I pointed out previously, the person doing the data model will often perform that vital task in complete ignorance of sound data management principles and without the best tools that science has to offer. All because marketing assholes like you go around spouting horseshit like "There is no need for normalization work".

>Storage is taken care of by the object database.

Here, once again, you demonstrate your ignorance of the difference between logical (data model) and physical (storage).

>> >The more complex your object model is, the higher the performance
advantage
>> >on inserts and navigation will be.
>>
>> Performance (physical) is completely orthogonal to data model (logical).
>
>O.K. if you think so.

I know so. Unlike you, I understand and articulate the difference between physical issues and logical issues.

>Compare the following:

It's not hard for someone as ignorant of database principles as yourself to build a straw man by comparing two horrible designs.

>What is your favourite flavour?

They both suck. What's your point?

>Version 1. will result in a monster table, totally unhandy and very
>unperformant.

Version 1 is a very poor design -- just look at all the booleans.

Of course, the peformance is totally independent of the data model. You can verify this for yourself by creating the design in a variety of database brands and comparing the different performance even though the designs are identical at a logical level.

>Version 2. will get you ugly unperfomant queries, ugly subselects and ugly
>outer joins. Would you like to provide a solution for "<all employees>,
<and
>their possible managed department, if they are managers> WHERE <the name is
>'Badour'> <but I don't want managers of the department 'flames'> ?

Version 2 is a very poor design. If employees already have unique id's, why on earth would anyone create a separate primary key column?

I disagree with all of your conclusions regarding what I will get. I don't consider outer joins relational; I think that NULLs are Codd's big mistake.

>This, I am afraid, is the object-relational-mismatch.

Huh? You didn't mention anything about objects in the above.

The object-relational mismatch exists because relational languages are so much higher-level than current object languages.

>Now where does performance come in?

At the physical level. If I frequently access all of the employee tuples that report to a given manager, I should instruct the DBMS to physically cluster the employee data with the manager data. If my queries frequently result in point lookups of managers of employees, I should instruct the DBMS to physically store a pointer to the appropriate manager data with the employee data. Or I should instruct the DBMS to use a hashed index on the manager primary key. Or I should, at least, instruct the DBMS to use some index on the manager primary key.

Lots of physical possibilities exist for any given logical design -- even a bad one such as you gave above.

>In the choice of an adequate table model.
>There is no one-size-fits-all table model to map objects to relational
>databases.

Actually, there is: Object Classes = Domains.

>They just don't match.

Well, maybe if you bothered to educate yourself a little, you would see how well they do match.

>> As
>> I mentioned in an earlier message, no reason exists why a relational
>> database cannot have equivalent performance characteristics to an OO
>> database.
>
>One reason among others:

Either you are totally ignorant of the distinction between logical and physical or you intentionally ignore that difference in an attempt to mislead others.

>[Value 10 discussion skipped]
>Object databases store the value 10 and deliver back the stored value 10.
>...hopefully even for Excel spreadsheets...
>Where is the problem?

OID is not a value. Information stored in containment relationships are not values.

>> >Why care about internal linkage, if it does not provide any needed
>> > information?
>>
>> Why have OID's at all? They provide no information whatsoever beyond the
>> values stored in the database. You are confusing physical (internal
linkage
>> and pointers) with logical (uniquely identifying attributes). OID's,
>> collections, arrays etc. expose the internal linkage to users, whereas
>> RDBMS's do not expose the internal pointers and linkages at all.
>
>Wrong.
>OIDs are (hopefully) not exposed to users.

Wrong. OIDs are exposed to users whether they see the actual representation or not. When the OODBMS forces one to access a contained collection through the containing object class, it exposes the linkage to the user.

>RDBMS expose primary keys and foreign keys, don't they?

RDBMS expose attribute values. Some of those attribute values are useful for identifying the data. For instance, the values of my SSN and my SIN both serve to identify me and can identify data about me. Likewise for the value my driver's license number.

Foreign keys are constraints. Are you saying that some kind of database can enforce constraints without exposing them to the user?!?

>You even need to take care of them in your queries.

Not necessarily -- a DBA can take of them for me in a view. You frequently make statements that are simply not true.

>> >Why care about the value 10?
>>
>> Because RDBMSs locate information by uniquely identifying VALUES, and not
by
>> any kind of implementation-dependent pointer like every single OODBMS
>does!
>
>Internals pointers are not visible to the user, so why do you care?

Because they are visible to the user. If I have to navigate to my data, then the pointer is exposed regardless of whether I ever look at any representation of that pointer directly.

You are assuming a don't ask don't tell policy that is simply fallacious.

>Do the users of relational databases bother about the memory handling the
>engine uses to allocate memory to evaluate queries?

Nope. DBA's do. I have never known a user to know or care about such issues.

This is as it should be. The requirement for highly specialized expertise should be limited and whenever possible and practical should be automated by the DBMS.

>No doubt, this is true.

Of course, it's true! Everything I have said is true. Relational databases have fundamental advantages over network model databases including network model databases with fresh new names like OODBMS.

RDBMSs have these advantages because they adhere to sound data management principles. Even though SQL database vendors frequently stray from these principles, they adhere to enough of them that they will always have advantages over OODBMSs -- until they fall prey to the crap about Objects = Tables.

>> Since the situation under discussion implies that the implementation
>> specific work is done by two or more different database vendors, and
since
>> the OODBMS exposes those implementation differences to the user through
>> OID's and other means, the differences will be anything but transparent
to
>> the user when they attempt to change brands.
>
>The difference will indeed be awful, as soon as users are using proprietary
>APIs. However OIDs are not the problem, since they are handled
transparently
>by most vendors.

So every vendor transparently handles every other vendor's OID transparently? Get real!

>OID dependant code is bad habit, but possible, if vendors
>expose them.

By extension, any code relying on a reference to an object variable is bad code.

>> Since you have already stated that the design will flow directly from the
>> application based on the performance needs of a single application, the
>> differences among different OODBMS vendors' implementation choices will
>> drive a need for completely different application designs.
>
>Not necessarily, depending on the API features used.

First you make statement A to support point B. Then you change things to say not necessarily A, but you neglect to realise or admit that this invalidates B. You do this repeatedly.

>> An application that relies on a specific type of pointer swizzling and
>> caching can break under a different DBMS implementation.
>
>The dinosaur that uses pointer swizzling is dead.

Oh, really? So my C++ or Java code has to deal directly with OID's because the language's native pointer or reference type is not supported by your product? Persisted objects are not first-class objects in the supported language?

>> If the DBMS vendor supports a limited number of programming languages
(just
>> C++ or C++ and java, for instance) the customer who expands through
>> acquisition and inherits a few million lines of mission-critical COBOL
and
>> Fortran code will have one hell of a time doing the integration. Even if
>> they assume ownership of a website written using an unsupported language,
>> ASP or Perl for instance, they will have one hell of a time. Even a major
>> smalltalk application will stop them in their tracks.
>
>No doubt, if the object database has no support for the respective
>programming language, you are lost. Using object databases, you also take
>the risk that the specific language will be abandoned by the vendor. This
>has happened.
>
>The situation among object databases is not as beautiful as it could be.
>This will change.

Yet another bald statement with no basis in fact.

>> If you honestly believe that any OODBMS can transparently support C++,
Java,
>> VB and Perl (not to mention COBOL and Fortran) and can do so through
several
>> revisions to the applications without any kind of administration or
>> maintenance and without any kind of data normalization or intentional
>> design, I've got a bridge in lower Manhattan you might be interested in
>> purchasing....
>
>No doubt, support for SQL is currently spread over more platforms than any
>object database standard. Give us some time to catch up.

Some of these products have been around for a decade. How much time do they need? Received on Sun Jul 22 2001 - 01:27:50 CEST

This message: [ Message body ]
Next message: Carl Rosenberger: "S.O.D.A. database Query API - call for comments"
Previous message: Mark William Hopkins: "Re: how to write good CS paper"
Maybe in reply to: Bob Badour: "Re: Unknown SQL"
Next in thread: Carl Rosenberger: "Re: Flamewar object databases vs. relational databases (was: Unknown SQL)"
Reply: Carl Rosenberger: "Re: Flamewar object databases vs. relational databases (was: Unknown SQL)"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

Original text of this message