Re: Clean Object Class Design -- What is it?

From: Bob Badour <bbadour_at_golden.net>
Date: Thu, 19 Jul 2001 21:04:32 -0400
Message-ID: <uAL57.176$eL2.48097336_at_radon.golden.net>


>>>It is not nesessarily SQL that I would blame in this particular case.
 Today,
>>>check constraints are allowed to depend upon single row only, because it
 is
>>>pretty much clear that touching a row requires revalidating just the
 check
>>>constraints on that one row (okay, tuple, if you insist).
>>
>>You start by claiming that SQL is not necessarily to blame and then you
>>introduce specific SQL syntax and limitations?!?
>
>Not at all. The ease of checking if a tuple attributes meet a predicate
>condition has nothing to do with sql.

"Check" constraints are SQL syntax. The limitation that they apply to a single tuple is an SQL limitation.

If a constraint refers only to a single tuple, then the DBMS need consider no other data. If a constraint refers to multiple tuples, then the DBMS needs to consider more than a single tuple.

>>>Foreign key constraints are more complicated, but again, there is more
 economic
>>>solution than just a full table scan per each update transaction. Now, do
 you
>>>know the solution in general case, that is easy to implement, and not
 expensive
>>>at runtime?
>>
>>First, using functional dependencies, one determines the minimal set of
>>constraints one must check to ensure coverage of all constraints.
>
>Are functional dependencies a part of schema definition?

No, but if you express your constraints as relational expressions, the DBMS can derive the functional dependencies.

>Are you aware that it is expensive to
>infer dependencies from the set of the data?

Why would the DBMS even attempt this? It would infer incorrect (incidental) dependencies simply from the state of the data.

>>Second, one constructs a relational expression that fully describes the
>>update to perform including all constraint checks.
>
>Relational expressions describe the static view of the world. How would you
>define the "update" expression: before and after conditions?

By extending the meaning of relational expression to include dynamism.

>>Third, one applies known mathematical identies to transform the expression
>>into a well-defined canonical form.
>
>Is there a well known canonical form -- the DNF of relational calculus? If
 there
>is, please, make sure that the algorithms that transform arbitrary
 expression
>into it have desired performance.

Why wouldn't they? It doesn't take long for a computer to rearrange a few symbols. Some might argue that this is what computers do best.

>>Fourth, one applies known mathematical identities to transform the
>>expression into logical equivalents that might improve performance. (For
>>instance, pushing restrictions through join criteria onto as many pre-join
>>relations as possible or simply permuting the order of sub-expressions.)
>
>I don't understand. Away from canonical form? Isn't there a combinatorial
>explosion that awaits us just around the corner?

See my earlier comment regarding dynamic programming. Does your revision control system die an explosive death of combinations? Or does it do an adequate job of finding the differences?

>>Fifth, one examines other declared integrity constraints looking for any
>>that might aid optimization and uses them to further transform the
 permuted
>>form.
>>
>>Sixth, one examines the available access paths that evaluate the given
>>expressions estimating cost looking for a minimal cost.
>>
>>The fourth and sixth steps are just an exercise in dynamic programming
>>optimization augmented by the fifth step -- many computer science texts
>>cover the topic of dynamic programming in varying levels of detail.
>
>Overall, you seem to suggest that data modification might become as complex
 as
>quering if we implement general constraints.

And as simple. The other option is to simply ignore integrity altogether. Is that your solution?

>Now, if you happen to lock some
>data in the process, your transactional throughput will suffer.

Assuming you use locking for your concurrency model.

>You might object that this must necessarily happen because this is logical
 model
>and there is no way, to overcome this semantics.

Not at all. Concurrency is an entirely independent issue from the logical model.

>Skipping constraint checking
>implies that this concurrency logic must be maintained by application. I
 don't
>really know how valid this speculative argument is.

I agree: your argument is entirely speculative and entirely invalid. Concurrency without integrity is worthless.

>I hope I'm not provoking "don't put words in my mouth" reaction.

You haven't. Why would it?

>>If the expense at runtime exceeds the constraints of the required
 solution,
>>use statically compiled access paths optimized at compile-time.
>>
>>As for blaming SQL, in five of the six steps above the solution depends on
>>known mathematical identities many of which apply for relations but do not
>>apply for SQL tables primarily due to duplicate rows, NULL etc.
>
>Strawman. Tables themself almost never contain duplicates.

Not at all. If the DBMS does not require the tables to be relations, it cannot use identities that apply to relations but not necessarily to tables. Doing so would require an explosion in the complexity of the optimizer to determine when it can use which identity.

>What do you think
>primary keys are for?

Why do you think the relational model requires at least one candidate key? How do you think SQL's omittence of this requirement affects its ability to optimize? It cannot assume that all "tables" are relations. Therefore, it cannot assume that all mathematical identities involving relations hold true.

Some identities apply to both tables and relations, which is why SQL databases can do some optimization even if not as effectively.

>>>>>>Why should relational DBMS vendors provide adequate support for
 domains
 when
>>>>>>the markets that demand them most scoff at the idea of using a
 relational
>>>>>>database?
>>>>>
>>>>>Is implementing user defined domains straightforward?
>>>>
>>>>I don't know. Is implementing user defined object classes
 straightforward?
>>>
>>>I'm not challenging your assault against OODBMS here.
>>
>>Okay, we now know what you are not doing. What are you doing?
>
>Coloring your black-and-white position.

Or trying to confuse issues. I will reiterate the question:

Why should DBMS vendors provide adequate support for domains when the markets that demand them most scoff at the idea of using a relational database?

>>>>>Why then relational
>>>>>vendors come up with cludgy object/relational implementations instaed
 of
 just
>>>>>implemnting java type based domains?
>>>>
>>>>Why would you limit the DBMS to Java? Why would you start by assuming
 that
>>>>the DBMS must dynamically allocate storage for all object values no
 matter
>>>>how simple?
>>>
>>>Name other popular object language. When object language defined in 3rd
>>>manifesto would be released? Until then, I would have to think how to
 make
 my
>>>RDBMS domains to natively support huge codebase of existing java classes.
>>
>>Why not the even huger codebase of C++ classes? Or SmallTalk classes?
 Or...?
>
>In the bookstore where I went last time, java literature weighted the same
 as
>the others combined.

Your point? If you go by the weight of the books in the bookstore, Access and VB applications account the majority of the entire codebase. Don't you think that is absurd?

You are trying to measure the size of a codebase by measuring the perceived size of the market for books on a programming language. Non sequitur.

Unfortunately, when it comes right down to it, COBOL and Fortran probably still account for a larger codebase than all of the object oriented languages combined.

>>>>>Why today java in RDBMS is just calling
>>>>>static methods?
>>>>
>>>>I doubt that your statement is true for all current RDBMS
 implementations.
>>>>In fact, didn't Carl criticize Lee's product for actually overcoming
 this
>>>>limitation.
>>>
>>>If we are talking about SQL databases, then the first limitation to
 overcome is
>>>to make at least all primitive datatypes to be the same. And given that
>>>java.lang.String and java.sql.Timestamp are not primitive, user defined
>>>datatypes as well.
>>
>>Why make any distinction among primitive and non-primitive data types at
>>all?
>
>Because it is much easier to describe relational operators over primitive
>datatypes, as they could be certainly treated as values. It requires some
 effort
>to treat user-defined datatypes as values as well.

I disagree. On what do you base such an extraordinary claim?

>>>>> And in order to get your query to
>>>>>perform you have to write it a certain way.
>>>>
>>>>If your product fails to deliver adequate physical independence,
 complain
 to
>>>>your vendor -- don't spread the myth that the logical data model is the
>>>>cause.
>>>
>>>That's easier said than done.
>>
>>Huh? Say it to the vendor and it is done.
>
>Say it to the vendor and be ignored.

At least that causes no harm, which makes it infinitely preferable to incorrectly blaming the data model and being heard.

>>>Every user demand something from vendor, and
>>>vendor has no way to please them all. If vendor were aware of fundamental
>>>importance of logical model, they would never prioritize corba or xml
>>ventures.
>>
>>I assume that vendors are aware of the fundamental importance of the
 logical
>>data model. I also assume vendors are aware how ignorant their customers
 are
>>of same.
>
>Really? Last time I spoke to their architect he was convincing me that
>adding/dropping an index must affect query result set. Never mind
 difeerence
>between key constraints and indexes.

If you can demonstrate that my assumption is wrong and that vendors are ignorant, I will accept that. I prefer to give them the benefit of the doubt, but it would explain a lot of things...

>>If the customers do not demand it -- no matter how important it is -- the
>>vendor will not deliver it. If ignorant customers are actively repulsed by
>>it, the vendor will take extraordinary measures to deprive them of it.
>
>I dont think closed customer/vendor model is healthy anyway. Look at phone
>companies, energy utilities. It needs an external flow of inventions - like
>mobile phones.

To achieve an open customer/vendor model requires interoperability. To achieve interoperability, you will need some commonality in the logical interface. As far as the DBMS goes, I suggest we start by assuming the relational data model -- it has already demonstrated huge advantages for interoperability.

>>>>If your vendor implements a redundant language such that different
 logically
>>>>identical queries have extremely varying performance characteristics,
 either
>>>>request they get rid of the redundancies of the language or request that
>>>>they optimize all equivalents equally well
>>>
>>>Can say hardly anything new here: the problem is that minority of users
 may
 be
>>>aware of and demand this, while the rest is looking for some "killer"
 features.
>>>Consistent optimizer is not a hot topic nowadays.
>>
>>..due to market ignorance, prejudice and misconception. Is your "solution"
>>to reinforce the ignorance, prejudice and misconception? Does that seem a
>>worthy goal to you?
>
>Am I sellng short relational? Just adding some color to the picture.

Actually, you simply repeated my earlier point that vendors respond to market demands and not to market needs. I added the colour. ;-) Received on Fri Jul 20 2001 - 03:04:32 CEST

Original text of this message