Re: Something new for the New Year (2008).

From: Rob <rmpsfdbs_at_gmail.com>
Date: Thu, 3 Jan 2008 10:52:14 -0800 (PST)
Message-ID: <b85e6758-a76d-4d43-abd1-d77013f7ae7a_at_d4g2000prg.googlegroups.com>


On Jan 2, 8:19 pm, Marshall <marshall.spi..._at_gmail.com> wrote:
> On Jan 1, 2:45 pm, Rob <rmpsf..._at_gmail.com> wrote:
>
> > If you have a chance, please take a look on my website at this page:
>
> >http://www.sfdbs.com/toplevel/fasttrack/fasttrack.shtml
>
> > You will see a completely new way to use foreign keys
> > to represent relationships in relational databases.
>
> What is wrong with the existing way?
>
When it comes to information technology, "right" and "wrong" are dangerously loaded terms. (Unless one technology fails to produce the "right" result.) A more appropriate axis for comparison is productivity, sometimes measured in net revenue per employee. The downside of "the existing way(s)" is that knowledge workers and application programmers of ordinary computer skills cannot seem to master database design, querying and response interpretation without expensive SQL/database experts. If we could makes database usage second nature to these individuals, the total cost of database ownership would be lower, and therefore, productivity higher. This representation is a first step in that direction, but the whole SQL database paradigm needs to be rethought if we are going to both preserve investments in current databases and database practices, and, open the benefits of database usage directly to a wider, less trained (and therefore less expensive) workforce.
>
>
> Looking at your webpage, I didn't see any problem statement.
> What problem is this technique intended to solve? What makes
> it better than existing techniques?
>
For now, just think of it as a new, interesting technology. I've worked with it since 1997: I could provide a long list here of its advantages, and I will add material to the website as fast as I can. cdt is a powerful group. It was (is) my intent to present the basic technology here and see where cdt people may choose to take it. I don't want to impose my high-level interpretations on it without giving people the chance to see it and think about it in its most basic form.
>
> > Working with this new representation has forever changed the way I
> > look at and  think about relational databases.
>
> That's a dramatic statement, but not a very motivating one.
> Is the new way you look and think about relational databases
> better or worse than the way you used to look at them?
> How can we tell?
>
> Marshall
>
I did not want to go that far in the initial presentation, but since you asked, here's my abbreviated answer.

Many fields that possess a mathematical underpinning exhibit the principle of duality. (See http://en.wikipedia.org/wiki/Duality .) For example, in graph theory, every theorem about vertices has a corresponding and equivalent dual theorem about edges. In Linear Programming, for every minimization problem, there is a equivalent dual maximization problem in which the min of the first equals the max of the dual. (The minimization problem is the dual of the maximization problem: They are duals of one another.)

For 35+ years, the conversation about relational database technology seems to have been exclusively about "things" or "entities" or "objects": Relationships and relationship representation are virtually invisible. (For example, casting Junction Tables as "association entities".) Is there a dual, relationship-oriented approach?

What the "new representation" does is simplify (and unify) relationship representation, making it possible to partition every relational database into two constituent parts, a data part and a disjoint "structure" part. (See
http://www.sfdbs.com/toplevel/fasttrack/fasttrack.shtml .) This makes it possible to "think about" database design and querying in terms of data AND structure. (This is very similar to graph theory duality.)

Here's a real simple example of the difference. In the FROM clause in a conventional SQL SELECT, the list of relations specifies a search space consisting a product of relations. That product space is restricted by predicates in the JOIN clauses and/or the WHERE clause. In the dual, relationship approach, the FROM clause equivalent would specify a list of relationships: Each relationship implies the parentand  child relations plus the attribute-data-free structure relations. Implicitly, these relationships "compose" to define a search space that accounts for the join predicates. The WHERE clause equivalent contains no joins because the join operations are implicit in the relationship structures and their composition. The search space is smaller. Without joins and (if you follow the notion of structure independence) with no foreign keys in entity relations, relational modeling and query formulation is conceptually simpler. Perhaps simple enough to lower the bar to entry!

Some (like JOG) interpret the data structures (relations) of a relational database as sets of "true" logical statements. I personally don't see how that benefits the database designer/user, but if they prefer that approach, they certainly should use it. However, I do not see an obvious "dual" in this interpretation that corresponds to the "structure" dual in the data + structure formulation.

Rob

P.S. I don't want to start a religious war here, but NULL values for foreign keys (where allowed) are unambiguous. If you automatically reject any approach that utilizes NULL values, at least take the time to differentiate the interpretation of NULL values in the structure representation and NULL values for non-key attributes. Received on Thu Jan 03 2008 - 19:52:14 CET

Original text of this message