Re: Something new for the New Year (2008).

From: Marshall <marshall.spight_at_gmail.com>
Date: Thu, 3 Jan 2008 17:28:00 -0800 (PST)
Message-ID: <85e10fe4-7bac-4212-b06e-1b4c7be5da46_at_l6g2000prm.googlegroups.com>


On Jan 3, 10:52 am, Rob <rmpsf..._at_gmail.com> wrote:
> On Jan 2, 8:19 pm, Marshall <marshall.spi..._at_gmail.com> wrote:> On Jan 1, 2:45 pm, Rob <rmpsf..._at_gmail.com> wrote:
>
> > > If you have a chance, please take a look on my website at this page:
>
> > >http://www.sfdbs.com/toplevel/fasttrack/fasttrack.shtml
>
> > > You will see a completely new way to use foreign keys
> > > to represent relationships in relational databases.
>
> > What is wrong with the existing way?
>
> When it comes to information technology, "right" and "wrong" are
> dangerously loaded terms. (Unless one technology fails to produce the
> "right" result.) A more appropriate axis for comparison is
> productivity, sometimes measured in net revenue per employee.

Wow. Okay. What about the existing way does not produce as much revenue per employee as your new way? Of course by raising this issue you now need to include measured revenue-per-employee numbers in your response.

> The
> downside of "the existing way(s)" is that knowledge workers and
> application programmers of ordinary computer skills cannot seem to
> master database design, querying and response interpretation without
> expensive SQL/database experts.

An interesting assertion. Can you back this up with any evidence? It is not what I have observed. If anything SQL seems to be easier to master than, say, C++, or concurrent programming, and hence available to a larger class of people. For a long time I had a lot of SQL questions in my standard interview, and lots of people did quite well on them.

> If we could makes database usage
> second nature to these individuals, the total cost of database
> ownership would be lower, and therefore, productivity higher. This
> representation is a first step in that direction, but the whole SQL
> database paradigm needs to be rethought if we are going to both
> preserve investments in current databases and database practices, and,
> open the benefits of database usage directly to a wider, less trained
> (and therefore less expensive) workforce.
>
> > Looking at your webpage, I didn't see any problem statement.
> > What problem is this technique intended to solve? What makes
> > it better than existing techniques?
>
> For now, just think of it as a new, interesting technology. I've
> worked with it since 1997: I could provide a long list here of its
> advantages, and I will add material to the website as fast as I can.
> cdt is a powerful group. It was (is) my intent to present the basic
> technology here and see where cdt people may choose to take it. I
> don't want to impose my high-level interpretations on it without
> giving people the chance to see it and think about it in its most
> basic form.

That's nice and all, but having been around for as long as I have been I tend to distrust any claims for new technologies that don't have any descriptions of benefits. There's a lot of stuff out there competing for my attention; I necessarily can only look closely at a few things. So I'm interested in seeing the bottom line as quickly as possible, and the bottom line is: how does this benefit me?

> > > Working with this new representation has forever changed the way I
> > > look at and think about relational databases.
>
> > That's a dramatic statement, but not a very motivating one.
> > Is the new way you look and think about relational databases
> > better or worse than the way you used to look at them?
> > How can we tell?
>
> I did not want to go that far in the initial presentation, but since
> you asked, here's my abbreviated answer.
>
> Many fields that possess a mathematical underpinning exhibit the
> principle of duality. (Seehttp://en.wikipedia.org/wiki/Duality.) For
> example, in graph theory, every theorem about vertices has a
> corresponding and equivalent dual theorem about edges. In Linear
> Programming, for every minimization problem, there is a equivalent
> dual maximization problem in which the min of the first equals the max
> of the dual. (The minimization problem is the dual of the maximization
> problem: They are duals of one another.)

I'm immensely attracted to the duality principle on aesthetic grounds. But I try to be realistic. Gratzer's "General Lattice Theory" says this
on page 3:

"It is hard to imagine that anything as trivial as the Duality Principle
could yield anything profound, and it does not; but it can save a lot of work."

> For 35+ years, the conversation about relational database technology
> seems to have been exclusively about "things" or "entities" or
> "objects": Relationships and relationship representation are virtually
> invisible. (For example, casting Junction Tables as "association
> entities".) Is there a dual, relationship-oriented approach?

The relational model treats relationships in a first class way. I suppose what you are saying above applies to some E/R or other modeling approach? I haven't myself studied any modeling disciplines per se, but as far as the RM goes, it models our ideas about real-world entities and our ideas about real-world relationships in exactly the same way: as mathematical relations. I don't see anything about a many-to-many table that is any more "invisible" than a customers table.

> What the "new representation" does is simplify (and unify)
> relationship representation, making it possible to partition every
> relational database into two constituent parts, a data part and a
> disjoint "structure" part. (Seehttp://www.sfdbs.com/toplevel/fasttrack/fasttrack.shtml.) This makes
> it possible to "think about" database design and querying in terms of
> data AND structure. (This is very similar to graph theory duality.)

Hmmm. You say "simplify", but it seems to me your schema in figures 4 and 5 are more complicated than they need to be.

> Here's a real simple example of the difference. In the FROM clause in
> a conventional SQL SELECT, the list of relations specifies a search
> space consisting a product of relations. That product space is
> restricted by predicates in the JOIN clauses and/or the WHERE clause.
> In the dual, relationship approach, the FROM clause equivalent would
> specify a list of relationships: Each relationship implies the parent-
> and child relations plus the attribute-data-free structure relations.
> Implicitly, these relationships "compose" to define a search space
> that accounts for the join predicates. The WHERE clause equivalent
> contains no joins because the join operations are implicit in the
> relationship structures and their composition. The search space is
> smaller. Without joins and (if you follow the notion of structure
> independence) with no foreign keys in entity relations, relational
> modeling and query formulation is conceptually simpler. Perhaps simple
> enough to lower the bar to entry!

I really don't see how this is simpler. I *can* see some ways in which it is more complicated, however. Your schema is more complicated for one thing. I am skeptical whether your approach will yield the simpler queries you claim. Would you care to write some queries both ways and we could compare them?

Marshall Received on Fri Jan 04 2008 - 02:28:00 CET

Original text of this message