Re: sql views for denomalizing

From: dawn <dawnwolthuis_at_gmail.com>
Date: 2 Aug 2005 08:22:48 -0700
Message-ID: <1122996168.609599.96180_at_g44g2000cwa.googlegroups.com>


Marshall Spight wrote:
> dawn wrote:
> > Marshall Spight wrote:
> >
> > > pick better than SQL:
> > > nested relations. This seems big to me
> >
> > yes, also the 2VL
>
> Good point! I also count this as big.
>
>
> > > pick worse than SQL:
> > > no static typing (but SQL's type system is fairly weak anyway)
> >
> > while I think there are significant benefits to strong typing, the
> > typical SQL type system provides the stick without the carrot, it
> > seems, so weak typing in the dbms is better. I'm still pretty
> > wishy-washy on typing, I realize.
>
> I agree that there are problems with the current situation, in which
> much of the value of static (not "strong") typing is thrown away
> because we generate the sql in strings at runtime, instead of
> checking them statically at compile time. But this is not sql's
> fault, per se; it's a problem with the "embedded data language"
> approach. Also I don't see how this issue means that the dbms
> should not have a powerful static type system.
>
> To my way of thinking, the type system is the most important
> component. It is *the* central component of the system, and its
> design will have a huge impact on every other part of the system.

On the one hand, I agree, but on the other hand I don't always like the implementation of such. Java is ok for a type system, but I haven't landed on one that really sings for me yet.
>
> > > only one key
> > It doesn't really turn out that way. The key may be multiple parts,
> > each of which can have a separate name, but together are tossed into
> > the ID for the record as well (under the default name for the key in
> > the given implementation)
> >
> > And one can access data through alternate keys, of course.
>
> If you can have multiple keys, that's great.

I'm not sure if I said that right. Yes, you can have both what are called multi-part keys and alternate keys, but one "candidate key" must still be designated as the primary key.

> I still don't understand
> the apparent significance of the primary key in pick. This may be a
> complete non-issue; simply a matter of cultural emphasis.

Pick uses functions (a specific flavor of relations, as has been discussed before) and functions require what is effectively a candidate key that is designated as the primary key which is then used for hashing. I wouldn't doubt if some of the newer grid or simply multiprocessors can outperform pick, but the algorithms for storage and retrieval are zippy quick on a single processor. That is about as technical as I can get on the implementation details, however What the user gets is automatic great performance when accessing anything by its primary key.

>
> > If there is some reason that a data repository requires an API that is
> > used by more than one top level organization, then that API needs more
> > constraint handling than a pick CRUD API has. If all applications using
> > the dbms api for crud are owned by the same top level company, then the
> > company has the ability to manage that, with more flexibility than if
> > it were a sql-dbms. That flexibility has trade-offs.
>
> I don't believe it is sufficient to say "within a single company."
> It seems limited by the size of the dev organization, and it strikes
> me that above, I dunno, 5 or 10 or so, you can't manage integrity
> effectively in applications.

All you need is a services-style architecture with an API that provides the crud services with the integrety contraints applied. If that is the only access path for a developer (and especially if it is the easiest and most reliable), then that is what they will use just as they will go through a dbms api if that is what is provided. If a company is unable to do this, then they are unlikely to be able to require that everyone use the same implementation of a dbms either.

> Not if integrity is really important.

I disagree, although I agree that there are types of integrity that you are more likely to get with one structure and other types you are more likely to get with the other.

> If it's not so important, then go ahead. Seriously; I've seen
> applications where they are happy with most-rows-correct.

There are many different ways to look at "correct" data. I have listed before some of the ways in which a SQL-DBMS ends up with poor metadata + data quality compared to pick. I will grant, however, that it is possible for a developer who is permitted to directly use the dbms api (instead of a crud service) to remove a record and not change a reference to it.

>
> > > I'm still not clear on how pick handles many-to-many.
> >
> > multivalued lists of foreign keys in each file.
>
> But this leads to duplication of data, does it not?

Yes, I thought I said that, but at least intended to. These return links are not required, but enhance performance and ease. An index duplicates data too. If a crud api handles this behind the scenes this then why should I care?

> Unless there
> is some virtual-field-trick in here somehow. If it actually
> does require the data to be duplicated,

no such requirement. The fact is that you can have multivalued lists and these are exploited to make things easier for the user of the dbms.

> then I'd count this
> as a significant advantage to SQL. SQL's many-to-many handling
> is ... just ... beautiful.

Beauty is in the eye of the beholder. Think of a web page with a list of students in a class. Click on a student and you see all of their classes with the attributes of each class. Click on the class you came from and you are back to that first list. There is no need for an intermediate web page to cross references these two. They can store all of their references without having helper tables to do so. From a logical modeling perspective, this is much more elegant, don't you think?

> Better than anything OOP has, for
> example.

OK, maybe. cheers! --dawn

>
> Marshall
Received on Tue Aug 02 2005 - 17:22:48 CEST

Original text of this message