Re: One Ring to Bind Them

From: Marshall Spight <mspight_at_dnai.com>
Date: Sat, 10 Jul 2004 15:44:06 GMT
Message-ID: <99UHc.45677$JR4.38565_at_attbi_s54>


"Bill H" <wphaskett_at_THISISMUNGEDatt.net> wrote in message news:v%6Hc.28150$JR4.13132_at_attbi_s54...
> "Marshall Spight" <mspight_at_dnai.com> wrote...
>
> One of the primary impediments to communication is a different use of words
> and definitions.

Yes. I'm trying to learn different terminology.

> From a non-RD model perspective a relationship exists when
> the properties of two pieces of data can be defined as having an aspect or
> quality that connects them as being or belonging or working together or as
> being of the same kind <the relation of time and space>.

In other words, a relation is anything we say it is. This works for me.

> This seems obvious
> to me but I do not use the RD model, or mathematical, definition.

Actually, the mathematical definition (as best I understand) is pretty much the same thing: a "relation" is a set of pair. How do we decide what the set is? It's anything we care to say it is.

> So, there exists a relationship between vendors and invoices. Containment
> has nothing to do with that relationship, except the relationship is
> contained within the database.

I dunno. If every invoice has exactly one vendor, I think "containment" is a pretty good term to describe that relationship. (And a popular one as well.) Do you have a preferred term for "every x has an associated y?"

> What exactly is this relationship and how is it stored? I can store the
> invoice#s within the vendor in the vendor table.

I want to make sure I understand: when you say "invoice#***s***" (I especially note the "s") you mean to say that the vendors table/file/collection has an attribute/field that is a **list** of invoice numbers? Or is it a list of invoices?

> This defines a
> relationship in the MV model (although there are a number of other ways to
> do so).

As an aside: can you enumerate the different ways?

> How is this relationship going to be exposed? An example would be
> to create a virtual field definition in the vendor table so that when asked,
> will deliver the list of invoices associated with this vendor and any data
> contained within the invoice table.

The term "virtual" here; what does it mean? Is there an online reference you like that I could use to read about this?

> The phrase "...not true but not exactly false..." was intended to reflect my
> desire to avoid being argumentative or didactic. My apologies for being
> obtuse and misleading. :-)

'Tis nothing. Thank you for having the conversation with me. I hope I did not come off as impatient.

> > > One of the interesting aspects of the
> > > MV model is the data and relationship is stored in the database.
> >
> > Uh, same with RM.
>
> I'm sorry to say this is another of those "yes but not really" observations.

:-)

> The relationship is stored in the relational database but not really like it
> is stored in the MV database. This is true because the MV model treats
> everything like regular data; unlike the RD model. As such, everything is
> stored in the database tables right along with all the other data; names,
> addresses, relationships, metadata, functions, constraints, stored
> procedures, application code, compiled code, etc. All MV tools, like RD
> tools, are available for these additionally defined data; it's just stored
> with all other data in the exact same formats.

The distinction you're drawing is that in addition to all the stuff that both model store, (names, addresses, relationships, metadata, functions, constraints, stored procedures) the MV model additionally stores application code, compiled code, MV tools, etc. Is that right?

Again, it's something I'd like to try out. Can you recommend a free solution; the mysql of MV?

> What makes this different is only that these tables are usually part of the
> database structure of the production data. So, you'd have tables for
> constraints, stored procedures, relationships, metadata, application code,
> and data all within a single database structure built for an application.
> The RD model would normally keep this kind of data separate from the
> production data within its own special system tables.

This separation you describe is not much of a separation.

> However, once again, the stuff stored is the same. It's just where its
> stored in relation to the normal everyday production data that's different.

Okay.

> > I disagree. There are specific well-documented and *fundamental*
> > disadvantages to managing integrity in applications instead of centrally.
> > This is independent of MV vs. RM vs. whatever.
>
> Don't forget, these "well documented" disadvantages revolve around the RD
> model, as its structure requires a different dance. If a dbms stores
> integrity constraints in the dbms, and the application is stored and runs in
> the dbms, then it makes little difference whether the integrity constraint
> is in or out of the application, as the application is located centrally in
> the dbms. I would point out that from this perspective it is wise to
> modularize the application so other applications can utilize the defined
> constraints.

I am hesitant here. On the one hand, that which makes application-enforced constraints not a good choice would seem to apply whether one kept the applications in the DB or on the filesystem. But having the applications stored centrally makes them central as well.

What if you have two comparatively unrelated applications that work against the same schema; both applications are in the dbms; one application enforces a constraint and one doesn't (for whatever reason: a bug, or the programmer just forgot about it.) Wouldn't that be a pathway for data corruption to enter the system?

> > > Let's reconcile a bank account. We need a primary account table and a
> > > transaction table in both models. However, in the MV model we don't
> need
> > > anything more than this. We will define the keys of the transactions to
> > > include the account#, so the account table can (and probably will)
> contain
> > > the ref# of the transactions. The transaction key would look like:
> > > Account# and transaction#.
> >
> > You're going to reuse transaction numbers in different accounts?
> > And you're also going to include the transaction number in
> > the account table? That kind of redundancy leads directly
> > to data corruption.
>
> Ah, excuse me? One reuses check#s in different accounts all the time. One
> reuses invoice#s for different vendors all the time too. To include the
> transaction# in the account table is to do nothing different than needs to
> be done anyway to define a relation; A > B and B < A.

I agree up until the last sentence. You don't need both A > B and B < A to define a relation; you only need one or the other. Likewise, you don't need a list of invoice numbers in the accounts table *and* an account number in the invoices table; that's a denormalization that will lead to corruption. You need one or the other, but both is bad, (unless they are just different views on the same data. Are they? Or are they stored separately, and able to become out of sync.)

> Redundancy? Storing
> the transaction#s in the account saves having to store the "transaction to
> account" relationship, as it is already defined by the transaction key. So
> this reduces redundancy.

Uh, no. I mean, it's less redundancy that storing it three times, but it's more than just storing it once.

> Data corruption? No different than anywhere else.
> Synchronization code performs the same task in all dbms products, although
> sometimes differently.

If you don't store the same information more than once, then the entire concept of "synchronization code" (first time I've heard the term) is unnecessary.

> I'm not taking a stand here claiming the RD model is bad. Nor am I stating
> that other models are necessarily better. I'm merely pointing out there are
> other methods and tools and dbms models that work.

Sure; yes. My interest in these conversations is to understand what works well in each of various approaches, and also what doesn't.

> My point is the nomenclature, syntax, and concepts within the MV model are
> specifically modeled after those of business.

Hmmm. Data management is something that is very useful to business, but it is not business-oriented in and of itself. Same with adding up columns of numbers.

> > Okay, how do you map a relational table like this into MV:
> >
> > create table Tri
> > (
> > a int,
> > b int,
> > c int,
> > unique(a,b),
> > unique(b,c),
> > unique(a,c)
> > );
>
> A good example of the point I was trying to make, and have made before,
> about deconstruction/reconstruction. To a business person this is complete
> nonsense.

You mean because they don't understand SQL? I don't get why we're talking about business people here; we're discussing data management.

> So, for instance, it is important that all invoice#s for a particular vendor
> are unique (we certainly wouldn't want to pay the same invoice twice). In
> the MV model the key is _not_ part of the data set but is part of the key (I
> would read a dataset using the key as the unique identifier). Thus both
> models do the same thing but a little differently. Other fields can be
> constrained. However, they're not constrained in the syntax of the table
> creation statement. They're done differently. So I can say any invoice
> must have a unique invoice# and a unique creation-stamp.

Hmmm. You didn't really answer my question.

I don't really care whether the constraints are part of the table declaration statement or not; I care about whether they are declarative, automatically enforced, and at least flexible enough to model that each of these three pairs must be unique: {a,b}, {a,c}, {b,c}

Is there a way to do that? I'm guessing not.

> I can set a trigger to enforce integrity within the bank account table so if
> a bank transaction is cleared, the uncleared reference to it within the bank
> account table is removed. So, right here I've set both a trigger and
> constraint on a relation at the same time. I know the RD model accomplishes
> the same task but differently.

Do you have to set these up manually every place a bank transaction clear is invoked, or do you just do it once?

> I cannot emphasize this enough; the MV model is located centrally! The
> application server and dbms server reside within the same environment, on
> the same machine. Therefore, all constraints are enforced centrally. The
> centralized application APIs can be called from outside the application.
> Additional constraints can be developed to provide service to more than one
> application and to meet ever-changing requirements.

Where can I read more about this intriguing concept?

> > It sounds like you don't understand Turing completeness. Also note
> > that I didn't say "everything." I said "anything that can be computed."
> > And I stand by my statement that BASIC can compute anything that
> > can be computed; it is a Turing complete language. This is not
> > the same thing as saying that it is a good language, though.
>
> There is a lot in the universe I don't understand. I understand the word
> tautology, though. :-)

:-) back.

But it's not actually a tautology. There are languages that can compute a lot of things but not everything. SQL is one such language.

> > I appreciate you're trying to help me understand, but I'm
> > having trouble following your posts. It seems like you
> > quote me, then respond, but the response, while interesting
> > isn't a response per se but you talking about something else.
> > I get lost.
>
> And here I was thinking I was answering your queries directly, albeit in a
> slightly different perspective. Perhaps my writing skills, and clarity of
> thought, will improve with time. :-)

Actually, I found this most recent message quite comprehensible. Also, I want to thank you for hanging in with my questions as my frustration grew. You are a gentleman, sir, and the world and this newsgroup needs more gentlemen. (And ladies, of course.)

Marshall Received on Sat Jul 10 2004 - 17:44:06 CEST

Original text of this message