Re: One Ring to Bind Them

From: Bill H <wphaskett_at_THISISMUNGEDatt.net>
Date: Mon, 05 Jul 2004 02:23:11 GMT
Message-ID: <jY2Gc.24273$%_6.8192_at_attbi_s01>


Marshall:

My comments are embedded.

> "Bill H" <wphaskett_at_THISISMUNGEDatt.net> wrote in message ...
> > "Marshall Spight" <mspight_at_dnai.com> wrote in message
> > news:GipDc.159689$3x.28156_at_attbi_s54...
> > >
> > > Perhaps I misunderstand, but MV has only the one kind of
> > > relationship it is capable of understanding: containment.
> >
> > I'm not sure why it is so difficult to express this concept. An MV
> > environment is both a data store and an application server. It is _NOT_
an
> > RD model. To discuss its attributes strictly from a datastore
perspective
> > is neither fair nor accurate. To understand its methodologies for
directly
> > solving business problems requires the willingness to work with its two
> > functions: storage and application properties/methods/rules/etc.
>
> I read that paragraph a bunch of times, but it didn't seem to
> address my statement that MV has only one kind of relationship
> it is capable of understanding. Does it have relationships besides
> containment that it can understand? An example of a non-containment
> relationship would be cool, if the example does not require hand-written
> application code to work.
>
> I think the MV and the RM world divide things up very differently.
> I will note first that "storage" is not a first-tier property of RM,
> but it is a useful, second tier function that most products support
> and that most applications take advantage of. It is perfectly reasonable,
> and useful, to have an RDBMS that does not persist its relations.
> We could still call this an RDBMS, but we couldn't call it a "datastore."

An MV relationship isn't an RM relationship; at least it isn't stored as such. It is an expression of a relationship, the containment of which resides in the database. e.g. a relationship exists between a vendor and invoices, between a check and invoices, between a bank transaction and a check, and between a reconciliation and checks.

So, your statement that the only relationship an MV dbms can understand is containment is not true; though not exactly false either because it does fundamentally understand that. The MV model understands defined relationships, which are stored (or contained) within the database. These defined relationships are then understood by the MV model.

I can set a relationship between a vendor and invoices by simply storing the data required for the relationship then defining the relationship. So, I can say:

:select vendors invoices
:sort invoices with no pddate by pddate

I can then define the above as a stored procedure named "List-Unpaid-Invoices", then execute:

:List-Unpaid-Invoices '12345' which will list all unpaid invoices for vend# '12345'.

Now, was this only containment? I think it was much more than that. However, it is what it is and the RM model will do the same thing; just differently. Notice that the relationship and the relationship data has to be stored somewhere in both models. One of the interesting aspects of the MV model is the data and relationship is stored in the database. The application will usually initiate the creation of the relationship data but it can be done via table triggers or relationship triggers separate from the application (as long as it's defined that way).

There's nothing tricky about this. All dbms models have to do the same things to accomplish the same tasks. The MV model doesn't do some miraculous mumbo-jumbo and neither does the RM model. Both store data, both store relationships, and both store constraints. In the MV model all this is stored in the database!

> Another example is managing data integrity in procedural application
> code. In RM this is considered a "stupid database trick" to quote from
> another thread. There are significant disadvantages to application-managed
> integrity rules, to the point where I do not consider it an approach
> worth discussing (and yes, I've used that approach in the real world.)
> However, it may be that this approach has lower overhead in situations
> where you have small development teams and single-application databases.

The models we use create "stupid" tricks. It's the models that create the constraints to make some tricks stupid and others smart. An RM "stupid" trick may be an MV smart move; and visa-versa. However, most design and development are constrained by the base delivery model: server vs client/server.

> > Secondly, solving business problems requires a great deal of
flexibility. A
> > non-relational model can, in a number of instances, provide additional
> > flexibility over and above whan the RD model can.
>
> Please be specific. I am very interested in specific examples of specific
> operations or structures that you feel are hard to solve with RM or SQL
> and easy to solve with MV. I do believe there are some, but I want
> to know what they are better. As it stands I have a hard time evaluating
> the claims of the MV people, even the smart/nice ones such as you
> and Dawn. I'm not saying I believe, and I'm not saying I disbelieve.
> I just want to hear more specifics.

Let's reconcile a bank account. We need a primary account table and a transaction table in both models. However, in the MV model we don't need anything more than this. We will define the keys of the transactions to include the account#, so the account table can (and probably will) contain the ref# of the transactions. The transaction key would look like: Account# and transaction#. The account row would include all of the uncleared transaction#s and the key of each transaction would include the account key. We have a defined relation in a format other than as defined in RD model. Don't let this fools us, a relation is a relation and has to be defined and stored somewhere. In the MV model it is simply stored in the database. My goodness, we've just defined a many to many relationship (please note: this description if fundamentally viewed from an MV model perspective).

Now we get a simple download file from our financial institution which usually includes the fed route#, the account#, the transaction#, the date cleared, and the amount of the cleared transaction. It can be in any format, we don't care as long as it's consistent. :-)

Our transaction# is encoded on the financial instrument (the check or the deposit) so the bank sends it back to us as their transaction#. Part of the transaction# returned by the bank is our account#, since it was part of our transaction key!

Now, what good is this? As Mr Youngman points out it only takes one disk read to get the account and its relationships to the uncleared transactions. This is an almost instantaneous response to our web clients. So there's an upside.

> > Not because the RD model
> > is incapable, but because the RD model declares for itself particular
> > limitations and methods of operation. This structure doesn't always work
> > ideally. Do I understand RD proponents to declare that it does in all
> > circumstances?
>
> I don't know how to measure the idealness of a solution, so I have
> no particular claims about whether the RM is ideal or not.

I sit in a corporate VP meeting and discuss this with them and they with me. We all see the same thing. I'm not the odd man out here. In addition, I can almost directly translate their vast knowledger to the dbms design and relationship definition. I think this is good!

> > For years HP calculators used RPN (reverse polish notation) instead of
> > standard algebraic entry mode (AEM). Nowadays they offer both. Does this
> > mean RPN is worthless or worse than AEM? No. Many people prefer RPN. Is
> > AEM more often used? Of course, but that doesn't say anything about RPN
or
> > those who prefer to use it. The same can be said about the RD model. Not
> > everyone prefers it.
>
> The problem with this analogy is that there is a simple one-to-one mapping
> between AEM and RPN. It is easy to show that the two methods are
> equivalent. I do not believe the RM and MV have such a mapping,
> nor do they support the same operations nor structures.

They really do. They do primarily the same things. We're not talking nuclear reactors and cigarettes. :-)

> > > Yes, it understands the meaning of this, so it knows what
> > > a one-to-many relationship means. Does it have any other
> > > facilities for understanding meaning? It can do
> > > ON DELETE CASCADE but can it do ON DELETE
> > > RESTRICT? Can it handle and understand many-to-many
> > > relationships? Can it understand and enforce arbitrary
> > > constraints a la SQL's CHECK? (These are actual questions,
> > > not rhetorical ones; I'm not that familiar with MV.)
> >
> > Of course it does, and can. Remember it is both a datastore and an
> > application environment wrapped into one. So, whatever needs to be done
can
> > be done.
>
> If by this you mean that you can implement these features by
> hand-writing application code, then I don't consider that any
> achievement. I can say the same thing about some Java code and
> a hashtable, but it's not a good solution.

Yes and no. You can always hand write code in an applicaton. You can also store the code in the dbms as triggers, constraints, relations, etc. The difference is that the RD model does some things one way and the MV model does some things the other way. The MV model is much more application-centric. This is only bad when working with the RD model, where this is defined as bad (or "stupid"). Most things are done the same though. :-)

It's nice to have a model implement some features for us. It saves us time, and I realize, and appreciate, this. Most of my experience working with RD models is: you give me data and I'll give you data.

> For example, in another thread someone said (over and over
> if I remember correctly :-) that if you delete an invoice, all
> the line items go with it, automatically. Okay, this is the same
> thing as ON DELETE CASCADE. But sometimes you want
> ON DELETE RESTRICT. (In other words, if you want to
> delete a container but it is still containing something, you
> have to dispose of the contained things first; you can't just
> throw them away.) Can you do this declaratively in MV? How
> is it done?

Let me point out that the MV model communicates with the database, with respect to data maintenance, via an application language. Where a RD model might say:

INSERT ... the MV model would need to:

OPEN My file
READ and Lock New record (make sure noone else is)

      or
READ and Lock Item to change (make sure noone else is) CHANGE data
WRITE data TO My file

Lock contention is a part of the dbms. There is no such thing as "optimistic" locking (unless one is an idiot). :-) But this is an MV perspective, not an RD perspective.

> Can it *automatically* enforce declared integrity constraints?
> Can you have an integer attribute and declare that it must
> always be divisible by 4? Is that enforced by auditing your
> application code and manually inserting a check at each
> place the attribute is updated, or is it enforced by declaring
> the constraint centrally? Does the constraint have a hole in
> it if you add a new place the attribute can change and forget
> to put the %4 check in?

Remember, a constraint is defined and stored somewhere. The only value with storing outside the application is if some other application is using it. This is not a usual requirement but an MV model can simply enforce this via via a trigger. We're much more inclined to place this in the application because all MV application are server-centric and run in the dbms.

> > Like I said before, this is not to say that the MV model does
> > everything...as nothing can.
>
> I don't think I agree. For example, Java, C++, and BASIC are
> all able to compute anything that can be computed. They do
> everything that can be done; no programming language of the
> future can ever do anything more. (Which is not the same thing
> as saying there is no room for improvement:
> FORTRAN < C < C++ < Java < {OCaml, Haskell}, IMHO. But these are
> usability and expressivity issues, not computability issues; we
> need to be clear on the distinction.)

Rule one in life: never say never. Rule two in life: never say I can do everything. :-)

> > But it provides an interesting confluence of
> > tools and capabilities that render the model very useful in solving
business
> > problems for many people and businesses.
>
> This is not so much what is under discussion in this newsgroup.
> I will readily acknowledge that many people use MV to do useful
> work, and that they solve business problems, and that they
> enjoy themselves doing so. They on-topic question is the theoretical
> basis for the tools. Are they complete? Are they correct? Are they
> self-consistent?

I thoroughly agree. That's what keeps us all here...the amount of knowledge and interesting thought-provoking ideas elucidated.

> SQL is relationally complete, over its lame type system. It could
> really use a better type system. This will make it more usable but
> it won't make it any more complete. SQL is already really good
> at automatically enforcing integrity; it's a real strong point. OTOH,
> it's not so good at ease-of-use, and could really stand to improve.
> I suspect MV is much better at ease of use and worse at enforcing
> integrity. Understanding why and where one is better and one is
> worse will help us better use our own systems, evaluate others,
> and also to build the next generation. In this respect, I think Dawn
> and I are engaged in exactly the same exploration, although we
> come at it from different backgrounds.

:-)

Bill Received on Mon Jul 05 2004 - 04:23:11 CEST

Original text of this message