Re: How to work with m:n relationships?

From: Mark Johnson <102334.12_at_compuserve.com>
Date: Sun, 26 Feb 2006 15:56:46 -0800
Message-ID: <t1g402ta55fadqg1evtv27at4d5svmgl8q_at_4ax.com>


"vldm10" <vldm10_at_yahoo.com> wrote:

>Subscriber ( ssn, name, address, phone, email, login, password,
>datefrom, dateto, other-details)
>Product ( product_code, name, price, other_details)
>SubscriberProduct ( ssn, product_code, type_of_subsription, date_from,
>date_to )
>Keys are ssn, product_code and (ssn, product_code) respectively.

Alternative:

>SubscriberSt (Key1, ssn, name, address, phone, email, login, password,
>datefrom, dateto, other_details)
>ProductSt (Key2, product_code, name, price, date_from, date_to,
>other_details)
>SubscriberProductSt ( Key3, Key1, Key2, type_of_subsription, date_from,
>date_to )
>Here every Key is one column generated key.

If the "datefrom,dateto" meant to suggest that a separate subscription is required before subscribing to any particular product? that the password is only good from, to, in other words?

The date_to in Sub-Prod would suggest the end of their subscription to that Product, which perhaps is stored for good reason as a date rather than the number of days.

And the Product's date_from and to likely represent the active life of that product, with support options requiring a different range of dates.

. . .

>We will assume that the entities and the
>relationships can change their own states.

By state do you mean that, once set, the date_from and date_to may be changed in any of these records?

>1.
>Let Subscriber and SubsriberSt now get values:
>Subscriber1 = ( s1, n1, a1, p1, e1, l1, ps1, df1, dt1)
>SubscriberSt1 = (k11, s1, n1, a1, p1, e1, l1, ps1, df1, dt1)

>Let now Subscriber1 gets a new phone number. We will need new key for
>Subscriber.
>Subscriber2 = ( s1, n1, a1, p2, e1, l1, ps1, df2, dt2) and the new key
>is (ssn, df)

Because S2 is a unique record, but the pk is no longer unique. You have duplicate ssn in your first scheme, but in the second, where you now also have a duplicate, a unique artificial pk was generated, specifically here, k12. It's a bit of a hypothetical, because in practice if you had defined the k_n to be the pk, the system would probably disallow any duplicates in that table. And this is one of the complaints, I believe, that people have with generated keys. The records may otherwise be duplicates, if you removed the generated key.

>SubscriberSt2 = (k12, s1, n1, a1, p1, e2, l1, ps1, df2, dt2)
>(St-solution can support this change)

>Now we can ask ourselves, what will be if any other attribute (for
>example email) change its value at same df2 time and has dateto not =
>dt2. Subscriber2 will get the duplicate key, while SubscriberSt2 will
>work normally.

Yes, because instead of creating a pk of ssn, df, and dt, it was limited to ssn and df. St2 just gets another record, and another unique generated key.

>First solution uses time, as a key’s attribute, compound key etc. Is
>there an explanation which will say when to use compound key or when to
>use surrogate key. What are the meanings of these keys, etc?
>Can we say that time is property of an entity?

In the case of subscriptions, I would think so. In the case of planned obsolescence and product support, I would think so.

>And related with this,
>we can set another question. What does it mean that two entities are
>identical?

Is it the same subscriber, you mean? In this case, the 'entity' also directly corresponds to a single 'tuple' or row. Is it the same person, even though everything from their name, to email, to phone number - heck, even ssn if the go into witness relocation, or something - and particular time interval, can change and furthermore be logged? I think as with the ssn, in the first place, that the machine would be set to treat name, or that ssn, as invariant, to correspond to a notion of unchanging identity. If these attributes ultimately are all deemed merely properties of the 'entity', then you are correct in that the person is reduced to an arbitrarily selected number. His 'file' would not be alphabetically, but numerically, sorted.

>Is it means that they are one same thing or they are the two
>things perfectly similar? In the second case their entity’s
>properties are same but their system’s properties can be different
>(like time/space/position properties).

If your 'entity' is a subscriber, then their subscription interval would seem a necessary attribute. If more likely the person, then whether or not they otherwise are reduced to a number, I might think that the intervals, and data of interest, would be held in another table(s), each hierarchically bound to a unique person, many to one. I've not had to use a temporal database, myself. But I believe this is what you are getting to.

>regarding construction of a key. Can we use principle: if for n
>attribute’s values, two rows are same, then I can always find n+1
>attributes so that rows are different?

Almost by definition, since something would have to be different or they could not possibly be unique.

>Can I say that for any n
>attributes, I can always set two rows to be with same values?

Yes.

>2.
>Now let St-solution be:
>SubscriberSt1 = (k11, s1, n1, a1, p1, e1, l1, ps1, df1, dt1)
>SubscriberSt2 = (k12, s1, n1, a1, p1, e2, l1, ps1, df2, dt2)
>ProductSt1 = (k21, p_c1, name1, price1, df3, dt3)
>SubscriberProductSt1 = (k31, k11, k21, type_of_sub1, df4, dt4)
>Here as addition to the referential integrity, it should be added that
>PK-FKs should correspond to each other by their states.

Integrity meaning a valid non-hanging reference, also a reference to the proper record. So if there were a k13, you simply don't want what should be k31, k11, k21 to be, instead, k31, k13, k21. If the constraint must exist in SPSt1 in order to be applied to the fk, then I assume, again, that the state refers to either/or df4, dt4, in this case? Either it's a subscriber record matching both, matching one or the other, or falling with the range of df4-dt4.

>3.
>Microsoft ADO.NET has possibility to work with the disconnected
>objects. We can use these objects to connect to a database, now we can
>get a data from the database, set them in the disconnected objects and
>then we can disconnect from the database. When we are disconnected we
>can update these data, for example we can update a key. Now we can
>connect to the database again and update it. Problems arise when there
>are the two users which are working in disconnected mode with the same
>row and one user updates the key, but the other one needs this key.
>There is Microsoft’s solution for these kinds of problems but it can
>be very complicated and there are many combinations. The St-solution
>which is shown above (in part 1 and 2) can handle these problems with
>ADO.NET easy.

Particularly since you suggest what even constitutes the key for variation 1 necessarily varies with changes to the data/attributes.

>In fact St-solution doesn’t have updates at all. The
>difference is that ADO.NET solution is based on programming theory.
>St-solution is based on DB theory.
>ADO.NET is great step, but maybe DB theory can have more appropriate
>solutions related to programming languages.

>4.
>In his book “Writings 1994-1997” on page 554, C.Date writes about
>duplicate rows as well as E.F.Code’s opinion about this. C. Date
>writes that G.W.Leibniz’s The Principle of Identity of
>Indiscernibles is the basic why relations don’t contain duplicate
>members. But it seems that results from Quantum Mechanics show that
>Leibniz’s principle is false.
>
>As an important part of his philosophical theory G.W.Leibniz defined
>the law which is about relationship between entities and properties. In
>the notation of symbolic logic Leibniz’s law can be written as:
> x=y ? ?F(Fx ? Fy)
>where x, y are objects and Fx is property of an object x.
>(For those who are interested in Leibniz’s Law, there is for example
>the Princeton University web site, where some ideas related to
>Leibniz’s Law are shown with the great imagination.
>
>http://www.princeton.edu/~jimpryor/courses/intro/notes/timetravel.html
>
>http://www.princeton.edu/~jimpryor/courses/intro/notes/personal-id.html
> )
>
>
>
>To not be misunderstood, I believe in Relational DB, but maybe some
>things can be “updated”, especially the things which are related to
>the new technologies. For example the things related to xml, web
>services, knowledge representation etc.

As far as that goes, I'd agree. And it's a matter of balance and degree. If some believe that an inflexible theory alone ought to inform the practice of many brilliant practitioners, inventing new variations on things all the time, perhaps not always with regard to some fully fleshed-out theory, but still using set constructs to conceive the problem, that is, using some proven theoretical tools, then one might say the theorists lived in an ivory tower or were chair-bound, etc. Unreasonable, that is to say. But the theorist might simply counter that tradecraft may create a host of problems, in the moment, and in the future as the project 'scales' or is used in other ways, with other datasets, and so. Only a well-considered, 'robust' theory should inform any particular implementation. And just give us time to do it. And so it goes. I think that's why the theories proposed by Codd, so concisely and clearly stated as was his apparent style, continue to be so appealing. He seemed to have one foot in the theory, the other in the practice.

>
>Vladimir Odrljin
Received on Mon Feb 27 2006 - 00:56:46 CET

Original text of this message