Re: Towards a definition of atomic

From: Jan Hidders <hidders_at_gmail.com>
Date: Sun, 3 Feb 2008 06:41:04 -0800 (PST)
Message-ID: <be187d41-8720-441c-891a-ed5a7e110175_at_l32g2000hse.googlegroups.com>


On 2 feb, 02:45, David BL <davi..._at_iinet.net.au> wrote:
> On Feb 2, 3:42 am, Jan Hidders <hidd..._at_gmail.com> wrote:
>
>
>
> > On 1 feb, 14:30, David BL <davi..._at_iinet.net.au> wrote:
>
> > > AFAIK the conventional wisdom is that no absolute definition of atomic
> > > exists for domain types.
>
> > On the contrary, we have too many of them. :-) But I don't think it is
> > wise to go looking for a definition if it is not clear to you want you
> > want to do with that definition. If you don't know where you are
> > going, any road will take you there.
> > >  Throwing caution to the wind, in this post I
> > > wish to conjecture a definition of atomic that, perhaps with some more
> > > effort at its formalisation, can provide some absolute meaning for a
> > > given attribute within a given RDB schema.
>
> > In view of such blatant carelessness I can only respond by not heeding
> > my own warning. :-)
>
> > The usual intuition is that something is not atomic if it is
> > decomposable into smaller parts. In this case that would mean parts
> > with less information. For the case of decomposition into two
> > components this leads to something like:
>
> > DEFINITION: A domain D is said to be decomposable if there are domains
> > D1 and D2 and functions f1 : D -> D1 and f2 : D -> D2 such that (1) f1
> > and f2 are not injective and (2) <f1,f2> is injective where <f1,f2> :
> > D -> (D1 x D2) is defined such that <f1,f2>(x) = (f1(x), f2(x)).
>
> > Note that (1) says that each individual decomposition function loses
> > information and (2) says that together they don't. However, we can
> > then make the following observation:
>
> > THEOREM: A domain is decomposable iff it contains more than 2 values.
>
> I like your approach to defining a *non-trivial* decomposition, by
> requiring that f1,f2 not be injective.
>
> However, I don't think your definition captures the right idea.  For
> instance, my example 2 decomposes a domain expressed as a set of
> person names to a single name.  In that case what is D1,D2?

The definition is aimed at only establishing that something is decomposable, not exactly how you should decompose it. For a set of names it clearly holds that it is decomposable and you can pick many different D1s and D2s that show this. For example, you can split the set into a set of strings before "Laura" and a set after "Laura". Whether that makes sense or not, is another question because you are looking for an absolute definition. It might, and that is enough. If you want to say more about how you want to decompose, you could define something like the following:

DEFINITION: A domain D is said to be set-decomposable if there is a domain D' and an injective function f : D -> P(D'), where P is the power-set operator, such that the number of values in D' is smaller than the number of values of D minus 1.

It think you will understand why D' has to be smaller than D. The "minus 1" may be a bit surprising, but otherwise the notion becomes trivial in the sense that every domain will be set-decomposable: you can map the domain {v_1, v_2, ... } to P({v_2, ...}) by mapping v_1 to the empty set and for i > 1 mapping v_i to { v_i }.

Guess what?

THEOREM A domain is set-decomposable iff it contains more than 3 values.

So set-decomposability implies decomposability, but not the other way around.

> Since finite domains with more than two values are quite useful, my
> attempted definition is clearly a waste of time.

I don't think the exercise is completely pointless, or we wouldn't be having this conversation. :-) It's important to see what happens if you try to come up with an absolute definition of atomicity. But you have to keep in mind that as far as the relational model is concerned atomicity is not an absolute concept but a relative concept. It's relative to the Universe of Discourse that is being modeled. From different points of view the same thing might be considered atomic or not. This is related to how the data is going to be used in the database. If in your UoD something is considered to be composed then chances are that you may want separate indexes on these components, constraints that refer to these components, joins on only some of the components, etc., and so flattening the relation is probably a good idea.

Apologies if I'm kicking in an open door here. It's just to offset a few remarks about atomicity not being relevant in the context of the relational model. IMO such remarks show a deep misunderstanding of what the relational model is essentially about.

  • Jan Hidders
Received on Sun Feb 03 2008 - 15:41:04 CET

Original text of this message