Re: On view updating

From: mAsterdam <mAsterdam_at_vrijdag.org>
Date: Tue, 28 Sep 2004 01:09:43 +0200
Message-ID: <41589db8$0$65124$e4fe514c_at_news.xs4all.nl>


Thank you for elaborating.

Paul wrote:

> The relational model organises our data by grouping together
> propositions that fit the same predicate "template" into a table.
>
> We're constrained to only enter data in terms of a proposition that fits
> to those predicates.
>
> For example our predicate might be:
>
> Employee [ID] is called [Firstname] [Surname], works in department
> [dept] and has a salary of [sal]
>
> This corresponds to a table, the square brackets denote columns in our
> table, and placeholders in our predicate.
>
> Suppose none of the columns are allowed to have an "unknown" value,
> either in terms of a global NULL or a "special value" in the type
> definitions. We can only enter a row in the table if we know every one
> of the column values for a given employee.
>
> Now suppose we don't know exactly what the firstname is, but we know it
> is definitely one of two options. We want to insert to our database the
> proposition:
>
> Employee 123 is called (John or Bill) Smith, works in department 10
> and has a salary of 10,000.

So far, I get it.

> But, as it doesn't fit our template, we can't.

Indeed. Why doesn't it fit? (John or Bill) is not a [Firstname].

Let's say the type of [Firstname] and [Surname] is both {name} (inventing syntax for this case only as we go along, no special considerations here)

'(John or Bill)' is *not* of type {name}. It is an expression that gives
alternative possible values for a {name}

So it could never render a true proposition under the predicate:

    Employee [ID] is called [Firstname] [Surname], works in     department [dept] and has a salary of [sal].

Rephrasing the predicate including the attributetypes

    Employee [ID:EID] is called [Firstname:name] [Surname:name],     works in department [dept:name] and has a salary of [sal:amount].

But (assuming) we *do* want to store what *do* know, we could invent a new type for this, let's say {nameRX}, (RX for some regular expression syntax)
and construct a predicate like this:

    Employee [ID:EmpID] is called one of [Firstnamepattern:nameRX],     one of [Surnamepattern:nameRX], works in one of the     departments [deptpattern:nameRX] and has one of the salaries     of [salpattern:amountRX].

> So does this mean that
> the relational model is lacking in terms of being able to store any
> expression of first-order logic?

No, it just means that the table can't hold data reflecting propositions beyond its predicate. This is a feature.

> Or does it just mean that the constraints we have implicitly chosen on
> allowable propositions (via our table design) won't allow that
> proposition to be stored?

Explicitly, I'ld say. A database that holds simple facts is simpler than a database that hold facts of the complexity of what you *did* know about Smith.
Imagine telling somebody else about what you know about Smith. You 'll take more words, more complicated - and rarer - grammatical constructs in natural langague to convey your facts to another person than when you simply know that
John Smith works at accounting.

> For any expression like the one above, is it possible to redesign our
> tables to allow the expression to be stored?
>
> We could rewrite our predicate like this:
>
> (Employee [ID] has first name [firstname])
> AND (Employee [ID] has last name [lastname])
> AND (Employee [ID] works in department [dept])
> AND (Employee [ID] has a salary of [sal])
>
> So we could split all our tables down into binary relations (assuming a
> single-column primary key).

 > Then we could store propositions of the form:
>
> Employee [Id] has first name ([firstname1] OR [firstname2] OR ...) and
> last name ([lastname1] OR [lastname2] OR ...) and ...

This strongly reminds me of prolog.

> But we're still just restricting ourselves to stored a limited subset of
> all possible logical propositions about employees.
>

>> Are you are investigating uncertainty?
>> What I do see is that you want to store values like:
>> "It's definitely either Bill or John".

>
> Yes, I don't see that as uncertainty though :)
> Unless you define uncertainty as any logical statement that contains an
> "OR".

Ok. Uncertainty is not the correct term for this. You are positive his name is either Bill or John. But it is not simple facts either, because of the 'OR'. Is there a good word for this kind of information?

> I'm thinking of those puzzle books you used to get (maybe you still do)
> called "Logic Problems" where you had a grid of possible truths and a
> few statements or clues, usually written as some complex boolean
> expression (in natural language though). You had to tick or cross the
> squares in the grid and work out what the truths were in a standard form.

I am familiar with this kind of puzzle.

...
> Is it even possible to make the database engine aware of this
> interpretation? Maybe if each data type could specify to the DBMS a
> value that meant "data unknown"?

That is *not* easy. Maybe you'ld want to read what Hugh Darwen writes about missing information without NULLS: http://www.hughdarwen.freeola.com/TheThirdManifesto.web/Missing-info-without-nulls.pdf

> Then from my row constraint above the
> database can know that currently the name is unknown for sure, but I do
> know for definite that it is either Bill or John.

Thanks again for taking the time to expose the problem, very interesting :-) Received on Tue Sep 28 2004 - 01:09:43 CEST

Original text of this message