Re: On view updating

From: Paul <paul_at_test.com>
Date: Mon, 27 Sep 2004 22:48:25 +0100
Message-ID: <41588aa3$0$42237$ed2e19e4_at_ptn-nntp-reader04.plus.net>


mAsterdam wrote:
> I don't see a database. I see no facts. I don't see
> what you are getting at. I would really need some more context.
> I would appreciate if you would give that,
> because maybe you *are* on to something interesting.
> You just did not give enough to go on (for me, that is,
> maybe others can chime in).

I'm really just reiterating what Costin said but just thinking of some of the issues around it and giving some more concrete examples.

The relational model organises our data by grouping together propositions that fit the same predicate "template" into a table.

We're constrained to only enter data in terms of a proposition that fits to those predicates.

For example our predicate might be:

   Employee [ID] is called [Firstname] [Surname], works in department    [dept] and has a salary of [sal]

This corresponds to a table, the square brackets denote columns in our table, and placeholders in our predicate.

Suppose none of the columns are allowed to have an "unknown" value, either in terms of a global NULL or a "special value" in the type definitions. We can only enter a row in the table if we know every one of the column values for a given employee.

Now suppose we don't know exactly what the firstname is, but we know it is definitely one of two options. We want to insert to our database the proposition:

   Employee 123 is called (John or Bill) Smith, works in department 10    and has a salary of 10,000.

But, as it doesn't fit our template, we can't. So does this mean that the relational model is lacking in terms of being able to store any expression of first-order logic?

Or does it just mean that the constraints we have implicitly chosen on allowable propositions (via our table design) won't allow that proposition to be stored?

For any expression like the one above, is it possible to redesign our tables to allow the expression to be stored?

We could rewrite our predicate like this:

   (Employee [ID] has first name [firstname])

    AND (Employee [ID] has last name [lastname])
    AND (Employee [ID] works in department [dept])
    AND (Employee [ID] has a salary of [sal])

So we could split all our tables down into binary relations (assuming a single-column primary key). Then we could store propositions of the form:

   Employee [Id] has first name ([firstname1] OR [firstname2] OR ...) and last name ([lastname1] OR [lastname2] OR ...) and ...

But we're still just restricting ourselves to stored a limited subset of all possible logical propositions about employees.

> Are you are investigating uncertainty?
> What I do see is that you want to store values like:
> "It's definitely either Bill or John".

Yes, I don't see that as uncertainty though :) Unless you define uncertainty as any logical statement that contains an "OR".

I'm thinking of those puzzle books you used to get (maybe you still do) called "Logic Problems" where you had a grid of possible truths and a few statements or clues, usually written as some complex boolean expression (in natural language though). You had to tick or cross the squares in the grid and work out what the truths were in a standard form.

  >> ... Suppose we create a row (with empid=234 say) in the emp table for

>> this Bill/John, but leave the name part
>>  blank (empty string). Now we can create a row
>>  constraint that says:
>>
>>   empid!=234 OR (empid=234 AND (name='Bill' OR name='John' OR name=''))
>>
>> The interpretation of a empty string can be that the name is not 
>> precisely known.
>>
>> But still I've stored information in the database to say that this 
>> employee is either called Bill or John. It's just not very easy to 
>> extract this information via a query without going to the system 
>> catalog and parsing some complex boolean expression.
>>
>> If I ran a query to tell me how many employees I have that are either 
>> called John or Bill, it would miss this one. The crux of the problem 
>> is that the database can't understand the interpretation of the empty 
>> string in the context of the "employee names" type.

>
> Not so strange, because you just imagined that interpretation,
> but did not tell the hypothetical database engine.

Is it even possible to make the database engine aware of this interpretation? Maybe if each data type could specify to the DBMS a value that meant "data unknown"? Then from my row constraint above the database can know that currently the name is unknown for sure, but I do know for definite that it is either Bill or John.

Paul. Received on Mon Sep 27 2004 - 23:48:25 CEST

Original text of this message