Re: Nearest Common Ancestor Report (XDb1's $1000 Challenge)

From: Hugo Kornelis <hugo_at_pe_NO_rFact.in_SPAM_fo>
Date: Tue, 18 May 2004 11:27:12 +0200
Message-ID: <miija0pta0n6sc8d4l4t7b1pocpgc2m4st_at_4ax.com>


On 17 May 2004 17:52:49 -0700, Neo wrote:

>> Furthermore, you seem to desire the possibility to enter untyped data,
>> which is of course impossible in a strong-typed language. I do present
>> "sort of" a way to do this in a relational database, but I'd never use a
>> kludge even remotely like this for real. Just as I consider XDb1 to be
>> completely worthless for any real problem, for exactly this same reason.
>> Remove types, and nothing prevents your user from entering "banana" as
>> John's age.
>
>All things in XDb1 are typed/classified.
>In XDb1, thing is the most general class.
>Person's class is thing (person isa thing).
>John's class is person (john isa person).
>Mary's class is person (mary isa person).
>Color's class is thing (color isa thing).
>Red's class is color (red isa color).
>Dog's class is thing (dog isa thing).
>Fido's class is dog (fido isa dog). Etc...
>Except for thing (which is the root),
>can you name any thing in an XDb1 database that isn't classified?

Hi Neo,

Probably not. I expect you to know XDb1 lots better than I do, so I'll take your word for it. But I didn't use the word "classified", I used the word "untyped". You changed that to "typed/classified" at the beginning of your reply, then conveniently stripped off the "typed" part in the rhetorical question at the end of this quote.

Your explanation does raise another question. It looks as if the same syntax is used to specify both intension and extension of the model, thereby eliminating the distinction between schema and population. My guess is that XDb1 would accept the following without complaining:

 person isa thing.
 john isa person.
 mary isa person.

 neo isa john.

XDb1 will probably gladly store it - but what does it mean?

>In RM, classification can be accomplished similarly. By adding a row
>in a table named T_Person, one effectively classifies that row as a
>person. I think you may be confusing or limiting typing to hardware
>types (ie bit, byte and integer, etc). XDb1's data model doesn't
>require hardware to have bit, byte or integer and is implemented as
>such.

No, I'm not confusing or limiting anything. I didn't mean hardware types (you should know that the RM is hardware independent). I did mean datatypes. There might be situations where an untyped database can have it's use (I admit that my original choice of words was too harsh), but outside of those niches, the schema should be strictly seperate from the data and the datatype of the data should be known.

>Any thing in XDb1 can have multiple classifications. For example, we
>can further classify John as a doctor in addition to being a person
>(john isa doctor). Also if user provides 35, XDb1/application can
>classify it as both an integer and age. If user provides 35.1,
>XDb1/application can classify it as both a decimal and age. If user
>provides 35 & 1/3, XDb1/application can classify it as both a fraction
>and age. If user provides thirty-five, XDb1/application can classify
>it as both a word and age. If user provides "over-the-hill",
>XDb1/application can classify it as both an expression and age.

And this is exactly the reason why I'd never use XDb1 for serious work, unless I encounter a problem area where the advantages of allowing untyped data outweigh the disadvantages.

In 99.9% of all applications that store a person's age, comparisons have to be made: who is younger than 45 years? Who is older, John, Mary or Fido? How can XDb1 (or any other type-less database) anser the last question is the user has provided the following input:

 over-the-hill isa age.
 very-young isa age.
 7 isa age.
 john is over-the-hill.
 mary is very-young.
 fido is 7.

Most computers I have used will classify very-young as greater than over-the-hill and will refuse to compare either of these to 7.

Like I said - there may be specific situations where a product such as XDb1 has it's use. But it's not (to quote the web site) "the future of databases" - not even remotely!

>You are correct in that XDb1 does not automatically validate "basic"
>classes such as bit, byte and integer. In XDb1, bits, bytes, integer
>are currently classifications whose rules need to be implemented by
>the user. In the future they (along with other common classifications
>such as color and person) might be provided.
>
>Suppose in the future, user creates type X. X has nothing to do with
>hardware bits, bytes or integers and is not built into the db. What
>will validate that x1 is a X in RM? It will be application logic just
>as it is in XDb1 (even for bit, byte and integer since XDb1 doesn't
>require them).

(from another message)
>Now consider the type/class color. It is not a built in type. How can
>RM implementations prevent the user from entering "banana" (the fruit)
>for color?

My comment is not about hardware types. Nor is it about domain checking (which is what your "color banana" example illustrates). It is about data types. Defining a column Color as character [varying] does not ensure that all data entered will be colors, but it does ensure that they are in the character domain. Same as a definition of the Age column as integer ensures that all values will be in the numerical domain, even though it is still possible (if I don't take steps to prevent it) to enter -41 or 3,765,987 in the Age column.

Restricting the data type is not enough to ensure that values adhere to the required domain. But it does ensure that I can perform operations on the data and rely on the outcome. Expressions like ValueA > ValueB, ValueA + ValueB and ValueA - ValueB are defined for values in the numeric domain. The first two are also defined for values in the character domain, though the definition differs from the numeric version; the third expression type is not defined for the character domain and would result in an error. I know all this, and I can rely on it - if and only if the database ensures that only numeric data will be accepted in the Age column and only character data in the Color column.

(from yet another message)
>The above statements create the RM equivalent of the following:
>(Note: ->XYZ represents appropriate ID)
(snip)
>T_Name (This "table" pre exists in XDb1, but is not pre-populated)
>ID Symbo1 Sym2 Sym3 Sym4 .........
>-- ------ ---- ---- ----
>. ->J ->o ->h ->n
>. ->M ->a ->r ->y
>. ->8 ->0

I would not call this "table" relational. It violates 1NF. If I did have the desire (which I don't have) to break down names and values into individual letters and digits, I'd at least use a design without repeating groups.

Groetjes, Hugo

-- 

Sorry, vandaag geen grappige sig lines meer.
Received on Tue May 18 2004 - 11:27:12 CEST

Original text of this message