Oracle FAQ Your Portal to the Oracle Knowledge Grid

Home -> Community -> Usenet -> comp.databases.theory -> Re: So what's null then if it's not nothing?

Re: So what's null then if it's not nothing?

From: David Cressey <>
Date: Wed, 23 Nov 2005 04:39:03 GMT
Message-ID: <HnSgf.4450$>

"DonR" <> wrote in message
> Hello Hugo,
> Since I'm a "Pickie", I'll throw my 2 cents in here. Caution, a long
> post.

> Here's a general description of how Pick systems store data where ^
> represents an attribute mark, hex FE, and / represents a value mark,
> hex FD.
> RecordKey attribute1^attribute2^attribute3 value1/attribute3
> value2/attribute3 value3^attribute4...
> RecordKey must be unique and allows direct access to a record (aka
> row).
> Attributes may contain multiple values and values may contain multiple
> sub-values, which I don't show in my example. All the Pick systems
> conform to this format and the data is stored in plain ascii. Of course
> they all differ at the physical file level in how they hash keys, etc
> Records are stored in groups and a calculation is performed to
> determine which group a record is stored in. In other words, if the
> file has 100 groups and 1,000 records, on average, each group will have
> 10 records in it. This is how Pick can locate a record in a file with a
> million records with as few as ONE read.

This is a good summary of the data format, and I think I can infer the model from it.
Here's my attempt to paraphrase what I think you wrote, and several questions that arise,
from my point of view.

I looks like a Pick file is a virtual array of records, indexed by RecordKey, with each array
containing a tree of values. (By "array" I just mean a direct access structure, what ever that means behind the scenes)

 It looks like the elements of the tree are juxtaposed, rather than being adressed via pointers,
as a tree would be in lisp. It looks like the FD and FE chars make the tree structure explicit.
Since the reader can use the FD and FE markers to detect the presence of a multivalue, it's unnecessary to model
the difference between an attribute that can take on a single value and one that can take on a multivalue in advance, and it's possible to recast a prior single value as a multivalue at a later point in time, provided one is willing to maintain any code that would break if there's more than one value.

For instance, I can create a model that says that each person has only one telephone number, write my file, but at a later time, after cell phones become widespread, I can change my mind, and start including multiple phones in a single person record, without reloading any of the existing data.

Am I reading this right so far? If I am, then it may explain why Pickies are so enthusiastic about the flexibility of the model. In order to get a parallel result in an SQL schema, I would have to add a second column for a second phone, and a third column for a third phone, (shudder), or decompose the table into two tables, one with (person_id, phone_number). This decomposition is basically the same decomposition explained in introductory treatments of first normal form. This could easily result in more ripple effect than you might get with a small, simple Pick application.

I haven't seen how you store sub-values yet. Does the scheme for sub-values require yet another marker like FE and FD to separate sub-values? If so, how to you represent sub-sub-values, and so on?

Also, I have some questions about what to do when you have more than one record type. Let's say you have CUSTOMERS, ORDERS, and PRODUCTS. Do you store them all in the same Pick file, coming up with some kind of composite key that concatenates an indicator of key type (as in "C", "O", or "P") with an indicator of key instance? or do you store separate record types in separate Pick files?

In either event, how do you manage the many-to-many relationship between ORDERS and PRODUCTS? It would seem difficult, at first glance, to manage a single body of data that could be used to answer PRODUCT centered questions, (like find all the PRODUCTS whose sales volume is rising) and also to answer ORDER centered questions (find all the ORDERS that have been billed for at least 30 days, but have not been paid). How is this done?

Also, while original "Pick" seems to have been a file manager and not a DBMS as such, does UniVerse supply DBMS like services (such as managing transaction concurrency)? Or is UniVerse simply a file manager? Received on Tue Nov 22 2005 - 22:39:03 CST

Original text of this message