Re: What is Pick anyway?

From: B Faux <bdfaux_at_prodigy.net>
Date: Fri, 13 Jan 2006 23:11:15 GMT
Message-ID: <nsWxf.470$NS6.53_at_newssvr30.news.prodigy.com>


"DonR" wrote (in a different sub-thread) [snip]
>
> As an example, I'll use PEOPLE and POSTAL files. In the dictionary for
> the PEOPLE file, I'll create records called ID, FIRSTNAME, LASTNAME,
> POSTALCODE, CITY and STATE. In these records, I'll define ID as field
> 0, the key, FIRSTNAME as field 1, LASTNAME as field 2, POSTALCODE as
> field 3, CITY and STATE as a lookups (join) into the POSTAL file.
>
> I'll define the POSTAL file with POSTALID as field 0, the key, CITY as
> field 1 and STATE as field 2.
>
> Once data is added to the files, I can use commands like these to list
> data from the files,
> LIST PEOPLE ID FIRSTNAME LASTNAME POSTALCODE
>
> ID FIRSTNAME LASTNAME POSTALCODE
> ------- ---------------- ---------------- ---------------------
> 1 Don Robinson 45678
>
>
> LIST POSTAL POSTALID CITY STATE
>
> POSTALID CITY STATE
> ----------------- ---------------- ----------
> 45678 Cincinnati Ohio
>
>
> Then I can list the combined contents of the files,
> LIST PEOPLE ID FIRSTNAME LASTNAME POSTALCODE CITY STATE
>
> ID FIRSTNAME LASTNAME POSTALCODE CITY STATE
> ----- ---------------- ------------------ -------------------- --------------
> ---------
> 1 Don Robinson 45678
> Cincinnati Ohio
>
> Here's how it works.
> The LIST command gets the id, first name, last name and postal code
> from the data in the PEOPLE file. Then using the postal code, it reads
> the POSTAL file using 45678 as the key to get the city and state. The
> CITY and STATE records define what file to read, what value to use as
> the key and what field to retrieve.
>
> In addition to defining the field or calculating a value, the
> dictionary records contain a justification and length. The LIST command
> uses the justification, L or R, for display of the data and for sorting
> as alpha or numeric. The length is used to control the display width of
> a column but has no effect on the amount of data stored in the record.
>
>
>

[snip]
(in this sub-thread)
>
> One of the features of the file system and BASIC is multiple values in
> an attribute and multiple sub-values in a value and in some versions,
> multiple sub-sub-values. All versions use CHAR(254) as the delimiter
> for attributes, CHAR(253) for values, CHAR(252) for sub-values and if
> supported, CHAR(251) for sub-sub-values.

[snip]

Don,

All of that is a great short primer, but you didn't get the chance to put them together. For the 'Non-Pick' aware crowd, consider the challenge of allowing an unlimited number of values in a particular field (attribute, cell).

In a simple SQL table design, you would need to reserve a certain number of columns for 'Home Phone', 'Work Phone', 'Mobile Phone' etc. In many cases, a record (row) will not have values for all of these possibilities, so a forced 'blank' (sometimes filled with spaces or zeros) will sit in the database indefinitely just taking up space (pardon the pun.) And if you need to accommodate a fifth phone type (or sixth, or seventh) then you get to enjoy the happiness of a table re-build exercise (multiple tables...) Now if your version of SQL supports it, you could extend a primary cell to a companion table (pivot table?) which houses the contact numbers in a separate table and then 'join' them every time you need to use them. The secondary table itself might also extend to a separate companion table and it could go on indefinitely which is why many SQL implementations either do not support this, or limit the maximum number of joins allowed. Or you might have repeating data in the ID, FIRSTNAME, LASTNAME, and POSTALCODE columns for as many additional rows as you need to accommodate the multiple data elements, any way you slice it, you end up with wasted space.

Contrast this to Multi-Value (MV, Pick) databases. ANY attribute (cell, field...) can contain an unlimited number of 'values', which can in turn contain an unlimited number of 'sub-values' (and in some implementations sub-sub-value, sub-sub-sub-value, etc.) So now we can model the example in a MV database with the multiple phone numbers, but I'll further complicate it by calling it 'contact data' allowing for pagers, email addresses, etc. In this case, we would benefit by creating two designated attributes in the data record we'll call 'Contact_Type' and 'Contact_Data'. We will further establish these two fields as 'related' fields or 'controlling-dependent' related attributes.

Now we can enter a set of values into the 'Contact_Type' attribute, delimited by value-marks which describe a particular type of allowed contact data (and a validation mask if desired) and a corresponding set of dependent values in the 'Contact_Data' attribute. Such that 'Contact_Type' value 1 corresponds to 'Contact_Data' value 1, and Contact_Type' value 2 corresponds to 'Contact_Data' value 2, etc. Any number of contact data elements, in any order can now be supported.

So extending the example above, the actual data as stored in the database looks something like this (please forgive formatting - these should align):

Field (attribute) - Description

0   -   ID
1   -   FIRSTNAME
2   -   LASTNAME
3   -   POSTALCODE
4   -   CONTACT_TYPE
5   -   CONTACT_DATA

Notice no reserved space for 'CITY' or 'STATE' because these values are contained in a different file, the 'POSTAL file. We might benefit from creating a control file called 'CONTACT', but the data likely to be held there could also be in the dictionary item itself, if there are only a few types allowed. So we'll use the following: 'HP' for 'Home Phone', 'BP' for

'Business Phone', 'MP' for 'Mobile Phone' and 'EM' for 'Email.'  In our 
'CONTACT' file we could store four records under the id's of 'HP', 'BP', 
'MP', and 'EM.'  Where the first attribute (field) is a descriptive name and 
the second attribute (field) is an input-mask test of some kind, and attribute 3 is an 'access method' of some type, (to avoid trying to connect the telephone headset to a fax machine, for example.)

Now our example on file will look like this (dictionary names in parenthesis - not data):

Field - Data Value (Dictionary Name)

0 - 1
(ID)
1 - Don
(FIRSTNAME)
2 - Robinson
(LASTNAME)
3 - 45678
(POSTALCODE)
4 - HP]EM]MP
(CONTACT_TYPE)
5 - 2123456789]drobinson_at_email.com]2125551212 (CONTACT_DATA)

Notice that the contact number is NOT FORMATTED in the file (except for the email) because output masking is handled in the dictionary, a mask of 'MR(###) ###-####' will result in the first Contact_Data value being Masked Right on output as (212) 345-6789, a test to return the 'EM' type as unformatted is easily added to the dictionary item as well. Note: 'MR' here means 'Mask Right'

The following access sentence is entered:

LIST PEOPLE ID FIRSTNAME LASTNAME CONTACT_TYPE CONTACT_DATA Would return the following:

ID FIRSTNAME LASTNAME CONTACT_TYPE CONTACT_DATA

----  ------------------    ----------------    -------------------------  ---------------------------
   1  Don                   Robinson       Home Phone          (212) 
345-6789
                                                        Email 
drobinson_at_email.com
                                                        Mobile Phone 
(212) 555-1212

You might amend the access sentence above to say:

LIST PEOPLE FIRSTNAME LASTNAME CONTACT_TYPE CONTACT_DATA IF CONTACT_TYPE = "MP"

If the record above is the only one containing a value of "MP" in the Contact_Type field (any value position) then the output would not change. But I hope you can see how easy it is to handle 'real-world' situations with very little effort, compared to many other DB approaches, SQL in particular.

And triggers (on read, on write, on change) are available in many MV implementations today to add functionality much like SQL column rules to disallow writing garbage, or allowing change or delete without authorization, etc. These triggers are usually held in the dictionary as well. And multiple dictionary representations can be used for the same field (attribute, cell) to allow for different formatting requirements for different situations, such as capturing data to pass to a dialer, in which case the parenthesis and the dash characters are not needed, (could even cause problems.)

Much too long, but it's an important point IMHO.

BFaux - Received on Sat Jan 14 2006 - 00:11:15 CET

Original text of this message