Re: Newbie question on table design.

From: -CELKO- <jcelko212_at_earthlink.net>
Date: 29 Apr 2007 05:41:17 -0700
Message-ID: <1177850477.793972.30990_at_y5g2000hsa.googlegroups.com>


>> There are numerous record management systems that allow for concurrent access by multiple processes. <<

The record systems I know used a job queue and locks. They could not even share read-only data very well. We had all kinds of schemes for handling the queues, but the model was that you had to wait until the other user was done with his changes. The early SQL systems were also built on a queues and locking model because that is what we had.

There was no concept of optimistic concurrency control or parallelism via logical partitioning when changing the data.

>> "row" and "record" is more a matter of terminology than concept. <<

I disagree. Rows are not records. A record is defined in the application program which reads it; a row is defined in the database schema and not by a program at all. The name of the field is in the READ or INPUT statements of the application; a row is named in the database schema. Likewise, the PHYSICAL order of the field names in the READ statement is vital (READ a,b,c is not the same as READ c, a, b; but SELECT a,b,c is the same data as SELECT c, a, b.

All empty files look alike; they are a directory entry in the operating system with a name and a length of zero bytes of storage. Empty tables still have columns, constraints, security privileges and other structures, even tho they have no rows.

This is in keeping with the set theoretical model, in which the empty set is a perfectly good set. The difference between SQL's set model and standard mathematical set theory is that set theory has only one empty set, but in SQL each table has a different structure, so they cannot be used in places where non-empty versions of themselves could not be used.

Another characteristic of rows in a table is that they are all alike in structure and they are all the "same kind of thing" in the model. In a file system, records can vary in size, data types and structure by having flags in the data stream that tell the program reading the data how to interpret it. The most common examples are Pascal's variant record, C's struct syntax and Cobol's OCCURS clause.

The OCCURS keyword in Cobol and the Variant records in Pascal have a number which tells the program how many time a record structure is to be repeated in the current record.

Unions in 'C' are not variant records, but variant mappings for the same physical memory. For example:

union x {int ival; char j[4];} myStuff;

defines myStuff to be either an integer (which are 4 bytes on most modern C compilers, but this code is non-portable) or an array of 4 bytes, depending on whether you say myStuff.ival or myStuff.j[0];

But even more than that, files often contained records which were summaries of subsets of the other records -- so called control break reports. There is no requirement that the records in a file be related in any way -- they are literally a stream of binary data whose meaning is assigned by the program reading them.

Columns versus Fields

A field within a record is defined by the application program that reads it. A column in a row in a table is defined by the database schema. The datatypes in a column are always scalar.

The order of the application program variables in the READ or INPUT statements is important because the values are read into the program variables in that order. In SQL, columns are referenced only by their names. Yes, there are shorthands like the SELECT * clause and INSERT INTO <table name> statements which expand into a list of column names in the physical order in which the column names appear within their table declaration, but these are shorthands which resolve to named lists.

The use of NULLs in SQL is also unique to the language. Fields do not support a missing data marker as part of the field, record or file itself. Nor do fields have constraints which can be added to them in the record, like the DEFAULT and CHECK() clauses in SQL.

Relationships among tables within a database

Files are pretty passive creatures and will take whatever an application program throws at them without much objection. Files are also independent of each other simply because they are connected to one application program at a time and therefore have no idea what other files looks like.

A database actively seeks to maintain the correctness of all its data. The methods used are triggers, constraints and declarative referential integrity.

Declarative referential integrity (DRI) says, in effect, that data in one table has a particular relationship with data in a second (possibly the same) table. It is also possible to have the database change itself via referential actions associated with the DRI.

For example, a business rule might be that we do not sell products which are not in inventory. This rule would be enforce by a REFERENCES clause on the Orders table which references the Inventory table and a referential action of ON DELETE CASCADE. Received on Sun Apr 29 2007 - 14:41:17 CEST

Original text of this message