c.d.theory glossary 0.0.4

From: mAsterdam <mAsterdam_at_vrijdag.org>
Date: Wed, 16 Jun 2004 22:16:38 +0200
Message-ID: <40d0aaa5$0$48959$e4fe514c_at_news.xs4all.nl>


  • " PUNTER: Good morning. Glossary 0.0.4: RECEPTIONIST: Good morning, sir. Can I help you? june 16, 2004 PUNTER: Well, I'd like to have an argument,
  • please."
    • ARGUMENT SKIT - Graham Chapman & John Cleese

Preamble:



This glossary seeks to limit lengthy misunderstandings in comp.database.theory. This newsgroup uses terms from database modeling, design, implementation, operations, change management, cost sharing, productivity research, and /or basic database research.

People tend to assume that words mean what they are accustomed to, and take for granted that the other posters have about the same connotations. They don't always.

It consists of signposts: watch out! You may think the OP means A but she might mean B. Alternative names and views of the same concept are introduced when the danger of mutual misunderstandings is appearant. When context matters, it is provided. The glossary is a highly biased list of problematic concepts.

Some words are particularly suspect:
data (!), database, object, normalisation. Some just cause minor annoyances, the misunderstanding is cleared and the discussion goes on:
domain, type, transaction.

We don't know well-accepted, formal or comprehensive definitions for everything. If you do have a useful reference, please provide it.
If an informal description is all we have, so be it.

What the glossay is not:



The glossary is not a dictionary or encyclopedia, such as FOLDOC, Wikipedia (http://www.wikipedia.org), and the Web Dictionary of Cybernetics and Systems.
Specific links to serve the glossary's purpose are welcome, of course. Also, it does not try to be a FAQ for "all things database".

Credits & contributions:



The glossary is built from contributions. Contributions from within this group are not credited, quotes from elsewhere are. If you want your name stated please say so.

If you want to contribute, please do so
in a copy&pastable way (don't let me do
all the editing :-)
Please also check spelling and grammar mistaeks.


[Address]

A value, used to identify a location.
What is to be found there is up to the rest of the system.

An address is a value used to locate ... A reference is a value used to refer ...

The difference between *locate* and *refer* is crucial here.

[Change management]

Many organizations have a CM process in place in order to make their evolution more manageable. The organization of data within a database can and will change with these changing circumstances. A DBMS should provide facilities to support this. Changing the underlying structure should be possible without affecting what is already stored. For example, you can add a column to a table without losing what is already there.

Related adjectives: maintainable, agile, flexible, adaptive.

[Class]

A class is what provides a name and a place for the abstract behavior of a set of objects said to belong to the class. (Larry Wall, Apocalypse 12)

note:
Other definitons welcome, this goes for the rest as well, of course.

Some use 'class' as having exposed data. Please be explicit about this if you do so.

[Data]

"Known facts that can be recorded and have implicit meaning." -- Fundamentals of Database Systems, Elmasri & Navathe.

When people discuss data in the context of database, they are usually talking of something with meaning. There are people who think that data doesn't need to mean anything. http://en.wikipedia.org/wiki/Data (currently) says "data on its own has no meaning". Somehow this "data has no meaning" idea has caught on.

1a. facts
1b. a record on a medium of some fact in the real world. 2. encoded information
3. a combination of sign and meaning

[Database]

  "A logically coherent collection of related real-world data   assembled for a specific purpose." -- rephrased from "Fundamentals of Database Systems", Elmasri & Navathe.

  1. Deluxe filesystem
  2. Shared databank (E. Codd)

[Data model]

Data models are artificial constructs and may not completely represent the true nature of information and categorization. These categories already exist, to some degree, in the way information is handled outside the database.
Databases don't exist in vacuo; they're fed (and consulted) by users who would have some system of mental categorization even if they were shuffling everything around with paper and pencil.

Source ?? (link was dead when I checked)

[Domain]

1. Given a relation R, a domain is a set Sn such that for each tuple (A1, A2, ...An, ...Am) in R, An is an element of Sn.

2. A domain is a set of values: for example

"integers between 0 and 255",
"character strings less than 10 characters long",
"dates".

Sometimes used synonymously with type.

[Entity]

Thing of interest. (ISO)

[Fact]

1. A piece of information about circumstances that exist or events that have occurred

2. A concept whose truth can be proved.
3. A statement or assertion of verified information.
4. An event known to have happened or something known to have existed.


[Function]

For now we have to live with different meanings of _function_ when talking about databases: "The function of this function is to get the tuples from B that are functionally dependant on A."

Three different contexts, but just about the same meaning:

General

     A purpose or use.
Math

     A binary mathematical relation with at most
     one b for each a in (a,b).
Software
     A subroutine, procedure, or method.

notes:
     every operator is a function
     every function is a relation

Please be specific.

[Information]

0. data in context, data with meaning.
(This implies a definition of data as being without context, whithout meaning - see data)
1. new data to the receptor.
2. available data, relevant to some decision or action.

[Information principle] (RM)

Date/Codd:
Chris Date in "EDGAR F. CODD 08/23/1923 – 04/18/2003 A TRIBUTE":

          The entire information content of a relational database
          is represented in one and only one way: namely, as
          attribute values within tuples within relations.


[Key]

A value, used to identify something.
See also TODO: primary key, foreign key.

[Meaning]

See [[Issues]]: meaning and use.

[MultiValue, MV]

1. One name for the industry surrounding the Nelson-Pick data model. In this context:

   FILE: a real-world collective noun.
   RECORD: a real-world object.
   FIELD: is a real-world adjective.n.

2. A data field (or attribute) defined to permit a variable number of values as a list (array).

[NULL]

The insanity bit. No! The humility marker. mu: The absence of an answer to a question which requires an answer.

/adj./
1. Attributes to something the absence of values.

         Ex: "The *null* set is the empty set, often represented by {}."

/n. colloq./
1. A noted appearance of the absence of values.

         Ex: "This table contains *nulls*."

Common usage:

  • Confusion arises when people use terms like "null value", a paradox to some, a contradictio in terminis to others.
  • Confusion arises due to the fact that nullness (the absence of value) is often represented on computers by the number 0. (Obviously, 0 is not null.)
  • In some contexts, 'null' and 'nil' mean the same thing; in others, they do not.

In databases traditionally NULL is used and and opposed. If you want to go into this, please first search for mu NIL void NULL undef, 2VL 3VL.

"It isn't the things we don't know that give us trouble. It's the things we know that ain't so." - Will Rogers

[Object]

1. Model of an entity, characterised by behaviour and state. (ISO) 2. Something intelligible or perceptible by the mind.

[Table/Row/Column] (SQL-DBMS)

Table: A collection of columns (the table header) and rows (the body). Row: A collection of values, conforming to the table header columns.

One table may contain data about one entity, about several entities, about one or several relationships or any combination.
A column can be seen as the attribute of the entity/one of the entities/relationships about which the table is concerned.

[Type]

" TYPES are sets of things we can talk about;

   RELATIONS are (true) statements bout those things." -- Chris Date, feb 2004

  1. Set of possible values (i.e. IT equivalent of math 'domain').
  2. Set of possible values plus all possible operators defined on them. (i.e. synonymous to Class if 'class' is meant to include a possible set of values).

This is highly misunderstanding-prone area, so please take some care to be specific.

[Type - 3rdM]

In The Third Manifesto a type is:

  • a pattern (possible representation)
  • a domain for some operators (THE_xxx operators)
  • a codomain for some operators (the "constructors")

There is a requirement for the 'domain' and the 'codomain' to be the same set.

[Pointer]

See address(*).

[Reference]

A reference is a value, used to refer to something. A program can get the current value of that something (without ever knowing where it resides) by dereferencing, even if that something has been relocated between the time of first reference and the dereferencing.

[References, pointers, keys]

While references may be implemented as pointers, the programmer prefers not to know (if he prefers to know he should have used pointers).

In some programming languages one can declare variables of a pointer type - these variables can have pointer values.
m.m. (mutatis mutandis) reference.

Two operations are supported:
referencing and dereferencing.
On references only these operations are possible. On pointers other operations are possible.

The dereferencing operation takes a pointer *value* and returns a pointer *variable* of the type the pointer refers to.
The referencing operation is the inverse operation. It takes a *variable* and returns a pointer *value*. m.m. reference.

In Java the term pointer was avoided
because pointer is often used to mean
physical memory addresses.

Relational keys are not pointers.

[Relation]

1. A relation is a subset of the set of ordered tuples (A1, A2, ... Am) formed by the Cartesian cross-product of sets S1 x ... x Sm where each An is an element of Sn.

Note: A set, Sx, is not restricted from participating as a member of a relation more than once. Distinction between identical sets in math is possible through ordinal numbering such that given sets Sx and Sy, x <> y AND Sx is a subset of Sy and Sy is a subset of Sx; in relational theory, in contrast, it is by attribute name.

2. ...

[Transaction]

A set of database operations constituting a logical unit of work. Most DBMS include the ability to rollback complete transactions when an error is detected.


[[Issues]]

RELATIONs vs. RELATIONSHIPs

     Can namespaces help to make some distance? In this case:
     RM.RELATION vs. ER.RELATIONSHIP

represented vs. described

RELATION(SHIP)s vs RELATION(SHIP)s SET

fact vs. thing (ENTITY).

First Order Logic vs. Higher Order Logic.

What, if there is, is the equivalent of an ENTITY(SET) in the RM ?

Does it make sense to talk about attributes of a fact ? How are those different from ATTRIBUTES of an ENTITY ?

     Traditionally there can be Multivalued ATTRIBUTES
     in ER, RM has atomic ATTRIBUTES.
     So: RM.ATTRIBUTE and ER.ATRRIBUTE ?

In ER modeling, a RELATIONSHIP is defined over ENTITIES: "A relationship is an association between several entities." In RM, a RELATIONSHIP is defined over VALUEs. What is the difference between ENTITIES and VALUEs ?

[meaning vs use]

Say we currently have a validated statement about the exchange rate of some stock at some recent time.

  1. It does not matter to the meaning where/how this statement is represented. We have it.
  2. To the use of it it is important where/how it is represented, and available to relevant actors.
  3. Twenty years later the meaning of this statement is still the same.
  4. Twenty years later most of its usefullness will probably have gone.

It may be --- in some instances -- not appropriate to make this distinction. The meaning of data is always contextual. The same bit of data means different things to different structured viewpoints within the organization, for example, and at different times (epochs). One grain of sand does not form a beach. One bit of data itself has little meaning. It is rather the collective of all data that possesses greater notion of meaning.


[[ToDo]]:

Application
Attribute
Dynamic vs static
Foreign key
Normalize
Location
Partitioning
Persistence
Primary key
Operator
Orthogonal
Scalar

Feel free to post suggestions to add or remove.

[[In preparation]]

[logical pointer]

logical pointers as in navigational information from a foreign key in one relation to a primary key in another (effectively a mapping).

[...]

How lossless is lossless decomposition?

What does it take for a pizza to be a pizza ?



Thank you for contributing.

Milestones? For the glossary I prefer inch-pebbles. Received on Wed Jun 16 2004 - 22:16:38 CEST

Original text of this message