Relational Database Systems

From Oracle FAQ
Jump to: navigation, search

History[edit]

Dr. E.F. Codd, an IBM researcher, first developed the relational data model in 1970. Over time it has proved to be flexible, extensible, and robust.

A RDBMS (Relational Database Management System) is a type of DBMS in which the database consists of relations and is manipulated by operations based on relational algebra. Database products like the Oracle Database, SQL Server, DB2, Microsoft Access, etc. are based on SQL. The design of SQL was strongly influenced by the idea of Codd's relational model but the SQL model differs from the relational one in some important ways. Despite the differences, understanding the relational model is key to understanding how to use SQL-based systems effectively.

Codd's 12 rules[edit]

In 1985 Codd published a list of 13 rules, numbered 0-12 and generally known as "Codd's 12 Rules", that defined how a true RDBMS should be evaluated. Understanding these rules will greatly improve your ability to understand RDBMS's in general, including Oracle. The "rules" make certain observations about the potential properties of database systems but in practice most systems don't try to conform to all the rules. Some of them, especially rules 1,2,3 and 6, are still controversial and are difficult to justify scientifically because Codd didn't give precise definitions.

Ten years after Codd the Third Manifesto defined relational systems by using 33 numbered prescriptions and 14 proscriptions. It uses more formal terms and contradicts Codd's original rules in several ways.

Here are Codd's rules:

Rule 0: The Foundation Rule

The system must manage databases entirely through its relational capabilities.

Rule 1: The Information Rule

Information is represented logically in tables.

All data should be represented to the user logically in tables (rows and columns).

Rule 2: Guaranteed Access Rule

Data must be logically accessible by table, primary key, and column.

All data should be accessible without ambiguity. This can be accomplished through a combination of the table name, primary key, and column name.

Rule 3: Systematic Treatment of NULL Values

NULL values must be uniformly treated as "missing information," not as empty strings, blanks, or zeros.

A field should be allowed to remain empty. This involves the support of a NULL value, which is distinct from an empty string or a number with a value of zero. Of course, this can't apply to primary keys. In addition, most database implementations support the concept of a nun-null field constraint that prevents null values in a specific table column.

Rule 4: Dynamic On-Line Catalog Based on the Relational Model

Metadata (data about the database) must be stored in the database just as regular data is.

A relational database must provide access to its structure through the same tools that are used to access the data. This is usually accomplished by storing the structure definition within special system tables.

Rule 5: Comprehensive Data Sublanguage Rule

A single language must be able to define data, views, integrity constraints, authorization, transactions, and data manipulation. [Typically this is SQL.]

The database must support at least one clearly defined language that includes functionality for data definition, data manipulation, data integrity, and database transaction control. All commercial relational databases use forms of the standard SQL (Structured Query Language) as their supported comprehensive language.

Rule 6: View Updating Rule

Views must show the updates of their base tables and vice versa.

Data can be presented to the user in different logical combinations, called views. Each view should support the same full range of data manipulation that direct-access to a table has available. In practice, providing update and delete access to logical views is difficult and is not fully supported by any current database.

Rule 7: High-level Insert, Update, and Delete

A single operation must be able to retrieve, insert, update, or delete data.

Data can be retrieved from a relational database in sets constructed of data from multiple rows and/or multiple tables. This rule states that insert, update, and delete operations should be supported for any retrievable set rather than just for a single row in a single table.

Rule 8: Physical Data Independence

Batch and end-user operations are logically separate from physical storage and access methods.

The user is isolated from the physical method of storing and retrieving information from the database. Changes can be made to the underlying architecture ( hardware, disk storage methods ) without affecting how the user accesses it.

Rule 9: Logical Data Independence

Batch and end-user operations can change the database schema without having to recreate it or the applications built upon it.

How a user views data should not change when the logical structure (tables structure) of the database changes. This rule is particularly difficult to satisfy. Most databases rely on strong ties between the user view of the data and the actual structure of the underlying tables.

Rule 10: Integrity Independence

Integrity constraints must be available and stored in the relational database metadata, not in an application program.

The database language (like SQL) should support constraints on user input that maintain database integrity. This rule is not fully implemented by most major vendors. At a minimum, all databases do preserve two constraints through SQL.

No component of a primary key can have a null value. (see rule 3) If a foreign key is defined in one table, any value in it must exist as a primary key in another table.

Rule 11: Distribution Independence

The data manipulation language of the relational system should not care where or how the physical data is centralized or distributed.

A user should be totally unaware of whether or not the database is distributed (whether parts of the database exist in multiple locations). A variety of reasons make this rule difficult to implement; I will spend time addressing these reasons when we discuss distributed databases.

Rule 12: Nonsubversion Rule

Any row processing done in the system must obey the same integrity rules and constraints that set-processing operations do.

There should be no way to modify the database structure other than through the multiple row database language (like SQL). Most databases today support administrative tools that allow some direct manipulation of the datastructure.


Relational database example[edit]

Here's a simple example of a relational database:

Your company needs a better way of keeping track of customers, products, and orders because your paper-based system just ain't cutting it anymore. One way of setting this up using the relational model is to create three tables: Customers, Products and Orders.

You can see that the Customer table doesn't care about orders or products, this keeps it focused on its objective - customers. Likewise, the Product table cares only about itself. The Order table uses the CustomerID and the ProductID to relate a product to a customer based on an order.

If you were to keep all this info in one spreadsheet you'd have difficulty trying to see subsets of information. How many widgits has Sally M. ordered over the last week; month; year? What is our most popular product? What points during the year are sales the highest?

Components[edit]

The relational model has three major aspects.

  • Structures - Structures are well-defined objects (such as tables, views, indexes, and so on) that store or access the data of a database. Structures and the data contained within them can be manipulated by operations.
  • Operations - Operations are clearly defined actions that allow users to manipulate the data and structures of a database. The operations on a database must adhere to a predefined set of integrity rules.
  • Integrity Rules - Integrity rules are the laws that govern which operations are allowed on the data and structures of a database. Integrity rules protect the data and the structure of a database.

Benefits of the relational model[edit]

A relational database management system offers benefits such as:

  • Independence of physical data storage and logical database structure
  • Variable and easy access to all data
  • Complete flexibility in database design
  • Reduce data storage and redundancy