Re: How to 'normalise' this scenario

From: Frank Millman <>
Date: Wed, 18 May 2011 01:33:02 -0700 (PDT)
Message-ID: <>

On May 17, 5:16 pm, Erwin <> wrote:
> On 17 mei, 13:16, Frank Millman <> wrote:

> The contents of file systems are not the best example for discussing
> management of graph data, because the "identity" (the means for
> identification) of a file typically is a combination (concatenation)
> of the "identity" of its containing directory and the file's own
> distinguishing name _within that directory_.  This is different
> compared to, say, bill-of-material structures or genealogical
> relationships, where the "nodes" (parts, people, ...) have a property
> "of their own" that uniquely identifies them "within the entire
> universe".  This is not the case for file systems.  *IX systems may
> have multiple directories each containing a /bin "file", and all
> those /bin "files" are distinct things.  Windows systems have multiple
> directories each containing an "Application data" folder, and those
> are all distinct.

Agreed - a file system is not a perfect analogy because, as you point out, files in different directories can have the same name, so you need a combination of directory name and file name to uniquely identify a file. In the scenario that I am trying to describe, that does not apply.

> I don't fully see what you mean by "creating an entry in the
> directories table that points to a file".  Relational database designs
> do not include "pointers".  And you don't say on the directory level
> which files it contains, instead you say on the file level to which
> directory each file belongs.

I don't know the correct terminology, but take your example of a bill- -materials structure. Each element in the structure 'is', 'represents', 'points to', 'references', an item in a 'products' table, which will contain all the attributes of the product in question.

This is also not a perfect analogy for the scenario I am trying to describe, because in my scenario only the 'leaf' nodes represent items in a separate table.

How about a 'menu' system as an analogy? A typical cell phone has hundreds of functions available to the user. Instead of presenting the options in a long list, they are grouped in the form of menus, with sub-menus, of arbitrary depth, ultimately arriving at a function.

Only the 'leaf' nodes represent something 'tangible' - an executable function. All the other nodes are purely descriptive, and their only purpose is to assist the user in arriving at the function they are looking for.

Assume we have a table of 'functions', with columns such as function name, description, and a pointer to the executable code.

If we create a separate table representing the 'menu' system, how do we ensure that all 'leaf' nodes represent functions?

Maybe you gave a clue earlier, when you said -

> you don't say on the directory level which files it contains,
> instead you say on the file level to which directory each file belongs.

Substituting, are you saying that "you don't say on the menu level which function it contains, instead you say on the function level to which menu each function belongs."?

Are you suggesting the following?

  (node_id INT PRIMARY KEY,

  ..., ..., ...)

CREATE TABLE functions
  (function_id INT PRIMARY KEY,
  menu_id INT REFERENCES menus)

I can see the benefit of this approach, but I can see some issues.

  1. How do you ensure that functions.menu_id only references 'leaf' nodes?
  2. How do you ensure that every 'leaf' node is referenced by a functions.menu_id?
  3. How do you prevent a user from expanding a 'leaf' node and giving it children?

All of these issues can be handled at the application level. Are there any SQL constraints that could be used?

All comments welcome.

Frank Received on Wed May 18 2011 - 10:33:02 CEST

Original text of this message