Re: How to 'normalise' this scenario

From: Bob Badour <>
Date: Wed, 18 May 2011 07:46:39 -0700
Message-ID: <>

Fred. wrote:

> On May 18, 7:14 am, Erwin <> wrote:

>>On 17 mei, 21:50, "Fred." <> wrote:
>>>On May 17, 11:16 am, Erwin <> wrote:
>>>>On 17 mei, 13:16, Frank Millman <> wrote:
>>>>>On May 17, 12:08 pm, Erwin <> wrote:
>>>>>>On 16 mei, 08:50, Frank Millman <> wrote:
>>>>>Hi Erwin
>>>>>Thanks for your comments - you have got me thinking.
>>>>>I have stripped most of the comments, because what I have left is the
>>>>>crux of the matter.
>>>>>>You have not stated what "the requirement" is.
>>>>>Let's take an example we should all be familiar with - a file system.
>>>>>Assume that we have a lot of files, and we have a table where each row
>>>>>represents one file, with columns such as file name, disk address,
>>>>>date created, date last modified, etc.
>>>>>You can sort the table by name, creation date, etc. However, it is
>>>>>getting difficult to manage, so a bright spark comes up with the idea
>>>>>of directories/folders.
>>>>>The user can create directories on demand, place directories within
>>>>>directories, assign files to directories, move files between
>>>>>directories, etc.
>>>>>The requirement is to create a database structure to represent the
>>>>>My guess is that you could use an adjacency list or a nested set for
>>>>>this purpose. However, where I am getting stuck is, how does one link
>>>>>an entry in the 'files' table to an entry in the 'directories' table,
>>>>>to indicate that a particular file resides in a particular directory.
>>>>>I have been trying to create an entry in the 'directories' table that
>>>>>represents, or points to, a 'file'. From your comments, it would
>>>>>appear that this is the wrong approach.
>>>>>I could carry on and speculate further, but I think this is a good
>>>>>time to pause and ask if this is a good example, and ask if there is a
>>>>>preferred solution.
>>>>That you're now thinking is a good thing :-)
>>>>The contents of file systems are not the best example for discussing
>>>>management of graph data, because the "identity" (the means for
>>>>identification) of a file typically is a combination (concatenation)
>>>>of the "identity" of its containing directory and the file's own
>>>>distinguishing name _within that directory_. This is different
>>>>compared to, say, bill-of-material structures or genealogical
>>>>relationships, where the "nodes" (parts, people, ...) have a property
>>>>"of their own" that uniquely identifies them "within the entire
>>>>universe". This is not the case for file systems. *IX systems may
>>>>have multiple directories each containing a /bin "file", and all
>>>>those /bin "files" are distinct things. Windows systems have multiple
>>>>directories each containing an "Application data" folder, and those
>>>>are all distinct.
>>>>So if file systems are what's in your mind, then what you would
>>>>typically need (for representing that in a database) is a single table
>>>>where the _fully qualified_ filename is a key/identifier. And as soon
>>>>as you have that, there typically is no longer a need for explicitly
>>>>recording the "identity" of the parent directory/folder, as that is
>>>>already implied by the full name of the file itself.
>>>>I don't fully see what you mean by "creating an entry in the
>>>>directories table that points to a file". Relational database designs
>>>>do not include "pointers". And you don't say on the directory level
>>>>which files it contains, instead you say on the file level to which
>>>>directory each file belongs.- Hide quoted text -
>>>>- Show quoted text -
>>>I think Frank's analogy is on all fours.
>>>The nested sets model is consrained to those graphs which represent
>>>hierarchies. While each node will have a identity, users in Frank's
>>>application will be working with descriptors, which is one level
>>>removed from whatever identifier Frank sets up, and two levels removed
>>>from the physucal structure of the database. Leaf nodes in the
>>>structure will correspond 1-1 to entries in a separte product table,
>>>from which users will be trying to retrieve information.
>>>The situation with a file system is very similar. The directory
>>>structure is a hierachical set of nodes. While each node has an
>>>identity in the logical disk structures. user of the file system work
>>>with the descriptive names associated with each node, which are on the
>>>order of 2 levels removed from the physical structure of the disk.
>>>Leaf nodes in the structure correspond 1-1 to files which have a
>>>different structure on the disk.
>>>While the hierarchy represents a conceptual structure, that conceptual
>>>structure does not exist in isolation. Elements of it are already
>>>implemetned in the product table.
>>>I agree that there are differences as well as similarities. That's
>>>why I used the word "similar" and Frank used the word "like".
>>>Fred.- Tekst uit oorspronkelijk bericht niet weergeven -
>>>- Tekst uit oorspronkelijk bericht weergeven -
>>I have no clue what point you're trying to make with your first
>>paragraphs, but I just want to note that those very same "differences
>>and similarities" are probably also the reason why I wrote "NOT THE
>>BEST example [for dealing with graph data]".- Hide quoted text -
>>- Show quoted text -
> I'm starting to think some of the people on this topic can't read very
> well.  Frank wasn't trying to create an example for dealing with graph
> data.  He was trying to solve a business problem using nested sets to
> support hierachical lookups in his product data and was asking a
> specific question about how to organize the hierarch to do this.  If
> you wish to complain that he was off topic for a theory group,  I
> can't argue to the contrary.  Still, I can't see that it hurts theory
> too much to apply it once in a while.
> Fred.

Remind me: What theory are we applying? Received on Wed May 18 2011 - 16:46:39 CEST

Original text of this message