Re: How to 'normalise' this scenario

From: Erwin <e.smout_at_myonline.be>
Date: Tue, 17 May 2011 08:16:27 -0700 (PDT)
Message-ID: <eafe2924-2bd9-4b23-b8e8-be2631c3f417_at_y19g2000yqk.googlegroups.com>


On 17 mei, 13:16, Frank Millman <fr..._at_chagford.com> wrote:
> On May 17, 12:08 pm, Erwin <e.sm..._at_myonline.be> wrote:
>
> > On 16 mei, 08:50, Frank Millman <fr..._at_chagford.com> wrote:
>
> Hi Erwin
>
> Thanks for your comments - you have got me thinking.
>
> I have stripped most of the comments, because what I have left is the
> crux of the matter.
>
> > You have not stated what "the requirement" is.
>
> Let's take an example we should all be familiar with - a file system.
>
> Assume that we have a lot of files, and we have a table where each row
> represents one file, with columns such as file name, disk address,
> date created, date last modified, etc.
>
> You can sort the table by name, creation date, etc. However, it is
> getting difficult to manage, so a bright spark comes up with the idea
> of directories/folders.
>
> The user can create directories on demand, place directories within
> directories, assign files to directories, move files between
> directories, etc.
>
> The requirement is to create a database structure to represent the
> directories.
>
> My guess is that you could use an adjacency list or a nested set for
> this purpose. However, where I am getting stuck is, how does one link
> an entry in the 'files' table to an entry in the 'directories' table,
> to indicate that a particular file resides in a particular directory.
>
> I have been trying to create an entry in the 'directories' table that
> represents, or points to, a 'file'. From your comments, it would
> appear that this is the wrong approach.
>
> I could carry on and speculate further, but I think this is a good
> time to pause and ask if this is a good example, and ask if there is a
> preferred solution.
>
> TIA
>
> Frank

That you're now thinking is a good thing :-)

The contents of file systems are not the best example for discussing management of graph data, because the "identity" (the means for identification) of a file typically is a combination (concatenation) of the "identity" of its containing directory and the file's own distinguishing name _within that directory_. This is different compared to, say, bill-of-material structures or genealogical relationships, where the "nodes" (parts, people, ...) have a property "of their own" that uniquely identifies them "within the entire universe". This is not the case for file systems. *IX systems may have multiple directories each containing a /bin "file", and all those /bin "files" are distinct things. Windows systems have multiple directories each containing an "Application data" folder, and those are all distinct.

So if file systems are what's in your mind, then what you would typically need (for representing that in a database) is a single table where the _fully qualified_ filename is a key/identifier. And as soon as you have that, there typically is no longer a need for explicitly recording the "identity" of the parent directory/folder, as that is already implied by the full name of the file itself.

I don't fully see what you mean by "creating an entry in the directories table that points to a file". Relational database designs do not include "pointers". And you don't say on the directory level which files it contains, instead you say on the file level to which directory each file belongs. Received on Tue May 17 2011 - 17:16:27 CEST

Original text of this message