Re: How to 'normalise' this scenario

From: Erwin <e.smout_at_myonline.be>
Date: Tue, 17 May 2011 08:40:13 -0700 (PDT)
Message-ID: <9f5ee79e-7c69-40cd-a747-4004e6eb7fdc_at_z13g2000yqg.googlegroups.com>


On 17 mei, 17:16, Erwin <e.sm..._at_myonline.be> wrote:
>
> That you're now thinking is a good thing :-)
>
> The contents of file systems are not the best example for discussing
> management of graph data, because the "identity" (the means for
> identification) of a file typically is a combination (concatenation)
> of the "identity" of its containing directory and the file's own
> distinguishing name _within that directory_.  This is different
> compared to, say, bill-of-material structures or genealogical
> relationships, where the "nodes" (parts, people, ...) have a property
> "of their own" that uniquely identifies them "within the entire
> universe".  This is not the case for file systems.  *IX systems may
> have multiple directories each containing a /bin "file", and all
> those /bin "files" are distinct things.  Windows systems have multiple
> directories each containing an "Application data" folder, and those
> are all distinct.
>
> So if file systems are what's in your mind, then what you would
> typically need (for representing that in a database) is a single table
> where the _fully qualified_ filename is a key/identifier.  And as soon
> as you have that, there typically is no longer a need for explicitly
> recording the "identity" of the parent directory/folder, as that is
> already implied by the full name of the file itself.
>
> I don't fully see what you mean by "creating an entry in the
> directories table that points to a file".  Relational database designs
> do not include "pointers".  And you don't say on the directory level
> which files it contains, instead you say on the file level to which
> directory each file belongs.- Tekst uit oorspronkelijk bericht niet weergeven -
>
> - Tekst uit oorspronkelijk bericht weergeven -

A small example of a single-table database representing the contents of a file system :

(not valid SQL, but I hope you'll see the idea)

TABLE FILES {FILENAME:VARCHAR(...) DIRNAME:VARCHAR(...) other attributes (creationdate, ...) here} KEY {FILENAME} FOREIGN KEY FILES{DIRNAME} REFERENCES FILES{FILENAME} CHECK CONSTRAINT LENGTH(DIRNAME) < LENGTH(FILENAME) && SUBSTR(FILENAME, 0,LENGTH(DIRNAME) == DIRNAME) &&
SUBSTR(FILENAME,LENGTH(DIRNAME),LENGTH(DIRNAME)+1) == '/' sample data :

"",""   /*the root point */
"\PF",""
"\UD",""
"\PF\IBM","\PF"
"\PF\IBM\DB2","\PF\IBM"

Note that this way of dealing with the "root" entry is just a hack and at any rate it doesn't satisfy the check constraint as stated. But that's not the main point, so I gloss over it.

The thing is, while this is indeed a file containment graph represented in the form of an adjacency list, you don't really need the adjacency list to answer queries such as "list me all files contained directly or indirectly in the \PF directory". Owing to the fact that the DIRNAME value is always derivable from the FILENAME. Received on Tue May 17 2011 - 17:40:13 CEST

Original text of this message