Re: How to 'normalise' this scenario

From: Erwin <e.smout_at_myonline.be>
Date: Wed, 18 May 2011 06:53:52 -0700 (PDT)
Message-ID: <365d59e6-ff3c-4dce-b858-a2626c56c168_at_c1g2000yqe.googlegroups.com>


On 18 mei, 15:27, "Fred." <ghrno-goo..._at_yahoo.com> wrote:
> On May 18, 7:14 am, Erwin <e.sm..._at_myonline.be> wrote:
>
>
>
>
>
> > On 17 mei, 21:50, "Fred." <ghrno-goo..._at_yahoo.com> wrote:
>
> > > On May 17, 11:16 am, Erwin <e.sm..._at_myonline.be> wrote:
>
> > > > On 17 mei, 13:16, Frank Millman <fr..._at_chagford.com> wrote:
>
> > > > > On May 17, 12:08 pm, Erwin <e.sm..._at_myonline.be> wrote:
>
> > > > > > On 16 mei, 08:50, Frank Millman <fr..._at_chagford.com> wrote:
>
> > > > > Hi Erwin
>
> > > > > Thanks for your comments - you have got me thinking.
>
> > > > > I have stripped most of the comments, because what I have left is the
> > > > > crux of the matter.
>
> > > > > > You have not stated what "the requirement" is.
>
> > > > > Let's take an example we should all be familiar with - a file system.
>
> > > > > Assume that we have a lot of files, and we have a table where each row
> > > > > represents one file, with columns such as file name, disk address,
> > > > > date created, date last modified, etc.
>
> > > > > You can sort the table by name, creation date, etc. However, it is
> > > > > getting difficult to manage, so a bright spark comes up with the idea
> > > > > of directories/folders.
>
> > > > > The user can create directories on demand, place directories within
> > > > > directories, assign files to directories, move files between
> > > > > directories, etc.
>
> > > > > The requirement is to create a database structure to represent the
> > > > > directories.
>
> > > > > My guess is that you could use an adjacency list or a nested set for
> > > > > this purpose. However, where I am getting stuck is, how does one link
> > > > > an entry in the 'files' table to an entry in the 'directories' table,
> > > > > to indicate that a particular file resides in a particular directory.
>
> > > > > I have been trying to create an entry in the 'directories' table that
> > > > > represents, or points to, a 'file'. From your comments, it would
> > > > > appear that this is the wrong approach.
>
> > > > > I could carry on and speculate further, but I think this is a good
> > > > > time to pause and ask if this is a good example, and ask if there is a
> > > > > preferred solution.
>
> > > > > TIA
>
> > > > > Frank
>
> > > > That you're now thinking is a good thing :-)
>
> > > > The contents of file systems are not the best example for discussing
> > > > management of graph data, because the "identity" (the means for
> > > > identification) of a file typically is a combination (concatenation)
> > > > of the "identity" of its containing directory and the file's own
> > > > distinguishing name _within that directory_.  This is different
> > > > compared to, say, bill-of-material structures or genealogical
> > > > relationships, where the "nodes" (parts, people, ...) have a property
> > > > "of their own" that uniquely identifies them "within the entire
> > > > universe".  This is not the case for file systems.  *IX systems may
> > > > have multiple directories each containing a /bin "file", and all
> > > > those /bin "files" are distinct things.  Windows systems have multiple
> > > > directories each containing an "Application data" folder, and those
> > > > are all distinct.
>
> > > > So if file systems are what's in your mind, then what you would
> > > > typically need (for representing that in a database) is a single table
> > > > where the _fully qualified_ filename is a key/identifier.  And as soon
> > > > as you have that, there typically is no longer a need for explicitly
> > > > recording the "identity" of the parent directory/folder, as that is
> > > > already implied by the full name of the file itself.
>
> > > > I don't fully see what you mean by "creating an entry in the
> > > > directories table that points to a file".  Relational database designs
> > > > do not include "pointers".  And you don't say on the directory level
> > > > which files it contains, instead you say on the file level to which
> > > > directory each file belongs.- Hide quoted text -
>
> > > > - Show quoted text -
>
> > > I think Frank's analogy is on all fours.
>
> > > The nested sets model is consrained to those graphs which represent
> > > hierarchies.  While each node will have a identity, users in Frank's
> > > application will be working with descriptors, which is one level
> > > removed from whatever identifier Frank sets up, and two levels removed
> > > from the physucal structure of the database.  Leaf nodes in the
> > > structure will correspond 1-1 to entries in a separte product table,
> > > from which users will be trying to retrieve information.
>
> > > The situation with a file system is very similar.  The directory
> > > structure is a hierachical set of nodes.  While each node has an
> > > identity in the logical disk structures. user of the file system work
> > > with the descriptive names associated with each node, which are on the
> > > order of 2 levels removed from the physical structure of the disk.
> > > Leaf nodes in the structure correspond 1-1 to files which have a
> > > different structure on the disk.
>
> > > While the hierarchy represents a conceptual structure, that conceptual
> > > structure does not exist in isolation.  Elements of it are already
> > > implemetned in the product table.
>
> > > I agree that there are differences as well as similarities.  That's
> > > why I used the word "similar" and Frank used the word "like".
>
> > > Fred.- Tekst uit oorspronkelijk bericht niet weergeven -
>
> > > - Tekst uit oorspronkelijk bericht weergeven -
>
> > I have no clue what point you're trying to make with your first
> > paragraphs, but I just want to note that those very same "differences
> > and similarities" are probably also the reason why I wrote "NOT THE
> > BEST example [for dealing with graph data]".- Hide quoted text -
>
> > - Show quoted text -
>
> I'm starting to think some of the people on this topic can't read very
> well.  Frank wasn't trying to create an example for dealing with graph
> data.  He was trying to solve a business problem using nested sets to
> support hierachical lookups in his product data and was asking a
> specific question about how to organize the hierarch to do this.  If
> you wish to complain that he was off topic for a theory group,  I
> can't argue to the contrary.  Still, I can't see that it hurts theory
> too much to apply it once in a while.
>
> Fred.- Tekst uit oorspronkelijk bericht niet weergeven -
>
> - Tekst uit oorspronkelijk bericht weergeven -

Well, first, I do not "complain" that he poses a question that can indeed legitimately be considered as "off-topic for the theory group". Had my intent been to do that kind of complaining, I simply would have said so (and left it at that afterwards).

Second, it does remain a fact that we are indeed in the theory group, and what I have tried is to provide the answers that theory has to give regarding the problem area that the OP is dealing with. Or rather, what I think theory's answers to this problem are. Others may still be around that can speak with more authority than I, but I know they will be much less inclined to respond, precisely because that offtopicness  that I am not so bothered with.

Third, applying theory "once in a while" obviously requires that one actually _knows_ that theory. I invite you to google around the whole internet for a definition of a mathematical theory of graphs that does _NOT_ define, somehow, a graph as being (or at least consisting of) a "set of edges". And if relational theory has it that a relation is (or at least consists of) a "set of tuples", I invite you to explain how one can reasonably justify that the "most natural" way of "dealing with graphs and hierarchies relationally" would somehow NOT be to just simply have a "set of tuples" where each tuple represents an edge of the graph. That is, an adjacency list.

To conclude : if the OP was indeed "asking a specific question about how to organize the hierarchy", then isn't the information I gave him about the adjacency list approach a most relevant and pertinent answer to precisely the question that he asked ? The only way imo to answer this with a "No it's not" is to observe that the OP had already chosen the solution (nested sets), and was now looking for a problem. I for one do not consider that a very smart way of "applying theory once in a while". Received on Wed May 18 2011 - 15:53:52 CEST

Original text of this message