Re: How to 'normalise' this scenario
From: Bob Badour <bob_at_badour.net>
Date: Wed, 18 May 2011 07:46:39 -0700
Message-ID: <dbydnYMJ8eBPRk7QnZ2dnUVZ5oudnZ2d_at_giganews.com>
>>On 17 mei, 21:50, "Fred." <ghrno-goo..._at_yahoo.com> wrote:
>>
>>
>>
>>
>>
>>
>>>On May 17, 11:16 am, Erwin <e.sm..._at_myonline.be> wrote:
>>
>>>>On 17 mei, 13:16, Frank Millman <fr..._at_chagford.com> wrote:
>>
>>>>>On May 17, 12:08 pm, Erwin <e.sm..._at_myonline.be> wrote:
>>
>>>>>>On 16 mei, 08:50, Frank Millman <fr..._at_chagford.com> wrote:
>>
>>>>>Hi Erwin
>>
>>>>>Thanks for your comments - you have got me thinking.
>>
>>>>>I have stripped most of the comments, because what I have left is the
>>>>>crux of the matter.
>>
>>>>>>You have not stated what "the requirement" is.
>>
>>>>>Let's take an example we should all be familiar with - a file system.
>>
>>>>>Assume that we have a lot of files, and we have a table where each row
>>>>>represents one file, with columns such as file name, disk address,
>>>>>date created, date last modified, etc.
>>
>>>>>You can sort the table by name, creation date, etc. However, it is
>>>>>getting difficult to manage, so a bright spark comes up with the idea
>>>>>of directories/folders.
>>
>>>>>The user can create directories on demand, place directories within
>>>>>directories, assign files to directories, move files between
>>>>>directories, etc.
>>
>>>>>The requirement is to create a database structure to represent the
>>>>>directories.
>>
>>>>>My guess is that you could use an adjacency list or a nested set for
>>>>>this purpose. However, where I am getting stuck is, how does one link
>>>>>an entry in the 'files' table to an entry in the 'directories' table,
>>>>>to indicate that a particular file resides in a particular directory.
>>
>>>>>I have been trying to create an entry in the 'directories' table that
>>>>>represents, or points to, a 'file'. From your comments, it would
>>>>>appear that this is the wrong approach.
>>
>>>>>I could carry on and speculate further, but I think this is a good
>>>>>time to pause and ask if this is a good example, and ask if there is a
>>>>>preferred solution.
>>
>>>>>TIA
>>
>>>>>Frank
>>
>>>>That you're now thinking is a good thing :-)
>>
>>>>The contents of file systems are not the best example for discussing
>>>>management of graph data, because the "identity" (the means for
>>>>identification) of a file typically is a combination (concatenation)
>>>>of the "identity" of its containing directory and the file's own
>>>>distinguishing name _within that directory_. This is different
>>>>compared to, say, bill-of-material structures or genealogical
>>>>relationships, where the "nodes" (parts, people, ...) have a property
>>>>"of their own" that uniquely identifies them "within the entire
>>>>universe". This is not the case for file systems. *IX systems may
>>>>have multiple directories each containing a /bin "file", and all
>>>>those /bin "files" are distinct things. Windows systems have multiple
>>>>directories each containing an "Application data" folder, and those
>>>>are all distinct.
>>
>>>>So if file systems are what's in your mind, then what you would
>>>>typically need (for representing that in a database) is a single table
>>>>where the _fully qualified_ filename is a key/identifier. And as soon
>>>>as you have that, there typically is no longer a need for explicitly
>>>>recording the "identity" of the parent directory/folder, as that is
>>>>already implied by the full name of the file itself.
>>
>>>>I don't fully see what you mean by "creating an entry in the
>>>>directories table that points to a file". Relational database designs
>>>>do not include "pointers". And you don't say on the directory level
>>>>which files it contains, instead you say on the file level to which
>>>>directory each file belongs.- Hide quoted text -
>>
>>>>- Show quoted text -
>>
>>>I think Frank's analogy is on all fours.
>>
>>>The nested sets model is consrained to those graphs which represent
>>>hierarchies. While each node will have a identity, users in Frank's
>>>application will be working with descriptors, which is one level
>>>removed from whatever identifier Frank sets up, and two levels removed
>>>from the physucal structure of the database. Leaf nodes in the
>>>structure will correspond 1-1 to entries in a separte product table,
>>>from which users will be trying to retrieve information.
>>
>>>The situation with a file system is very similar. The directory
>>>structure is a hierachical set of nodes. While each node has an
>>>identity in the logical disk structures. user of the file system work
>>>with the descriptive names associated with each node, which are on the
>>>order of 2 levels removed from the physical structure of the disk.
>>>Leaf nodes in the structure correspond 1-1 to files which have a
>>>different structure on the disk.
>>
>>>While the hierarchy represents a conceptual structure, that conceptual
>>>structure does not exist in isolation. Elements of it are already
>>>implemetned in the product table.
>>
>>>I agree that there are differences as well as similarities. That's
>>>why I used the word "similar" and Frank used the word "like".
>>
>>>Fred.- Tekst uit oorspronkelijk bericht niet weergeven -
>>
>>>- Tekst uit oorspronkelijk bericht weergeven -
>>
>>I have no clue what point you're trying to make with your first
>>paragraphs, but I just want to note that those very same "differences
>>and similarities" are probably also the reason why I wrote "NOT THE
>>BEST example [for dealing with graph data]".- Hide quoted text -
>>
>>- Show quoted text -
Date: Wed, 18 May 2011 07:46:39 -0700
Message-ID: <dbydnYMJ8eBPRk7QnZ2dnUVZ5oudnZ2d_at_giganews.com>
Fred. wrote:
> On May 18, 7:14 am, Erwin <e.sm..._at_myonline.be> wrote: >
>>On 17 mei, 21:50, "Fred." <ghrno-goo..._at_yahoo.com> wrote:
>>
>>
>>
>>
>>
>>
>>>On May 17, 11:16 am, Erwin <e.sm..._at_myonline.be> wrote:
>>
>>>>On 17 mei, 13:16, Frank Millman <fr..._at_chagford.com> wrote:
>>
>>>>>On May 17, 12:08 pm, Erwin <e.sm..._at_myonline.be> wrote:
>>
>>>>>>On 16 mei, 08:50, Frank Millman <fr..._at_chagford.com> wrote:
>>
>>>>>Hi Erwin
>>
>>>>>Thanks for your comments - you have got me thinking.
>>
>>>>>I have stripped most of the comments, because what I have left is the
>>>>>crux of the matter.
>>
>>>>>>You have not stated what "the requirement" is.
>>
>>>>>Let's take an example we should all be familiar with - a file system.
>>
>>>>>Assume that we have a lot of files, and we have a table where each row
>>>>>represents one file, with columns such as file name, disk address,
>>>>>date created, date last modified, etc.
>>
>>>>>You can sort the table by name, creation date, etc. However, it is
>>>>>getting difficult to manage, so a bright spark comes up with the idea
>>>>>of directories/folders.
>>
>>>>>The user can create directories on demand, place directories within
>>>>>directories, assign files to directories, move files between
>>>>>directories, etc.
>>
>>>>>The requirement is to create a database structure to represent the
>>>>>directories.
>>
>>>>>My guess is that you could use an adjacency list or a nested set for
>>>>>this purpose. However, where I am getting stuck is, how does one link
>>>>>an entry in the 'files' table to an entry in the 'directories' table,
>>>>>to indicate that a particular file resides in a particular directory.
>>
>>>>>I have been trying to create an entry in the 'directories' table that
>>>>>represents, or points to, a 'file'. From your comments, it would
>>>>>appear that this is the wrong approach.
>>
>>>>>I could carry on and speculate further, but I think this is a good
>>>>>time to pause and ask if this is a good example, and ask if there is a
>>>>>preferred solution.
>>
>>>>>TIA
>>
>>>>>Frank
>>
>>>>That you're now thinking is a good thing :-)
>>
>>>>The contents of file systems are not the best example for discussing
>>>>management of graph data, because the "identity" (the means for
>>>>identification) of a file typically is a combination (concatenation)
>>>>of the "identity" of its containing directory and the file's own
>>>>distinguishing name _within that directory_. This is different
>>>>compared to, say, bill-of-material structures or genealogical
>>>>relationships, where the "nodes" (parts, people, ...) have a property
>>>>"of their own" that uniquely identifies them "within the entire
>>>>universe". This is not the case for file systems. *IX systems may
>>>>have multiple directories each containing a /bin "file", and all
>>>>those /bin "files" are distinct things. Windows systems have multiple
>>>>directories each containing an "Application data" folder, and those
>>>>are all distinct.
>>
>>>>So if file systems are what's in your mind, then what you would
>>>>typically need (for representing that in a database) is a single table
>>>>where the _fully qualified_ filename is a key/identifier. And as soon
>>>>as you have that, there typically is no longer a need for explicitly
>>>>recording the "identity" of the parent directory/folder, as that is
>>>>already implied by the full name of the file itself.
>>
>>>>I don't fully see what you mean by "creating an entry in the
>>>>directories table that points to a file". Relational database designs
>>>>do not include "pointers". And you don't say on the directory level
>>>>which files it contains, instead you say on the file level to which
>>>>directory each file belongs.- Hide quoted text -
>>
>>>>- Show quoted text -
>>
>>>I think Frank's analogy is on all fours.
>>
>>>The nested sets model is consrained to those graphs which represent
>>>hierarchies. While each node will have a identity, users in Frank's
>>>application will be working with descriptors, which is one level
>>>removed from whatever identifier Frank sets up, and two levels removed
>>>from the physucal structure of the database. Leaf nodes in the
>>>structure will correspond 1-1 to entries in a separte product table,
>>>from which users will be trying to retrieve information.
>>
>>>The situation with a file system is very similar. The directory
>>>structure is a hierachical set of nodes. While each node has an
>>>identity in the logical disk structures. user of the file system work
>>>with the descriptive names associated with each node, which are on the
>>>order of 2 levels removed from the physical structure of the disk.
>>>Leaf nodes in the structure correspond 1-1 to files which have a
>>>different structure on the disk.
>>
>>>While the hierarchy represents a conceptual structure, that conceptual
>>>structure does not exist in isolation. Elements of it are already
>>>implemetned in the product table.
>>
>>>I agree that there are differences as well as similarities. That's
>>>why I used the word "similar" and Frank used the word "like".
>>
>>>Fred.- Tekst uit oorspronkelijk bericht niet weergeven -
>>
>>>- Tekst uit oorspronkelijk bericht weergeven -
>>
>>I have no clue what point you're trying to make with your first
>>paragraphs, but I just want to note that those very same "differences
>>and similarities" are probably also the reason why I wrote "NOT THE
>>BEST example [for dealing with graph data]".- Hide quoted text -
>>
>>- Show quoted text -
> > > I'm starting to think some of the people on this topic can't read very > well. Frank wasn't trying to create an example for dealing with graph > data. He was trying to solve a business problem using nested sets to > support hierachical lookups in his product data and was asking a > specific question about how to organize the hierarch to do this. If > you wish to complain that he was off topic for a theory group, I > can't argue to the contrary. Still, I can't see that it hurts theory > too much to apply it once in a while. > > Fred.
Remind me: What theory are we applying? Received on Wed May 18 2011 - 16:46:39 CEST