Re: Counting links between to records on the same table (Clustering?)

From: Alexandr Savinov <savinov_at_host.com>
Date: Mon, 27 Jun 2005 18:05:24 +0200
Message-ID: <42c023d0$1_at_news.fhg.de>


Dale Walker schrieb:
> I'm trying to develop a little method that groups DJs together into
> similar tastes in music so that people can work out if they might like
> a certain night out.
>
> Now in the DJ world, no-one can agree on definitions of musics styles,
> some have many music styles, some have styles with no names yet, etc.,
> so whilst having a separate field containing a one-many link to a list
> of DJstyles might be useful some of the time, it's not going to work
> every time.
>
> So I thought that using my quite extensive database of previous
> activities of DJs who play within the London area I could perhaps use
> some method of totalling up the times when two DJs appeared together
> on the same night and therefore be able to work out some sort of
> similarity that way.
>
> My first attempt at this left me with a problem.
>
> I've got a table of DJs names with various other bits and bobs about
> them. I tried to create a second table with a couple of fields, the
> first being that ID of a DJ, the second being the ID of another DJ on
> the same night. A third field would simply be a counter to add up each
> time the two people played together. The problem was, there was no way
> of automatically making sure that if the two DJs were entered in
> reverse the totals would match.
>
> How do I make sure that regardless of which order the two DJs names
> are entered, the total number of times they played together will be
> correct and match?
>
> I had the thought that I could somehow create a list of all the
> possible pairings and then use the id from that to link to each Name
> on the master DJ Name list but I'm unsure of how to do that. I'm
> trying to avoid as much coding as possible but realise that sometimes,
> coding is the only way out.

If you want to minimize coding then add two records for each event with the opposite order of DJs. This will reflect the symmetric character of the relationship. For example, in accounting you will add two records for one event: one to debit the first account and another to credit the second account.

So you will have a long table EVENTS(DJ1, DJ2). For clustering or other analysis purposes you might add more fields for characterizing the events such as the number of people. Probably there is a difference between an event with 500 people and 10000 people.

If you want to have three and more DJs associated with one event then you should add one record for each pair (EventID, DJID).

-- 
http://conceptoriented.com
Received on Mon Jun 27 2005 - 18:05:24 CEST

Original text of this message