Counting links between to records on the same table (Clustering?)

From: Dale Walker <dale_at_sorted.org>
Date: Mon, 27 Jun 2005 15:46:14 +0100
Message-ID: <2110c1d3puece199eq9hea8hh34l91op5u_at_4ax.com>


I'm trying to develop a little method that groups DJs together into similar tastes in music so that people can work out if they might like a certain night out.

Now in the DJ world, no-one can agree on definitions of musics styles, some have many music styles, some have styles with no names yet, etc., so whilst having a separate field containing a one-many link to a list of DJstyles might be useful some of the time, it's not going to work every time.

So I thought that using my quite extensive database of previous activities of DJs who play within the London area I could perhaps use some method of totalling up the times when two DJs appeared together on the same night and therefore be able to work out some sort of similarity that way.

My first attempt at this left me with a problem.

I've got a table of DJs names with various other bits and bobs about them. I tried to create a second table with a couple of fields, the first being that ID of a DJ, the second being the ID of another DJ on the same night. A third field would simply be a counter to add up each time the two people played together. The problem was, there was no way of automatically making sure that if the two DJs were entered in reverse the totals would match.

How do I make sure that regardless of which order the two DJs names are entered, the total number of times they played together will be correct and match?

I had the thought that I could somehow create a list of all the possible pairings and then use the id from that to link to each Name on the master DJ Name list but I'm unsure of how to do that. I'm trying to avoid as much coding as possible but realise that sometimes, coding is the only way out.

If I'm barking up the wrong tree, please let me know.

I also kind of get the feeling I'm getting into the area of clustering algorithms. Are there any resources on the net that deal with database implementations of that?

---
Dale Walker
London Techno Events
dale_at_sorted,org
http://london.sorted.org
Received on Mon Jun 27 2005 - 16:46:14 CEST

Original text of this message