Re: one big table or many smaller tables

From: GoranG <no_at_spam.net>
Date: Fri, 19 Jul 2002 11:48:13 +0200
Message-ID: <61mfjuc6tngqjplql5aed6vrevnpg07fff_at_4ax.com>


On Tue, 16 Jul 2002 12:15:26 -0700, "news0.alterdial.uu.net" <DCraig_at_abc-clio.com> wrote:

>Database theorists:
>I have a fairly complex data structure with about 110 tables, 11 of
>the tables have 'notes' related to them and each row can have multiple
>notes associated with them.
>
>We're considering two ways to model this, the simplest
>structure is to have a notes table associated with each table that will have
>notes, so you have
>table1_notes, table2_notes etc, each with the primary key of the
>related table in each row in notes.
>
>In this simple model to query the notes table it would be something like:
>"SELECT note_text FROM table1_notes with id=currentID"
>
>The other way is to have one large notes table (it would have about..
>I think it was 60,000 records in it, not enough that performance would
>be an issue), and have the primary key of each related table + the
>table name in the notes table, so you would specify the table name
>when you query the notes table, like:
>"SELECT note_text FROM notes with table_name='table1'
>AND table_id=currentID"
>
>I'm advocating the second method because in the interface design there
>will be several places where we need to pull together notes from
>multiple related tables, for instance if table1 is students and table2
>is classes and table3 is instructors and I'm at a student record and
>want to see the notes for the student and the classes and the
>instructor it seems to me it would be reasonable to modify the query
>pulling in the student notes:
>"SELECT note_text FROM notes, classes_students
>WITH notes.table_id = current_student_id
>AND notes.table_name = 'students' "
>
>and add notes for the classes, let's see:
>AND classes_students.student_id=current_student_id
>AND with notes.table_id=classes_students.classes_id
>AND with notes.table_name='classes'

Which can be done on the 220 tables with UNION as well.

SELECT 'students notes' AS notes_origin, note_text FROM students_notes WHERE notes.students = current_student_id UNION
SELECT 'classes notes' AS notes_origin, note_text FROM classes_notes, classes_students
WITH classes_students.student_id = current_student_id AND classes_notes.classes_id = classes_students.classes_id

Please note that with some naming conventions the complete SQL string can be built, given only the 2 tables' names 'students' and 'classes' and of course, the current_student_id.

I think it is easier to keep the ref control this way... and the data model is slightly more obvious, at the expense of the size of it :)

This way model also allows greater flexibility in design... some notes might have additional properties.

Still I think that decission should be based on the question: are notes related in meaning or not, do they refer to the same area of interest and comment on same entities.
If they do then you might expect specification request for searches through all notes for some text... (student name, etc..) and that would (i think) make one table approach more interesting (or would lead to some sort of denormalization to approach acceptable performance).
If only pairs or small groups of notes entities are related then divided approach might make more sense.

( GoranG79 AT hotmail.com ) Received on Fri Jul 19 2002 - 11:48:13 CEST

Original text of this message