Re: best way to store checkbox + text-entry fields

From: Jonathan Leffler <jleffler_at_earthlink.net>
Date: Sun, 13 Oct 2002 19:41:42 GMT
Message-ID: <3DA9CC6F.9030203_at_earthlink.net>


Eep² wrote:

> I'm designing a PHP-MySQL computer game comparison database and I'm still not sure how to store checkboxes and text-entry fields as I have them laid out in the game submission form at http://tnlc.com/eep/compare/gameform.html . I would VERY MUCH appreciate any help in how best to design the tables for this database. Here is what I've come up with so far:
>
> 1. a single cross-reference table that holds the IDs of most other tables in a heirarchial structure (see http://tnlc.com/eep/compare/relations.jpg)
> 2. multiple games_whatever (genre, publisher, developer, engine, etc) cross-reference tables (see http://tnlc.com/eep/compare/relations3.gif - disregard the duplicate websiteID fields)
> 3. same as 2 except adding the text-entry fields in the cross-reference tables (see http://tnlc.com/eep/compare/relations4.gif - only "genres" table has this)
> 4. same as 2 except the cross-reference tables consist of id, gameID, and each column of the table's contents instead of just listing the contents in a single column as in the previous examples) - for example:
>
> table: games
> fields: id, name
> data:
> 1, Tomb Raider
> 2, Tomb Raider II
> ...
>
> table: games_genres
> fields: id, gameID, action, adventure, role-playing, simulation, strategy, other
> data:
> 1, 1, 1, 1, null, null, null, null
> 2, 2, 1, 1, null, null, null, null
> ...

This is wasteful if most games have one or two genres and the list of genres grows to any significant size. Doubly so if users are going to be able to add new genres on the fly (see below). If you add a new genre, you have to add a column to this table, and populate it with appropriate values (presumably null given the sample data above).

> This allows only adding the appropriate genres to each game without having to duplicate gameID keys in "normal" cross-reference tables:

How big is the genre field? Integer constrained to values 1 and null?   Why not 1 and 0, anyway. It does not take long to waste more in the unneeded column. Let's see: gameID = integer = 4 bytes; genreID = integer = 4 bytes; id = integer = 4 bytes. The design above is using 32 bytes per row, for 64 bytes. A design recording just the id, gameID and genreID would use 12 bytes per row and 4 rows, meaning just 48 bytes -- 32 if you dropped the id field. So you're already wasting space and the wastage is going to get worse as you add more genres. Plus it is going to be harder and harder to write the queries against this table as you add more genres.

> table: games_genres
> fields: id, gameID, genreID
> data:
> 1, 1, 1
> 2, 1, 2
> 3, 2, 1
> 4, 2, 2
> ...

I'm not clear why you need the 'id' column in this table; the combination of gameID and genreID is a candidate key and unless you are going to be storing auxilliary information about the combination of gameID and genreID in other tables (that need to cross-reference this one), then I would expect a net loss using the id column.  

> Is #4 the best, most efficient way to do this or must I use the wasteful duplicate-key design?

I don't see where the waste is in the 'duplicate key design', but then I'm not sure whether I see the 'duplicate key design' either. Give or take the id column, this is the most obvious way to model what I've seen of your requirements.

 > I want to allow game submitters to enter new fields into the
 > various tables via text-entry fields (which I validate) so I need
 > some  way of storing them (and turning them into checkboxes) but
 > I'm not sure how to design this either.  Also, not all
 > text-entry fields will be added as new checkboxes (particularly
 > if the feature is VERY unique/specific to the game).

New fields in general probably mean table reorganizations - and is probably not a sensible goal. However, if you mean you want users to be able to add new genres, then this #4 is way superior to #3. You can simply define a new genreID in the table defining the meanings of the valid genres (create table genre(id ... primary key, description ...)), and people can add the new genre as a new row in here, and you can use it in the games_genres table. As for checkboxes, the genre table tells you which genre codes exist, so you can use the contents of the table to control the construction of your checkboxes. Additional attributes in the table can tell you whether it needs to be displayed (for your unique feature problem), and the order in which the values should be displayed.

[...all the above written without looking at the pictures on the web site...]

The stuff in relations.jpg appears to be trying to capture different peoples views of the same game; the stuff in relations3.gif and relatsions4.gif appears to be trying to present a single (authoritative) view of the game. You need to decide which is the way you want to progress - both designs are plausible, but they are very different.

-- 
Jonathan Leffler                   #include <disclaimer.h>
Email: jleffler_at_earthlink.net, jleffler_at_us.ibm.com
Guardian of DBD::Informix 1.00.PC2 -- http://dbi.perl.org/
Received on Sun Oct 13 2002 - 21:41:42 CEST

Original text of this message