cimode_at_hotmail.com wrote:
> [Snipped]
>> Oh what the heck. I promise! :)
> Be careful what you wish for...
>
> You just swallowed the red pill in Matrix...
OK, here's the sketch, and it is only a sketch of the "best" approach. I
am going to describe it in terms of SQL only because that is what you
are going to end up using. I hope you will end up thinking my sketch
is simplistic or trivial. (Looking at a database design and asking
"but where's the magic?" is probably a sign that you are on the right
track. There shouldn't be anything magic or mysterious in the database
that can be discovered/disclosed/decoded only by the application at
run-time.) I am sure others here will elaborate on my suggestion if
they feel I've been excessively colloquial/sloppy/terse/SQL-ish, but
this is a sketch for a newbie in a hurry and I am rushing this out
before the boss catches me. :-)
Before I start I should clarify the terminology a bit. To me,
a questionnaire is a set of questions. A single questionnaire may be
administered to any number of interviewees. Each interviewee will
provide a response which is their set of answers to the questions on
the questionnaire, hence each questionnaire will have multiple
responses--one from each interviewee.
First, note that each row in an SQL table is expected to be
understood as an assertion of fact. There is expected to be a
"template" of a sentence associated with the table, into which the
values from each row can be plugged to form a meaningful statement.
For instance a simple template for the responses to one questionnaire
might be "on day <date> the interviewee called <name> answered question
1 <answer1> and question 2 <answer2> and question 3 <answer3>...". We
don't store the template anywhere--maybe we should but we don't.
However it should be written down somewhere because everything that you
do subsequently in your database design and your application code has
to make sense with respect to that template.
I know nothing about your business process (and I don't want to--too
busy),but it seems to me that you probably need just one response table
per questionnaire to record the responses. If you have thousands of
questionnaires then you'll need thousands of response tables. No
worry; there is no ration on tables, and multiple smaller tables will
make querying fast.
I expect different questionnaires to have different kinds of reponses.
One might be a set of yes/no answers. Another might be a set scores
from best to worst or whatever. Another might be set of observations
(e.g. date, time, temperature, windchill, etc.). And yet another could
be a mix. So for each questionnaire you create a response table with
columns of the appropriate type for that questionnaire. Nothing
mysterious so far. But note that even if all your questionnaires have
the exact same kind of responses they should still each have their own
response table. Having the same "shape" is an irrelevant coincidence;
don't be tempted to confuse or conflate them because that obscures
what is really going on.
When you create your response table in your database, the SQL
engine will automatically create entries in the database catalogue
tables to describe your table. It *has* to do this, both for the
obvious reasons, and because the relational model (which SQL pretends
to respect) insists that the meta-data be represented explicitly in the
database in a form capable of being queried by anyone. I suspect this
one point is where 99.99% of practitioners go astray. They don't know
the meta-data is there and that it is REQUIRED to be there and it is
absolutely legitimate to assume it will always be there and always be
public! The meta-data tells both you and the SQL engine what columns
exist in the table and what each is called and what its data-type is
and a whole lot of other vital stuff.
This is where your moderately elaborate application comes in.
You need to write a generic questionnaire-processing application
which takes a response table name as input, and then interrogates the
system catalogues for the meta-data to discover the number and type of
each column in the response table. Once it has that information it can
insert/update/delete/select using dynamically constructed queries.
Just what your program has to look like will depend crucially on your
programming language and your DBMS. If you are using any kind of API
you are probably already constructing queries dynamically whether you
know it or not. If you are using something more old-school like Cobol
or C with embedded SQL you will have to learn to use dynamic SQL, but
it's actually quite easy.
I haven't talked about a table for the questionnaire itself (i.e. the
questions). I don't see any need for such a table in the database. It
is just documentation. Obviously if it is convenient to have
the questions in the database for the purpose of decorating reports then
by all means have one.
The reason this approach deserves the label "best" is because it
exploits SQL instead of working against it or in spite of it. You
therefore end up with an intuitively simple table design (and an
intuitive approach to future table design), and a single application
that can (in principle) handle all future questionnaires without
modification. Having written the application once, you will be able to
implement new questionnaires in a few minutes (just create a new
table), and you have the full power of your SQL engine to query it
directly and intuitively.
I hesitate to bring this up at all, but you're going to hear about it
sooner or later: there is another, hugely popular approach that tries
to do the same thing as what I have described. It gets aggrandized
with an "official-sounding" name. It is called the
entity-attribute-value model, or EAV for short. You can google the
history of this group for explanations of why it is hideously
wrong-headed. Don't be taken in by the erudite-sounding name or the
mountains of erudite-looking papers and articles you will find. They
are ignorant tosh. Maybe it made some kind of sense in the early
1970s, but its day is long past.
--
Roy
Received on Thu Apr 23 2009 - 10:43:31 CEST