Re: Complicated query

From: Jerry Stuckle <jstucklex_at_attglobal.net>
Date: Mon, 15 Dec 2014 08:16:59 -0500
Message-ID: <m6mmvo$h8p$1_at_dont-email.me>


On 12/15/2014 1:10 AM, Denis McMahon wrote:

> On Sat, 13 Dec 2014 23:29:10 -0500, Jerry Stuckle wrote:
> 
>> On 12/13/2014 11:16 PM, Norman Peelman wrote:
>>> On 12/13/2014 08:52 PM, Jerry Stuckle wrote:

>>>> On 12/13/2014 7:11 PM, Denis McMahon wrote:
>>>>> On Sat, 13 Dec 2014 16:13:24 -0500, Jerry Stuckle wrote:
>>>>>
>>>>>> Actually, I think richard's concept is closer to what the OP had
>>>>>> than yours. And the OP thought that was a better design for his
>>>>>> installation.
>>>>>
>>>>> Nah
>>>>>
>>>>> The clues are here:
>>>>>
>>>>>
>>>>> Table Questions has Q entries. (Lets say just three for now.) What
>>>>> I want is to pull where each record contains:
>>>>>
>>>>> Table Instances has N entries.
>>>>>
>>>>> Table Answers has Q x N entries.
>>>>>
>>>>> I want to pull N records where each record has:
>>>>> column1, column2, Answer1, Answer2, Answer3 <<<<
>>>>>
>>>>> Note 3 questions, in the question table, 3 answers in the result
>>>>> record.
>>>>> I think those are separate answers to each of the three questions,
>>>>> rather than 3 possible answers to the same question.
>>>>>
>>>>> ie Answer1 is the ans to q1 for this "instance" (whatever an instance
>>>>> is), Answer2 is the ans to q2 ditto, and Answer3 ditto again to q3.
>>>>>
>>>>> Scenario: A polling company. You have a list of people who have
>>>>> demographic info attached. You have polls or surveys that you ask
>>>>> people to complete. You provide sanitised answers to your clients
>>>>> with demographic but not personal data and the poll answers.
>>>>>
>>>>> So if instances were sets of demographic data (age group, gender,
>>>>> ethnicity, relationship status) relating to people who had completed
>>>>> a poll, and questions was a list of questions in a poll, then the
>>>>> output records to your polling client would each combine the
>>>>> demographic information from instance and the relevant answers given
>>>>> for each question in the poll by the person whose demographic data
>>>>> you attached the answers to.
>>>>>
>>>>> ie the output from a poll would be the recordset:
>>>>>
>>>>> for each person x who took part in the poll the record:
>>>>>
>>>>> the effective join of <demographic data of person x> and <for each
>>>>> question in the poll, the answer given by person x>
>>>>>
>>>>> hence giving a scenario where multiple answers appear in a single
>>>>> record without violating first normal form, because the answers are
>>>>> answers to different questions, not the same question.
>>>>>
>>>>> In fact, generating the output as:
>>>>>
>>>>> instance data, question id, answer
>>>>>
>>>>> would involve unneeded duplications of instance data, because each
>>>>> set of instance data is associated with one specific set of answers.
>>>>>
>>>>> I still think that the OP failed to understand what richard was
>>>>> presenting, whereas those of us with more familiarity with his crap
>>>>> did understand it, and that's how you and the OP got into a row about
>>>>> whether the design would violate first normal form.
>>>>>
>>>>>
>>>> I would argue that just because they are answers to different
>>>> questions is immaterial - they are still in the answers domain.
>>>>
>>>> There is a good reason for this, also. What happens if the pollster
>>>> wants to add another question to the poll, after the database is
>>>> designed? It would require a redesign of the table as you have it.
>>>> Additionally, some polls will have different questions based on
>>>> demographics, i.e. male vs. female, age < 30 vs. age > 60, etc.
>>>>
>>>> And even if the above were not true, you don't need to duplicate
>>>> instance data. You have an instance table with an id and the
>>>> appropriate demographic information. A question table and an answer
>>>> table, with the appropriate info and ids. The link table would be
>>>> instance_id, question_id and answer_id. No duplication of instance
>>>> data.
>>>>
>>>> Additionally, such a design would allow summation of all questions and
>>>> answers by any of the demographics, i.e. sex, age, race, etc.
>>>>
>>>> This would be properly normalized to third normal form.
>>>>
>>>>
>>>   And one more table to hold the link to the correct answer per
>>>   question
>>> per instance?
>>>
>>>
>>>
>> It's a poll.  There are no "incorrect" answers.
>>
>> However, in the case of a quiz, the correct answer id could be in the
>> question table.  However, as some questions may have multiple answers, I
>> would suggest a separate table.
> 
> Of course, if you're going to have a single answers table, you need to 
> coerce all the answers to the same datatype ....
>

No, you can have different data types. For instance, you can have a column which indicates the datatype of the answer, and different columns with different datatypes. It can cause retrieval to be a bear, however, so most cases they have a column for the real datatype then cast everything to a string.

> If in the OPs case the answers are actually a mix of strings, dates, 
> numbers and booleans, then I guess this coercion has already been done, 
> given that the Op was presenting us with a single table of answers, but 
> again, there may be cases where having all the answers to a single set of 
> questions in a single record might not violate normal forms, which is 
> where I'm going with this.
> 

I don't see where this does not violate first normal form. You would have multiple columns in the same domain.

> It would be nice if the OP would clarify what was actually meant by 
> answers, questions and instances. It would also be nice if richard could 
> refrain from posting crap, because I'm pretty sure it's his crap that 
> caused the confusion in the first place.
> 

That is true, also.

-- 
==================
Remove the "x" from my email address
Jerry Stuckle
jstucklex_at_attglobal.net
==================
Received on Mon Dec 15 2014 - 14:16:59 CET

Original text of this message