Re: Complicated query

From: The Natural Philosopher <tnp_at_invalid.invalid>
Date: Sat, 06 Dec 2014 16:47:56 +0000
Message-ID: <m5vbvs$jbq$5_at_news.albasani.net>


On 06/12/14 15:05, Shelly wrote:

> On 12/6/2014 9:45 AM, The Natural Philosopher wrote:
>> On 06/12/14 14:16, Shelly wrote:
>>> On 12/6/2014 8:28 AM, Shelly wrote:

>>>> I have finally come across a desired query that I have no idea on
>>>> how to
>>>> do -- or if it is even possible. Here is the scenario:
>>>>
>>>> Table Instances
>>>> ===============
>>>> id
>>>> column1
>>>> column2
>>>> (and more)
>>>>
>>>> Table Questions
>>>> ===============
>>>> Id
>>>> Question Text
>>>>
>>>> Table Answers
>>>> =============
>>>> Id
>>>> QuestionId (Id in Questions)
>>>> InstanceId (Id in Instances)
>>>>
>>>> Table Questions has Q entries. (Lets say just three for now.) What I
>>>> want is to pull where each record contains:
>>>>
>>>> Table Instances has N entries.
>>>>
>>>> Table Answers has Q x N entries.
>>>>
>>>> I want to pull N records where each record has:
>>>> column1, column2, Answer1, Answer2, Answer3
>>>>
>>>> In other words, I want all the answers (Q of them) for a given instance
>>>> to be in a single returned record, not one answer in each of Q records
>>>> returned.
>>>>
>>>
>>> After some research I found a good solution on the net.  (Note: I
>>> inadvertently left out that the Answers table has a column "Answer" when
>>> I posed the original question).
>>>
>>> This is for Oracle.  The solution is:
>>>
>>> with data
>>>   as
>>>   (
>>>    select I.*, A.answer, row_number() over (partition by
>>> observation_card_id order by answer) rn,
>>>      count(*) over (partition by A.InstanceId) cnt
>>>    from Answers A
>>>    join Instances I on (I.ID = A.InstanceId)
>>>    order by A.QUESTION_ID
>>>   )
>>>   select D.*, ltrim(sys_connect_by_path(D.Answer,','),',') Answers
>>>    from data D
>>>    where rn = cnt
>>>    start with rn = 1
>>>    connect by prior D.Id = D.Id and prior rn = rn-1
>>>    order by D.Id
>>>
>>> The answers are all in one field, Answeers, and it is a comma separated
>>> string.
>>>
>>> Thank you to Joe Stefanelli who answered someone else about this on
>>> stackoverflow.com.
>>>
>> This is the sort of query where I have found that doing it all in
>> application software using multiple queries is quicker, for sub zillion
>> records, than expecting mysql to do it in hours..
>>
>> For < zillion records, buy oracle and a very powerful computer...
>>
>> Don't get me wrong. I love mysql. But it is not best when optimising and
>> executing complex queries. And its not nearly as good at it as a custom
>> program written in (e.g.) C, is.
>
> Perhaps.
>
> I thought of how I could do it with brute force if I couldn't get a
> query.  What I would have done would have been to first query the
> instances table and make an array where the index was the instance id
> and the fields being the instance fields.  I would note the lowest and
> highest index and then do a query pulling all answers where the index
> was between them.  For each of those records I would add fields to the
> appropriate element in the array based upon the instance id for each of
> the answers.
>
>

That's what I did with a massive database I was building out of bits and scrappy data.

Do the subquery and that have a 'for each row in sub query, do the following enquiry'

It was around 100 times faster..

-- 
Everything you read in newspapers is absolutely true, except for the 
rare story of which you happen to have first-hand knowledge. – Erwin Knoll
Received on Sat Dec 06 2014 - 17:47:56 CET

Original text of this message