Re: [bug?] Re: Adding an index caused problems!

From: J.O. Aho <user_at_example.net>
Date: Thu, 23 May 2019 21:46:43 +0200
Message-ID: <gkobl4Fnnf4U1_at_mid.individual.net>


[Quoted] On 23/05/2019 20.56, Axel Schwenke wrote:

>> About the index/optimezer/bug ?
>>
>> MariaDB [test]>   select left(state,10), (dollars) from people where
>> dollars<150 order by state  limit 10;

>
> Now that is a completely different query. Honestly I just looked at the
> opening post of the thread and there the query is:
>
> SELECT
> p.state,
> avg(p.dollars)
> FROM
> people p
> GROUP BY
> p.state
>
> This query has to look at all rows, or at least to all tuples (state,
> dollars) from the table. An optimal plan would require an index on (state,
> dollars) to read the rows in GROUP BY order (hence no sorting required) from
> the index only (which is called a /covering index read/)
>
> An index on (state) alone could be used to read rows in GROUP BY order, but
> then it had to read the `dollars` field from the row, accessing all (!) rows
> in index order; which would be random storage order. And this would almost
> certainly be slower than reading rows in table order (= storage order) and
> doing the summing in a temporary table.
>
> Exactly for that reason there used to be a limit of IIRC 30%. If a query has
> to read more rows than that percentage of the table, a table scan is
> preferred over an index access.
>

[Quoted] The OP may gain if he created a joint index for state and dollars ex: CREATE INDEX inx_state_dollars ON people (state,dollars); He should in that case drop the index for state and the one for dollars.

[Quoted] For me with 262000+ rows, I drop from 1.23 sec (no index) to 0.24 sec with the joint index. Sure I don't have big variation of states nor dollar amounts.

-- 

  //Aho
Received on Thu May 23 2019 - 21:46:43 CEST

Original text of this message