Re: Adding an index caused problems!

From: J.O. Aho <user_at_example.net>
Date: Wed, 22 May 2019 22:25:25 +0200
Message-ID: <gklphmF6jauU1_at_mid.individual.net>


On 22/05/2019 17.50, rich_at_turftechno.com wrote:
> Here's the DDL:
>
> CREATE TABLE `people` (
> `id` INT(11) NULL DEFAULT NULL,
> `last_name` VARCHAR(255) NULL DEFAULT NULL,
> `first_name` VARCHAR(255) NULL DEFAULT NULL,
> `address` VARCHAR(255) NULL DEFAULT NULL,
> `city` VARCHAR(255) NULL DEFAULT NULL,
> `state` VARCHAR(255) NULL DEFAULT NULL,
> `zip` CHAR(10) NULL DEFAULT NULL,
> `join_date` DATE NULL DEFAULT NULL,
> `last_sale_date` DATE NULL DEFAULT NULL,
> `dollars` DECIMAL(10,2) NULL DEFAULT NULL,
> INDEX `inx_join_date` (`join_date`),
> INDEX `inx_full_name` (`last_name`, `first_name`),
> INDEX `inx_dollars` (`dollars`)
> )
>
> Volume: 1,000,000 rows
>
> This query execute in 1.9 sec:
>
> SELECT
> p.state,
> avg(p.dollars)
> FROM
> people p
> GROUP BY
> p.state
>
> Added state as an index. Now the same query hangs. Explain shows that state is now being used as an index. What's going on under the covers?
>

I made a small experiment on my MariaDB, all tables in the test had ~910000 rows. First table setup as yours, second table with the state index, the third with a state index but not allowing null value and last I made the state table to be an int and of course with index

+------+-------------+-------+------+---------------+------+---------+------+--------+---------------------------------+
| id   | select_type | table | type | possible_keys | key  | key_len | 
ref  | rows   | Extra                           |
+------+-------------+-------+------+---------------+------+---------+------+--------+---------------------------------+
|    1 | SIMPLE      | p     | ALL  | NULL          | NULL | NULL    | 
NULL | 917910 | Using temporary; Using filesort |
+------+-------------+-------+------+---------------+------+---------+------+--------+---------------------------------+
SELECT p.state, avg(p.dollars) from p1 p group by p.state;
+-------+----------------+

| state | avg(p.dollars) |
+-------+----------------+
| NULL  |      10.000000 |
| as    |      10.000000 |
| fs    |      10.000000 |
+-------+----------------+

3 rows in set (4.82 sec)

+------+-------------+-------+-------+---------------+------------+---------+------+--------+-------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+-------+-------+---------------+------------+---------+------+--------+-------+

|    1 | SIMPLE      | p     | index | NULL          | inx_state2 | 768 
    | NULL | 917910 |       |

+------+-------------+-------+-------+---------------+------------+---------+------+--------+-------+ SELECT p.state, avg(p.dollars) from p2 p group by p.state;
+-------+----------------+

| state | avg(p.dollars) |
+-------+----------------+
| NULL  |      10.000000 |
| as    |      10.000000 |
| fs    |      10.000000 |
+-------+----------------+

3 rows in set (13.52 sec)

+------+-------------+-------+-------+---------------+------------+---------+------+--------+-------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+-------+-------+---------------+------------+---------+------+--------+-------+

|    1 | SIMPLE      | p     | index | NULL          | inx_state3 | 767 
    | NULL | 917910 |       |

+------+-------------+-------+-------+---------------+------------+---------+------+--------+-------+ SELECT p.state, avg(p.dollars) from p3 p group by p.state;
+-------+----------------+

| state | avg(p.dollars) |
+-------+----------------+
| as    |      10.000000 |
| fs    |      10.000000 |
+-------+----------------+

2 rows in set (5.79 sec)

+------+-------------+-------+-------+---------------+------------+---------+------+--------+-------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+-------+-------+---------------+------------+---------+------+--------+-------+

|    1 | SIMPLE      | p     | index | NULL          | inx_state4 | 4 
    | NULL | 917910 |       |

+------+-------------+-------+-------+---------------+------------+---------+------+--------+-------+ SELECT p.state, avg(p.dollars) from p4 p group by p.state;
+-------+----------------+

| state | avg(p.dollars) |
+-------+----------------+
|     1 |      10.000000 |
|     2 |      10.000000 |
+-------+----------------+

2 rows in set (7.93 sec)

I think the issue with the indexed state may most likely be that you have no real gain as you still have to make a full table scan, while the use of both filesort and temp table improves the grouping.

Not sure what you need the index for the dollar column, you improve insert time if you remove indexes you really don't need and also your database will take less space on disk.

-- 

  //Aho
Received on Wed May 22 2019 - 22:25:25 CEST

Original text of this message