Index Hints: How to Force Query Plans
Contents
The optimizer is largely cost-based and will try to choose the optimal plan for any query. However in some cases it does not have enough information to choose a perfect plan and in these cases you may have to provide hints to force the optimizer to use another plan.
You can examine the query plan for a SELECT by writing
EXPLAIN before the statement. SHOW EXPLAIN shows the output of a running query. In some cases, its output can be closer to reality than EXPLAIN
.
For the following queries, we will use the world database for the examples.
Setting up the World Example Database
Download it from ftp://ftp.askmonty.org/public/world.sql.gz
Install it with:
mariadb-admin create world zcat world.sql.gz | ../client/mysql world
or
mariadb-admin create world gunzip world.sql.gz ../client/mysql world < world.sql
Forcing Join Order
You can force the join order by using STRAIGHT_JOIN either in the SELECT or JOIN part.
The simplest way to force the join order is to put the tables in the correct
order in the FROM
clause and use SELECT STRAIGHT_JOIN
like so:
SELECT STRAIGHT_JOIN SUM(City.Population) FROM Country,City WHERE City.CountryCode=Country.Code AND Country.HeadOfState="Volodymyr Zelenskyy";
If you only want to force the join order for a few tables, use
STRAIGHT_JOIN
in the FROM
clause. When this is done, only tables
connected with STRAIGHT_JOIN
will have their order forced. For example:
SELECT SUM(City.Population) FROM Country STRAIGHT_JOIN City WHERE City.CountryCode=Country.Code AND Country.HeadOfState="Volodymyr Zelenskyy";
In both of the above cases Country
will be scanned first and for each
matching country (one in this case) all rows in City
will be checked for a
match. As there is only one matching country this will be faster than the
original query.
The output of EXPLAIN for the above cases is:
id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
---|---|---|---|---|---|---|---|---|---|
1 | SIMPLE | Country | ALL | PRIMARY | NULL | NULL | NULL | 239 | Using where |
1 | SIMPLE | City | ALL | NULL | NULL | NULL | NULL | 4079 | Using where; Using join buffer (flat, BNL join) |
This is one of the few cases where ALL
is ok, as the scan of the
Country
table will only find one matching row.
Forcing Usage of a Specific Index for the WHERE Clause
In some cases the optimizer may choose a non-optimal index or it may choose to not use an index at all, even if some index could theoretically be used.
In these cases you have the option to either tell the optimizer to only use a limited set of indexes, ignore one or more indexes, or force the usage of some particular index.
USE INDEX: Use a Limited Set of Indexes
You can limit which indexes are considered with the USE INDEX option.
USE INDEX [{FOR {JOIN|ORDER BY|GROUP BY}] ([index_list])
The default is 'FOR JOIN
', which means that the hint only affects how the
WHERE
clause is optimized.
USE INDEX
is used after the table name in the FROM
clause.
Example:
CREATE INDEX Name ON City (Name); CREATE INDEX CountryCode ON City (Countrycode); EXPLAIN SELECT Name FROM City USE INDEX (CountryCode) WHERE name="Helsingborg" AND countrycode="SWE";
This produces:
id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
---|---|---|---|---|---|---|---|---|---|
1 | SIMPLE | City | ref | CountryCode | CountryCode | 3 | const | 14 | Using where |
If we had not used USE INDEX, the Name
index would have been in
possible keys
.
IGNORE INDEX: Don't Use a Particular Index
You can tell the optimizer to not consider some particular index with the IGNORE INDEX option.
IGNORE INDEX [{FOR {JOIN|ORDER BY|GROUP BY}] ([index_list])
This is used after the table name in the FROM
clause:
CREATE INDEX Name ON City (Name); CREATE INDEX CountryCode ON City (Countrycode); EXPLAIN SELECT Name FROM City IGNORE INDEX (Name) WHERE name="Helsingborg" AND countrycode="SWE";
This produces:
id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
---|---|---|---|---|---|---|---|---|---|
1 | SIMPLE | City | ref | CountryCode | CountryCode | 3 | const | 14 | Using where |
The benefit of using IGNORE_INDEX
instead of USE_INDEX
is that it will
not disable a new index which you may add later.
Also see Ignored Indexes for an option to specify in the index definition that indexes should be ignored.
FORCE INDEX: Forcing an Index
Forcing an index to be used is mostly useful when the optimizer decides to do a table scan even if you know that using an index would be better. (The optimizer could decide to do a table scan even if there is an available index when it believes that most or all rows will match and it can avoid the overhead of using the index).
CREATE INDEX Name ON City (Name); EXPLAIN SELECT Name,CountryCode FROM City FORCE INDEX (Name) WHERE name>="A" and CountryCode >="A";
This produces:
id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
---|---|---|---|---|---|---|---|---|---|
1 | SIMPLE | City | range | Name | Name | 35 | NULL | 4079 | Using where |
FORCE_INDEX
works by only considering the given indexes (like with
USE_INDEX
) but in addition it tells the optimizer to regard a table scan as
something very expensive. However if none of the 'forced' indexes can be used,
then a table scan will be used anyway.
Index Prefixes
When using index hints (USE, FORCE or IGNORE INDEX), the index name value can also be an unambiguous prefix of an index name.
Forcing an Index to be Used for ORDER BY or GROUP BY
The optimizer will try to use indexes to resolve ORDER BY and GROUP BY.
You can use USE INDEX, IGNORE INDEX and
FORCE INDEX as in the WHERE
clause above
to ensure that some specific index used:
USE INDEX [{FOR {JOIN|ORDER BY|GROUP BY}] ([index_list])
This is used after the table name in the FROM
clause.
Example:
CREATE INDEX Name ON City (Name); EXPLAIN SELECT Name,Count(*) FROM City FORCE INDEX FOR GROUP BY (Name) WHERE population >= 10000000 GROUP BY Name;
This produces:
id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
---|---|---|---|---|---|---|---|---|---|
1 | SIMPLE | City | index | NULL | Name | 35 | NULL | 4079 | Using where |
Without the FORCE INDEX option we would have
'Using where; Using temporary; Using filesort
' in the
'Extra' column, which means that the optimizer would created a temporary
table and sort it.
Help the Optimizer Optimize GROUP BY and ORDER BY
The optimizer uses several strategies to optimize GROUP BY and ORDER BY:
- Resolve with an index:
- Filesort:
- Scan the table to be sorted and collect the sort keys in a temporary file.
- Sort the keys + reference to row (with filesort)
- Scan the table in sorted order
- Use a temporary table for ORDER BY:
A temporary table will always be used if the fields which will be sorted are not from the first table in the JOIN order.
- Use a temporary table for GROUP BY:
- Create a temporary table to hold the GROUP BY result with an index that matches the GROUP BY fields.
- Produce a result row
- If a row with the GROUP BY key exists in the temporary table, add the new result row to it. If not, create a new row.
- Before sending the results to the user, sort the rows with filesort to get the results in the GROUP BY order.
Forcing/Disallowing TemporaryTables to be Used for GROUP BY:
Using an in-memory table (as described above) is usually the fastest option for
GROUP BY if the result set is small. It is not optimal if
the result set is very big. You can tell the optimizer this by using
SELECT SQL_SMALL_RESULT
or SELECT SQL_BIG_RESULT
.
For example:
EXPLAIN SELECT SQL_SMALL_RESULT Name,Count(*) AS Cities FROM City GROUP BY Name HAVING Cities > 2;
produces:
id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
---|---|---|---|---|---|---|---|---|---|
1 | SIMPLE | City | ALL | NULL | NULL | NULL | NULL | 4079 | Using temporary; Using filesort |
while:
EXPLAIN SELECT SQL_BIG_RESULT Name,Count(*) AS Cities FROM City GROUP BY Name HAVING Cities > 2;
produces:
id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
---|---|---|---|---|---|---|---|---|---|
1 | SIMPLE | City | ALL | NULL | NULL | NULL | NULL | 4079 | Using filesort |
The difference is that with SQL_SMALL_RESULT
a
temporary table is used.
Forcing Usage of Temporary Tables
In some cases you may want to force the use of a temporary table for the result to free up the table/row locks for the used tables as quickly as possible.
You can do this with the SQL_BUFFER_RESULT
option:
CREATE INDEX Name ON City (Name); EXPLAIN SELECT SQL_BUFFER_RESULT Name,Count(*) AS Cities FROM City GROUP BY Name HAVING Cities > 2;
This produces:
id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
---|---|---|---|---|---|---|---|---|---|
1 | SIMPLE | City | index | NULL | Name | 35 | NULL | 4079 | Using index; Using temporary |
Without SQL_BUFFER_RESULT
, the above query would not use a
temporary table for the result set.
Optimizer Switch
In MariaDB 5.3 we added an optimizer switch which allows you to specify which algorithms will be considered when optimizing a query.
See the optimizer section for more information about the different algorithms which are used.