FirstMatch Strategy
Contents
FirstMatch
is an execution strategy for Semi-join subqueries.
The idea
It is very similar to how IN/EXISTS
subqueries were executed in MySQL 5.x.
Let's take the usual example of a search for countries with big cities:
select * from Country where Country.code IN (select City.Country from City where City.Population > 1*1000*1000) and Country.continent='Europe'
Suppose, our execution plan is to find countries in Europe, and then, for each found country, check if it has any big cities. Regular inner join execution will look as follows:
Since Germany has two big cities (in this diagram), it will be put into the query output twice. This is not correct, SELECT ... FROM Country
should not produce the same country record twice. The FirstMatch
strategy avoids the production of duplicates by short-cutting execution as soon as the first genuine match is found:
Note that the short-cutting has to take place after "Using where" has been applied. It would have been wrong to short-cut after we found Trier.
FirstMatch in action
The EXPLAIN
for the above query will look as follows:
MariaDB [world]> explain select * from Country where Country.code IN (select City.Country from City where City.Population > 1*1000*1000) and Country.continent='Europe'; +----+-------------+---------+------+--------------------+-----------+---------+--------------------+------+----------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+---------+------+--------------------+-----------+---------+--------------------+------+----------------------------------+ | 1 | PRIMARY | Country | ref | PRIMARY,continent | continent | 17 | const | 60 | Using index condition | | 1 | PRIMARY | City | ref | Population,Country | Country | 3 | world.Country.Code | 18 | Using where; FirstMatch(Country) | +----+-------------+---------+------+--------------------+-----------+---------+--------------------+------+----------------------------------+ 2 rows in set (0.00 sec)
FirstMatch(Country)
in the Extra column means that as soon as we have produced one matching record combination, short-cut the execution and jump back to the Country table.
FirstMatch
's query plan is very similar to one you would get in MySQL:
MySQL [world]> explain select * from Country where Country.code IN (select City.Country from City where City.Population > 1*1000*1000) and Country.continent='Europe'; +----+--------------------+---------+----------------+--------------------+-----------+---------+-------+------+------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+--------------------+---------+----------------+--------------------+-----------+---------+-------+------+------------------------------------+ | 1 | PRIMARY | Country | ref | continent | continent | 17 | const | 60 | Using index condition; Using where | | 2 | DEPENDENT SUBQUERY | City | index_subquery | Population,Country | Country | 3 | func | 18 | Using where | +----+--------------------+---------+----------------+--------------------+-----------+---------+-------+------+------------------------------------+ 2 rows in set (0.01 sec)
and these two particular query plans will execute in the same time.
Difference between FirstMatch and IN->EXISTS
The general idea behind the FirstMatch
strategy is the same as the one behind the IN->EXISTS
transformation, however, FirstMatch
has several advantages:
- Equality propagation works across semi-join bounds, but not subquery bounds. Therefore, converting a subquery to semi-join and using
FirstMatch
can still give a better execution plan. (TODO example)
- There is only one way to apply the
IN->EXISTS
strategy and MySQL will do it unconditionally. WithFirstMatch
, the optimizer can make a choice between whether it should run theFirstMatch
strategy as soon as all tables used in the subquery are in the join prefix, or at some later point in time. (TODO: example)
FirstMatch factsheet
- The
FirstMatch
strategy works by executing the subquery and short-cutting its execution as soon as the first match is found. - This means, subquery tables must be after all of the parent select's tables that are referred from the subquery predicate.
EXPLAIN
showsFirstMatch
as "FirstMatch(tableN)
".- The strategy can handle correlated subqueries.
- But it cannot be applied if the subquery has meaningful
GROUP BY
and/or aggregate functions. - Use of the
FirstMatch
strategy is controlled with thefirstmatch=on|off
flag in the optimizer_switch variable.
See Also
In-depth material: