This is a read-only copy of the MariaDB Knowledgebase generated on 2025-01-30. For the latest, interactive version please visit https://mariadb.com/kb/.

Virtual Column Support in the Optimizer

MariaDB starting with 11.8

Starting from MariaDB 11.8, the optimizer can recognize use of indexed virtual column expressions in the WHERE clause and use them to construct range and ref(const) accesses.

Motivating Example

Suppose one has a table with data in JSON:

create table t1 (json_data JSON);
insert into t1 values('{"column1": 1234}'); 
insert into t1 ...

In order to do efficient queries over data in JSON, one can add a virtual column and an index:

alter table t1
  add column vcol1 int as (cast(json_value(json_data, '$.column1') as integer)),
  add index(vcol1);

Before MariaDB 11.8, one had to use vcol1 in the WHERE clause. Now, one can use the virtual column expression, too. A query like this:

explain select * 
from t1 
where 
  cast(json_value(json_data, '$.column1') as integer)=100;

will be resolved through the index on vcol1:

+------+-------------+-------+------+---------------+-------+---------+-------+------+-------+
| id   | select_type | table | type | possible_keys | key   | key_len | ref   | rows | Extra |
+------+-------------+-------+------+---------------+-------+---------+-------+------+-------+
|    1 | SIMPLE      | t1    | ref  | vcol1         | vcol1 | 5       | const | 1    |       |
+------+-------------+-------+------+---------------+-------+---------+-------+------+-------+

General Considerations

  • In MariaDB, one has to create a virtual column and then create an index over it. Other databases allow to create an index directly over expression: create index on t1((col1+col2)). This is not yet supported.
  • The WHERE clause must use the exact same expression as in the virtual column definition.
  • In the optimizer trace, the rewrites are shown like so:
            "virtual_column_substitution": {
              "condition": "WHERE",
              "resulting_condition": "t1.vcol1 = 100"
            }

Considerations for Accessing JSON fields

SQL is strongly-typed language while JSON is weakly-typed. This means one must specify the desired datatypes when accessing JSON data from SQL. In the above example, we declared vcol1 as INT and then used (CAST ... AS INTEGER) (both in the ALTER TABLE and in the query):

alter table t1
  add column vcol1 int as (cast(json_value(json_data, '$.column1') as integer)) ...

When handling string values, CAST is not necessary, as JSON_VALUE returns strings. However, one must take into account collations. If you declare a column as json_data JSON, its collation will be utf8mb4_bin, and the collation of JSON_VALUE(json_data, ...) will also be utf8mb4_bin. Most use cases require a more commonly-used collation. It is possible to achieve that using the COLLATE clause:

alter table t1
  add col1 varchar(100) collate utf8mb4_uca1400_ai_ci as
  (json_value(js1, '$.string_column') collate utf8mb4_uca1400_ai_ci),
  add index(col1);
...
select  ... 
where
  json_value(js1, '$.string_column') collate utf8mb4_uca1400_ai_ci='string-value';

References

  • MDEV-35616: Add basic optimizer support for virtual columns
Content reproduced on this site is the property of its respective owners, and this content is not reviewed in advance by MariaDB. The views, information and opinions expressed by this content do not necessarily represent those of MariaDB or any other party.