Which Index Configuration is Optimal when Working with Range Queries Involving High and Low Cardinality Columns?
In the given scenario, we have a table 'files' with a primary key on 'did' and 'filename', and two additional indexes: 'fe' on 'filetime' and 'ext', and 'ef' on 'ext' and 'filetime'. Our query involves filtering rows based on both 'ext' and 'filetime' using range conditions.
Let's explore which index configuration is more efficient for this query.
Evaluating Index Options
To determine the optimal index, we can analyze the potential index usage and cost estimates using EXPLAIN:
Forcing fe (range column first):
EXPLAIN SELECT COUNT(*), AVG(fsize) FROM files FORCE INDEX(fe) WHERE ext = 'gif' AND filetime >= '2015-01-01' AND filetime < '2015-01-01' + INTERVAL 1 MONTH;
Forcing ef (low cardinality column first):
EXPLAIN SELECT COUNT(*), AVG(fsize) FROM files FORCE INDEX(ef) WHERE ext = 'gif' AND filetime >= '2015-01-01' AND filetime < '2015-01-01' + INTERVAL 1 MONTH;
Analysis
EXPLAIN suggests that using 'ef' (low cardinality column first) results in a more efficient execution plan compared to 'fe'. This is because 'ef' enables the optimizer to filter rows using both columns of the index, leading to a lower estimated cost.
Optimizer Trace
The Optimizer trace provides additional insights into the index evaluation process:
"potential_range_indices": [ { "index": "fe", "usable": true }, { "index": "ef", "usable": true } ], "analyzing_range_alternatives": { "range_scan_alternatives": [ { "index": "fe", "ranges": [...], "index_only": false, "rows": 16684, "cost": 20022 }, { "index": "ef", "ranges": [...], "index_only": false, "rows": 538, "cost": 646.61 } ] }, "attached_conditions_computation": [ { "access_type_changed": { "table": "`files`", "index": "ef", "old_type": "ref", "new_type": "range", "cause": "uses_more_keyparts" } } ]
Conclusions
The Optimizer trace confirms that:
Therefore, considering both EXPLAIN output and the Optimizer trace, the optimal index configuration is ef (ext, filetime) for queries involving both ext and filetime range conditions. By putting the low cardinality column first in the index, we enable the optimizer to use both columns effectively, resulting in a more efficient execution plan.
The above is the detailed content of Which Index Configuration (Range Column First vs. Low Cardinality Column First) Is Optimal for Range Queries on High and Low Cardinality Columns?. For more information, please follow other related articles on the PHP Chinese website!