Performance Gap Between Subquery and Explicit Value "IN" Queries
Why is a MySQL "IN" query significantly slower when utilizing a subquery than when using explicit values?
Consider the following query:
<code class="sql">SELECT COUNT(DISTINCT subscriberid) FROM em_link_data WHERE linkid IN (SELECT l.id FROM em_link l WHERE l.campaignid = '2900' AND l.link != 'open')</code>
This query takes approximately 18 seconds to execute, despite the fact that the subquery alone completes in under 1ms.
However, when the subquery is replaced with explicit values:
<code class="sql">SELECT COUNT(DISTINCT subscriberid) FROM em_link_data WHERE linkid IN (24899,24900,24901,24902);</code>
The query completes in less than 1 millisecond.
Explanation
The performance discrepancy arises from the way MySQL evaluates subqueries. Subqueries are executed every time they are encountered, which means that in the first query, MySQL is essentially executing seven million queries (subquery evaluation for each row in the "em_link_data" table). In contrast, when using explicit values, the subquery is evaluated only once.
Workaround
If rewriting the query using a JOIN is not an option, you can consider using a query cache to improve performance. The query cache stores the results of previously executed queries and reuses them if the same query is executed again. This can significantly reduce the execution time of subquery-heavy queries.
To enable the query cache, add the following line to your MySQL configuration file:
query_cache_type = 1
Restart MySQL for the changes to take effect.
The above is the detailed content of Why is a MySQL 'IN' query with a subquery significantly slower than with explicit values?. For more information, please follow other related articles on the PHP Chinese website!