Home > Database > Mysql Tutorial > Why is My PostgreSQL SELECT DISTINCT Query with a Composite Primary Key So Slow?

Why is My PostgreSQL SELECT DISTINCT Query with a Composite Primary Key So Slow?

Linda Hamilton
Release: 2025-01-07 18:23:40
Original
538 people have browsed it

Why is My PostgreSQL SELECT DISTINCT Query with a Composite Primary Key So Slow?

PostgreSQL SELECT DISTINCT Performance Issues with Composite Keys

Using SELECT DISTINCT on a PostgreSQL table with a composite primary key (e.g., (product_id, trade_id)) can be surprisingly slow. The query planner often chooses a sequential scan instead of efficiently utilizing the index.

Why is it Slow?

  • Sequential Scan Preference: The planner may prefer a full table scan, even with an index, leading to significantly longer query times.
  • Missing Index Skip Scan: PostgreSQL doesn't natively support index skip scans, an optimization that efficiently retrieves unique values by skipping duplicates within the index.

The Solution: Simulating Index Skip Scan with a CTE

While a true index skip scan isn't available, we can effectively mimic its behavior using a recursive Common Table Expression (CTE):

<code class="language-sql">WITH RECURSIVE cte AS (
   (   -- parentheses are crucial
   SELECT product_id
   FROM   tickers
   ORDER  BY 1
   LIMIT  1
   )
   UNION ALL
   SELECT l.*
   FROM   cte c
   CROSS  JOIN LATERAL (
      SELECT product_id
      FROM   tickers t
      WHERE  t.product_id > c.product_id
      ORDER  BY 1
      LIMIT  1
      ) l
   )
SELECT * FROM cte;</code>
Copy after login

This CTE iterates through unique product_id values in sorted order, leveraging the index on (product_id) for efficiency.

Advantages of this Approach

  • Speed Improvement: This method drastically reduces query execution time. Tests show a reduction to 0.75 milliseconds on a 2.25 million-row table.
  • Index Utilization: It effectively utilizes both the composite primary key index (product_id, trade_id) and the index on (product_id).
  • Data Distribution Agnostic: Performance remains consistent regardless of data distribution within the table.

The above is the detailed content of Why is My PostgreSQL SELECT DISTINCT Query with a Composite Primary Key So Slow?. For more information, please follow other related articles on the PHP Chinese website!

source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Latest Articles by Author
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template