Why is my PostgreSQL SELECT DISTINCT query so slow, and how can I improve its performance?-Mysql Tutorial-php.cn

Why is my PostgreSQL SELECT DISTINCT query so slow, and how can I improve its performance?

Barbara Streisand

Release： 2025-01-07 18:17:40

Original

177 people have browsed it

Why is my PostgreSQL SELECT DISTINCT query so slow, and how can I improve its performance?

PostgreSQL SELECT DISTINCT Performance Bottleneck and Optimization Strategies

A SELECT DISTINCT query on a PostgreSQL table containing nearly two million records exhibits unexpectedly slow performance (500-600ms). The query planner inexplicably defaults to a sequential scan instead of leveraging an available index, and even index forcing doesn't significantly improve execution time.

Emulating Index Skip Scan in PostgreSQL

Because PostgreSQL lacks native index skip scan functionality, a workaround using a recursive common table expression (CTE) can mimic its behavior. This CTE iteratively retrieves distinct product IDs in ascending order, utilizing an index on product_id for efficiency:

<code class="language-sql">WITH RECURSIVE cte AS (
    (
        SELECT product_id
        FROM tickers
        ORDER BY 1
        LIMIT 1
    )
    UNION ALL
    SELECT l.*
    FROM cte c
    CROSS JOIN LATERAL (
        SELECT product_id
        FROM tickers t
        WHERE t.product_id > c.product_id
        ORDER BY 1
        LIMIT 1
    ) l
)
SELECT * FROM cte;</code>

Copy after login

This approach offers a substantial performance gain compared to a full table scan.

Alternative Approaches: DISTINCT and DISTINCT ON

For tables with a more even distribution of rows per unique product ID, the standard DISTINCT or DISTINCT ON keywords might prove faster than the emulated index skip scan. Their performance depends heavily on data distribution.

Future Improvements: Native Index Skip Scan

PostgreSQL development includes ongoing work to integrate native index skip scan capabilities. This future enhancement promises further performance optimizations for SELECT DISTINCT queries.

The above is the detailed content of Why is my PostgreSQL SELECT DISTINCT query so slow, and how can I improve its performance?. For more information, please follow other related articles on the PHP Chinese website!