How to optimize a SQL query with over 2 million rows
P粉127901279
P粉127901279 2023-09-01 18:09:31
0
2
622
<p>I have a Sql database with over 2 million rows and it is growing rapidly. There aren't many columns, just <code>code, price, date, and stationID</code>. </p> <p>The purpose is to get the latest price by code and stationID. The query works great but takes over 10 seconds. </p> <p>Is there any way to optimize the query? </p> <pre class="brush:php;toolbar:false;">$statement = $this->pdo->prepare( 'WITH cte AS ( SELECT stationID AS ind, code, CAST(price AS DOUBLE ) AS price, date ,ROW_NUMBER() OVER( PARTITION BY code, stationID ORDER BY date DESC ) AS latest FROM price ) SELECT * FROM cte WHERE latest = 1 ' ); $statement->execute(); $results = $statement->fetchAll(PDO::FETCH_GROUP | PDO::FETCH_ASSOC);</pre> <p>Edit: The first column has an index called "id". I don't know if this helps. </p> <p>The database (InnoDB) looks like this: </p> <pre class="brush:php;toolbar:false;">id primary - int stationID - int code - int price-decimal(10,5) date - datetime</pre> <p>Edit 2:</p> <p>The results need to be grouped by stationID, and multiple rows need to be displayed for each stationID. One line for each code with the latest date.像这样:</p> <pre class="brush:php;toolbar:false;">22456: code: 1 price: 3 date: 2023-06-21 code: 2 price: 2 date: 2023-06-21 code: 3 price: 5 date: 2023-06-21 22457: code: 1 price: 10 date: 2023-06-21 code: 2 price: 1 date: 2023-06-21 code: 3 price: 33 date: 2023-06-21</pre> <p>The json output should be 像这样:</p> <pre class="brush:php;toolbar:false;">{"1000001":[{"code":1,"price":1.661,"date":"2023-06-06 12:46:32","latest":1},{"code":2,"price":1.867,"date":"2023-06-06 12:46:32","latest":1},{"code":3,"price":1.05,"date":"2023-06-06 12:46:32","latest":1},{"code":5,"price":1.818,"date":"2023-06-06 12:46:32","latest":1},{"code":6,"price":1.879,"date":"2023-06-06 12:46:32","latest":1}],"1000002":[{"code":1,"price":1.65,"date":"2023-06-03 08:53:26","latest":1},{"code":2,"price":1.868,"date":"2023-06-03 08:53:26","latest":1},{"code":6,"price":1.889,"date":"2023-06-03 08:53:27","latest":1}],…</pre></p>
P粉127901279
P粉127901279

reply all(2)
P粉141455512

I guess you need the following indexes for the query to execute well (you only need to do it once as part of the database design).

CREATE INDEX IX ON price
  (code, stationID, date DESC, price)

The first two columns can be arranged in any order.

P粉297434909

As long as there can't be two rows of the same code, stationID pair with the same datetime, using window functions is a bit like using a sledgehammer to crack a nut.

select p.stationID, p.code, p.price, p.date
from (
    select code, stationID, max(date) as max_date
    from price
    group by code, stationID
) max
join price p
    on max.code = p.code
   and max.stationID = p.stationID
   and max.max_date = p.date;

It requires the following indexes:

alter table price add index (code, stationID, date desc);

This query should take less than 1 millisecond because the derived table can be built from the index and then it reads only the required rows from the table.

Alternatively, you can significantly reduce the effort if you know that each code, stationID pair will receive an updated price within a specific time period (1 hour, 1 day, 1 week) The window function needs to add a where clause:

with cte as 
(
    select stationID as ind, code, price, date, row_number() over(partition by code, stationID order by date desc) as latest
    from price
    where date >= now() - interval 1 week
)
select * from cte where latest  = 1;
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template