How to retrieve rows in column with highest value using SQL [duplicate]-PHP Chinese Network Q&A

Article Topic Learning Download Q&A Programming Dictionary Game Recent Updates

简体中文(ZH-CN) English(EN) 繁体中文(ZH-TW) 日本語(JA) 한국어(KO) Melayu(MS) Français(FR) Deutsch(DE)

How to retrieve rows in column with highest value using SQL [duplicate]

P粉211273535 2023-09-20 13:44:50

575

I have a table of documents (here is a simplified version):

id	change	content
1	1	...
2	1	...
1	2	...
1	3	...

How to select one row for each id and only select the largest rev?

Based on the above data, the result should contain two rows:[1, 3, ...]and[2, 1, ..]. I'm usingMySQL.

Currently, I'm using a check in awhileloop to detect and overwrite the old rev in the result set. But is this the only way to achieve results? Is there noSQLsolution?

P粉211273535

reply all (2)

P粉7147807682023-09-21 12:21:06 2 floor

I prefer to use as little code as possible...

You can useINto achieve Try this:

SELECT * FROM t1 WHERE (id,rev) IN ( SELECT id, MAX(rev) FROM t1 GROUP BY id )

In my opinion, this is simpler... easier to read and maintain.

Like+0

Add Reply

P粉3365367062023-09-21 09:25:48 1 floor

At first glance...

You only need to use theMAXaggregate function in theGROUP BYclause:

SELECT id, MAX(rev) FROM YourTable GROUP BY id

Things are never that simple, right?

I just noticed that you also need thecontentcolumn.

In SQL, this is a very common problem: find the entire row of data with the maximum value in a certain column based on a certain grouping identifier. I've heard this question a lot in my career. In fact, this is a question I answered during a technical interview for my current job.

This question is actually so common that the Stack Overflow community created a tag specifically to deal with this type of question:greatest-n-per-group.

Basically, you have two ways to solve this problem:

Use simple`group-identifier, max-value-in-group`Subquery to connect

In this approach, you first find thegroup-identifier, max-value-in-group(already solved above) in a subquery. You then join your table with the subquery, usinggroup-identifierandmax-value-in-groupfor an equijoin:

SELECT a.id, a.rev, a.contents FROM YourTable a INNER JOIN ( SELECT id, MAX(rev) rev FROM YourTable GROUP BY id ) b ON a.id = b.id AND a.rev = b.rev

Use self-join for left join, and adjust the connection conditions and filtering conditions

In this approach, you left join the table to itself. Equivalent connections are placed ingroup-identifier. Then, there are two clever steps:

The second connection condition is that the value on the left is less than the value on the right
When you do step 1, the row that actually has the largest value will haveNULLon the right (remember this is aLEFT JOIN). We then filter the results of the join to only show rows withNULLon the right.

So, you end up with:

SELECT a.* FROM YourTable a LEFT OUTER JOIN YourTable b ON a.id = b.id AND a.rev < b.rev WHERE b.id IS NULL;

in conclusion

Both methods will give exactly the same results.

If there are two rows withmax-value-in-groupingroup-identifier, then these two rows will appear in the result in both methods.

Both methods are SQL ANSI compatible, so no matter what "flavor" of RDBMS you prefer, you can use it.

Both methods are also very friendly in terms of performance, but your actual situation may be different (RDBMS, database structure, index, etc.). Therefore,benchmarkwhen choosing a method. Make sure to choose the method that makes the most sense for you.

Like+0

Add Reply