Edit 3 - MySQL version is 8.0.33.
Edit 2 - See bottom for final working code. Thanks @Akina!
I have a score sheet for a sporting event. The table has three related fields that I want to select -
scoreID
as primary key valueclassifierID
Maps to the primary key of another table that contains details about a specific course layoutcalculatedPercent
is the result of a specific eventThe table has three other fields that I use in the WHERE clause, but these are incidental.
I need to generate a query that selects the four best values forcalculatedPercent
and specifies thatclassifierID
cannot be repeated. I need to be able to capture thescoreID
for use in later stages of the process.
This is my first query:
SELECT `masterScores`.`scoreID`, `masterScores`.`classifierID`, `masterScores`.`calculatedPercent` FROM `masterScores` WHERE `masterScores`.`memberID` = 3516 AND `masterScores`.`eventDivision` = "O" AND `masterScores`.`scoreUnusable` != "TRUE" ORDER BY `masterScores`.`calculatedPercent` DESC LIMIT 4
Initially I thought this was great because it does select the row with the highestcalculatedPercent
value for the given memberscoreID
value. Then I noticed that several members had the first and second highest marks in the same course, which violated the requirement thatclassifierID
values not be repeated.
I gave SELECT DISTINCT a try but eventually realized what I really needed was GROUP BY, so I did some research and found that I was getting errors related to only_full_group_by when executing queries in MySql, but that didn't completely solve it for me The problem.
What I tried next:
SELECT `masterScores`.`scoreID`, `masterScores`.`classifierID`, MAX(`masterScores`.`calculatedPercent`) AS bestPercent FROM `masterScores` WHERE `masterScores`.`memberID` = 3516 AND `masterScores`.`eventDivision` = "O" AND `masterScores`.`scoreUnusable` != "TRUE" GROUP BY `masterScores`.`classifierID` ORDER BY bestPercent DESC LIMIT 4
This is the following error message:
#1055 - Expression #1 of the ORDER BY clause is not in the GROUP BY clause and contains the non-aggregated column '.masterScores.calculatedPercent', which is not functionally dependent on the columns in the GROUP BY clause; This is incompatible with sql_mode=only_full_group_by
I considered using MIN and MAX for themasterScores
.scoreID
column, but it doesn't work as expected;scoreID
the primary key value is not always selected# The value of ##calculatedPercent. I read somewhere that since
scoreIDis the primary key, I can fix this by using ANY_VALUE aggregation. I tried this:
SELECT ANY_VALUE(`masterScores`.`scoreID`), `masterScores`.`classifierID`, MAX(`masterScores`.`calculatedPercent`) AS bestPercent FROM `masterScores` WHERE `masterScores`.`memberID` = 3516 AND `masterScores`.`eventDivision` = "O" AND `masterScores`.`scoreUnusable` != "TRUE" GROUP BY `masterScores`.`classifierID` ORDER BY bestPercent DESC LIMIT 4At first glance, this does seem to work, but it does not always return a
scoreIDvalue that matches the bestPercent value.
calculatedPercentand 1
scoreIDvalue are selected for each
classifierID. Without grouping by
classifierID, each
classifierIDcould have anywhere from 0 to 400 rows satisfying the WHERE clause, so I think GROUP BY would be appropriate here.
calculatedPercentselected for each grouping's
classifierIDis the highest value of all options
calculatedPercentvalue selected.
calculatedPercentvalue.
scoreIDvalue for each selected row actually represents the same row as the selected
calculatedPercent(currently, this is the point at which the percentage is calculated) mine Query failed).
Score ID | Classifier ID | Best Percentage |
---|---|---|
58007 | 42 | 66.60 |
63882 | 42 | 64.69 |
64685 | 54 | 64.31 |
58533 | 32 | 63.20 |
55867 | 42 | 62.28 |
66649 | 7 | 56.79 |
55392 | 12 | 50.28 |
58226 | 1 | 49.52 |
55349 | 7 | 41.10 |
This is the desired output when I run the query:
Score ID | Classifier ID | Best Percentage |
---|---|---|
58007 | 42 | 66.60 |
64685 | 54 | 64.31 |
58533 | 32 | 63.20 |
66649 | 7 | 56.79 |
This is the actual output when I run the query:
Score ID | Classifier ID | Best Percentage |
---|---|---|
55867 | 42 | 66.60 |
64685 | 54 | 64.31 |
58533 | 32 | 63.20 |
55349 | 7 | 56.79 |
As shown in the figure, thescoreID
values of the first and fourth rows of the actual output are incorrect.
Currently, I welcome any suggestions.
Edit 2 - Final working solution
WITH cte AS ( SELECT scoreID, classifierID, calculatedPercent AS bestPercent, ROW_NUMBER() OVER (PARTITION BY classifierID ORDER BY calculatedPercent DESC, scoreID DESC) AS rn FROM masterScores WHERE memberID = 3516 AND eventDivision = "O" AND scoreUnusable != "TRUE" ) SELECT scoreID, classifierID, bestPercent FROM cte WHERE rn = 1 ORDER BY bestPercent DESC LIMIT 4
I was able to test this against six cases where the problem occurred, and the solution resolved every issue. Thanks again @Akina!
will mark this issue as resolved.
If
PS. If your MySQL version is 8, you must use(classifierID,calculatedPercent)
is not unique, you may receive multiple rows for eachclassifierID
. In this case you needROW_NUMBER()
in the CTE instead of a subquery.