I have a gaming table with the following description:
+---------------+-------------+------+-----+---------+----------------+ | Field | Type | Null | Key | Default | Extra | +---------------+-------------+------+-----+---------+----------------+ | id | int(11) | NO | PRI | NULL | auto_increment | | date | date | NO | | NULL | | | time | time | NO | | NULL | | | hometeam_id | int(11) | NO | MUL | NULL | | | awayteam_id | int(11) | NO | MUL | NULL | | | locationcity | varchar(30) | NO | | NULL | | | locationstate | varchar(20) | NO | | NULL | | +---------------+-------------+------+-----+---------+----------------+
But every game has a duplicate entry somewhere in the table because every game is on both teams' schedules. Is there a SQL statement I can use to see and remove all duplicates based on the same date, time, hometeam_id, awayteam_id, locationcity and locationstate fields?
You can try a query like this:
This will keep only one example of each game instance with the smallest ID in the database.
You should be able to perform a correlated subquery to remove the data. Find all duplicate rows and delete all but the row with the smallest id. For MYSQL, you need to use inner joins (equivalent to the functionality of EXISTS), as follows:
delete games from games inner join (select min(id) minid, date, time, hometeam_id, awayteam_id, locationcity, locationstate from games group by date, time, hometeam_id, awayteam_id, locationcity, locationstate having count(1) > 1) as duplicates on (duplicates.date = games.date and duplicates.time = games.time and duplicates.hometeam_id = games.hometeam_id and duplicates.awayteam_id = games.awayteam_id and duplicates.locationcity = games.locationcity and duplicates.locationstate = games.locationstate and duplicates.minid <> games.id)To test, replace
remove game from gamewithselect from game*. Don't just run delete on the database :-)