Deleting Duplicate Rows from a MySQL Table
When working with large datasets, it's essential to identify and remove duplicate records. This guide will provide a comprehensive solution to delete duplicate rows from a MySQL table named "employee" that consists of three fields: empid, empname, and empssn.
Identifying Duplicate Records
To identify duplicate records, you can use the following query:
SELECT COUNT(empssn), empssn FROM employee GROUP BY empssn HAVING COUNT(empssn) > 1
This query returns the count of occurrences for each unique empssn. The HAVING clause filters out rows with a count greater than 1, effectively identifying duplicate empssn values.
Deleting Duplicate Records
To delete the duplicate records, we can use the following query:
DELETE FROM employee WHERE (empid, empssn) NOT IN ( SELECT MIN(empid), empssn FROM employee GROUP BY empssn );
However, MySQL throws an error if the target table ("employee") is specified in both the DELETE and the subquery's FROM clause. To overcome this, we can wrap the subquery in a derived table, as follows:
DELETE FROM employee WHERE (empid, empssn) NOT IN ( SELECT empid, empssn FROM ( SELECT MIN(empid) AS empid, empssn FROM employee GROUP BY empssn ) X );
This modified query effectively deletes duplicate records while avoiding the target table conflict. As a result, the database retains only the records with the minimum empid for each unique empssn value.
The above is the detailed content of How to Delete Duplicate Rows from a MySQL Table?. For more information, please follow other related articles on the PHP Chinese website!