Scenario: Duplicate rows in MySQL tables can arise from various sources, including data entry errors. Removing these duplicates is vital for maintaining data integrity and optimal database performance.
Problem: The challenge lies in deleting all but one instance of each duplicate row.
Solution & Precautions:
Critical Note: Always back up your table before executing any DELETE queries. Incorrectly implemented DELETE statements can lead to irreversible data loss.
Method 1: Preserving the Row with the Lowest ID
This approach retains the row with the smallest ID value:
<code class="language-sql">DELETE n1 FROM names n1, names n2 WHERE n1.id > n2.id AND n1.name = n2.name;</code>
Method 2: Preserving the Row with the Highest ID
This alternative keeps the row possessing the largest ID value:
<code class="language-sql">DELETE n1 FROM names n1, names n2 WHERE n1.id < n2.id AND n1.name = n2.name;</code>
Enhanced Efficiency for Large Tables:
For significantly large tables, the following INSERT ... SELECT DISTINCT
method provides a more efficient solution:
<code class="language-sql">CREATE TEMPORARY TABLE tempTableName AS SELECT DISTINCT cellId, attributeId, entityRowId, value FROM tableName; TRUNCATE TABLE tableName; INSERT INTO tableName SELECT * FROM tempTableName; DROP TEMPORARY TABLE tempTableName;</code>
This creates a temporary table containing only unique rows, truncates the original table, and then populates it with the unique data from the temporary table. This is generally faster than using DELETE
for large datasets. Remember to replace tableName
, cellId
, attributeId
, entityRowId
, and value
with your actual column names.
The above is the detailed content of How to Efficiently Remove Duplicate Rows from a MySQL Table While Preserving a Single Instance?. For more information, please follow other related articles on the PHP Chinese website!