Identifying and Removing Duplicate Rows in SQLite Database
SQLite databases often encounter duplicate rows, where multiple entries share identical data. In situations where maintaining unique rows is crucial, removing duplicates becomes essential. This article explores an efficient method to delete duplicate rows from a large SQLite table without relying on a primary key column.
Consider the following table with two columns – 'hash' and 'd'. Some rows within this table have both 'hash' and 'd' values that match. However, identical 'd' values do not necessarily indicate matching 'hash' values. To resolve this, we aim to retain only one instance of each duplicate row, while discarding the others.
To distinguish rows in the absence of a primary key, SQLite provides the 'rowid' pseudo-column. Each row possesses a unique 'rowid' value, allowing us to differentiate them. The following query effectively deletes duplicate rows by preserving the lowest 'rowid' per unique combination of 'hash' and 'd':
DELETE FROM YourTable WHERE rowid NOT IN ( SELECT MIN(rowid) FROM YourTable GROUP BY hash, d );
By utilizing the 'rowid' column alongside 'hash' and 'd' grouping, this query ensures that each unique combination of 'hash' and 'd' retains the row with the lowest 'rowid'. Consequently, duplicate rows are effectively removed from the table.
The above is the detailed content of How to Efficiently Remove Duplicate Rows from a SQLite Table Without a Primary Key?. For more information, please follow other related articles on the PHP Chinese website!