MySQL handles duplicate data
Some MySQL data tables may have duplicate records. In some cases, we allow the existence of duplicate data, but sometimes we also need to delete these duplicate data.
In this chapter, we will introduce how to prevent duplicate data from appearing in the data table and how to delete duplicate data in the data table.
Prevent duplicate data from appearing in the table
You can set the specified field in the MySQL data table as a PRIMARY KEY (primary key) or UNIQUE (unique) index to ensure the uniqueness of the data.
Let us try an example: there are no indexes and primary keys in the following table, so the table allows multiple duplicate records.
CREATE TABLE person_tbl ( first_name CHAR(20), last_name CHAR(20), sex CHAR(10) );
If you want to set the fields first_name and last_name in the table, the data cannot be repeated. You can set the dual primary key mode to set the uniqueness of the data. If you set a dual primary key, the default value of that key cannot be NULL. , can be set to NOT NULL. As shown below:
CREATE TABLE person_tbl ( first_name CHAR(20) NOT NULL, last_name CHAR(20) NOT NULL, sex CHAR(10), PRIMARY KEY (last_name, first_name) );
If we set a unique index, when inserting duplicate data, the SQL statement will not be executed successfully and an error will be thrown.
The difference between INSERT IGNORE INTO and INSERT INTO is that INSERT IGNORE will ignore the data that already exists in the database. If there is no data in the database, new data will be inserted. If there is data, the data will be skipped. In this way, the existing data in the database can be retained to achieve the purpose of inserting data in the gap.
The following example uses INSERT IGNORE INTO. There will be no errors after execution, and no duplicate data will be inserted into the data table:
mysql> INSERT IGNORE INTO person_tbl (last_name, first_name) -> VALUES( 'Jay', 'Thomas'); Query OK, 1 row affected (0.00 sec) mysql> INSERT IGNORE INTO person_tbl (last_name, first_name) -> VALUES( 'Jay', 'Thomas'); Query OK, 0 rows affected (0.00 sec)
INSERT IGNORE INTO When inserting data, the record is set After the uniqueness, if duplicate data is inserted, no error will be returned, only a warning will be returned. If REPLACE INTO into records with the same primary or unique exist, they will be deleted first. Then insert new records.
Another way to set the uniqueness of data is to add a UNIQUE index, as shown below:
CREATE TABLE person_tbl ( first_name CHAR(20) NOT NULL, last_name CHAR(20) NOT NULL, sex CHAR(10) UNIQUE (last_name, first_name) );
Count duplicate data
Below we will count first_name and last_name in the table The number of duplicate records:
mysql> SELECT COUNT(*) as repetitions, last_name, first_name -> FROM person_tbl -> GROUP BY last_name, first_name -> HAVING repetitions > 1;
The above query statement will return the number of duplicate records in the person_tbl table. In general, to query for duplicate values, do the following:
Determine which column contains values that may be duplicated.
Those columns listed using COUNT(*) in the column selection list.
Columns listed in the GROUP BY clause. The
HAVING clause sets the number of repetitions to be greater than 1.
Filter duplicate data
If you need to read unique data, you can use the DISTINCT keyword in the SELECT statement to filter duplicate data.
mysql> SELECT DISTINCT last_name, first_name -> FROM person_tbl -> ORDER BY last_name;
You can also use GROUP BY to read non-duplicate data in the data table:
mysql> SELECT last_name, first_name -> FROM person_tbl -> GROUP BY (last_name, first_name);
Delete duplicate data
If you want to delete duplicate data in the data table , you can use the following SQL statement:
mysql> CREATE TABLE tmp SELECT last_name, first_name, sex -> FROM person_tbl; -> GROUP BY (last_name, first_name); mysql> DROP TABLE person_tbl; mysql> ALTER TABLE tmp RENAME TO person_tbl;
Of course, you can also add INDEX (index) and PRIMAY KEY (primary key) to the data table in a simple way to delete duplicate records in the table. The method is as follows:
mysql> ALTER IGNORE TABLE person_tbl -> ADD PRIMARY KEY (last_name, first_name);
The above is the content of [mysql tutorial] MySQL processing duplicate data. For more related content, please pay attention to the PHP Chinese website (m.sbmmt.com)!