MySQL latest ma...LOGIN
MySQL latest manual tutorial
author:php.cn  update time:2022-04-15 14:04:12

MySQL handles duplicate data


MySQL handles duplicate data

Some MySQL data tables may have duplicate records, and in some cases we allow the existence of duplicate data , but sometimes we also need to delete these duplicate data.

In this chapter we will introduce how to prevent duplicate data from appearing in the data table and how to delete duplicate data in the data table.

Prevent duplicate data from appearing in the table

You can set the specified field in the MySQL data table to PRIMARY KEY (primary key) or UNIQUE (unique) Index to ensure the uniqueness of data.

Let us try an example: there are no indexes and primary keys in the following table, so the table allows multiple duplicate records.

CREATE TABLE person_tbl
(
    first_name CHAR(20),
    last_name CHAR(20),
    sex CHAR(10)
);
If you want to set the fields first_name and last_name in the table, the data cannot be repeated. You can set the dual primary key mode to set the uniqueness of the data. If you set up a dual primary key, the default value of that key cannot be NULL and can be set to NOT NULL. As shown below:
CREATE TABLE person_tbl
(
   first_name CHAR(20) NOT NULL,
   last_name CHAR(20) NOT NULL,
   sex CHAR(10),
   PRIMARY KEY (last_name, first_name)
);
If we set a unique index, then when inserting duplicate data, the SQL statement will not be executed successfully and an error will be thrown.

The difference between INSERT IGNORE INTO and INSERT INTO is that INSERT IGNORE will ignore the data that already exists in the database. If there is no data in the database, new data will be inserted. If there is data, the data will be skipped. . In this way, the existing data in the database can be retained to achieve the purpose of inserting data in the gap.

The following example uses INSERT IGNORE INTO, which will not cause errors after execution, and will not insert duplicate data into the data table:

mysql> INSERT IGNORE INTO person_tbl (last_name, first_name)
    -> VALUES( 'Jay', 'Thomas');
Query OK, 1 row affected (0.00 sec)
mysql> INSERT IGNORE INTO person_tbl (last_name, first_name)
    -> VALUES( 'Jay', 'Thomas');
Query OK, 0 rows affected (0.00 sec)
INSERT IGNORE INTO When inserting data, set After ensuring the uniqueness of the record, if duplicate data is inserted, no error will be returned, only a warning will be returned. If REPLACE INTO into records with the same primary or unique exist, they will be deleted first. Then insert new records.

Another way to set the uniqueness of your data is to add a UNIQUE index, as shown below:

CREATE TABLE person_tbl
(
   first_name CHAR(20) NOT NULL,
   last_name CHAR(20) NOT NULL,
   sex CHAR(10)
   UNIQUE (last_name, first_name)
)

Count duplicate data

Below we will count the number of duplicate records for first_name and last_name in the table:

mysql> SELECT COUNT(*) as repetitions, last_name, first_name
    -> FROM person_tbl
    -> GROUP BY last_name, first_name
    -> HAVING repetitions > 1;
The above query statement will return the number of duplicate records in the person_tbl table. In general, to query for duplicate values, do the following:
  • Determine which column contains values ​​that may be duplicated.

  • Those columns listed using COUNT(*) in the column selection list.

  • The columns listed in the GROUP BY clause.

  • The HAVING clause sets the number of repetitions to be greater than 1.

Filter duplicate data

If you need to read non-duplicate data, you can use the DISTINCT keyword in the SELECT statement to filter duplicate data.

mysql> SELECT DISTINCT last_name, first_name
    -> FROM person_tbl
    -> ORDER BY last_name;
You can also use GROUP BY to read unique data in the data table:
mysql> SELECT last_name, first_name
    -> FROM person_tbl
    -> GROUP BY (last_name, first_name)

Delete duplicate data

If you want to delete duplicate data in the data table, you can use the following SQL statement:

mysql> CREATE TABLE tmp SELECT last_name, first_name, sex
    ->                  FROM person_tbl;
    ->                  GROUP BY (last_name, first_name);
mysql> DROP TABLE person_tbl;
mysql> ALTER TABLE tmp RENAME TO person_tbl;
Of course, you can also add it in the data table INDEX (index) and PRIMAY KEY (primary key) this simple method to delete duplicate records in the table. The method is as follows:
mysql> ALTER IGNORE TABLE person_tbl 
   -> ADD PRIMARY KEY (last_name, first_name);

Related video tutorial recommendations:

php.cn