Home  >  Article  >  Database  >  How to remove duplicate queries in sql

How to remove duplicate queries in sql

coldplay.xixi
coldplay.xixiOriginal
2020-10-10 11:37:2915901browse

SQL deduplication query method: Duplicate records are judged based on a single field peopleId, and deleted using statements. The code is [where peopleId in (select peopleId from people group by peopleId].

How to remove duplicate queries in sql

SQL deduplication query method:

sql single table/multi-table query to remove duplicate records

Single table distinct

Multiple table group by

group by must be placed before order by and limit, otherwise an error will be reported

1. Find redundant duplicate records in the table. Duplicate records are based on a single field. (peopleId) to judge

select * from people
where peopleId in (select  peopleId  from  people  group  by  peopleId  having  count(peopleId) > 1)

2. Delete redundant duplicate records in the table. Duplicate records are judged based on a single field (peopleId), leaving only the record with the smallest rowid

delete from people
where peopleId  in (select  peopleId  from people  group  by  peopleId   having  count(peopleId) > 1)
and rowid not in (select min(rowid) from  people  group by peopleId  having count(peopleId )>1)

3. Find redundant duplicate records (multiple fields) in the table

select * from vitae a
where (a.peopleId,a.seq) in  (select peopleId,seq from vitae group by peopleId,seq  having count(*) > 1)

4. Delete redundant duplicate records (multiple fields) in the table, leaving only the record with the smallest rowid

delete from vitae a
where (a.peopleId,a.seq) in  (select peopleId,seq from vitae group by peopleId,seq having count(*) > 1)
and rowid not in (select min(rowid) from vitae group by peopleId,seq having count(*)>1)

5. Find redundant duplicate records (multiple fields) in the table, excluding the record with the smallest rowid

select * from vitae a
where (a.peopleId,a.seq) in  (select peopleId,seq from vitae group by peopleId,seq having count(*) > 1)
and rowid not in (select min(rowid) from vitae group by peopleId,seq having count(*)>1)

(2)

For example,

There is a field in table A "name",

And the "name" value between different records may be the same.

Now we need to query the "name" value between each record in the table There are duplicate items;

Select Name,Count(*) From A Group By Name Having Count(*) > 1

If the gender is also the same, the results are as follows:

Select Name,sex,Count(*) From A Group By Name,sex Having Count(*) > 1

(3)

Method 1

declare @max integer,@id integer
declare cur_rows cursor local for select 主字段,count(*) from 表名 group by 主字段 having count(*) >; 1
open cur_rows
fetch cur_rows into @id,@max
while @@fetch_status=0
begin
select @max = @max -1
set rowcount @max
delete from 表名 where 主字段 = @id
fetch cur_rows into @id,@max
end
close cur_rows
set rowcount 0

Method 2

"Duplicate records" have two meanings: one is a completely duplicate record, that is, a record in which all fields are repeated; the other is a record in which some key fields are repeated, such as a duplicate Name field, and Other fields may not be repeated or can be ignored.

1. For the first type of repetition, it is easier to solve. Use

select distinct * from tableName

to get a result set without duplicate records.

If the table needs to delete duplicate records (retaining 1 duplicate record), you can delete it as follows

select distinct * into #Tmp from tableName
drop table tableName
select * into tableName from #Tmp
drop table #Tmp

The reason for this duplication is poor table design. Add a unique index column that is It can be solved.

2. This type of duplication problem usually requires retaining the first record among the duplicate records. The operation method is as follows

Assume that there are duplicate fields named Name and Address, and it is required to obtain these two A result set with unique fields

select identity(int,1,1) as autoID, * into #Tmp from tableName
select min(autoID) as autoID into #Tmp2 from #Tmp group by Name,autoID
select * from #Tmp where autoID in(select autoID from #tmp2)

The last select results in a non-duplicate result set of Name and Address (but there is an additional autoID field, which can be written in the select clause to omit this column when actually writing)

(4)

Duplicate query

select * from tablename where id in (select id from tablename
group by id
having count(id) > 1
)

3. Excessive duplicate records (multiple fields) in the lookup table

select * from vitae a
where (a.peopleId,a.seq) in (select peopleId,seq from vitae group by peopleId,seq having count(*) > 1)

will cause problems when running. Writing like where(a.peopleId,a.seq) will not pass! ! !

Related learning recommendations: SQL video tutorial

The above is the detailed content of How to remove duplicate queries in sql. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn