Home  >  Article  >  Backend Development  >  Python implements removing duplicates from a sequence while keeping the order of elements unchanged

Python implements removing duplicates from a sequence while keeping the order of elements unchanged

不言
不言forward
2018-10-15 14:15:332438browse

What this article brings to you is about Python's implementation of removing duplicates from a sequence and keeping the order of elements unchanged. It has certain reference value. Friends in need can refer to it. I hope it will be helpful to you. help.

1. Requirements

We want to remove duplicate elements that appear in the sequence, but still keep the order of the remaining elements unchanged.

If you just want to remove duplicates, then usually a simple enough method is to build a set:

a=[1,5,4,36,7,8,2,3,5,7]
#结果为:{1, 2, 3, 4, 5, 36, 7, 8}
print(set(a))

2. Solution

If the value in the sequence is hashable, then this problem can be easily solved by using collections and generators.

If an object is hashable, then it must be immutable during its lifetime, and it needs to have a __hash__() method. Integers, floating point numbers, strings, and elements are all immutable.
def dedupe(items):
    seen=set()
    for item in items:
        if item not in seen:
            yield item
            seen.add(item)

a=[1,2,3,1,9,1,5,10]
print(list(dedupe(a)))

Run result:

[1, 2, 3, 9, 5, 10]

This can only be done when the elements in the sequence are hashable. If you want to remove duplicates from a non-hashable object sequence, you need to modify the above code slightly:

def dedupe(items,key=None):
    seen=set()
    for item in items:
        value=item if key is None else key(item)
        if value not in seen:
            yield item
            seen.add(value)

a=[
    {'x':1,'y':2},
    {'x':1,'y':3},
    {'x':1,'y':4},
    {'x':1,'y':2},
    {'x':1,'y':3},
    {'x':1,'y':1},

]
print(list(dedupe(a,key=lambda d:(d['x'],d['y']))))

print(list(dedupe(a,key=lambda d:d['y'])))

Running results:

[{'x': 1, 'y': 2}, {'x': 1, 'y': 3}, {'x': 1, 'y': 4}, {'x': 1, 'y': 1}]
[{'x': 1, 'y': 2}, {'x': 1, 'y': 3}, {'x': 1, 'y': 4}, {'x': 1, 'y': 1}]

The function of the parameter key here is to specify a function to use Convert the elements in the sequence to a hashable type in order to detect duplicates.

The above is the detailed content of Python implements removing duplicates from a sequence while keeping the order of elements unchanged. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:segmentfault.com. If there is any infringement, please contact admin@php.cn delete