Home  >  Article  >  Java  >  How to implement java list deduplication operation

How to implement java list deduplication operation

高洛峰
高洛峰Original
2017-01-22 15:53:041399browse

List in Java can contain repeated elements (hash code and equals), so there are two ways to deduplicate List:
Option 1: It can be implemented through HashSet, the code is as follows:

class Student { 
private String id; 
private String name; 
public Student(String id, String name) { 
super(); 
this.id = id; 
this.name = name; 
} 
@Override 
public String toString() { 
return "Student [id=" + id + ", name=" + name + "]"; 
} 
@Override 
public int hashCode() { 
final int prime = 31; 
int result = 1; 
result = prime * result + ((id == null) ? 0 : id.hashCode()); 
result = prime * result + ((name == null) ? 0 : name.hashCode()); 
return result; 
} 
@Override 
public boolean equals(Object obj) { 
if (this == obj) { 
return true; 
} 
if (obj == null) { 
return false; 
} 
if (getClass() != obj.getClass()) { 
return false; 
} 
Student other = (Student) obj; 
if (id == null) { 
if (other.id != null) { 
return false; 
} 
} else if (!id.equals(other.id)) { 
return false; 
} 
if (name == null) { 
if (other.name != null) { 
return false; 
} 
} else if (!name.equals(other.name)) { 
return false; 
} 
return true; 
} 
}

The hashCode and equals methods must be implemented. We will see why they must be implemented in a moment.
The specific operation code is as follows:

private static void removeListDuplicateObject() { 
List list = new ArrayList(); 
for (int i = 0; i < 10; i++) { 
Student student = new Student("id", "name"); 
list.add(student); 
} 
System.out.println(Arrays.toString(list.toArray())); 
Set set = new HashSet(); 
set.addAll(list); 
System.out.println(Arrays.toString(set.toArray())); 
list.removeAll(list); 
set.removeAll(set); 
System.out.println(Arrays.toString(list.toArray())); 
System.out.println(Arrays.toString(set.toArray())); 
}

Calling code:

public static void main(String[] args) { 
removeListDuplicateObject(); 
}

Utilization When HashSet performs deduplication operations, why must it cover the hashCode and equals methods?
Let’s check the source code of HashSet’s add operation as follows:

public boolean add(E e) { 
return map.put(e, PRESENT)==null; 
}

The HashMap is called for operation. Let’s look at the put operation of HashMap:

public V put(K key, V value) { 
if (key == null) 
return putForNullKey(value); 
int hash = hash(key.hashCode()); 
int i = indexFor(hash, table.length); 
for (Entry e = table[i]; e != null; e = e.next) { 
Object k; 
if (e.hash == hash && ((k = e.key) == key || key.equals(k))) { 
V oldValue = e.value; 
e.value = value; 
e.recordAccess(this); 
return oldValue; 
} 
} 
modCount++; 
addEntry(hash, key, value, i); 
return null; 
}

What needs to be noted is:

if (e.hash == hash && ((k = e.key) == key || key.equals(k))) { 
...... 
}

That is to say, the hash codes are equal and equals(==).
Complexity: Just traverse on one side, O(n)
Option 2: Traverse the List directly and implement it through contains and add operations
The code is as follows:

private static void removeListDuplicateObjectByList() { 
List list = new ArrayList(); 
for (int i = 0; i < 10; i++) { 
Student student = new Student("id", "name"); 
list.add(student); 
} 
System.out.println(Arrays.toString(list.toArray())); 
List listUniq = new ArrayList(); 
for (Student student : list) { 
if (!listUniq.contains(student)) { 
listUniq.add(student); 
} 
} 
System.out.println(Arrays.toString(listUniq.toArray())); 
list.removeAll(list); 
listUniq.removeAll(listUniq); 
System.out.println(Arrays.toString(list.toArray())); 
System.out.println(Arrays.toString(listUniq.toArray())); 
}

Others are the same as above.
Complexity:
While traversing, the contains method is called at the same time. We view the source code as follows:

public boolean contains(Object o) { 
return indexOf(o) >= 0; 
} 
public int indexOf(Object o) { 
if (o == null) { 
for (int i = 0; i < size; i++) 
if (elementData[i]==null) 
return i; 
} else { 
for (int i = 0; i < size; i++) 
if (o.equals(elementData[i])) 
return i; 
} 
return -1; 
}

You can see that another traversal operation has been performed on the new list. That is, the complexity of 1+2+....+n is O(n*n)
Conclusion:
The first solution is highly efficient, that is, using HashSet to perform deduplication operations

For more articles related to the implementation of java list deduplication operation, please pay attention to the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn