1 Set and storage order
#The elements added toSet
mustDefine theequals()
method to ensure theuniquenessof the object.
hashCode()
Only when this class is placed inHashSet
orLinkedHashSet
is what is necessary. But as a matter of good programming style, you should always override the hashCode() method when overriding the equals() method.
If an object is used in any kind ofsorted container, such asSortedSet
(TreeSet
is its only implementation), then it must implement theComparable
interface.
Note thatSortedSet
means "according to thecomparison functionof the object Sorting" rather than "the order in which elements are inserted".Insertion orderUseLinkedHashSet
to save.
For concurrent applications, the only two implementations of Queue in Java SE5 areLinkedList
andPriorityQueue
, they only havesorting behaviordifferences, there is no difference in performance.
The order of priority queuePriorityQueue
is also controlled by implementingComparable
.
Map table(also calledassociative arrayAssociative Array).
HashMap
uses a special value, calledhash code(hash code), to Replaces slow searches for keys. A hash code is a"relatively unique"int
value used to represent an object. It is performed by combiningcertain information about the object. Generated by converting.
hashCode() is a method in the root class Object, so all objects can generate hash codes.
The requirements for keys used in a Map are the same as for the elements in a Set:
Any key must have anequals( )
Method;
If the key is used to hash the Map, it must also have the appropriatehashCode()
Method;
If the key is used forTreeMap
, then it must implementComparable
.
HashMap usesequals()
to determine whether the current key is consistent with the table The same keys exist.
The default Object.equals() just compares the address of the object.If you want to use your own class as the key of HashMap, you mustoverride bothhashCode()
andequals()
.
The correctequals()
method must meet the following 5 conditions:
Reflexivity.
symmetry.
Transitivity.
consistency.
For anyx
that is not null,x.equals(null)
must returnfalse
.
The purpose of using hashing is:You want to use one object to find another object.
The implementation class of Map uses hashing toincrease the query speed.
The value of hashing is speed:Hashing allows queries to be performed quickly. Since the bottleneck isquery speed, one solution is to keep the keys sorted and then use
Collections.binarySearch()
to query.Hashing goes one step further and saves the key somewhere so that it can be foundquickly. The fastest data structure to store a set of elements is an array, so use it to represent the key information (note carefully that I mean the key information, not the key itself). But because the array cannot adjust its capacity, there is a problem: we want to save an uncertain number of values in the Map, but what if the number of keys is limited by the capacity of the array?
The answer is:The array does not save the key itself. Instead, a number is generated from the key object and used as thesubscriptof the array. This number is thehash code, determined by the
hashCode()
method defined in Object and possibly overridden by your class. (called ahash functionin computer science terms) is generated.To solve the problem of fixed array capacity, different keys can produce the same subscript. That is, there may beconflicts, i.e. thehash codes do not have to be unique. Therefore, it doesn't matter how big the array is, any key will always find its place in the array.
To sum up,Hashis to generate a number from an object and save it (as the bottom of the array) standard), and then just find the number directly when searching for this object, so thepurposeof hashing is to improve the search speed, and themeansis to combine the number generated by an object with its Associated and saved (through an array, called ahash table). The generated number is thehash code. The method of generating this hash code is calledhash function(hashCode()
).
Therefore, the process of querying akey
inHashMap
is:
First calculate the hash code
Then use the hash code to query the array (the hash code is used as the variable array subscript)
If there is no conflict, that is, there is only one object that generates this hash code, then the position of the array subscript corresponding to the hash code is the element to be found
If there is a conflict, then scatter The array element where the subscript corresponding to the column code is located saves alist
, and then use theequals()
method to perform a linear query on the values inlist
.
list,
quickly jump to a certain position in the array, only forvery few The elements are compared. This is whyHashMapis so
fast.
slot(slot) in the hash table is usually called thebucket bit(bucket)
prime numbers(It is a prime number in JDK5, and it is already an integer of 2 in JDK7 Powered).
2 raised to the integer power . Division and remainder are the slowest operations on modern processors. Using a hash table with an integer power of 2 length, maskcan be used instead of division. Because get() is the most commonly used operation, the % operation of finding the remainder is the most expensive part, and using the integer power of 2 can eliminate this overhead (it may also have some impact on hashCode()).
Same location.
package net.mrliuli.containers; import java.util.*;public class SimpleHashMapextends AbstractMap { // Choose a prime number for the hash table size, to achieve a uniform distribution: static final int SIZE = 997; // You can't have a physical array of generics, but you can upcast to one: @SuppressWarnings("unchecked") LinkedList >[] buckets = new LinkedList[SIZE]; @Override public V put(K key, V value){ int index = Math.abs(key.hashCode()) % SIZE; if(buckets[index] == null){ buckets[index] = new LinkedList >(); } LinkedList > bucket = buckets[index]; MapEntry pair = new MapEntry (key, value); boolean found = false; V oldValue = null; ListIterator > it = bucket.listIterator(); while(it.hasNext()){ MapEntry iPair = it.next(); if(iPair.equals(key)){ oldValue = iPair.getValue(); it.set(pair); // Replace old with new found = true; break; } } if(!found){ buckets[index].add(pair); } return oldValue; } @Override public V get(Object key){ int index = Math.abs(key.hashCode()) % SIZE; if(buckets[index] == null) return null; for(MapEntry iPair : buckets[index]){ if(iPair.getKey().equals(key)){ return iPair.getValue(); } } return null; } @Override public Set > entrySet(){ Set > set = new HashSet >(); for(LinkedList > bucket : buckets){ if(bucket == null) continue; for(MapEntry mpair : bucket){ set.add(mpair); } } return set; } public static void main(String[] args){ SimpleHashMap m = new SimpleHashMap (); for(String s : "to be or not to be is a question".split(" ")){ m.put(s, s); System.out.println(m); } System.out.println(m); System.out.println(m.get("be")); System.out.println(m.entrySet()); } }
Whenever, calling hashCode() on the same phase object should produce thesame value.
meaningfulidentification information within the object should be used. That is, it must generate a hash code based on thecontentof the object.
"Effective Java™ Programming Language Guide (Addison-Wesley, 2001)" gives a basic guide on how to write a decent hashCode():
intvariable
result, such as
17
为对象内每个有意义的域f
(即每个可以做equals()
操作的域)计算出一个int
散列码c
:
域类型 | 计算 |
---|---|
boolean | c=(f?0:1) |
byte、char、short或int | c=(int)f |
long | c=(int)(f^(f>>>32)) |
float | c=Float.floatToIntBits(f); |
double | long l = Double.doubleToLongBits(f); |
Object,其equals()调用这个域的equals() | c=f.hashCode() |
数组 | 对每个元素应用上述规则 |
3. 合并计算散列码:result = 37 * result + c;
4. 返回result。
5. 检查hashCode()
最后生成的结果,确保相同的对象有相同的散列码。
已证明
0.0
是包含在Math.random()
的输出中的,按照数学术语,即其范围是[0,1)
。
HashMap中的一些术语:
容量(Capacity):表中的桶位数(The number of buckets in the table)。
初始容量(Initial capacity):表在创建时所拥有的桶位数。HashMap
和HashSet
都具有允许你指定初始容量的构造器。
尺寸(Size):表中当前存储的项数。
负载因子(Loadfactor):尺寸/容量。空表的负载因子是0
,而半满表的负载因子是0.5
,依此类推。负载轻的表产生冲突的可能性小,因此对于插入和查找都是最理想的(但是会减慢使用迭代器进行遍历的过程)。HashMap
和HashSet
都具有允许你指定负载因子的构造器,表示当负载情况达到该负载的水平时,容器将自动增加其容量(桶位数),实现方式是使容量大致加倍,并重新将现有对象分布到新的桶位集中(这被称为再散列)。
HashMap
使用的默认负载因子是0.75
(只有当表达到四分之三满时,才进行再散列),这个因子在时间和空间代价之间达到了平衡。更高的负载因子可以降低表所需的空间,但会增加查找代价,这很重要,因为查找是我们在大多数时间里所做的操作(包括get()
和put()
)。
Collections类有办法能够自动同步整个容器。其语法与“不可修改的”方法相似:
package net.mrliuli.containers; import java.util.*;public class Synchronization { public static void main(String[] args){ Collectionc = Collections.synchronizedCollection(new ArrayList ()); List list = Collections.synchronizedList(new ArrayList ()); Set s = Collections.synchronizedSet(new HashSet ()); Set ss = Collections.synchronizedSortedSet(new TreeSet ()); Map m = Collections.synchronizedMap(new HashMap ()); Map sm = Collections.synchronizedSortedMap(new TreeMap ()); } }
Java容器有一种保护机制能够防止多个进行同时修改同一个容器的内容。Java容器类类库采用快速报错(fail-fast)机制。它会探查容器上的任何除了你的进程所进行的操作以外的所有变化,一旦它发现其他进程修改了容器,就会立刻抛出ConcurrentModificationException
异常。这就是“快速报错”的意思——即,不是使用复杂的算法在事后来检查问题。
package net.mrliuli.containers; import java.util.*;public class FailFast { public static void main(String[] args){ Collectionc = new ArrayList<>(); Iterator it = c.iterator(); c.add("An Object"); try{ String s = it.next(); }catch(ConcurrentModificationException e){ System.out.println(e); } } }
相关文章:
The above is the detailed content of Java Programming Thoughts Learning Class (4) Chapter 17 - In-depth Discussion of Containers. For more information, please follow other related articles on the PHP Chinese website!