What is the expansion mechanism of hashmap?-Javagetting Started-php.cn

The expansion mechanism of hashmap is: recalculate the capacity and replace the original array with a new array. Recalculate all the data of the original array and insert a new array, and then point to the new array; if the array has reached the maximum value before capacity expansion, directly set the threshold to the maximum integer and return it.

What is the expansion mechanism of hashmap?

#The operating environment of this tutorial: windows7 system, java8, Dell G3 computer.

What is resize?

Capacity expansion (resize): It is to recalculate the capacity and continuously add elements to the HashMap object. When the array inside the HashMap object cannot load more elements, the object needs to be expanded. The length of the array so that more elements can be accommodated. Of course, arrays in Java cannot be automatically expanded. The method is to use a new array to replace the existing array with a small capacity. Just like we use a small bucket to hold water, if we want to hold more water, we have to change to a larger bucket. .

When will the capacity be expanded?

When adding elements to a container, the number of elements in the current container will be judged. If it is greater than or equal to the threshold (threshold), that is, the number of elements in the current container is greater than the length of the current array. When multiplied by the value of the loading factor, it will automatically expand.

Hashmap expansion principle

HashMap expansion is to recalculate the capacity and continuously add elements to hashMap. When hashMap cannot load new elements, The object will need to expand the array capacity to accommodate more elements.

What is the expansion mechanism of hashmap?

HashMap capacity expansion characteristics, the greater the loading factor, the higher the space utilization, the more elements need to be filled before expansion, the faster the put operation, but the linked list is easy to pass Long, hash collision probability is high, and get operation is slow. The smaller the loading factor, the faster the get operation, the shorter the linked list, and the lower the probability of hash collision. However, space utilization is low. Too many put elements will lead to frequent expansion and affect performance.

What is the expansion mechanism of hashmap?

The capacity expansion principle of HashMap: The Hashmap method is to replace the original array with a new array, recalculate all the data in the original array, insert the new array, and then point to the new array; If the array has reached the maximum before expansion, the threshold is directly set to the maximum integer and returned.

The expansion process

The following uses source code, pictures, and text descriptions to introduce the expansion process of HashMap.

/** 
 * HashMap 添加节点 
 * 
 * @param hash        当前key生成的hashcode 
 * @param key         要添加到 HashMap 的key 
 * @param value       要添加到 HashMap 的value 
 * @param bucketIndex 桶，也就是这个要添加 HashMap 里的这个数据对应到数组的位置下标 
 */  
void addEntry(int hash, K key, V value, int bucketIndex) {  
    //数组扩容条件：1.已经存在的key-value mappings的个数大于等于阈值  
    //             2.底层数组的bucketIndex坐标处不等于null  
    if ((size >= threshold) && (null != table[bucketIndex])) {  
        resize(2 * table.length);//扩容之后，数组长度变了  
        hash = (null != key) ? hash(key) : 0;//为什么要再次计算一下hash值呢？  
        bucketIndex = indexFor(hash, table.length);//扩容之后，数组长度变了，在数组的下标跟数组长度有关，得重算。  
    }  
    createEntry(hash, key, value, bucketIndex);  
}  
  
/** 
 * 这地方就是链表出现的地方，有2种情况 
 * 1，原来的桶bucketIndex处是没值的，那么就不会有链表出来啦 
 * 2，原来这地方有值，那么根据Entry的构造函数，把新传进来的key-value mapping放在数组上，原来的就挂在这个新来的next属性上了 
 */  
void createEntry(int hash, K key, V value, int bucketIndex) {  
    HashMap.Entry<K, V> e = table[bucketIndex];  
    table[bucketIndex] = new HashMap.Entry<>(hash, key, value, e);  
    size++;  
}

Copy after login

In the above addEntry method, if size (number of elements in the current container) is greater than or equal to threshold (array length multiplied by load factor), and the bucketIndex coordinate of the underlying array is not equal to null, then it will be executed Expansion (resize) . Otherwise, the expansion will not occur.

The following will focus on the expansion process:

        void resize(int newCapacity) {   //传入新的容量
            Entry[] oldTable = table;    //引用扩容前的Entry数组
            int oldCapacity = oldTable.length;
            if (oldCapacity == MAXIMUM_CAPACITY) {  //扩容前的数组大小如果已经达到最大(2^30)了
                threshold = Integer.MAX_VALUE; //修改阈值为int的最大值(2^31-1)，这样以后就不会扩容了
                return;
            }
     
            Entry[] newTable = new Entry[newCapacity];  //初始化一个新的Entry数组
            transfer(newTable);	此行有遗漏，勘误见下面引用	//！！将数据转移到新的Entry数组里
            table = newTable;                           //HashMap的table属性引用新的Entry数组
            threshold = (int) (newCapacity * loadFactor);此行有遗漏，勘误见下面引用//修改阈值
        }

Copy after login

Corrected by wenni328 blogger: transfer(newTable); ==> transfer(newTable, initHashSeedAsNeeded(newCapacity));
threshold = (int) (newCapacity * loadFactor); ==> threshold = (int)Math.min(newCapacity * loadFactor, MAXIMUM_CAPACITY 1);

Before expansion, first obtain the reference address of the array before expansion and store it in the oldTable variable, and then determine whether the length of the array before expansion reaches int The maximum value stored in the type. If so, the expansion will be given up because the array capacity has reached the maximum and cannot be expanded.

The picture below shows the state after the program executes the Entry[] newTable = new Entry[newCapacity]; code:

What is the expansion mechanism of hashmap?

Here is the use of a larger capacity An array is used to replace the existing array with a small capacity. The transfer() method copies the elements of the original Entry array to the new Entry array.

        void transfer(Entry[] newTable) {
            Entry[] src = table;                   //src引用了旧的Entry数组
            int newCapacity = newTable.length;
            for (int j = 0; j < src.length; j++) { //遍历旧的Entry数组
                Entry<K, V> e = src[j];             //取得旧Entry数组的每个元素
                if (e != null) {
                    src[j] = null;//释放旧Entry数组的对象引用（for循环后，旧的Entry数组不再引用任何对象）
                    do {
                        Entry<K, V> next = e.next;
                        int i = indexFor(e.hash, newCapacity); //！！重新计算每个元素在数组中的位置
                        e.next = newTable[i]; //标记[1]
                        newTable[i] = e;      //将元素放在数组上
                        e = next;             //访问下一个Entry链上的元素
                    } while (e != null);
                }
            }
        }

        static int indexFor(int h, int length) {
            return h & (length - 1);
        }

Copy after login

The reference of newTable[i] is assigned to e.next, that is,

uses the head insertion method of a singly linked list, and new elements at the same position will always be placed at the head of the linked list. position; in this way, elements placed on an index first will eventually be placed at the end of the Entry chain (if a hash conflict occurs). Elements in the same Entry chain in the old array may be placed in different positions in the new array after recalculating the index position.

The transfer process will be demonstrated in the form of pictures below (the red fonts in the pictures below indicate the differences from the above pictures, the following pictures are like this, and the descriptions in red fonts will not be repeated)

The picture below shows the state after the program executes the src[j] = null; code (this is the state during the first loop):

What is the expansion mechanism of hashmap?

First, assign the reference address of the table[] array to the src[] array.

Then, Entry e = src[j]; is to transfer the linked list at the src[j] position to the e variable for storage. Since the linked list at src[j] has been given to e for storage, you can boldly set src[j]=null; and then wait for garbage collection.

The picture below shows the state after the program executes the Entry next = e.next; code (this is the state during the first loop):

What is the expansion mechanism of hashmap?

The value of e.next is first backed up to the next variable. Subsequent code will change the pointer of e.next, so the value of e.next is backed up here.

The picture below shows the state after the program executes the e.next = newTable[i]; code (this is the state during the first loop):
#

What is the expansion mechanism of hashmap?

## Since the value of newTable[3] is null, e.next is null, as shown in the figure above.

The picture below shows the state after the program executes the newTable[i] = e; code (this is the state during the first loop):

What is the expansion mechanism of hashmap? ## The following picture shows the state after the program executes the e = next; code (this is the state during the first loop):

What is the expansion mechanism of hashmap? As shown above, Entry1 This node is successfully inserted into newTable. At the end of the cycle, because e!=null is judged, the above process will be repeated until all nodes are moved to newTable.

Summary

Expansion is a particularly performance-intensive operation, so when programmers use HashMap, they estimate the size of the map and provide it during initialization. A rough value to avoid frequent map expansion.

The load factor can be modified, or it can be greater than 1, but it is recommended not to modify it easily unless the situation is very special.
HashMap is thread-unsafe. Do not operate HashMap at the same time in a concurrent environment. It is recommended to use ConcurrentHashMap.
JDK1.8 introduces red-black trees to greatly optimize the performance of HashMap.
For more programming-related knowledge, please visit:

Programming Teaching

! !

The above is the detailed content of What is the expansion mechanism of hashmap?. For more information, please follow other related articles on the PHP Chinese website!