Home > Article > Java > It’s enough to read this article about Java collection framework

It’s enough to read this article about Java collection framework

Java学习指南forward: 2023-07-26 17:09:381172browse

Without further ado, let’s go directly to the picture:

It’s enough to read this article about Java collection framework

Java collections, also called containers, Mainly derived from two major interfaces (Interface):
Collection and Map

As the name suggests, containers are used to store data.

The difference between these two interfaces is:

Collection stores a single element;
Map stores key- value key-value pair.

That is, singles are placed in the Collection, and couples are placed in the Map. (So where do you belong?

Learning these collection frameworks, I think there are 4 goals:

Clear the corresponding relationship between each interface and class;
Be familiar with commonly used APIs for each interface and class;
Be able to choose appropriate data structures and analyze the advantages and disadvantages for different scenarios;
Learn source code design and be able to answer interviews.

Regarding Map, the previous HashMap article has been very thorough and detailed. So I won’t go into details in this article. If you haven’t read that article yet, please go to the official account and reply “HashMap” to read the article~

Collection

Let’s first look at the top-level Collection.

Collection also defines many methods, which will also be inherited into each sub-interface and implementation class. The use of these APIs is also a common test in daily work and interviews, so Let’s look at these methods first.

The collection of operations is nothing more than the four categories of "Add, Delete, Modify and Check", also called CRUD:

Create, Read, Update, and Delete.

Then I also divide these APIs into these four categories:

Function	Method
Add	add()/addAll()
Delete	remove()/ removeAll( )
Change	There is no
in Collection Interface#check	contains()/ containsAll()
Others	isEmpty()/size()/toArray()

Let’s look at it in detail below:

Added:

boolean add(E e);

add() The data type passed in by the method It must be Object, so when writing basic data types, auto-boxing and unboxing will be performed.

There is another method addAll(), which can add elements from another collection to this collection.

boolean addAll(Collection<? extends E> c);

Delete:

boolean remove(Object o);

remove() is the specified element to be deleted.

That corresponds to addAll(),
naturally has removeAll(), which is to delete all elements in set B.

boolean removeAll(Collection<?> c);

Change:

There is no direct operation to change elements in Collection Interface. Anyway, deletion and addition can complete the change!

查：

查下集合中有没有某个特定的元素：

boolean contains(Object o);

查集合 A 是否包含了集合 B：

boolean containsAll(Collection<?> c);

还有一些对集合整体的操作：

判断集合是否为空：

boolean isEmpty();

集合的大小：

int size();

把集合转成数组：

Object[] toArray();

以上就是 Collection 中常用的 API 了。

在接口里都定义好了，子类不要也得要。

当然子类也会做一些自己的实现，这样就有了不同的数据结构。

那我们一个个来看。

List

List 最大的特点就是：有序，可重复。

看官网说的：

An ordered collection (also known as a sequence).

Unlike sets, lists typically allow duplicate elements.

This time I also mentioned the characteristics of Set. It is completely opposite to List. Set is unordered and does not repeat.

List can be implemented in two ways: LinkedList and ArrayList. The most common question asked during interviews is how to choose between these two data structures.

For this type of selection problem:
The first is to consider whether the data structure can complete the required functions;
If both can be completed, the second is to consider which is more Efficient.

(Everything is like this.

Let’s look specifically at the APIs of these two classes and their time complexity:

Function	Method	ArrayList	LinkedList
increase	add(E e)	O(1)	O(1)
Add	add(int index, E e)	O(n)	O(n)
Delete	remove(int index)	O(n)	O(n)
Delete	remove(E e)	O(n)	O(n)
Change	set(int index, E e)	O(1)	O(n)
Check	get(int index)	O(1)	O(n)

A few explanations:

add(E e) is to add elements to the tail. Although the ArrayList may expand, the amortized time complexity ) is still O(1).

add(int index, E e) is to add an element at a specific position. LinkedList needs to find this position first and then add this element. Although it is a simple "add" action It's O(1), but finding this position is still O(n). (Some people think this is O(1). Just explain it clearly to the interviewer and refuse to take the blame.

remove(int index) is the element on the index of remove. So

The process of finding this element in ArrayList is O(1), but after remove, subsequent elements must be moved forward by one position, so the amortized complexity is O(n);
LinkedList also needs to find the index first. This process is O(n), so the overall process is also O(n).

remove(E e) is the first element seen by remove, then

ArrayList must first find this element. This process is O(n), and then move After division, you have to move one position forward, which is O(n), and the total is still O(n);
LinkedList also needs to be searched first, and this process is O(n) ), and then removed, this process is O(1), and the total is O(n).

What is the reason for the difference in time complexity?

Answer:

Because ArrayList is implemented using arrays.
The biggest difference between arrays and linked lists is that arrays can be accessed randomly (random access).

This feature causes You can get the number at any position in O(1) time through subscripting, but you can't do that with a linked list. You can only traverse it one by one from the beginning.

That is to say, in the two functions of "Change and Check" Because the array can be accessed randomly, ArrayList is highly efficient.

What about "addition and deletion"?

If you do not consider the time to find this element,

Because of the physical continuity of the array, when you want to add or delete elements, it is fine at the tail, but other places will cause subsequent elements to be deleted. Movement, so the efficiency is lower; while a linked list can easily disconnect from the next element, directly insert new elements or remove old elements.

But, in fact, you have to consider the time to find the element. . . And if it is operated at the tail, ArrayList will be faster when the amount of data is large.

So:

Change the search and select ArrayList;
Choose ArrayList for additions and deletions at the end;
In other cases, if the time complexity is the same, it is recommended to choose ArrayList because the overhead is smaller, or the memory Use more efficiently.

Vector

As the last knowledge point about List, let’s talk about Vector. This is also an age-revealing post, used by all the big guys.

The Vector, like the ArrayList, also inherits from java.util.AbstractList, and the bottom layer is also implemented using an array.

But it has been deprecated now because...it adds too many synchronized!

Every benefit comes at a cost. The cost of thread safety is low efficiency, which can easily become a bottleneck in some systems. Therefore, now we no longer add synchronized at the data structure level, but instead Transfer to our programmers==

So the common interview question: What is the difference between Vector and ArrayList? Just answering this is not very comprehensive.

Let’s take a look at the high-voted answers on stack overflow:

The first is the thread safety issue just mentioned;
The second is the difference in how much expansion is required during expansion.

You have to look at the source code for this:

This is the expansion implementation of ArrayList. This arithmetic right shift operation is to The binary number of this number is moved one bit to the right, and the leftmost complements the sign bit , but because the capacity does not have a negative number, it still adds 0.

The effect of shifting one bit to the right is to divide by 2 , then the new capacity defined is 1.5 times of the original capacity.

Let’s look at Vector again:

Because usually we do not define capacityIncrement, so by default it is expanded twice.

If you answer these two points, there will definitely be no problem.

Queue & Deque

Queue is a linear data structure with one end in and the other out; and Deque is both ends. All can be entered and exited.

Queue

The Queue interface in Java is a little bit confusing. Generally speaking, the semantics of queues They are all first in, first out (FIFO).

But there is an exception here, that is, PriorityQueue, also called heap, does not come out in the order of the time it enters, but goes out according to the specified priority, and its operation is not O(1), time The calculation of complexity is a bit complicated, and we will discuss it in a separate article later.

The Queue methodOfficial website^[1]It’s all summarized. It has two sets of APIs, and the basic functions are the same, but:

One group will throw an exception;
The other group will return a special value.

##Deleteremove()poll()Lookelement()peek()

Why does it throw an exception?

For example, if the queue is empty, remove() will throw an exception, but poll() will return null; element() will throw an exception, and peek() will Just return null.

How come add(e) throws an exception?

Some Queues have capacity limits, such as BlockingQueue. If it has reached its maximum capacity and will not be expanded, an exception will be thrown; but if the offer (e), it will return false.

So how to choose? ：

First of all, if you want to use it, use the same set of API, and the front and back must be unified;
Secondly, according to needs. If you need it to throw exceptions, use the one that throws exceptions; but it is basically not used when doing algorithm problems, so just choose the one that returns special values.

Deque

Deque can be entered and exited from both ends, so it is naturally targeted at the First end. Operations and operations on the Last side, each side has two groups, one group throws an exception, and the other group returns a special value:

Function	Throw exception	Return value
Add	add(e)	offer(e)

FunctionThrows an exception Return value##increaseDeleteLook

The same applies when used. If you want to use it, use the same group.

These APIs of Queue and Deque have a time complexity of O(1), to be precise, it is an amortized time complexity.

Implementation classes

Their implementation classes have these three:

So,

If you want to implement the semantics of "ordinary queue - first in, first out", use LinkedList or ArrayDeque to achieve it;
If If you want to realize the semantics of "priority queue", use PriorityQueue;
If you want to realize the semantics of "stack", use ArrayDeque.

Let’s look at them one by one.

When implementing a common queue, How to choose between LinkedList or ArrayDeque?

Let’s take a look at the top-voted answer on StackOverflow^[2]:

Summary It is recommended to use ArrayDeque because of its high efficiency, while LinkedList will have other additional overhead.

What are the differences between ArrayDeque and LinkedList?

Still with the same question just now, this is the best summary I think:

ArrayDeque It is an expandable array, and LinkedList is a linked list structure;
ArrayDeque cannot store null values, but LinkedList can;
ArrayDeque in It is more efficient to operate the addition and deletion operations at the head and tail, but the removal of LinkedList is O(1) only when an element in the middle is to be removed and the element has been found;
ArrayDeque is more efficient in terms of memory usage.

So, as long as you don’t have to store null values, choose ArrayDeque!

So if a very senior interviewer asks you, under what circumstances should you choose to use LinkedList?

答：Java 6 以前。。。因为 ArrayDeque 在 Java 6 之后才有的。。

为了版本兼容的问题，实际工作中我们不得不做一些妥协。。

那最后一个问题，就是关于 Stack 了。

Stack

Stack 在语义上是 后进先出（LIFO） 的线性数据结构。

有很多高频面试题都是要用到栈的，比如接水问题，虽然最优解是用双指针，但是用栈是最直观的解法也是需要了解的，之后有机会再专门写吧。

那在 Java 中是怎么实现栈的呢？

虽然 Java 中有 Stack 这个类，但是呢，官方文档都说不让用了！

原因也很简单，因为 Vector 已经过被弃用了，而 Stack 是继承 Vector 的。

那么想实现 Stack 的语义，就用 ArrayDeque 吧：

Deque<Integer> stack = new ArrayDeque<>();

Set

最后一个 Set，刚才已经说过了 Set 的特定是无序，不重复的。

就和数学里学的「集合」的概念一致。

Set 的常用实现类有三个：

HashSet: 采用 Hashmap 的 key 来储存元素，主要特点是无序的，基本操作都是 O(1) 的时间复杂度，很快。

LinkedHashSet: 这个是一个 HashSet + LinkedList 的结构，特点就是既拥有了 O(1) 的时间复杂度，又能够保留插入的顺序。

TreeSet: 采用红黑树结构，特点是可以有序，可以用自然排序或者自定义比较器来排序；缺点就是查询速度没有 HashSet 快。

那每个 Set 的底层实现其实就是对应的 Map：

The value is placed on the key in the map, and a PRESENT is placed on the value. It is a static Object, equivalent to a place holder, and each key points to this object.

So the specific implementation principle, add, delete, modify, check four operations, and hash conflict, hashCode( )/equals() and other issues have been discussed in the HashMap article, so I won’t go into details here. Friends who haven’t seen it can reply “HashMap” in the background of the official account to get the article~

Summary

Go back to the picture at the beginning, is it clearer? Woolen cloth?

There is actually a lot of content under each data structure, such as PriorityQueue. This article does not go into details, because it will take a long time for this guy to talk about it. .

If you think the article is good, the likes at the end of the article are back again, Remember to give me "like" and "reading"~

Finally, there are many Readers have been asking me about the communication group, but I haven’t done it because I have a jet lag and it’s difficult to manage it.

But now I have found a professional administrator to manage it with me, so "Sister Qi's Secret Base" is under preparation, and I will invite some big guys at home and abroad to come and bring it to everyone A different perspective.

The first phase of the exchange group is planned to be opened in mid-to-early July. I will send invitations in Moments then, so stay tuned!


addFirst(e)/ addLast(e)	offerFirst( e)/ offerLast(e)
removeFirst()/ removeLast()	pollFirst()/ pollLast()
getFirst()/ getLast()	peekFirst()/ peekLast()

The above is the detailed content of It’s enough to read this article about Java collection framework. For more information, please follow other related articles on the PHP Chinese website!

Statement：

This article is reproduced at:Java学习指南. If there is any infringement, please contact admin@php.cn delete

Previous article：Netty Journey: All the NIO knowledge points you want are available here!Next article：Netty Journey: All the NIO knowledge points you want are available here!

See more