This article will take you to understand the memory management and garbage collection algorithm of the V8 engine. I hope it will be helpful to you!
As we all know, JS automatically manages garbage collection, and developers do not need to care about memory allocation and recycling. Moreover, the garbage collection mechanism is also a commonly tested part in front-end interviews. This article mainly explains the generational garbage collection algorithm of V8. I hope that after reading this article, friends can have a painful
about the V8
garbage collection mechanism (haha, it is painful
!!!), the article mainly covers the following content:
V8
Memory limitations and solutionsScavenge
Algorithmreachability analysis algorithm
Logic and optimization methods for marking surviving objectsScavenge
The depth/breadth first difference of the algorithmGC
STW
Causes and optimization strategiesV8 initially Designed for browsers, there are fewer scenarios where large memory is used. There are restrictions on memory usage by default in the design, and only part of the memory is allowed to be used. The 64-bit system allows about 1.4g of memory, and the 32-bit system allows about 0.7g of memory. As shown in the following code, check the memory limit method of the dependent V8 engine in Node:
process.memoryUsage(); // 返回内存的使用量,单位字节 { rss: 22953984, // 申请的总的堆内存 heapTotal: 9682944, // 已使用的堆内存 heapUsed: 5290344, external: 9388 }
V8
There is another way to limit the memory usage size The important reason is that when the heap memory is too large, V8
takes a long time to perform garbage collection (1.5g
takes 50ms
), and it takes longer to do non-incremental garbage collection. Long time (1.5g
to 1s
). After explaining the garbage collection mechanism of V8
later, I believe everyone will be able to relate to it more.
Although the V8
engine has limits on memory usage, the method to modify the memory limit is also exposed, which is to add relevant parameters when starting the V8
engine. The following code is demonstrated in Modify the dependent V8
engine memory limit in Node
:
# 更改老生代的内存限制,单位mb node --max-old-space-size=2048 index.js # 更改新生代的内存限制,单位mb node --max-semi-space-size=1024=64 index.js
It should be noted here that the syntax of the changed new generation memory has been changed to the above writing method, and The unit has also changed from kb
to mb
. The old way of writing is node --max-new-space-size
. You can query the current ## through the following command. #NodeEnvironment to modify the syntax of the new generation memory:
node --v8-options | grep max
divide memory garbage into generations based on the survival time of objects. Generational garbage collection algorithms implement different methods for different types of memory garbage. recycling algorithm.
V8 Divide the memory into two types: New generation
and Old generation
:
), the old generation memory is stored in the old generation memory space (oldspace
), as shown in the following figure:
Old generation memory uses Mark-Compact
algorithms
!
algorithm is used. The specific implementation of Scavenge
is Cheney
Algorithm. Cheney
The algorithm is to divide the new generation memory space into two, one space is in use (FromSpace
), and the other space is in idle state (called ToSpace
) .
在内存开始分配时,首先在FromSpace
中进行分配,垃圾回收机制执行时会检查FromSpace
中的存活对象,存活对象会被会被复制到ToSpace
,非存活对象所占用的空间将被释放,复制完成后FromSpace
和ToSpace
的角色将翻转。当一个对象多次复制后依然处于存活状态,则认为其是长期存活对象,此时将发生晋升,然后该对象被移动到老生代空间oldSpace
中,采用新的算法进行管理。
Scavenge
算法其实就是在两个空间内来回复制存活对象,是典型的空间换时间做法,所以非常适合新生代内存,因为仅复制存活的对象且新生代内存中存活对象是占少数的。但是有如下几个重要问题需要考虑:
假设存在三个对象temp1、temp2、temp3
,其中temp2、temp3
都引用了temp1
,js代码示例如下:
var temp2 = { ref: temp1, } var temp3 = { ref: temp1, } var temp1 = {}
从FromSpace
中拷贝temp2
到ToSpace
中时,发现引用了temp1
,便把temp1
也拷贝到ToSpace
,是一个递归的过程。但是在拷贝temp3
时发现也引用了temp1
,此时再把temp1
拷贝过去则重复了。
要避免重复拷贝,做法是拷贝时给对象添加一个标记visited
表示该节点已被访问过,后续通过visited
属性判断是否拷贝对象。
还是上述引用关系,由于temp1
不需要重复拷贝,temp3
被拷贝到ToSpace
之后不知道temp1
对象在ToSpace
中的内存地址。
做法是temp1
被拷贝过去后该对象节点上会生成新的field
属性指向新的内存空间地址,同时更新到旧内存对象的forwarding
属性上,因此temp3
就可以通过旧temp1
的forwarding
属性找到在ToSpace
中的引用地址了。
内存对象同时存在于新生代和老生代之后,也带来了问题:
const temp1 = {} const temp2 = { ref: temp1, }
比如上述代码中的两个对象temp1
和temp2
都存在于新生代,其中temp2
引用了temp1
。假设在经过GC
之后temp2
晋升到了老生代,那么在下次GC
的标记阶段,如何判断temp1
是否是存活对象呢?
在基于可达性分析算法中要知道temp1
是否存活,就必须要知道是否有根对象引用
引用了temp1
对象。如此的话,年轻代的GC
就要遍历所有的老生代对象判断是否有根引用对象引用了temp1
对象,如此的话分代算法就没有意义了。
解决版本就是维护一个记录所有的跨代引用的记录集,它是写缓冲区
的一个列表。只要有老生代中的内存对象指向了新生代内存对象时,就将老生代中该对象的内存引用记录到记录集中。由于这种情况一般发生在对象写的操作,顾称此为写屏障,还一种可能的情况就是发生在晋升时。记录集的维护只要关心对象的写操作和晋升操作即可。此是又带来了另一个问题:
优化的手段是在一些Crankshaft
操作中是不需要写屏障的,还有就是栈上内存对象的写操作是不需要写屏障的。还有一些,更多的手段就不在这里过多讨论。
Scavenge
算法内存利用率不高问题新生代内存中存活对象占比是相对较小的,因此可以在分配空间时,ToSpace
可以分配的小一些。做法是将ToSpace
空间分成S0
和S1
两部分,S0
用作于ToSpace
,S1
与原FromSpace
合并当成FromSpace
。
垃圾回收算法中,识别内存对象是否是垃圾的机制一般有两种:引用计数和基于可达性分析。
Based on reachability analysis, it is to find all root references (such as global variables, etc.), traverse all root references, all references on the recursive root reference, everything that is traversed is Live objects and mark them. At this time, other memory objects in the space are dead objects, thus constructing a directed graph.
Taking into account the limitations of recursion, recursive logic generally adopts non-recursive implementation. Common ones include breadth-first and depth-first algorithms. The difference between the two is:
ToSpace
, making objects with reference relationships closer. The reason is that after copying itself, the object referenced by itself is copied directly, so the related objects are closer in ToSpace
Because of the CPU's caching strategy, when reading a memory object, there is a high probability that the objects behind it will be read together, in order to hit the cache faster. Because a very common scenario during code development is obj1.obj2.obj3
. At this time, when the CPU reads obj1
, if the following obj2
, If obj3
is read together, it will be very helpful to hit the cache.
So the depth-first algorithm is more conducive to business logic hitting the cache, but its implementation requires additional stack-assisted implementation of the algorithm, which consumes memory space. Breadth first, on the contrary, cannot improve cache hits, but its implementation can use pointers to cleverly avoid space consumption, and the algorithm execution efficiency is high.
If memory objects in the new generation want to be promoted to the old generation, they need to meet the following conditions:
Scavenge
RecyclingToSpace
The memory usage ratio cannot exceed the limitJudge whether it has been experienced# The logic of ##Scavenge's GC is to give the
age attribute of the surviving object
1 every time
GC, and when
GC## is done again Just judge the age
attribute when #. The basic promotion diagram is as follows:
There are many long-term surviving objects in the old generation memory, and the reason why the
Scavenge algorithm cannot be recycled is:
(Mark-Sweep
) and Mark-Sweep
(Mark -Compact
) combination method. Marking and clearing is divided into two parts:
As shown in the figure above, one problem with mark clearing is that after clearing, there is a discontinuous space that cannot be used anymore, so memory cleaning of the old generation memory space is required. A solution combined with tagging. This solution is to move the living objects to one side during the marking process, and then clean up and remove all non-living objects outside the boundary after the movement is completed.
Full pause of garbage collection", which is often called Stop The World, referred to as STW
. Garbage collection of young generation memory has little impact on application execution, but because there are many surviving objects in old generation memory, the impact of full pauses caused by garbage collection of old generation memory is very large.
In order to optimize the full pause time of GC, V8 also introduces
incremental markers, concurrency markers
,parallel Mark
, Incremental cleaning
, Parallel cleaning
, Delayed cleaning
and other methods.
. The impact of STW is unacceptable, so V8 has also adopted many optimization methods. <ul><li>Parallel GC</li></ul>
<p>The GC process needs to do a lot of things, which will lead to the STW phenomenon on the main thread. The method of parallel GC is to open multiple auxiliary threads to share the GC work. This approach still cannot avoid the STW phenomenon, but it can reduce the total time of STW, depending on the number of auxiliary threads enabled. </p>
<p><img src="https://img.php.cn/upload/image/795/713/176/165106317370481Lets%20talk%20about%20V8s%20memory%20management%20and%20garbage%20collection%20algorithm" title="165106317370481Lets talk about V8s memory management and garbage collection algorithm" alt="1Lets talk about V8s memory management and garbage collection algorithm"></p>
<ul><li>Incremental<code>GC
Incremental GC splits the GC work and executes it on the main thread Intermittent step-by-step execution. This approach will not reduce the GC time. On the contrary, it will cost a little, but it will also reduce the total STW time of the GC.
GC
Concurrent GC means that GC runs in the background and no longer runs on the main thread . This approach will avoid the STW phenomenon.
GC
The rendering of animations in Chrome
is approx. ##60 frames (each frame is about
16ms), if the current rendering time reaches
16.6ms, then there will be free time to do other things, such as part
GCTask.
Node. Otherwise, storing a large number of user session objects in the memory will cause the memory of the old generation to surge, affecting cleanup performance and thus affecting application execution performance and memory overflow. Improved ways to use redis, etc. Benefits of moving the cache externally:
nodejs tutorial!
The above is the detailed content of Let's talk about V8's memory management and garbage collection algorithm. For more information, please follow other related articles on the PHP Chinese website!