Home>Article>Web Front-end> Let’s talk about the GC (garbage collection) mechanism in Node.js
NodeHow does GC (garbage collection) work? The following article will take you through it.
GC, Garbage Collection, garbage collection. In programming, it generally refers to the automatic memory recycling mechanism, which regularly clears data that is not needed.
Node.js uses the V8 engine underneath. V8 is a high-performance JavaScript engine open sourced by Google, written in C. [Related tutorial recommendations:nodejs video tutorial]
The memory of Node.js is mainly divided into three parts:
Code space: where code segments are stored Place;
Stack: Temporary variables generated by the function call stack, which are some basic types, such as numbers, strings, Boolean values, and object references (the address is saved, the object is not saved itself).
Heap: stores objects and other data;
Node.js underlying usage The one is V8. Let’s explain the memory recycling mechanism of V8.
First of all, all objects in JS will be stored in heap memory. When the process is created, an initial size of heap memory will be allocated, and then our objects will be placed in it.
When there are more and more objects, the heap memory will not be enough, and the heap memory will dynamically expand. If a maximum limit is reached (usually 4GB nowadays), a heap overflow error occurs and the Node.js process is terminated.
V8 first divides the memory into two parts, or two generations:
Young generation: saves some objects with short survival time;
Old generation (old generation): saves objects with long survival time or long-term persistence.
The new generation is very small, and some objects with short survival times are stored here. They are usually recycled frequently (such as some temporary objects in the call stack of functions).
The size of the new generation can be modified through
node --max-semi-space-size=SIZE index.js
, the unit is MB.In addition, the old generation uses
--max-old-space-size=SIZE
to set the
The new generation uses the Scavenge algorithm, which is an algorithm based on copy.
The new generation will be divided into two spaces. This space is called semispace. They are:
From space: newly declared objects will be placed here
To space: the space used for relocation
Newly declared objects will be placed in the From space. The objects in the From space are closely arranged. Pointers, the previous object is close to the next object, and the memory is continuous, so there is no need to worry about memory fragmentation.
The so-called memory fragmentation refers to uneven space allocation, resulting in a large number of small continuous spaces that cannot fit into a large object.
When the From space is almost full, we will traverse to find the active objects and copy them to the To space. At this point, the From space is actually empty, and then we swap the identities of From and To.
If some objects are copied multiple times, they will be considered to have a longer survival time and will be moved to the old generation.
The advantage of this copy-based algorithm is that it can handle the problem of memory fragmentation very well. The disadvantage is that it will waste some space as the moving space location. In addition, because copying is time-consuming, it is not suitable for allocation that is too large. The memory space is more of an auxiliary GC.
The space of the old generation is much larger than that of the new generation. It contains some objects that have a long survival time. Use is the Mark-Sweep (mark removal) algorithm.
The first is the marking stage. Find all accessible objects from the Root Set (execution stack and global objects) and mark them as active objects.
After marking, it is the clearing phase. Clearing the unmarked objects is actually marking the memory address as free.
This approach will lead tofree memory space fragmentation. When we create a large continuous object, we will not be able to find a place to put it down. At this time, Mark-Compact (mark compaction) must be used to integrate the fragmented active objects.
Mark-Compact will move all active object copies to one end, and then the other side of the boundary will be a whole block of contiguous available memory.
Considering that Mark-Sweep and Mark-Compact take a long time and will block the JavaScript thread, we usually do not do it all at once, but use incremental marking (Incremental Marking) . That means marking intermittently, taking small steps, and alternating garbage collection and application logic.
In addition, V8 also performs parallel marking and parallel cleaning to improve execution efficiency.
We can get some memory-related information through the process.memoryUsage method.
process.memoryUsage();
The output content is:
{ rss: 35454976, heapTotal: 7127040, heapUsed: 5287088, external: 958852, arrayBuffers: 11314 }
Description
rss: resident set size (resident set size), including code snippets, heap memory, Stack and other parts.
heapTotal: Total heap memory size of V8;
heapUsed: Heap memory occupied;
external: The memory size other than V8 refers to the memory occupied by C objects, such as Buffer data.
arrayBuffers: The memory size related toArrayBuffer
andSharedArrayBuffer
, which is part of external.
The units of the above numbers are bytes.
Write a script, use a timer, make an array grow continuously, and print the heap memory usage until the memory overflows .
const format = function (bytes) { return (bytes / 1024 / 1024).toFixed(2) + " MB"; }; const printMemoryUsage = function () { const memoryUsage = process.memoryUsage(); console.log( `heapTotal: ${format(memoryUsage.heapTotal)}, heapUsed: ${format( memoryUsage.heapUsed )}` ); }; const bigArray = []; setInterval(function () { bigArray.push(new Array(20 * 1024 * 1024)); printMemoryUsage(); }, 500);
Special attention should be paid to not using Buffer for testing.
Because Buffer is a binary object unique to Node.js. It is not implemented in V8. It is implemented separately by Node.js in C. It does not allocate memory through V8 and belongs to off-heap memory.
The computer I use is macbook pro M1 Pro, the Node.js version isv16.17.0
, and the V8 version used is 9.4.146.26-node.22 (via process.versions .v8 gets).
The output result is (some redundant information is omitted):
heapTotal: 164.81 MB, heapUsed: 163.93 MB heapTotal: 325.83 MB, heapUsed: 323.79 MB heapTotal: 488.59 MB, heapUsed: 483.84 MB ... heapTotal: 4036.44 MB, heapUsed: 4003.37 MB heapTotal: 4196.45 MB, heapUsed: 4163.29 MB <--- Last few GCs ---> [28033:0x140008000] 17968 ms: Mark-sweep 4003.2 (4036.4) -> 4003.1 (4036.4) MB, 2233.8 / 0.0 ms (average mu = 0.565, current mu = 0.310) allocation failure scavenge might not succeed [28033:0x140008000] 19815 ms: Mark-sweep 4163.3 (4196.5) -> 4163.1 (4196.5) MB, 1780.3 / 0.0 ms (average mu = 0.413, current mu = 0.036) allocation failure scavenge might not succeed <--- JS stacktrace ---> FATAL ERROR: Reached heap limit Allocation failed - JavaScript heap out of memory ...
It can be seen that the memory limit is exceeded after 4000 MB, a heap overflow occurs, and then the process exits. Note that on my machine, the default maximum memory is 4G.
The actual maximum memory is related to the machine on which it runs. If the memory size of your machine is 2G, the maximum memory will be set to 1.5G.
For more node-related knowledge, please visit:nodejs tutorial!
The above is the detailed content of Let’s talk about the GC (garbage collection) mechanism in Node.js. For more information, please follow other related articles on the PHP Chinese website!