This article first briefly introduces the common methods of garbage collection, then analyzes the collection principle of the G1 collector, its advantages compared with other garbage collectors, and finally gives some tuning practices.
First of all, before understanding G1, we need to clearly know what garbage collection is? Simply put, garbage collection is to recycle objects that are no longer used in memoryObjects.
Basic steps of garbage collection
There are 2 steps in recycling:
Find objects that are no longer used in memory
Release the memory occupied by these objects
1. Find objects that are no longer used in the memory
Then the question is, how to determine which objects are no longer used? ? We also have 2 methods:
#ReferenceCounting method
##1. Mark-Copy
It divides the available memory capacity into two equal-sized blocks, and only uses one of them at a time. When this block is used up, the remaining one will be #. ##Copy the object
to another block, and then clear the used memory space at once. Its advantages are simple implementation, high efficiency, and no memory fragmentation. The disadvantage is that it requires 2 times the memory to manage. . The marking and clearing algorithm is divided into two stages: "marking" and "clearing": first, mark the objects that need to be recycled, and then clear the objects after the marking is completed. The advantage is high efficiency, but the disadvantage is that it is prone to memory fragmentation. 3. Marking-organizingThe marking operation is consistent with the "marking-cleaning" algorithm, and subsequent operations do not just clean up the object directly, but After the cleaning of useless objects is completed, all surviving objects are moved to one end and the pointers that refer to their objects are updated. Because the objects are moved, it is less efficient than "mark-clean", but it does not produce memory fragmentation. . Based on generational assumptionsBecause the survival time of objects can be long or short, for objects with long survival times, reducing the number of times they are gc can avoid unnecessary overhead. The memory is divided into the new generation and the old generation. The new generation stores newly created objects and objects with a relatively short survival time, and the old generation stores objects with a relatively long survival time. In this way, only the young generation is cleaned every time, and the old generation is only cleaned when necessary, which can greatly improve GC efficiency and save GC time. The history of java garbage collectorThe first stage, Serial (serial) collectorBefore jdk1.3.1, the java virtual machine could only use the Serial collector . The Serial collector is a single-threaded collector, but its "single-threaded" meaning does not only mean that it will only use one CPU or one collection thread to complete garbage collection work, but more importantly, it will use only one CPU or one collection thread to complete garbage collection work. , all other worker threads must be paused until its collection is complete. PS: How to turn on the Serial collector-XX:+UseSerialGCSecond stage, Parallel (parallel) collectorParallel collector Also known as the throughput collector, compared to the Serial collector, the main advantage of Parallel is that it uses multiple threads to complete garbage cleaning work, which can make full use of the characteristics of multi-core and greatly reduce the gc time. PS: How to open the Parallel collector-XX:+UseParallelGC -XX:+UseParallelOldGCThe third stage, CMS (concurrent) collectorCMS collector will suspend all application threads during Minor GC and perform garbage collection in a multi-threaded manner. During Full GC, the application thread is no longer suspended. Instead, several background threads are used to scan the old generation space regularly and recycle objects no longer used in it in a timely manner. PS: How to open the CMS collector-XX:+UseParNewGC -XX:+UseConcMarkSweepGCThe fourth stage, G1 (concurrent) collectorThe G1 collector (or garbage-first collector) is designed to minimize pauses when processing very large heaps (greater than 4GB). The advantage over CMS is that the generation rate of memory fragments is greatly reduced. PS: How to open the G1 collector-XX:+UseG1GCThe first paper (Appendix 1) of G1 was published in 2004 and was only available in jdk1.7u4 in 2012. Oracle officially plans to turn G1 into the default garbage collector in jdk9 to replace CMS. Why does Oracle strongly recommend G1? What are the advantages of G1?
First of all, G1’s design principle is simple and feasible performance tuning
Developers only need to declare the following parameters:
-XX: +UseG1GC -Xmx32g -XX:MaxGCPauseMillis=200
Among them -XX:+UseG1GC turns on the G1 garbage collector, -Xmx32g designs the maximum heap memory to 32G, -XX:MaxGCPauseMillis=200 sets the maximum GC The pause time is 200ms. If we need to tune, when the memory size is certain, we only need to modify the maximum pause time.
Secondly, G1 cancels the physical space division between the new generation and the old generation.
In this way, we no longer need to set up each generation in a separate space, and we do not need to worry about whether the memory of each generation is enough.
Instead, the G1 algorithm divides the heap into several regions (Regions), which still belong to the generational collector. However, part of these areas includes the new generation. Garbage collection in the new generation still uses the method of pausing all application threads and copying surviving objects to the old generation or Survivor space. The old generation is also divided into many areas, and the G1 collector completes the cleanup work by copying objects from one area to another. This means that during normal processing, G1 completes the compression of the heap (at least part of the heap), so there will be no cms memory fragmentation problem.
In G1, there is also a special area called the Humongous area. If an object occupies more than 50% of the partition capacity, the G1 collector considers it a giant object. These giant objects will be allocated directly in the old generation by default, but if it is a short-lived giant object, it will have a negative impact on the garbage collector. In order to solve this problem, G1 divides a Humongous area, which is used to store giant objects. If a huge object cannot fit in one H partition, G1 will look for a continuous H partition to store it. In order to find the continuous H area, Full GC sometimes has to be started.
PS: In Java 8, the persistent generation has also been moved to the ordinary heap memory space and changed to metaspace.
Object Allocation Strategy
Speaking of the allocation of large objects, we have to talk about the object allocation strategy. It is divided into 3 stages:
TLAB (Thread Local Allocation Buffer) thread local allocation buffer
Allocation in the Eden area
Humongous area allocation
TLAB allocates buffers locally for threads. Its purpose is to allocate objects as quickly as possible. If objects are allocated in a shared space, we need to use some synchronization mechanism to manage free space pointers in these spaces. In the Eden space, each thread has a fixed partition for allocating objects, that is, a TLAB. When allocating objects, no synchronization is required between threads.
For objects that cannot be allocated in the TLAB space, the JVM will try to allocate them in the Eden space. If the Eden space cannot accommodate the object, space can only be allocated in the old generation.
Finally, G1 provides two GC modes, Young GC and Mixed GC, both of which are Stop The World (STW). Below we will introduce these two modes respectively.
Young GC mainly performs GC on the Eden area. It will be triggered when the Eden space is exhausted. In this case, the data in the Eden space is moved to the Survivor space. If the Survivor space is not enough, part of the data in the Eden space will be directly promoted to the old generation space. The data in the Survivor area is moved to the new Survivor area, and some data is also promoted to the old generation space. Finally, the data in the Eden space is empty, the GC stops working, and the application thread continues to execute.
At this time, we need to consider a question. If we only GC the new generation objects, how do we find all the root objects? Are all objects in the old generation roots? Scanning like this will take a lot of time. Therefore, G1 introduced the concept of RSet. Its full name is Remembered Set, and its function is to track object references pointing to a certain heap area.
In CMS, there is also the concept of RSet. There is an area in the old generation to record references to the new generation. This is a point-out. When performing Young GC, when scanning the root, only this area needs to be scanned, and there is no need to scan the entire old generation.
But in G1, point-out is not used. This is because one partition is too small and there are too many partitions. If point-out is used, it will cause a lot of scanning waste, and some partitions do not require GC at all. References were also scanned. So point-in is used in G1 to solve it. Point-in means which partitions refer to objects in the current partition. This way, scanning these objects only as roots avoids invalid scans. Since there are multiple young generations, do we need to record references between new generations? This is unnecessary because all new generations will be scanned every time GC occurs, so only references from the old generation to the new generation need to be recorded.
It should be noted that if there are many referenced objects, the assignor needs to process each reference, and the assignor overhead will be very large. In order to solve the problem of assignor overhead, another one is introduced in G1 Concept, Card Table. A Card Table logically divides a partition into continuous areas of fixed size, and each area is called a card. Cards are usually smaller, between 128 and 512 bytes. Card Table is usually a byte array, and the space address of each partition is identified by the index of Card (i.e., array subscript). By default, each card is unreferenced. When an address space is referenced, the value of the array index corresponding to this address space is marked as "0", that is, marked as dirty and referenced. In addition, RSet also records the array subscript. Under normal circumstances, this RSet is actually a Hash Table, the Key is the starting address of another Region, the Value is a set, and the elements inside are the Index of the Card Table.
Young GC Phases:
Phase 1: Root Scan
Static and local objects are scanned
Phase 2 : Update RS
Process dirty card queue update RS
Phase 3: Process RS
Detect objects pointing from the young generation to the old generation
Phase 4: Object copy
Copy the surviving object to the survivor/old area
Phase 5: Process the reference queue
Soft reference, weak reference, virtual reference Processing
At this stage, G1 GC marks the root. This phase is closely related to regular (STW) young generation garbage collection.
G1 GC scans references to the old generation in the initially marked survival area and marks the referenced objects. This phase runs concurrently with the application (non-STW), and only after this phase is completed can the next STW young generation garbage collection begin.
G1 GC searches the entire heap for accessible (live) objects. This phase runs concurrently with the application and can be interrupted by STW young generation garbage collection
This phase is STW collection that helps complete the marking cycle. G1 GC clears the SATB buffer, keeps track of live objects that have not been accessed, and performs reference handling.
In this final phase, G1 GC performs STW operations of statistics and RSet purification. During the accounting period, G1 GC identifies areas that are completely free and areas that are available for mixed garbage collection. The cleanup phase is partially concurrent as it resets the empty area and returns it to the free list.
After traversing all reachable objects, all reachable objects turn black. Unreachable objects are white and need to be cleaned up.
This looks great, but if the application is running during the marking process, the object pointer may change. In this case, we will encounter a problem: object loss problem
Let’s look at the following situation, when the garbage collector scans the following situation:
At this time, the application performed the following operations:
A.c=C
B.c=null
In this way, the object's state diagram becomes the following situation:
At this time, when the garbage collector marks the scan again, it will look like this:
Obviously, C is white at this time , is considered to be garbage and needs to be cleaned up, which is obviously unreasonable. So how do we ensure that the objects marked by GC are not lost when the application is running? There are 2 possible ways as follows:
Record the object when inserting
Record the object when deleting
This happens to correspond to the two different implementation methods of CMS and G1:
In CMS, incremental update (Incremental update) is used, as long as it is found in the write barrier (write barrier) that there is If a reference to a white object is assigned to a field of a black object, the white object will be turned gray. That is, it is recorded when inserted.
In G1, the STAB (snapshot-at-the-beginning) method is used to record all objects when deleting. It has three steps:
1, at the beginning When marking, a snapshot graph is generated to mark the surviving objects
2. During concurrent marking, all changed objects are enqueued (in the write barrier, all objects pointed to by old references become non-white )
3, there may be free garbage, which will be collected next time
In this way, G1 can now know which old partitions can recycle the most garbage. When the global concurrent marking is completed, at a certain moment, Mix GC starts. These garbage collections are called "hybrid" because they not only perform normal young generation garbage collection, but also collect some partitions marked by the background scan thread. Mixed garbage collection is as shown below:
Hybrid GC also uses a copy cleaning strategy. When the GC is completed, the space will be released again.
At this point, the hybrid GC has come to an end. In the next section we will get into tuning practice.
MaxGCPauseMillis tuning
The most basic parameters for using GC were introduced earlier:
- XX:+UseG1GC -Xmx32g -XX:MaxGCPauseMillis=200
The first two parameters are easy to understand. How to configure the latter MaxGCPauseMillis parameter? This parameter literally means the maximum pause time allowed for GC. G1 tries to ensure that each GC pause time is within the set MaxGCPauseMillis range. So how does G1 achieve the maximum pause time? This involves another concept, CSet (collection set). It means the set of areas that are collected in one garbage collector.
Young GC: Select all regions in the new generation. Control the cost of young GC by controlling the number of regions in the new generation.
Mixed GC: Select all regions in the new generation, plus several old generation regions with high collection income based on global concurrent marking statistics. Select the old generation region with high revenue as much as possible within the user-specified cost target.
After understanding this, it will be easier for us to set the maximum pause time. First, there is a limit to the maximum pause time we can tolerate, and we need to set it within this limit. But what value should be set? We need to make a balance between throughput and MaxGCPauseMillis. If MaxGCPauseMillis is set too small, GC will be frequent and throughput will decrease. If MaxGCPauseMillis is set too large, the application pause time will become longer. The default pause time of G1 is 200 milliseconds. We can start from here and adjust the appropriate time.
Other tuning parameters
-XX:G1HeapRegionSize=n
The size of the G1 region set. Values are powers of 2 and range from 1 MB to 32 MB. The goal is to divide around 2048 regions based on the minimum Java heap size.
-XX:ParallelGCThreads=n
Set the value of the number of STW worker threads. Set the value of n to the number of logical processors. The value of n is the same as the number of logical processors, up to a maximum of 8.
If there are more than eight logical processors, set the value of n to approximately 5/8 of the number of logical processors. This works in most cases except for larger SPARC systems, where the value of n can be around 5/16 of the number of logical processors.
-XX:ConcGCThreads=n
Set the number of threads marked in parallel. Set n to approximately 1/4 of the number of parallel garbage collection threads (ParallelGCThreads).
-XX:InitiatingHeapOccupancyPercent=45
Set the Java heap occupancy threshold that triggers the marking cycle. The default occupancy is 45% of the entire Java heap.
Avoid using the following parameters:
Avoid explicitly setting the young generation size using the -Xmn option or other related options such as -XX:NewRatio. Fixed young generation size overrides pause time goals.
Trigger Full GC
In some cases, G1 triggers Full GC. At this time, G1 will degenerate and use the Serial collector to complete the garbage cleanup work. It only uses a single thread to complete it. GC is working, and the GC pause time will reach the second level. The entire application is in a state of suspended animation and cannot handle any requests. Of course, our program does not want to see this. So what are the situations where Full GC occurs?
Concurrent mode failure
G1 starts the marking cycle, but the old generation is filled before Mix GC, and G1 will give up at this time Mark cycle. In this case, you need to increase the heap size or adjust the cycle (such as increasing the number of threads -XX:ConcGCThreads, etc.).
Promotion failure or evacuation failure
G1 does not have enough memory for surviving objects or promotion objects to use when performing GC, which triggers Full GC. You can see (to-space exhausted) or (to-space overflow) in the log. The way to solve this problem is:
a. Increase the value of the -XX:G1ReservePercent option (and increase the total heap size accordingly) to increase the amount of reserved memory for the "target space".
b, Start the marking cycle in advance by reducing -XX:InitiatingHeapOccupancyPercent.
c, you can also increase the number of parallel marking threads by increasing the value of the -XX:ConcGCThreads option.
Failed to allocate giant objects
When a giant object cannot find suitable space for allocation, Full GC will be started to release the space . In this case, you should avoid allocating a large number of giant objects, increase memory, or increase -XX:G1HeapRegionSize so that the giant object is no longer a giant object.
Due to limited space, there are many tuning practices for G1, so I won’t list them one by one here. You can explore them slowly in daily practice. Finally, I look forward to the official release of Java 9. Will the performance of Java, which uses G1 as the garbage collector by default, improve again?
The above is the detailed content of Parsing the Java G1 Garbage Collector. For more information, please follow other related articles on the PHP Chinese website!