Home  >  Article  >  Java  >  Research on recycling mechanism in Java virtual machine

Research on recycling mechanism in Java virtual machine

一个新手
一个新手Original
2017-09-07 15:38:411713browse


1: Overview

Speaking of garbage collection (Garbage Collection, GC), many people will naturally associate it with Java. In Java, programmers do not need to worry about dynamic memory allocation and garbage collection. As the name suggests, garbage collection is to release the space occupied by garbage. All this is left to the JVM to handle. This article mainly answers three questions:

1. Which memory needs to be recycled? (Which objects can be regarded as "garbage")
2. How to recycle? (Commonly used garbage collection algorithms)
3. What tools are used for recycling? (Garbage collector)

2. JVM garbage determination algorithm

Commonly used garbage determination algorithms include: reference counting algorithm and reachability analysis algorithm.

1. Reference counting algorithm

Java is associated with objects through references, which means that if you want to operate an object, you must do it through references. Add a reference counter to the object. Whenever there is a reference to it, the counter value is incremented by 1; when the reference expires, the counter value is decremented by 1; an object with a counter of 0 at any time cannot be used anymore, that is, Indicates that the object can be considered "garbage" for recycling.

The reference counter algorithm is simple to implement and highly efficient; however, it cannot solve the problem of circular references (object A refers to object B, object B refers to object A, but objects A and B are no longer referenced by any other objects), At the same time, each increase and decrease of the counter brings a lot of additional overhead, so after JDK1.1, this algorithm is no longer used. Code:

public class Main {    
    public static void main(String[] args) {
        MyTest test1 = new MyTest();
        MyTest test2 = new MyTest();

        test1.obj  = test2;
        test2.obj  = test1;//test1与test2存在相互引用 

        test1 = null;
        test2 = null;

        System.gc();//回收
    }
}

class MyTest{    
    public Object obj = null;
}

Although test1 and test2 are finally assigned null, which means that the objects pointed to by test1 and test2 can no longer be accessed, but because they refer to each other, their reference counts are not equal. 0, then the garbage collector will never reclaim them. After running the program, we can see from the memory analysis that the memory of these two objects is actually recycled. This also shows that the current mainstream JVM does not use the reference counter algorithm as the garbage determination algorithm.

2. Reachability analysis algorithm (root search algorithm)

The root search algorithm uses some "GC Roots" objects as the starting point, searches downwards from these nodes, and searches for the path passed. Becomes a reference chain
(Reference Chain). When an object is not connected by the reference chain of GC Roots, it means that the object is unavailable.

Research on recycling mechanism in Java virtual machine

GC Roots objects include:
a) Objects referenced in the virtual machine stack (local variable table in the stack frame).
b) The object referenced by the class static property in the method area.
c) The object referenced by the constant in the method area.
d) The object referenced by JNI (generally known as Native method) in the local method stack.

In the reachability analysis algorithm, unreachable objects are not "necessary to die". At this time, they are temporarily in the "suspended" stage. To truly declare an object dead, it takes at least two Secondary marking process: If the object is found to have no reference chain connected to GC Roots after the reachability analysis, it will be marked for the first time and filtered. The filtering condition is whether the object needs to execute finalize() method. When the object does not cover the finalize() method, or the finalize() method has been called by the virtual machine, the virtual machine treats both situations as "no need to execute". Note that the finalize() method of any object will only be automatically executed once by the system.

If this object is determined to need to execute the finalize() method, then this object will be placed in a queue called F-Queue, and will be later automatically created by a virtual machine. The priority Finalizer thread executes it. The so-called "execution" here means that the virtual machine triggers this method, but does not promise to wait for it to finish running. The reason for this is that if an object executes slowly in the finalize() method, or an infinite loop occurs, it will be very difficult. It may cause other objects in the F-Queue queue to wait permanently, or even cause the entire memory recycling system to crash. Therefore, calling the finalize() method does not mean that the code in this method can be completely executed.

The finalize() method is the last chance for the object to escape the fate of death. Later, the GC will mark the object in the F-Queue for a second small scale. If the object is to be successfully rescued in finalize() Self - Just re-associate yourself with any object in the reference chain, for example, assign yourself (this keyword) to a class variable or member variable of the object, and it will be removed when marked for the second time. Out of the "about to be recycled" collection; if the object has not escaped at this time, then it is basically really recycled. From the following code we can see that an object's finalize() is executed, but it can still survive.

/**   
 * 此代码演示了两点:   
 * 1.对象可以在被GC时自我拯救。   
 * 2.这种自救的机会只有一次,因为一个对象的finalize()方法最多只会被系统自动调用一次   
 */    public class FinalizeEscapeGC {    

  public static FinalizeEscapeGC SAVE_HOOK = null;    

  public void isAlive() {    
   System.out.println("yes, i am still alive :)");    
  }    

  @Override    
  protected void finalize() throws Throwable {    
   super.finalize();    
   System.out.println("finalize mehtod executed!");    
   FinalizeEscapeGC.SAVE_HOOK = this;    
  }    

  public static void main(String[] args) throws Throwable {    
   SAVE_HOOK = new FinalizeEscapeGC();    

   //对象第一次成功拯救自己    
   SAVE_HOOK = null;    
   System.gc();    
   //因为finalize方法优先级很低,所以暂停0.5秒以等待它    
   Thread.sleep(500);    
   if (SAVE_HOOK != null) {    
    SAVE_HOOK.isAlive();    
   } else {    
    System.out.println("no, i am dead :(");    
   }    

   //下面这段代码与上面的完全相同,但是这次自救却失败了    
   SAVE_HOOK = null;    
   System.gc();    
   //因为finalize方法优先级很低,所以暂停0.5秒以等待它    
   Thread.sleep(500);    
   if (SAVE_HOOK != null) {    
    SAVE_HOOK.isAlive();    
   } else {    
    System.out.println("no, i am dead :(");    
   }    
  }    
}

Running results:

finalize mehtod executed!    
yes, i am still alive :)    
no, i am dead :(

从运行结果可以看出,SAVE_HOOK对象的finalize()方法确实被GC收集器调用过,且在被收集前成功逃脱了。
另外一个值得注意的地方是,代码中有两段完全一样的代码片段,执行结果却是一次逃脱成功,一次失败,这是因为任何一个对象的finalize()方法都只会被系统自动调用一次,如果对象面临下一次回收,它的finalize()方法不会被再次执行,因此第二段代码的自救行动失败了。

三、JVM垃圾回收算法

常用的垃圾回收算法包括:标记-清除算法,复制算法,标记-整理算法,分代收集算法

1、标记—清除算法(Mark-Sweep)(DVM 使用的算法)

标记—清除算法包括两个阶段:“标记”和“清除”。在标记阶段,确定所有要回收的对象,并做标记。清除阶段紧随标记阶段,将标记阶段确定不可用的对象清除。标记—清除算法是基础的收集算法,标记和清除阶段的效率不高,而且清除后回产生大量的不连续空间,这样当程序需要分配大内存对象时,可能无法找到足够的连续空间。

Research on recycling mechanism in Java virtual machine

2、复制算法(Copying)

复制算法是把内存分成大小相等的两块,每次使用其中一块,当垃圾回收的时候,把存活的对象复制到另一块上,然后把这块内存整个清理掉。复制算法实现简单,运行效率高,但是由于每次只能使用其中的一半,造成内存的利用率不高。现在的JVM 用复制方法收集新生代,由于新生代中大部分对象(98%)都是朝生夕死的,所以两块内存的比例不是1:1(大概是8:1)。

Research on recycling mechanism in Java virtual machine

3、标记—整理算法(Mark-Compact)

标记—整理算法和标记—清除算法一样,但是标记—整理算法不是把存活对象复制到另一块内存,而是把存活对象往内存的一端移动,然后直接回收边界以外的内存。标记—整理算法提高了内存的利用率,并且它适合在收集对象存活时间较长的老年代。

Research on recycling mechanism in Java virtual machine

4、分代收集(Generational Collection)

分代收集是根据对象存活周期的不同将内存划分为几块。一般是把Java堆分为新生代和老年代,这样就可以根据各个年代的特点采用最适当的收集算法。在新生代中,每次垃圾收集时都发现有大批对象死去,只有少量存活,那就选用复制算法,只需要付出少量存活对象的复制成本就可以完成收集。而老年代中因为对象存活率高、没有额外空间对它进行分配担保,就必须使用“标记—清理”或者“标记—整理”算法来进行回收。

四、垃圾收集器

如果说垃圾收集算法是内存回收的方法论,那么垃圾收集器就是内存回收的具体实现。上面说过,各个平台虚拟机对内存的操作各不相同,因此本章所讲的收集器是基于JDK1.7Update14之后的HotSpot虚拟机。这个虚拟机包含的所有收集器如图:

Research on recycling mechanism in Java virtual machine

1、Serial收集器

Serial收集器是最基本、发展历史最悠久的收集器,曾经(在JDK 1.3.1之前)是虚拟机
新生代收集的唯一选择。大家看名字就会知道,这个收集器是一个单线程的收集器,但它
的“单线程”的意义并不仅仅说明它只会使用一个CPU或一条收集线程去完成垃圾收集工作,
更重要的是在它进行垃圾收集时,必须暂停其他所有的工作线程,直到它收集结束。

Research on recycling mechanism in Java virtual machine

2、ParNew收集器

ParNew收集器其实就是Serial收集器的多线程版本,除了使用多条线程进行垃圾收集之
外,其余行为包括Serial收集器可用的所有控制参数(例如:-XX:SurvivorRatio、-XX:
PretenureSizeThreshold、-XX:HandlePromotionFailure等)、收集算法、Stop The World、对
象分配规则、回收策略等都与Serial收集器完全一样,在实现上,这两种收集器也共用了相
当多的代码。

Research on recycling mechanism in Java virtual machine

3. Parallel Scavenge Collector

Research on recycling mechanism in Java virtual machine

The characteristic of Parallel Scavenge collector is that its focus is different from other collectors. The goal of Parallel Scavenge collector is to achieve a Controllable throughput. The so-called throughput is the ratio of the time the CPU spends running user code to the total CPU time consumed, that is, throughput = time running user code/(time running user code + garbage collection time). Due to its close relationship with throughput, the Parallel Scavenge collector is often called a "throughput-first" collector.

4. Serial Old Collector

Serial Old is the old generation version of the Serial collector. It is also a single-threaded collector and uses the "mark-sort" algorithm. The main significance of this collector is also to be used by virtual machines in Client mode. If it is in Server mode, it has two main uses: one is to be used with the Parallel Scavenge collector in JDK 1.5 and previous versions [1], and the other is to be used as a backup plan for the CMS collector. , used when ConcurrentMode Failure occurs in concurrent collection.

5. Parallel Old collector

Parallel Old is the old generation version of Parallel Scavenge collector, using multi-threading and "mark-sort" algorithm.
This collector was only provided in JDK 1.6.

6. CMS collector

Research on recycling mechanism in Java virtual machine

CMS (Concurrent Mark Sweep) collector is a collector that aims to obtain the shortest recycling pause time. At present, a large part of Java applications are concentrated on the servers of Internet websites or B/S systems. Such applications pay special attention to the response speed of the service and hope that the system pause time will be the shortest to provide users with a better experience.
The operation process is divided into 4 steps, including:
a) Initial mark (CMS initial mark)
b) Concurrent mark (CMS concurrent mark)
c) Remark (CMS remark)
d) Concurrent sweep (CMS concurrent sweep)

CMS collector has three disadvantages:
1 Sensitive to CPU resources. Generally, concurrently executed programs are sensitive to the number of CPUs
2 cannot handle floating garbage. During the concurrent cleanup phase, the user thread is still executing, and the garbage generated at this time cannot be cleaned up.
3 Due to the mark-sweep algorithm, a large amount of space fragmentation is generated.

7. G1 collector

Research on recycling mechanism in Java virtual machine

G1 is a garbage collector for server-side applications.
The operation of the G1 collector can be roughly divided into the following steps:

a) Initial Marking (Initial Marking)
b) Concurrent Marking (Concurrent Marking)
c) Final Marking ( Final Marking)
d) Screening and recycling (Live Data Counting and Evacuation)


The above is the detailed content of Research on recycling mechanism in Java virtual machine. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn