Tomcat--performance optimization-javaTutorial-php.cn

1. JVM optimization

1. Memory optimization.

2. Garbage collection strategy optimization.

2. Connector optimization of server.xml (connector is a container related to HTTP request processing. The initialization sequence of the three containers is: Server->Service->Connector)

( 1) Specify the use of NIO model to accept HTTP requests

(2) Optimization of http connector, specify the number of processing threads

(3) Thread pool

(4) Other routines Settings

3. Setting session expiration time

4. Apr plug-in improves tomcat performance

5. Cluster

6. Problem location

1. JVM Optimization

linux modify TOMCAT_HOME/bin/catalina.sh and add it in front

JAVA_OPTS="-XX:PermSize=64M -XX:MaxPermSize=128m -Xms512m -Xmx1024m -Duser.timezone=Asia/Shanghai"

Copy after login

windows modify TOMCAT_HOME/bin/catalina.bat and add it in front

set JAVA_OPTS=-XX:PermSize=64M -XX:MaxPermSize=128m -Xms512m -Xmx1024m

Copy after login

1. Memory tuning
The memory mode setting is in catalina.sh. Just adjust the JAVA_OPTS variable, because the subsequent startup parameters will treat JAVA_OPTS as the JVM startup parameters. .

The specific settings are as follows:
JAVA_OPTS="$JAVA_OPTS -Xmx3550m -Xms3550m -Xss128k -XX:NewRatio=4 -XX:SurvivorRatio=4"

The parameters are as follows:
-Xmx3550m: Set the maximum available memory of the JVM to 3550M. The maximum Heap Size should not exceed 80% of the available physical memory
-Xms3550m: Set the JVM to drive the memory to 3550m. This value can be set the same as -Xmx to avoid the JVM reallocating memory after each garbage collection.
-Xmn2g: Set the young generation size to 2G. The entire heap size = young generation size + old generation size + persistent generation size. The persistent generation generally has a fixed size of 64m, so increasing the young generation will reduce the size of the old generation. This value has a great impact on system performance. Sun officially recommends configuring it to 3/8 of the entire heap.
-Xss128k: Set the stack size of each thread. After JDK5.0, the stack size of each thread is 1M. In the past, the stack size of each thread was 256K. Adjust the memory size required for more application threads. Under the same physical memory, reducing this value can generate more threads. However, the operating system still has limits on the number of threads in a process and cannot be generated infinitely. The experience value is around 3000~5000.
-XX:NewRatio=4: Set the ratio of the young generation (including Eden and two Survivor areas) to the old generation (excluding the persistent generation). Set to 4, then the ratio of the young generation to the old generation is 1:4, and the young generation accounts for 1/5 of the entire stack
-XX:SurvivorRatio=4: Set the size ratio of the Eden area and the Survivor area in the young generation . Set to 4, then the ratio of two Survivor areas to one Eden area is 2:4, and one Survivor area accounts for 1/6 of the entire young generation
-XX:MaxPermSize=16m: Set the persistent generation size to 16m.
-XX:MaxTenuringThreshold=0: Set the maximum age of garbage. If set to 0, the young generation objects will directly enter the old generation without passing through the Survivor area. For applications with a large number of old generations, efficiency can be improved. If this value is set to a larger value, the young generation object will be copied multiple times in the Survivor area, which can increase the survival time of the object in the young generation and increase the possibility of being recycled in the young generation.

2. Garbage collection strategy tuning
The setting of garbage collection is also in catalina.sh, adjust the JAVA_OPTS variable.
The specific settings are as follows:
JAVA_OPTS="$JAVA_OPTS -Xmx3550m -Xms3550m -Xss128k -XX:+UseParallelGC -XX:MaxGCPauseMillis=100"
The specific garbage collection strategy and the parameters of the corresponding strategy are as follows:

Serial collector (the main recycling method before JDK1.5)
-XX:+UseSerialGC: Set up the serial collector
Parallel collector (throughput priority)
java - Xmx3550m -Xms3550m -Xmn2g -Xss128k -XX:+UseParallelGC -XX:MaxGCPauseMillis=100

-XX:+UseParallelGC: Select the garbage collector as a parallel collector. This configuration is only valid for the young generation. That is, under the above configuration, the young generation uses concurrent collection, while the old generation still uses serial collection.
-XX:ParallelGCThreads=20: Configure the number of threads of the parallel collector, that is, how many threads can perform garbage collection at the same time. This value is best configured equal to the number of processors.
-XX:+UseParallelOldGC: Configure the old generation garbage collection method to parallel collection. JDK6.0 supports parallel collection of the old generation
-XX:MaxGCPauseMillis=100: Set the maximum time for each young generation garbage collection. If this time cannot be met, the JVM will automatically adjust the young generation size to meet this value .
-XX:+UseAdaptiveSizePolicy: After setting this option, the parallel collector will automatically select the young generation area size and the corresponding Survivor area ratio to achieve the minimum response time or collection frequency specified by the target system. This value is recommended to use parallel The collector is always open.

Concurrent collector (response time priority)
Example: java -Xmx3550m -Xms3550m -Xmn2g -Xss128k -XX:+UseConcMarkSweepGC
-XX:+UseConcMarkSweepGC: Set the old generation for concurrent collection. After configuring this in the test, the configuration of -XX:NewRatio=4 failed for unknown reasons. Therefore, it is best to use -Xmn setting for the young generation size at this time.
-XX:+UseParNewGC: Set the young generation to parallel collection. Can be used simultaneously with CMS collection. For JDK5.0 and above, the JVM will set it by itself according to the system configuration, so there is no need to set this value.
-XX:CMSFullGCsBeforeCompaction: Since the concurrent collector does not compress and organize the memory space, "fragmentation" will be generated after running for a period of time, which reduces operating efficiency. This value sets how many times the GC will be run to compress and organize the memory space.
-XX:+UseCMSCompactAtFullCollection: Turn on compression of the old generation. It may affect performance, but it can eliminate fragmentation

3. Summary
You need to make some trade-offs in memory settings
1) The larger the memory, the better the processing efficiency under normal circumstances. The higher it is, but at the same time, the longer it takes to do garbage collection, and the processing efficiency during this period will inevitably be affected.
2)在大多数的网络文章中都推荐 Xmx和Xms设置为一致，说是避免频繁的回收，这个在测试的时候没有看到明显的效果，内存的占用情况基本都是锯齿状的效果，所以这个还要根据实际情况来定。

二、Server.xml的Connection优化

提高Tomcat的并发能力一些方法

1、Apache + Tomcat 结合起来用Apache 负责静态页面，Tomcat负责动态页面，同时减少connectionTimeout的时间，以应对并发量大线程回收来不及的情况。
2、压力过大的问题，可以做负载均衡，一个TOMCAT无论如何也不可能担当如此多的线程负载，而且JVM过大，其内存管理成本将显著加大。2G的内存，做3-4个TOMCAT实例（512RAM*4），更为科学合理。
3、数据库连接池，不少人，都推荐使用C3P0，能提高访问数据库的并发性能好几倍。（有博文称使用tomcat自带的jdbc-pool更好，还没试过）
4、采用Tomcat集群可以最大程度的发挥服务器的性能，可以在配置较高的服务器上部署多个Tomcat，也可以在多台服务器上分别部署 Tomcat，Apache和Tomcat整合的方式还是JK方式。经过验证，系统对大用户量使用的响应方面，Apache+3Tomccat集群> Apache+2Tomcat集群> Apache集成Tomcat >单个Tomcat。并且采用Apache+多Tomcat集群的部署方式时，如果一个Tomcat出现宕机，系统可以继续使用，所以在硬件系统性能足够优越的情况下，需要尽量发挥软件的性能，可以采用增加Tomcat集群的方式。
5. 打开KeepAlive支持
KeepAlive on, KeepAliveTimeout 15 MaxKeepAliveRequests 1000
根据实际经验，通过Apache和Tomcat集群的方式提高系统性能的效果十分明显，这种方式可以最大化的利用硬件资源，通过多个Tomcat的处理来分担单Tomcat时的压力。
web server允许的最大连接数还受制于操作系统的内核参数设置，通常Windows是2000个左右，Linux是1000个左右。

1.指定使用NIO模型来接受HTTP请求
protocol="org.apache.coyote.http11.Http11NioProtocol" 指定使用NIO模型来接受HTTP请求。默认是BlockingIO，配置为protocol="HTTP/1.1"
acceptorThreadCount="2" 使用NIO模型时接收线程的数目

2、指定处理线程数目

<Connector port="80" protocol="HTTP/1.1" maxThreads="600" minSpareThreads="100" maxSpareThreads="500" acceptCount="700"
connectionTimeout="20000" redirectPort="8443" />

Copy after login

maxThreads="600" ///最大线程数
minSpareThreads="100"///初始化时创建的线程数
maxSpareThreads="500"///一旦创建的线程超过这个值，Tomcat就会关闭不再需要的socket线程。
acceptCount="700"//指定当所有可以使用的处理请求的线程数都被使用时，可以放到处理队列中的请求数，超过这个数的请求将不予处理

这里是http connector的优化，如果使用apache和tomcat做集群的负载均衡，并且使用ajp协议做apache和tomcat的协议转发，那么还需要优化ajp connector。

<Connector port="8009" protocol="AJP/1.3" maxThreads="600" minSpareThreads="100" maxSpareThreads="500" acceptCount="700"
connectionTimeout="20000" redirectPort="8443" />

Copy after login

3、线程池

由于tomcat有多个connector，所以tomcat线程的配置，又支持多个connector共享一个线程池。

首先。打开/conf/server.xml，增加

<Executor name="tomcatThreadPool" namePrefix="catalina-exec-" maxThreads="500" minSpareThreads="20" maxIdleTime="60000" />

Copy after login

最大线程500（一般服务器足以），最小空闲线程数20，线程最大空闲时间60秒。

然后，修改节点，增加executor属性，executor设置为线程池的名字：

<Connector executor="tomcatThreadPool" port="80" protocol="HTTP/1.1"  connectionTimeout="60000" keepAliveTimeout="15000" maxKeepAliveRequests="1"  redirectPort="443" />

Copy after login

可以多个connector公用1个线程池，所以ajp connector也同样可以设置使用tomcatThreadPool线程池。

4.其它常用设置
maxHttpHeaderSize="8192" http请求头信息的最大程度，超过此长度的部分不予处理。一般8K。
URIEncoding="UTF-8" 指定Tomcat容器的URL编码格式。不要遗漏URIEncoding=”GBK”，能使页面url传递中文参数时保证正确。
disableUploadTimeout="true" 上传时是否使用超时机制
enableLookups="false"--是否反查域名，默认值为true。为了提高处理能力，应设置为false
compression="on" 打开压缩功能。压缩会增加Tomcat负担，最好采用Nginx + Tomcat 或者 Apache + Tomcat 方式，压缩交由Nginx/Apache 去做。
compressionMinSize="10240" 启用压缩的输出内容大小，默认为2KB
noCompressionUserAgents="gozilla, traviata" 对于以下的浏览器，不启用压缩
compressableMimeType="text/html,text/xml,text/javascript,text/css,text/plain" 哪些资源类型需要压缩
5.小结
关于Tomcat的Nio和ThreadPool，本身的引入就提高了处理的复杂性，所以对于效率的提高有多少，需要实际验证一下。

三、设置session过期时间

conf\web.xml中通过参数指定：

    <session-config>   
        <session-timeout>180</session-timeout>     
    </session-config> 
单位为分钟。

Copy after login

四、Apr插件提高Tomcat性能

Tomcat可以使用APR来提供超强的可伸缩性和性能，更好地集成本地服务器技术.

APR(Apache Portable Runtime)是一个高可移植库，它是Apache HTTP Server 2.x的核心。APR有很多用途，包括访问高级IO功能(例如sendfile,epoll和OpenSSL)，OS级别功能(随机数生成，系统状态等等)，本地进程管理(共享内存，NT管道和UNIX sockets)。这些功能可以使Tomcat作为一个通常的前台WEB服务器，能更好地和其它本地web技术集成，总体上让Java更有效率作为一个高性能web服务器平台而不是简单作为后台容器。

在产品环境中，特别是直接使用Tomcat做WEB服务器的时候，应该使用Tomcat Native来提高其性能

  要测APR给tomcat带来的好处最好的方法是在慢速网络上（模拟Internet），将Tomcat线程数开到300以上的水平，然后模拟一大堆并发请求。
  如果不配APR，基本上300个线程狠快就会用满，以后的请求就只好等待。但是配上APR之后，并发的线程数量明显下降，从原来的300可能会马上下降到只有几十，新的请求会毫无阻塞的进来。
  在局域网环境测，就算是400个并发，也是一瞬间就处理/传输完毕，但是在真实的Internet环境下，页面处理时间只占0.1%都不到，绝大部分时间都用来页面传输。如果不用APR，一个线程同一时间只能处理一个用户，势必会造成阻塞。所以生产环境下用apr是非常必要的。

(1)安装APR tomcat-nativeapr-1.3.8.tar.gz   安装在/usr/local/apr
    #tar zxvf apr-1.3.8.tar.gz
    #cd apr-1.3.8#./configure;make;make install
    
    apr-util-1.3.9.tar.gz  安装在/usr/local/apr/lib
    #tar zxvf apr-util-1.3.9.tar.gz
    #cd apr-util-1.3.9  
    #./configure --with-apr=/usr/local/apr ----with-java-home=JDK;make;make install
    
    #cd apache-tomcat-6.0.20/bin  
    #tar zxvf tomcat-native.tar.gz  
    #cd tomcat-native/jni/native  
    #./configure --with-apr=/usr/local/apr;make;make install
    
  (2)设置 Tomcat 整合 APR
    修改 tomcat 的启动 shell （startup.sh），在该文件中加入启动参数：
      CATALINA_OPTS="$CATALINA_OPTS -Djava.library.path=/usr/local/apr/lib" 。
 
  (3)判断安装成功:
    如果看到下面的启动日志，表示成功。      2007-4-26 15:34:32 org.apache.coyote.http11.Http11AprProtocol init

Copy after login

5. Cluster solution

The processing performance of a single Tomcat is limited. When the amount of concurrency is large, multiple sets need to be deployed for load balancing.

The key points of the cluster are the following:
1. Introducing the load side
Soft load can be carried out using nginx or apache, mainly using a distribution function
Reference:
(nginx load)
(apache load)

2. Shared session processing
The current processing methods are as follows:
1). Use Tomcat's own Session replication function
Reference (Configuration of Session replication)
The advantage of the solution is that the configuration is simple. The disadvantage is that when the number of clusters is large, the Session replication time will be longer, affecting the efficiency of the response.
2). Use a third party to store shared sessions
Currently, memcached is used to manage shared sessions, and memcached-sesson-manager is used to manage Tomcat’s sessions.
Reference (using MSM Manage Tomcat cluster session)
3). Use sticky session strategy
For occasions where the session requirements are not too strong (no accounting involved, re-request is allowed if failed, etc.), the session of the same user can Nginx or apache is handed over to the same Tomcat for processing. This is the so-called session sticky strategy, which is currently used in many applications.
Reference: (tomcat session sticky)
nginx does not include the session sticky module by default and needs to be recompiled. OK (I don’t know how to recompile under Windows)
The advantage is that the processing efficiency is much higher, but the disadvantage is that it is not suitable for occasions with strong session requirements

3. One instance for each site. Start multiple tomcat.

Do not use Tomcat virtual host, one instance per site. That is, start multiple tomcats.

This is also a common mistake made by PHP operation and maintenance here. PHP's approach is to place multiple virtual hosts under one Web instead of starting a web server for each host. Tomcat is multi-threaded and shares memory. If any application in a virtual host crashes, all applications will be affected. Although using multiple instances is relatively expensive, it ensures application isolation and security.

4. Summary
The above are the key points of implementing a cluster. 1 and 2 can be used in combination. Let’s analyze the specific scenarios~

6. Problem location

The problem of long processing time of Tomcat is mainly caused by the concurrency, number of sessions, memory and memory recycling at that time. After a problem occurs, it must be analyzed.

1. About the number of Tomcat sessions
This can be viewed directly from Tomcat’s web management interface
Or it can be viewed with the help of the third-party tool Lambda Probe. Compared with the management that comes with Tomcat, it has a little more functions, but not much

2. Monitor the memory usage of Tomcat
Using the jconsole that comes with the JDK can be more clear. See the memory usage, thread status, total number of currently loaded classes, etc.
The jvisualvm that comes with JDK can download plug-ins (such as GC, etc.) and view richer information. If you are analyzing local Tomcat, you can also perform memory sampling to check the usage of each class

3. Print the loading status of classes and the recycling status of objects
This You can print this information (to the screen (also to catalina.log by default) or file) by configuring the startup parameters of the JVM. The specific parameters are as follows:
-XX:+PrintGC: Output format: [GC 118250K-> 113543K(130112K), 0.0094143 secs] [Full GC 121376K->10414K(130112K), 0.0650971 secs]
-XX:+PrintGCDetails: Output format: [GC [DefNew: 8614K->781K(90 88K), 0.0123035 secs] 118250K->113543K(130112K), 0.0124633 secs] [GC [DefNew: 8614K->8614K(9088K), 0.0000665 secs][Tenured: 112761K->10414K(12102 4K), 0.0433488 secs] 121376K-> 10414K(130112K), 0.0436268 secs]
-XX:+PrintGCTimeStamps -XX:+PrintGC: PrintGCTimeStamps can be mixed with the above two, output format: 11.851: [GC 98328K->93620K(130112K), 0.0082960 secs]
-XX:+PrintGCApplicationConcurrentTime: Print the uninterrupted execution time of the program before each garbage collection. Can be mixed with above. Output format: Application time: 0.5291524 seconds
-XX:+PrintGCApplicationStoppedTime: Print the time the program was paused during garbage collection. Can be mixed with above. Output format: Total time for which application threads were stopped: 0.0468229 seconds
-XX:PrintHeapAtGC: Print detailed stack information before and after GC
-Xloggc:filename: Use in conjunction with the above to record relevant log information to file for analysis

-verbose:class monitors the status of loaded classes
-verbose:gc displays information on the output device when memory recycling occurs in the virtual machine
-verbose:jni outputs native method calls Related situations are generally used to diagnose jni calling error messages

4. Add JMS remote monitoring
For Tomcat deployed on other machines in the LAN, you can open the JMX monitoring port, LAN Other machines can view some commonly used parameters through this port (but some more complex functions are not supported). They can also be configured in the JVM startup parameters. The configuration is as follows:
-Dcom.sun.management.jmxremote.ssl =false -Dcom.sun.management.jmxremote.authenticate=false
-Djava.rmi.server.hostname=192.168.71.38 Set the JVM JMS monitoring IP address, mainly to prevent incorrect monitoring to 127.0.0.1 This intranet address
-Dcom.sun.management.jmxremote.port=1090 Sets the port for JVM JMS monitoring
-Dcom.sun.management.jmxremote.ssl=false Sets JVM JMS monitoring without SSL
-Dcom.sun.management.jmxremote.authenticate=false Setting up JVM JMS monitoring does not require authentication

5. Professional analysis tools include
IBM ISA, JProfiler, etc. , just search online for specific monitoring and analysis methods.

The above is the detailed content of Tomcat--performance optimization. For more information, please follow other related articles on the PHP Chinese website!