How to solve high concurrency in java-javaTutorial-php.cn

Java's methods to solve high concurrency: 1. Optimize code; 2. Static html; 3. Separate images from servers; 4. Caching mechanism; 5. Database cluster; 6. Load balancing; 7. CDN acceleration technology.

How to solve high concurrency in java

High concurrency in Java has always been a problem we have to deal with, so how to solve it? Here are some methods for your reference.

(Recommended tutorial: java course)

Java’s method to solve high concurrency:

1, from Start from the most basic place, optimize the code we write, and reduce unnecessary waste of resources.

a. Avoid frequent use of new objects. For classes that only need one instance for the entire application, we can use the singleton mode. For String connection operations, use StringBuffer or StringBuilder, which can be accessed through static methods for tool classes.

b. Avoid using the wrong method and try not to use instanceof for conditional judgment. Use efficient classes in Java, such as ArrayList, which has better performance than Vector.

2. HTML staticization

We access through a link address. Through this link address, the corresponding module of the server processes the request and goes to the corresponding jsp page. Finally generate the data we want. However, if there are tens of millions of requests and there are too many high concurrent requests, it will increase the pressure on the server, and in the worst case, the server will be brought down. So how to avoid this situation? If we save the result of the initial request for test.do into an html file, and then the user accesses this html file every time, so that there is no need to access the server anymore, wouldn't the pressure on the server be reduced?

How to automatically generate a static page? When a user visits it, test.html will be automatically generated and then displayed to the user.

3. Separation of picture servers

For web servers, pictures consume the most resources, so it is necessary for us to separate pictures from pages. We separate pictures Put it on a separate image server. Such an architecture can reduce the pressure on the server system that provides page access requests, and can ensure that the system will not crash due to image problems. On the image server, we can optimize different configurations.

4. Caching

The caching mechanism that I have come into contact with specifically is the caching mechanism of hibernate. In order to avoid getting data from the database every time, we put the data that users often access in memory. Even when the cache is very large, we can put the cache in memory into the hard disk. There is also the use of advanced distributed cache databases, which can increase the system's stress resistance.

5. Batch transmission

When working on a certain project, too many parameters are transmitted at one time, and the database stipulates that the maximum number of parameters that can be transmitted at one time is 30,000. , there were 50,000 records at that time, so how to transmit them? In the end, it is sent in batches. If the elevator cannot fit so many people at one time, it will report an overweight bug, so people will be sent up in batches.

Another time in the examination system, if so many examinees submitted to the database at the same time, the pressure on the database would increase and sometimes it would be down. The method used at that time was to use ajax asynchronous transmission without waiting. When the candidate clicks the submit button, the candidate's answers are automatically submitted. This also avoids the loss of the questions that the candidate has done before when there is a sudden power outage.

6. Database cluster

When faced with complex applications and a large number of users accessing them, one data set will soon be unable to meet the demand, so we need to use a database cluster Or library table hash.

We install business and application or functional modules in the application to separate the data. Different modules correspond to different databases or tables, and then disperse a certain page or function into smaller databases according to a certain strategy. List.

7. DB optimization

a. When designing the database, we must consider later maintenance. The three paradigms of the database are the principles we must follow when designing the database.

b. Index creation: Index creation must be appropriate. If a table is often queried and rarely used for additions and modifications, we can create an index for this table because for additions and modifications and deletion operations, our maintenance of the index greatly exceeds the efficiency that the index brings to us.

c. The type selection of table fields should appropriately include the length and type of the fields, etc. The selection should be based on the actual stored data. The length should not be too long, otherwise it will affect efficiency.

d. Foreign keys should be used with caution, because the primary key represents this table, and the foreign key represents a group of tables, which associates the tables. We need to associate them when deleting, modifying, etc.

e. In database operations,

try to use prepareStatement and use less statement, because PrepareStatement is precompiled.

Connection is set to readOnly. Connection is a connection to the library and is heavyweight. We can just use it.

Using the connection pool, we can modify the default number of connections in the database.

8. Load balancing

Load balancing will be a high-end solution for large websites to solve high-load access and a large number of concurrent requests.

Load balancing technology has been developed for many years, and there are many professional service providers and products to choose from. I have personally come across some solutions, and two of them can be used as a reference.

(1) Hardware four-layer switching

The fourth layer switching uses the header information of the third and fourth layer information packets to identify the business flow according to the application interval, and convert the entire interval segment Business flows are assigned to appropriate application servers for processing.

The fourth layer switching function is like a virtual IP, pointing to the physical server. The services it transmits obey a variety of protocols, including HTTP, FTP, NFS, Telnet or other protocols. These services require complex load balancing algorithms based on physical servers. In the IP world, the service type is determined by the terminal TCP or UDP port address. In Layer 4 switching, the application range is determined by the source and terminal IP addresses, TCP and UDP ports.

In the field of hardware four-layer switching products, there are some well-known products to choose from, such as Alteon, F5, etc. These products are expensive, but they are worth the money and can provide very excellent performance and very flexible management. ability. "Yahoo China" originally had close to 2,000 servers, but only used three or four Alteons to get it done.

(2), Software four-layer switching

After everyone knows the principle of hardware four-layer switch, software four-layer switching based on the OSI model has emerged. Such a solution The principle of solution implementation is the same, but the performance is slightly worse. However, it is still easy to meet a certain amount of pressure. Some people say that the software implementation method is actually more flexible, and the processing power depends entirely on the familiarity of your configuration.

We can use the commonly used LVS on Linux to solve the four-layer switching of software. LVS is Linux Virtual Server. It provides a real-time disaster response solution based on the heartbeat line, which improves the robustness of the system and also provides With flexible virtual VIP configuration and management functions, it can meet multiple application requirements at the same time, which is essential for distributed systems.

A typical load balancing strategy is to build a Squid cluster based on software or hardware four-layer switching. This idea is adopted by many large websites, including search engines. This architecture is low-cost, High performance and strong scalability, it is very easy to add or remove nodes to the architecture at any time.

For large websites, each of the methods mentioned above may be used at the same time. The introduction here is relatively simple. Many details in the specific implementation process still need everyone to gradually become familiar with and understand. Sometimes a small squid parameter or apache parameter setting can have a great impact on system performance.

9. Mirroring

Mirroring is a method often used by large websites to improve performance and data security. Mirroring technology can solve the problem of different network access providers and geographical zones. The difference in user access speeds, such as the difference between ChinaNet and EduNet, has prompted many websites to build mirror sites within the education network, and the data is updated regularly or in real time. In terms of the detailed technology of mirroring, I will not go into too much detail here. There are many professional off-the-shelf solution architectures and products to choose from. There are also cheap ways to implement it through software, such as rsync and other tools on Linux.

10. Latest: CDN acceleration technology

What is CDN?

The full name of CDN is content distribution network. Its purpose is to add a new layer of network architecture to the existing Internet to publish the content of the website to the network "edge" closest to the user, so that the user can obtain the required content nearby and improve the response speed of the user's access to the website. .

CDN is different from mirroring because it is more intelligent than mirroring, or we can use this metaphor: CDN = more intelligent mirroring, caching, and traffic diversion. Therefore, CDN can significantly improve the efficiency of information flow in the Internet network. Technically, we will comprehensively solve the problems caused by small network bandwidth, large user visits, and uneven distribution of outlets, and improve the response speed of users' access to the website.

Type characteristics of CDN:

The implementation of CDN is divided into three categories: mirroring, caching, and dedicated lines.

Mirror Site is the most common, which allows content to be published directly and is suitable for static and quasi-dynamic data synchronization. However, the cost of purchasing and maintaining new servers is relatively high, and mirror servers must be set up in various regions and professional technicians must be deployed for management and maintenance. For large websites, the cost of bandwidth for updates also increases significantly.

Cache, low cost, suitable for static content. Internet statistics show that more than 80% of users often access 20% of website content. Under this rule, the cache server can handle most of the customers' static requests, while the original server only needs to handle about 20% of non-stop requests. Caching requests and dynamic requests greatly speed up the response time of client requests and reduce the load on the original server.

CDN services generally place cache servers on key nodes across the country.

Dedicated line allows users to directly access the data source and achieve dynamic synchronization of data.

Related learning recommendations: java basic tutorial

The above is the detailed content of How to solve high concurrency in java. For more information, please follow other related articles on the PHP Chinese website!