PS: Nginx/LVS/HAProxy are the three most widely used load balancing software at present. I have implemented them in multiple projects. I have referred to some information and combined with some of my own experience to summarize.
Generally, the use of load balancing is to use different technologies according to different stages as the scale of the website increases. Specific application requirements need to be analyzed in detail. If it is a small and medium-sized Web application, for example, the daily PV is less than 10 million, it is perfectly fine to use Nginx; if there are many machines, you can use DNS polling, and LVS still consumes a lot of machines. For large websites or important services, and there are many servers, you can consider using LVS.
One is to do it through hardware. Common hardware includes commercial load balancers such as relatively expensive F5 and Array. Its advantage is that it has a professional maintenance team to maintain these services. Its disadvantage is that it costs too much. Large, so there is no need to use it for smaller network services for the time being; the other is Linux-based open source free load balancing software similar to Nginx/LVS/HAProxy. These are all implemented at the software level, so The fees are very low.
The current website architecture is generally more reasonable and popular architecture solutions: the Web front-end uses Nginx/HAProxy+Keepalived as the load balancer; the back-end uses a MySQL database with one master and multiple slaves and read-write separation, and adopts the LVS+Keepalived architecture. Of course, a plan must be developed based on the specific needs of the project.
Let’s talk about their respective characteristics and applicable occasions.
The advantages of Nginx are:
1. Working above the 7th layer of the network, it can make some diversion strategies for http applications, such as domain names and directory structures. Its regular rules are more powerful than HAProxy And flexibility is one of the main reasons why it is currently widely popular. Nginx can be used in far more situations than LVS based on this alone.
2. Nginx has very little dependence on network stability. In theory, it can perform the load function as long as it can be pinged. This is also one of its advantages; on the contrary, LVS has a greater dependence on network stability, which I am fully aware of;
3. Nginx is relatively simple to install and configure, and it is more convenient to test. It can basically print out errors in logs. The configuration and testing of LVS takes a relatively long time, and LVS relies heavily on the network.
3. It can withstand high load pressure and is stable. If the hardware is not bad, it can generally support tens of thousands of concurrency, and the load degree is relatively smaller than LVS.
4. Nginx can detect internal server failures through the port, such as status codes, timeouts, etc. returned by the server processing web pages, and will resubmit requests that return errors to another node. However, the disadvantage is that it does not support URLs. detection. For example, if the user is uploading a file, and the node processing the upload fails during the upload process, Nginx will switch the upload to another server for reprocessing, and LVS will be directly disconnected. If a large file is uploaded, Or very important files, users may be dissatisfied.
5. Nginx is not only an excellent load balancer/reverse proxy software, it is also a powerful web application server. LNMP is also a very popular web architecture in recent years, and its stability is also very good in high-traffic environments.
6. Nginx is now becoming more and more mature as a web reverse acceleration cache and is faster than the traditional Squid server. You can consider using it as a reverse proxy accelerator.
7. Nginx can be used as a mid-level reverse proxy. At this level, Nginx has basically no rivals. The only one that can compare with Nginx is lighttpd. However, lighttpd does not yet have the full functions of Nginx, and the configuration is not so clear and easy to read. , the community information is far less active than Nginx.
8. Nginx can also be used as a static web page and image server, and its performance in this area is unmatched. The Nginx community is also very active and there are many third-party modules.
The disadvantages of Nginx are:
1. Nginx can only support http, https and Email protocols, so the scope of application is smaller. This is its disadvantage.
2. The health check of the back-end server only supports detection through ports and does not support detection through URLs. Direct retention of Session is not supported, but it can be solved through ip_hash.
LVS: Use Linux kernel cluster to implement a high-performance, high-availability load balancing server, which has good scalability (Scalability), reliability (Reliability) and manageability (Manageability).
The advantages of LVS are:
1. Strong load resistance. It works above the 4th layer of the network and is only used for distribution. No traffic is generated. This feature also determines its strongest performance among load balancing software. The consumption of memory and CPU resources is relatively low.
2. The configurability is relatively low, which is a disadvantage and an advantage. Because there is nothing that can be configured too much, it does not require too much contact, which greatly reduces the chance of human error.
3. It works stably because it has strong load resistance and has a complete dual-machine hot backup solution, such as LVS+Keepalived. However, the one we use most in project implementation is LVS/DR+Keepalived.
4. No traffic, LVS only distributes requests, and the traffic does not go out from itself. This ensures that the performance of the balancer IO will not be affected by large traffic.
5. The application range is relatively wide. Because LVS works on layer 4, it can load balance almost all applications, including http, databases, online chat rooms, etc.
The disadvantages of LVS are:
1. The software itself does not support regular expression processing and cannot separate dynamic and static data; many websites now have strong demands in this regard. This is the advantage of Nginx/HAProxy+Keepalived.
2. If the website application is relatively large, LVS/DR+Keepalived will be more complicated to implement. Especially if there is a Windows Server machine behind it, the implementation, configuration and maintenance process will be more complicated. Relatively speaking, Nginx /HAProxy+Keepalived is much simpler.
The features of HAProxy are:
1. HAProxy also supports virtual hosts.
2. The advantages of HAProxy can supplement some of the shortcomings of Nginx, such as supporting Session retention and Cookie guidance; it also supports detecting the status of the back-end server by obtaining the specified URL.
3. HAProxy is similar to LVS, it is just a load balancing software; purely in terms of efficiency, HAProxy will have better load balancing speed than Nginx, and it is also better than Nginx in concurrent processing.
4. HAProxy supports load balancing forwarding of the TCP protocol. It can load balance MySQL reads and detect and load balance the back-end MySQL nodes. You can use LVS+Keepalived to load balance the MySQL master and slave.
5. There are many HAProxy load balancing strategies. HAProxy's load balancing algorithms currently include the following 8 types:
① roundrobin, which means simple polling. Not much to say, this is basically what load balancing has;
② static- rr, which means based on the weight, it is recommended to pay attention;
③ leastconn, which means the least connected ones are processed first, it is recommended to pay attention;
④ source, which means based on the request source IP, this is similar to Nginx’s IP_hash mechanism, we use it as a way to solve session problems method, it is recommended to pay attention;
⑤ ri, means based on the requested URI;
⑥ rl_param, means based on the requested URl parameter 'balance url_param' requires an URL parameter name;
⑦ hdr(name), means locked based on the HTTP request header Every HTTP request;
⑧ rdp-cookie(name) means locking and hashing every TCP request based on cookie(name).
Summary of the comparison between Nginx and LVS:
1. Nginx works on the 7th layer of the network, so it can implement diversion strategies for the http application itself, such as domain names, directory structures, etc. In contrast, LVS does not have such a function , so Nginx can be used in far more situations than LVS based on this alone; however, these useful functions of Nginx make it more adjustable than LVS, so you often have to touch and touch. If you touch too much, you will make mistakes. The probability of problems will be greater.
2. Nginx relies less on network stability. In theory, as long as ping is successful and web page access is normal, Nginx can connect. This is a major advantage of Nginx! Nginx can also distinguish between internal and external networks. If it is a node with both internal and external networks, it is equivalent to a single machine having a backup line; LVS is more dependent on the network environment. Currently, the servers are in the same network segment and LVS uses direct mode to offload. The effect is more guaranteed. Also note that LVS needs to apply for at least one more IP from the hosting provider to be used as a Visual IP. It seems that it cannot use its own IP as a VIP. To be a good LVS administrator, you really need to follow up and learn a lot of knowledge about network communication, which is no longer as simple as HTTP.
3. Nginx is relatively simple to install and configure, and it is also very convenient to test, because it can basically print out errors in logs. The installation, configuration, and testing of LVS take a relatively long time; LVS relies heavily on the network. In many cases, failure to configure successfully is due to network problems rather than configuration problems. If there is a problem, it will be much more troublesome to solve. .
4. Nginx can also withstand high loads and is stable, but the load and stability are poor. LVS has several levels: Nginx handles all traffic, so it is limited by machine IO and configuration; its own bugs are still unavoidable.
5. Nginx can detect internal server failures, such as status codes, timeouts, etc. returned by the server processing web pages, and will resubmit requests that return errors to another node. Currently, ldirectd in LVS can also support monitoring the internal conditions of the server, but the principle of LVS prevents it from resending requests. For example, if the user is uploading a file, and the node processing the upload happens to fail during the upload process, Nginx will switch the upload to another server for reprocessing, and LVS will be directly disconnected. If a large file is uploaded, Or very important files, users may be annoyed by this.
6. Nginx’s asynchronous processing of requests can help the node server reduce the load. If apache is used to serve external parties directly, then when there are many narrowband links, the apache server will occupy a large amount of memory and cannot be released. If one more Nginx is used as an apache proxy, These narrowband links will be blocked by Nginx, and excessive requests will not accumulate on Apache, thus reducing considerable resource usage. Using Squid has the same effect in this regard. Even if Squid itself is configured not to cache, it is still of great help to Apache.
7. Nginx can support http, https and email (the email function is less commonly used), and LVS supports more applications than Nginx in this regard. In terms of use, generally the strategy adopted by the front end should be LVS, that is, the direction of DNS should be the LVS equalizer. The advantages of LVS make it very suitable for this task. Important IP addresses are best managed by LVS, such as database IPs, webservice server IPs, etc. As time goes by, these IP addresses will become more and more widely used. If the IP addresses are changed, faults will ensue. Therefore, it is safest to hand over these important IPs to LVS for hosting. The only disadvantage of doing so is that the number of VIPs required will be larger. Nginx can be used as an LVS node machine. First, it can take advantage of the functions of Nginx. Second, it can take advantage of the performance of Nginx. Of course, you can also use Squid directly at this level. Squid's functions are much weaker than Nginx, and its performance is also inferior to Nginx. Nginx can also be used as a mid-level proxy. At this level, Nginx basically has no rivals. The only one that can shake Nginx is lighttpd, but lighttpd has not yet been able to do so.
Nginx is fully functional and the configuration is not so clear and easy to read. In addition, the IP of the middle-level agent is also important, so it is the most perfect solution for the middle-level agent to also have a VIP and LVS. The specific application needs to be analyzed in detail. If it is a relatively small website (daily PV is less than 10 million), it is perfectly fine to use Nginx. If there are many machines, you can use DNS polling. LVS still consumes a lot of machines. ; For large websites or important services, when the machine is not worried, you should consider using LVS.
The current use of network load balancing is to use different technologies according to different stages as the scale of the website increases:
The first stage: Use Nginx or HAProxy for single-point load balancing. At this stage, the server scale has just left The billing server and single database model requires a certain amount of load balancing, but it is still small and does not require a professional maintenance team to maintain it, nor does it require large-scale website deployment. In this way, using Nginx or HAproxy is the first choice. At this time, these things are quick to get started and easy to configure. Just use the HTTP protocol above layer seven. This is the first choice.
Second stage: As network services further expand, single-point Nginx is no longer sufficient. At this time, using LVS or commercial Array is the first choice. Nginx is used as a node of LVS or Array at this time. Specifically, LVS or Array The choice is based on company size and budget. Array's application delivery function is very powerful. I have used it in a project, and the cost performance is much higher than F5. It is the first choice for commercial use! However, generally speaking, relevant talents at this stage cannot keep up with the improvement of the business, so purchasing commercial load balancing has become the only way to go.
The third stage: At this time, network services have become mainstream products. At this time, as the company's popularity has further expanded, the capabilities and quantity of relevant talents have also increased. At this time, both in terms of developing customization suitable for its own products and reducing costs In terms of open source LVS, it has become the first choice, and LVS will become mainstream at this time.
The final ideal basic architecture is: Array/LVS — Nginx/Haproxy — Squid/Varnish — AppServer.
This blog post is reproduced from http://www.ha97.com/5646.html
The above has introduced (summary) the advantages and disadvantages of Nginx/LVS/HAProxy load balancing software in detail, including the relevant aspects. I hope it will be helpful to friends who are interested in PHP tutorials.