Home  >  Article  >  Backend Development  >  The differences and advantages and disadvantages of threads and processes in python

The differences and advantages and disadvantages of threads and processes in python

乌拉乌拉~
乌拉乌拉~Original
2018-08-23 17:42:402157browse

In this article, let’s take a look at what python threads and processes are. Learn about the relevant knowledge of python threads and processes, and understand the differences and advantages and disadvantages of threads and processes in python. We introduced multi-processing and multi-threading, which are the two most common ways to achieve multi-tasking. Now, let’s discuss the pros and cons of both approaches.

First of all, to achieve multi-tasking, we usually design the Master-Worker model. The Master is responsible for allocating tasks and the Worker is responsible for executing tasks. Therefore, in a multi-tasking environment, there is usually one Master and multiple Workers.

If you use multiple processes to implement Master-Worker, the main process is Master, and other processes are Workers.

If you use multi-threading to implement Master-Worker, the main thread is the Master and other threads are the Workers.

The biggest advantage of multi-process mode is its high stability, because if a sub-process crashes, it will not affect the main process and other sub-processes. (Of course, if the master process hangs up, all processes will hang up, but the Master process is only responsible for allocating tasks, and the probability of hanging up is low) The famous Apache first adopted the multi-process mode.

The disadvantage of the multi-process mode is that the cost of creating a process is high. Under Unix/Linux systems, calling fork is okay, but creating processes under Windows is expensive. In addition, the number of processes that the operating system can run at the same time is also limited. Under the limitations of memory and CPU, if there are thousands of processes running at the same time, the operating system will even have scheduling problems.

Multi-threaded mode is usually a little faster than multi-process, but not much faster. Moreover, the fatal disadvantage of multi-threaded mode is that any thread hanging may directly cause the entire process to crash, because all threads share Process memory. On Windows, if there is a problem with the code executed by a thread, you can often see this prompt: "The program has performed an illegal operation and will be closed soon." In fact, there is often a problem with a certain thread, but the operating system will force End the entire process.

Under Windows, multi-threading is more efficient than multi-process, so Microsoft's IIS server adopts multi-threading mode by default. Due to stability issues with multi-threading, IIS is not as stable as Apache. In order to alleviate this problem, IIS and Apache now have a mixed mode of multi-process and multi-thread, which really makes the problem more and more complicated.

Thread switching

Whether it is multi-process or multi-thread, as long as the number is large, the efficiency will definitely not increase. Why?

Let’s use an analogy. Suppose you are unfortunately preparing for the high school entrance examination and need to do homework in five subjects: Chinese, mathematics, English, physics, and chemistry every night. Each homework takes 1 hour.

If you first spend 1 hour doing Chinese homework, and then spend 1 hour doing math homework, and then finish them all in sequence, spending a total of 5 hours, this method is called a single-task model, or a batch model. Process task model.

Suppose you plan to switch to the multi-tasking model, you can first do Chinese for 1 minute, then switch to math homework, do 1 minute, then switch to English, and so on, as long as the switching speed is fast enough, this way It is the same as a single-core CPU performing multitasking. From the perspective of a kindergarten child, you are writing homework for 5 subjects at the same time.

However, there is a price for switching homework. For example, when switching from Chinese to mathematics, you must first clear away the Chinese books and pens on the table (this is called saving the scene), then open the mathematics textbook and find the compass and ruler. (This is called preparing for the new environment) before you can start doing math homework. The operating system is the same when switching processes or threads. It needs to first save the current execution environment (CPU register state, memory pages, etc.), and then prepare the execution environment for the new task (restore the last register state, switch memory page, etc.) before execution can begin. Although this switching process is fast, it also takes time. If there are thousands of tasks running at the same time, the operating system may be mainly busy switching tasks, and there is not much time to perform tasks. The most common situation in this situation is that the hard disk beeps wildly, there is no response when clicking on the window, and the system is in a state of suspended animation.

Therefore, once the number of multitasking reaches a limit, all the resources of the system will be consumed. As a result, the efficiency will drop sharply and all tasks will not be completed well.

Computing intensive vs. IO intensive

The second consideration in whether to use multitasking is the type of task. We can divide tasks into computing-intensive and IO-intensive.

Computing-intensive tasks are characterized by a large amount of calculations that consume CPU resources, such as calculating pi, high-definition decoding of videos, etc., all relying on the computing power of the CPU. Although this kind of computing-intensive task can also be completed with multi-tasking, the more tasks there are, the more time spent on task switching, and the lower the efficiency of the CPU in executing tasks. Therefore, to make the most efficient use of the CPU, computing-intensive tasks The number of simultaneous tasks should be equal to the number of CPU cores.

Computing-intensive tasks mainly consume CPU resources, so code running efficiency is crucial. Scripting languages ​​like Python run very inefficiently and are completely unsuitable for computationally intensive tasks. For computationally intensive tasks, it is better to write in C language.

The second type of task is IO-intensive. Tasks involving network and disk IO are all IO-intensive tasks. The characteristic of this type of task is that the CPU consumption is very small, and most of the time of the task is Wait for the IO operation to complete (because the speed of IO is much slower than the speed of CPU and memory). For IO-intensive tasks, the more tasks, the higher the CPU efficiency, but there is a limit. Most common tasks are IO-intensive tasks, such as web applications.

During the execution of IO-intensive tasks, 99% of the time is spent on IO, and very little time is spent on the CPU. Therefore, use the extremely fast C language to replace the extremely slow running speed of Python. The scripting language cannot improve operating efficiency at all. For IO-intensive tasks, the most suitable language is the language with the highest development efficiency (the least amount of code). Scripting language is the first choice, and C language is the worst.

Asynchronous IO

Considering the huge speed difference between CPU and IO, a task spends most of the time waiting for IO operations during execution. The single-process single-thread model will lead to other problems. Tasks cannot be executed in parallel, so we need a multi-process model or a multi-thread model to support the concurrent execution of multiple tasks.

Modern operating systems have made huge improvements to IO operations. The biggest feature is that they support asynchronous IO. If you make full use of the asynchronous IO support provided by the operating system, you can use a single-process single-thread model to perform multiple tasks. This new model is called an event-driven model. Nginx is a web server that supports asynchronous IO. It runs on a single-core CPU. Using a single-process model can efficiently support multitasking. On a multi-core CPU, you can run multiple processes (the number is the same as the number of CPU cores), taking full advantage of the multi-core CPU. Since the total number of processes in the system is very limited, operating system scheduling is very efficient. Using the asynchronous IO programming model to implement multitasking is a major trend.

Corresponding to the Python language, the single-threaded asynchronous programming model is called coroutine. With the support of coroutine, efficient multi-task programs can be written based on event-driven.

The above is all the content of this article. This article mainly introduces the differences and advantages and disadvantages of threads and processes in python. I hope you can use the information to understand the above. Content. I hope what I have described in this article will be helpful to you and make it easier for you to learn python.

For more related knowledge, please visit the

Python tutorial column on the php Chinese website.

The above is the detailed content of The differences and advantages and disadvantages of threads and processes in python. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn