Home > Article > Web Front-end > Let's talk about asynchronous implementation and event driving in Node
This article will take you through the asynchronous implementation and event drive in Node, I hope it will be helpful to you!
Some tasks in computers can generally be divided into two categories, one category is called IO Intensive, one is called computing-intensive; for computing-intensive tasks, the performance of the CPU can only be continuously drained, but for IO-intensive tasks, ideally this is not needed, and only the IO device needs to be notified for processing. , just come back and get the data after a while. [Related tutorial recommendations: nodejs video tutorial, Programming video]
For some scenarios, there are some unrelated tasks that need to be completed. The current mainstream There are two methods:
node
gave its solution before the two: use single thread to stay away from multi-thread deadlock, state synchronization and other problems; use asynchronous IO, keeping single threads away from blocking to better use the CPU
I just talked about
node
's multi-tasking solution, but it is not easy to implement it internally innode
. Here are some concepts of the operating system, so that everyone can better understand it in the future. Let’s talk about asynchronous implementation and node’s event loop mechanism later:
Everything in the operating system is a file, and input and output devices are also abstracted. Files, when the kernel performs IO operations, manage them through file descriptors
Some problems with non-blocking IO: Although it allows the CPU The utilization rate has increased, but since a file descriptor is returned immediately, we do not know when the IO operation is completed. In order to confirm the status change, we can only perform polling operations
read
: The most primitive and lowest performance method, completes the acquisition of complete data byrepeatedly checking the IO status
select
: Judging by the event status on the file descriptor, relatively speaking, it consumes less; the disadvantage is that it uses a 1024-length array for storage status, so it can check up to 1024 file descriptors at the same time poll
: Due to the limitation of select
, poll
is improved to the storage of linked list The other methods are basically the same; but when there are many file descriptors, its performance is still very low eopll
: This solution is under linux
The most efficient IO event notification mechanism. If no IO event is checked when entering polling, it will sleep until an event occurs to wake it upkqueue
: and epoll
Similar, but only exists under FreeBSD systemsAlthough epoll
uses events to reduce CPU consumption, the CPU is almost idle during sleep; we The expected asynchronous IO should be a non-blocking call initiated by the application. There is no need to poll through traversal or event wake-up. The next task can be processed directly. It only needs to pass the data to the application through a signal or callback after the IO is completed.
There is also an AIO method under Linux that transmits data through signals or callbacks, but it is only available in Linux, and there are restrictions that cannot use the system cache
Let’s talk about the conclusion first. node
The implementation of asynchronous IO is implemented through multi-threading. What may be confusing is that although node
is internally multi-threaded, the JavaScript
code developed by our programmers only runs on a single thread.
node
Use some threads to perform blocking IO or non-blocking IO plus polling technology to complete data acquisition, let one thread perform calculation and processing, and transfer the data obtained from IO through communication between threads. , which easily realizes the simulation of asynchronous IO.
In addition to asynchronous IO, other resources in the computer are also applicable, because everything in Linux is a file, and almost all computer resources such as disks, hardware, sockets, etc. are abstracted. file, the next introduction to calls to computer resources takes IO as an example.
When the process starts, node
will create a loop similar to while(true)
, each time the loop body is executed, we become Tick
;
Below is the event loop flow chart in node
:
A very simple picture, a brief explanation: every time the execution completion event is obtained from the IO observer (it is a request object, a simple understanding is that it contains some data generated in the request), and then there is no callback function, continue to take out the next event (request object), and execute the callback function if there is a callback
Note: Different platforms have different implementation details. This picture hides the relevant platform compatibility details. For example, using
PostQueuedCompletionStatus()
in IOCP under windows to submit the execution status throughGetQueuedCompletionStatus
Obtain the completed request, and the details of the thread pool are implemented internally in IOCP, while platforms such as Linux implement this process througheopll
, and self-implement the thread pool underlibuv
setTimtout
and setInterval
In addition to IO and other computer resources that require asynchronous calls, node
There are some other asynchronous APIs that have nothing to do with asynchronous IO:
Their implementation principles are similar to asynchronous IO,just do not require the participation of the IO thread pool
:
setInterval
The created timer will be inserted into a red-black tree inside the timer observer
Every time
If it does, push the event (request object) into the event queue and execute the callback function in the event loop##Have you considered this issue? Well, why does the timer not require the participation of the thread pool? If you understand the implementation principles of asynchronous IO in the previous chapters, I believe you should be able to explain it. Here is a brief explanation of the reasons to deepen your memory:
The IO thread pool in
node
is a way to call IO and wait for data to return (see the specific implementation). It enablesJavaScript
single threads to be asynchronous Call IO, and do not need to wait for the IO execution to complete (because the IO thread pool does it), and can obtain the final data (through the observer mode: the IO observer obtains the execution completion event from the thread pool, and the event loop mechanism executes the subsequent callback function)The above paragraph may be a bit brief. If you still don’t understand, you can look at the previous pictures~
process .nextTick
andsetImmediate
Both functions represent the immediate asynchronous execution of a function, so why not use
setTimeout(() => { . .. }, 0)
to complete it?
- The timer is not accurate enough
- The timer uses a red-black tree to create timer objects and iterative operations, which wastes performance
- That is,
process.nextTick
More lightweightLightweight specifically: every time we call
process.nextTick
, we will only put the callback function into the queue, and in the next round Take out and execute whenTick
. When using the red-black tree method in the timernextTick , issetImmediate process.nextTick## What is the difference between # and ? After all, they all execute the callback function asynchronously immediately
process.nextTick
's callback execution priority is higher than
setImmediatebelongs to the
- ## The callback function of #process.nextTick
is stored in an array, and is executed in each round of the event loop. The result of
setImmediateis saved in a linked list, and the first callback is executed in sequence in each round of the cycle.
- Note: The reason why the callback execution priority of
process.nextTickprocess.nextTick
is higher than that ofsetImmediate
is because the event loop checks the observer in order.idle
nodeobserver,
setImmediatebelongs to the
checkobserver.
iedlObserver> IO Observer> check ObserverHigh performance server
Processing of network sockets ,
is also applied to asynchronous IO. The requests listened to on the network socket will form events and be handed over to the IO observer. The event loop will continuously process these network IO events. If we are in JavaScrpt Corresponding callback functions are passed in at the level, and these callback functions will be executed in the event loop (processing these network requests)
Common server models:
Per process-->Per requestSynchronous
Per thread-->Per request
- And
node- The event-driven approach is used to handle these requests. There is no need to create additional corresponding threads for each request. The overhead of creating and destroying threads can be omitted. At the same time, the operating system has fewer scheduling tasks because there are fewer threads (only
node- Some threads implemented internally) The cost of context switching is very low.
Classic problem--Avalanche problemSolution:
Problem description: When the server is just started, there is no data in the cache. If the number of visits is huge, the same
SQL
will be sent to the database for repeated queries, affecting performance.Solution:
const proxy = new events.EventEmitter(); let status = "ready"; // 状态锁,避免反复查询 const select = function(callback) { proxy.once("selected", callback); // 绑定一个只执行一次名为selected的事件 if(status === "ready") { status = "pending"; // sql db.select("SQL", (res) => { proxy.emit("selected", res); // 触发事件,返回查询数据 status = "ready"; }) } }Use
once
to push all requested callbacks into the event queue, and use it to remove the monitor after executing it only once Features ensure that each callback function will only be executed once. For the same SQL statement, it is guaranteed to be executed only once from the beginning to the end of the same query. New arrivals of the same call only need to wait in the queue for the data to be ready. Once the results are queried, the results can be used by these calls.For more programming-related knowledge, please visit: Programming Teaching! !
The above is the detailed content of Let's talk about asynchronous implementation and event driving in Node. For more information, please follow other related articles on the PHP Chinese website!