1, opening analysis
Stream is an abstract interface implemented by many objects in Node. For example, a request to an HTTP server is a stream, and stdout is also a stream. Streams are readable, writable, or both.
The earliest exposure to Stream started in the early days of Unix. Decades of practice have proven that the Stream idea can easily develop some huge systems.
In Unix, Stream is implemented through "|". In node, as a built-in stream module, many core modules and third-party modules are used.
Like Unix, the main operation of node stream is also .pipe(). Users can use the anti-pressure mechanism to control the balance of reading and writing.
Stream can provide developers with a unified interface that can be reused and control the read and write balance between streams through the abstract Stream interface.
A TCP connection is both a readable stream and a writable stream, while an HTTP connection is different. An http request object is a readable stream, and an http response object is a writable stream.
The stream transmission process is transmitted in the form of buffer by default, unless you set other encoding methods for it. The following is an example:
Garbled characters will appear after running. The reason is that the specified character set is not set, such as: "utf-8".
Just modify it:
Run result:
Why use Stream
I/O in node is asynchronous, so reading and writing to disk and network require callback functions to read data. The following is an example of file download
Above code:
The code can achieve the required functions, but the service needs to cache the entire file data into memory before sending the file data. If the "data.txt" file is very large
If it is large and the amount of concurrency is large, a lot of memory will be wasted. Because the user needs to wait until the entire file is cached in memory before accepting the file data, this results in
The user experience is quite bad. Fortunately, both parameters (req, res) are Stream, so we can use fs.createReadStream() instead of fs.readFile(). As follows:
.pipe() method listens to the 'data' and 'end' events of fs.createReadStream(), so that the "data.txt" file does not need to be cached in its entirety
file, a data block can be sent to the client immediately after the client connection is completed. Another benefit of using .pipe() is that it can solve the problem when the client
The read-write imbalance problem caused by very large end-to-end latency.
There are five basic Streams: readable, writable, transform, duplex, and "classic". (Please check the API for specific usage)
2. Introduction of examples
When the data that needs to be processed cannot be loaded in the memory at one time, or when it is more efficient to read and process at the same time, we need to use data streams. NodeJS provides operations on data streams through various Streams.
Taking the large file copy program as an example, we can create a read-only data stream for the data source. The example is as follows:
The data event in the code will be triggered continuously, regardless of whether the doSomething function can handle it. The code can continue to be modified as follows to solve this problem.
In addition, we can also create a write-only data stream for the data target, as follows:
After doSomething is replaced by writing data into a write-only data stream, the above code looks like a file copy program. However, the above code has the problem mentioned above. If the writing speed cannot keep up with the reading speed, the cache inside the write-only data stream will burst. We can use the return value of the .write method to determine whether the incoming data is written to the target or temporarily placed in the cache, and based on the drain event, we can determine when the write-only data stream has written the data in the cache to the target. , the next data to be written can be passed in. So the code is as follows:
Finally realized the transfer of data from read-only data flow to write-only data flow, and included explosion-proof warehouse control. Because there are many usage scenarios for this, such as the large file copy program above, NodeJS directly provides the .pipe method to do this, and its internal implementation is similar to the code above.
Here is a more complete process of copying files:
可以把上面的代码保存为 "copy.js" 试验一下我们添加了一个递归的 setTimeout (或者直接使用setInterval)来做一个旁观者,
每500ms观察一次完成进度,并把已完成的大小、百分比和复制速度一并写到控制台上,当复制完成时,计算总的耗费时间。
三,总结一下
(1),理解Stream概念。
(2),熟练使用相关Stream的api
(3),注意细节的把控,比如:大文件的拷贝,采用的使用 “chunk data” 的形式进行分片处理。
(4),pipe的使用
(5),再次强调一个概念:一个TCP连接既是可读流,又是可写流,而Http连接则不同,一个http request对象是可读流,而http response对象则是可写流。