Home>Article>Web Front-end> Detailed introduction to the module implementation process in Node (with examples)

Detailed introduction to the module implementation process in Node (with examples)

不言
不言 forward
2019-03-08 17:34:27 2557browse

This article brings you a detailed introduction to the module implementation process in Node (with examples). It has certain reference value. Friends in need can refer to it. I hope it will be helpful to you.

CommonJS defines module, exports and require module specifications. In order to implement this simple standard, Node.js goes from underlying C/C built-in modules to JavaScript core modules, from path analysis, file positioning to compilation and execution. It went through a series of complex processes. A simple understanding of the principles of the Node module will help us re-understand the framework based on Node.

1. CommonJS module specification

CommonJS specification or standard is simply a theory. It expects JavaScript to have the ability to execute across host environments, not only to develop Client applications can also develop server applications, command line tools, desktop graphical interface applications, etc.

The CommonJS specification defines a module in three parts:

Module definition

The module object exists in the module to represent the module itself. The module context provides the exports attribute and the method The export method can be defined by mounting it on the exports object, for example:

// math.js exports.add = function(){ //...}

Module reference

module provides the require() method to introduce the API of the external module into the current context:

var math = require('math')
Module identification

The module identification is actually the parameter passed to the require() method. It can be a string named according to camelCase, or it can be a file path.

Node.js draws on the design of CommonJS specifications, especially CommonJS’s Modules specification, to implement a module system. At the same time, NPM implements CommonJS’s Packages specification. Modules and packages form the basis for Node application development.

2. Node module loading principle

The above module specification seems very simple, only module, exports and require, but how is Node implemented?

You need to go through three steps: path analysis (full path of the module), file location (file extension or directory), and compilation and execution.

2.1 Path Analysis

Review require() receives the module identifier as a parameter to introduce the module. Node performs path analysis based on this identifier. Different identifiers use different analysis methods, which are mainly divided into the following categories:

The core modules provided by Node, such as http, fs, path

The core modules are compiled when the Node source code is compiled It is saved as a binary executable file and is directly loaded into the memory when Node starts. The path analysis takes priority, so the loading speed is very fast, and there is no need for subsequent file positioning, compilation and execution.

If you want to load a custom module with the same name as the core module, such as a custom http module, you must use a different identifier or use a path instead.

File modules in the form of paths, ., ..relative path modules and /absolute path modules

Identifiers starting with ., ..or / will be treated as file modules, and Node will The path in require() is converted to the real path as an index, and then compiled and executed.

Since the file module clarifies the file location, the path analysis time is shortened, and the loading speed is only slower than that of the core module.

Custom module, that is, a file module in non-path form

It is not a core module, nor a file module in path form. Custom files are special file modules. When searching for paths, Node Paths within the module path are searched one level at a time.
Examples of module path search strategies are as follows:

// paths.js console.log(module.paths) // Terminal $ node paths.js [ '/Users/tong/WebstormProjects/testNode/node_modules', '/Users/tong/WebstormProjects/node_modules', '/Users/tong/node_modules', '/Users/node_modules', '/node_modules' ]

As can be seen from the module path array output in the above example, when searching for modules, the node_modules directory is searched level by level along the current path until the target path is reached, similar to JS Prototype chain or scope chain. The deeper the path, the slower it is, so custom modules load the slowest.

Cache priority mechanism: Node will cache imported modules to improve performance. Unlike browsers that cache files, Node caches compiled and executed objects, so require() will cache the same module. The secondary loading adopts cache priority method. This cache priority is the first priority, which is higher than the priority of the core module!

2.2 File positioning

After module path analysis is completed, file positioning is completed, which mainly includes file extension analysis, directory and package processing. In order to express it more clearly, file positioning is divided into four steps:

step1: Supplement the extension

Usually the identifier in require() does not include the file extension. In this case, Node will try to supplement the extension in the order of .js, .json, and .node.

When trying to add an extension, you need to call the fs module to synchronously block and determine whether the file exists, so a little trick to improve performance here is to bring the extension when passing the .json and .node files to require(). It will speed up some.

step2: Directory processing and search pakage.json

If the corresponding file is not found after adding the extension, but a directory is obtained, Node will treat the directory as a package. According to the implementation of the CommonJS package specification, Node will search for pakage.json (package description file) in the directory, parse it into a package description object through JSON.parse(), and obtain the file name specified by the main attribute.

step3: Continue to search for index files by default

如果没有pakage.json或者main属性指定的文件名错误,那 Node 会将 index 当做默认文件名,依次查找 index.js、index.json、index.node

step4: 进入下一个模块路径

在上述目录分析过程中没有成功定位时,自定义模块按路径查找策略进入上一层node_modules目录,当整个模块路径数组遍历完毕后没有定位到文件,则会抛出查找失败异常。

缓存加载的优化策略使得二次引入不需要路径分析、文件定位、编译执行这些过程,而且核心模块也不需要文件定位的过程,这大大提高了再次加载模块时的效率

2.3 编译执行

Node 中每个模块都是一个对象,在具体定位到文件后,Node 会新建该模块对象,然后根据路径载入并编译。不同的文件扩展名载入方法为:

.js 文件: 通过 fs 模块同步读取后编译执行.json 文件: 通过 fs 模块同步读取后,用JSON.parse()解析并返回结果.node 文件: 这是用 C/C++ 写的扩展文件,通过process.dlopen()方法加载最后编译生成的其他扩展名: 都被当做 js 文件载入

载入成功后 Node 会调用具体的编译方式将文件执行后返回给调用者。对于 .json 文件的编译最简单,JSON.parse()解析得到对象后直接赋值给模块对象的exports,而 .node 文件是C/C++编译生成的,Node 直接调用process.dlopen()载入执行就可以,下面重点介绍 .js 文件的编译:

在 CommonJS 模块规范中有module、exports 和 require 这3个变量,在 Node API 文档中每个模块还有 __filename、__dirname这两个变量,但是在模块中没有定义这些变量,那它们是怎么产生的呢?

事实上在编译过程中,Node 对每个 JS 文件都被进行了封装,例如一个 JS 文件会被封装成如下:

(function (exports, require, module, __filename, __dirname) { var math = require('math') export.add = function(){ //... } })

首先每个模块文件之间都进行了作用域隔离,通过vm原生模块的runInThisContext()方法(类似 eval)返回一个具体的 function 对象,最后将当前模块对象的exports属性、require()方法、模块对象本身module、文件定位时得到的完整路径__filename和文件目录__dirname作为参数传递给这个 function 执行。模块的exports属性上的任何方法和属性都可以被外部调用,其余的则不可被调用。

至此,module、exports 和 require的流程就介绍完了。

曾经困惑过,每个模块都可以使用exports的情况下,为什么还必须用module.exports。

这是因为exports在编译过程中时通过形参传入的,直接给exports形参赋值只改变形参的引用,不能改变作用域外的值,例如:

let change = function (exports) { exports = 100 console.log(exports) } var exports = 2 change(exports) // 100 console.log(exports) // 2

所以直接赋值给module.exports对象就不会改变形参的引用了。

编译成功的模块会将文件路径作为索引缓存在 Module._cache 对象上,路径分析时优先查找缓存,提高二次引入的性能。

三、Node 核心模块

总结来说 Node 模块分为Node提供的核心模块和用户编写的文件模块。文件模块是在运行时动态加载,包括了上述完整的路径分析、文件定位、编译执行这些过程,核心模块在Node源码编译成可执行文件时存为二进制文件,直接加载在内存中,所以不用文件定位和编译执行。

核心模块分为 C/C++ 编写的和 JavaScript 编写的两部分,在编译所有 C/C++ 文件之前,编译程序需要将所有的 JavaScript 核心模块编译为 C/C++ 可执行代码,编译成功的则放在 NativeModule._cache对象上,显然和文件模块 Module._cache的缓存位置不同。

在核心模块中,有些模块由纯 C/C++ 编写的内建模块,主要提供 API 给 JavaScript 核心模块,通常不能被用户直接调用,而有些模块由 C/C++ 完成核心部分,而 JavaScript 实现封装和向外导出,如 buffer、fs、os 等。

所以在Node的模块类型中存在依赖层级关系:内建模块(C/C++)—> 核心模块(JavaScript)—> 文件模块。

使用require()十分的方便,但从 JavaScript 到 C/C++ 的过程十分复杂,总结来说需要经历 C/C++ 层面内建模块的定义、(JavaScript)核心模块的定义和引入以及(JavaScript)文件模块的引入。

四、前端模块规范

对比前后端的 JavaScript,浏览器端的 JavaScript 需要经历从同一个服务器端分发到多个客户端执行,通过网络加载代码,瓶颈在于宽带;而服务器端 JavaScript 相同代码需要多次执行,通过磁盘加载,瓶颈在于 CPU 和内存,所以前后端的 JavaScript 在 Http 两端的职责完全不用。

Node 模块的引入几乎是同步的,而前端模块如果同步引入,那脚本加载需要太长的时间,所以 CommonJS 为后端 JavaScript 制定的规范不适合前端。而后出现 AMD 和 CMD 用于前端应用场景。

4.1 AMD 规范

AMD 即异步模块定义(Asynchronous Module Definition),模块定义为:

define(id?, dependencies?, factory);

AMD 模块需要用define明确定义一个模块,其中模块id与依赖dependencies是可选的,factory的内容就是实际代码的内容。例如指定一些依赖到模块中:

define(['dep1', 'dep2'], function(){ // module code });

require.js 实现 AMD 规范的模块化,感兴趣的可以查看 require.js 的文档。

4.2 CMD 规范

CMD 模块的定义更加简单:

define(factory);

定义的模块同 Node 模块一样是隐式包装,在依赖部分支持动态引入,例如:

define(function(require, exports, module){ // module code });

requireexportsmodule通过形参传递给模块,需要依赖模块时直接使用require()引入。

sea.js 实现 AMD 规范的模块化,感兴趣的可以查看 sea.js 的文档。

The above is the detailed content of Detailed introduction to the module implementation process in Node (with examples). For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:segmentfault.com. If there is any infringement, please contact admin@php.cn delete