A brief discussion on Webpack persistent caching practice-JS Tutorial-php.cn

This article mainly introduces the practice of Webpack persistent caching. Now I will share it with you and give you a reference.

Preface

Recently I was looking at how webpack makes persistent cache content, and found that there are still some pitfalls in it. I just have time to sort and summarize them. After reading this article, you can roughly understand:

What is persistent caching and why do we need to do persistent caching?
How does webpack do persistent caching? ?
Some points to note when doing caching with webpack.

Persistence Cache

First of all, we need to explain what persistence cache is, in the context of the current popularity of applications that separate front and rear ends. Under the circumstances, front-end html, css, and js often exist on the server in the form of a static resource file, and data is obtained through interfaces to display dynamic content. This involves the issue of how the company deploys the front-end code, so it involves an issue of update deployment. Should the page be deployed first, or the resources be deployed first?

Deploy the page first, then deploy the resources: During the time interval between the two deployments, if a user accesses the page, the old resource will be loaded in the new page structure, and the old version of the resource will be regarded as the new When the version is cached, the result is that the user accesses a page with a disordered style. Unless refreshed manually, the page will remain in a disordered state until the resource cache expires.

Deploy resources first, then deploy pages: During the deployment time interval, users who have locally cached resources of an old version visit the website. Since the requested page is an old version, the resource reference has not changed and the browser will use it directly. Local caching is normal, but when users without local caching or cache expired visit the website, the old version page will load the new version resource, resulting in page execution errors.

So we need a deployment strategy to ensure that when we update our online code, online users can transition smoothly and open our website correctly.

It is recommended to read this answer first: How to develop and deploy front-end code in a large company?

After you read the above answer, you will roughly understand that the more mature persistent caching solution now is to add a hash value after the name of the static resource, because the hash value generated each time the file is modified is different. The advantage of this is to publish files incrementally to avoid overwriting previous files and causing online user access to fail.

Because as long as the names of the static resources (css, js, img) published each time are unique, then I can:

For html files : Do not enable caching, put the html on your own server, turn off the server's cache, and your own server only provides html files and data interfaces
For static js, css, pictures, etc. File: Enable CDN and caching, and upload static resources to the CDN service provider. We can enable long-term caching of resources. Because the path of each resource is unique, it will not cause the resource to be overwritten, ensuring the stability of online user access. sex.
Every time an update is released, the static resources (js, css, img) are first transferred to the cdn service, and then the html file is uploaded. This ensures that old users can Normal access allows new users to see new pages.

The above briefly introduces the mainstream front-end persistent caching solution, so why do we need to do persistent caching?

When a user visits our site for the first time using a browser, the page introduces a variety of static resources. If we can achieve persistent caching, we can add Cache- in the http response header. control or Expires field to set the cache, the browser can cache these resources locally one by one.

If the user needs to request the same static resource again during subsequent visits, and the static resource has not expired, the browser can directly cache it locally instead of requesting the resource through the network.

How webpack does persistent caching

After briefly introducing persistent caching, the following is the focus, so how should we do persistent caching in webpack? Well, we need to do the following two things:

Ensure the uniqueness of the hash value, that is, generate a unique hash value for each packaged resource. As long as the packaged content is inconsistent, then The hash values are inconsistent.
To ensure the stability of the hash value, we need to ensure that when a module is modified, only the hash value of the affected packaged file changes, and the hash value of the packaged file unrelated to the module changes. constant.

Hash file name is the first step to implement persistent caching. Currently, webpack has two ways to calculate hash ([hash] and [chunkhash])

hash means that each time webpack generates a unique hash value during the compilation process, it will be re-created after any file in the project is changed, and then webpack calculates the new hash value.
chunkhash is a hash value calculated based on the module, so changes to a certain file will only affect its own hash value and will not affect other files.

So if you simply package everything into the same file, then hash can satisfy you. If your project involves unpacking, loading in modules, etc., then you Chunkhash is needed to ensure that only the relevant file hash values change after each update.

So our webpack configuration with persistent cache should look like this:

module.exports = { entry: __dirname + '/src/index.js', output: { path: __dirname + '/dist', filename: '[name].[chunkhash:8].js', } }

Copy after login

The meaning of the above code is: use index.js as the entry point to package all the code into one The file is named index.xxxx.js and placed in the dist directory. Now we can generate a newly named file every time we update the project.

If you are dealing with simple scenarios, this is enough, but in large multi-page applications, we often need to optimize the performance of the page:

Separate business Code and third-party code: The reason why we separate business code and third-party code is because business code updates frequently and third-party code updates and iterations are slow, so we separate third-party code (libraries, frameworks) , which can make full use of the browser's cache to load third-party libraries.
On-demand loading: For example, when using React-Router, when the user needs to access a certain route, the corresponding component will be loaded. Then the user does not need to load it at the beginning. At this time, all routing components will be downloaded locally.
In multi-page applications, we can often extract common modules, such as header, footer, etc., so that when the page jumps, these common modules exist in the cache , you can load it directly instead of making a network request.

So how to unpack and load modules into modules requires webpack's built-in plug-in: CommonsChunkPlugin. Below I will use an example to explain how to configure webpack.

The code of this article is placed on my Github. If you are interested, you can download it and take a look:

git clone https://github.com/happylindz/blog.git cd blog/code/multiple-page-webpack-demo npm install

Copy after login

Before reading the following content, I strongly recommend that you read my previous article: In-depth understanding Webpack file packaging mechanism. Understanding the webpack file packaging mechanism will help you better implement persistent caching.

The example is roughly described like this: it consists of two pages pageA and pageB

// src/pageA.js import componentA from './common/componentA'; // 使用到 jquery 第三方库，需要抽离，避免业务打包文件过大 import $ from 'jquery'; // 加载 css 文件，一部分为公共样式，一部分为独有样式，需要抽离 import './css/common.css' import './css/pageA.css'; console.log(componentA); console.log($.trim(' do something ')); // src/pageB.js // 页面 A 和 B 都用到了公共模块 componentA，需要抽离，避免重复加载 import componentA from './common/componentA'; import componentB from './common/componentB'; import './css/common.css' import './css/pageB.css'; console.log(componentA); console.log(componentB); // 用到异步加载模块 asyncComponent，需要抽离，加载首屏速度 document.getElementById('xxxxx').addEventListener('click', () => { import( /* webpackChunkName: "async" */ './common/asyncComponent.js').then((async) => { async(); }) }) // 公共模块基本长这样 export default "component X";

Copy after login

The above page content basically involves the three modes of splitting our modules: splitting the public library , loading and splitting common modules on demand. Then the next step is to configure webpack:

const path = require('path'); const webpack = require('webpack'); const ExtractTextPlugin = require('extract-text-webpack-plugin'); module.exports = { entry: { pageA: [path.resolve(__dirname, './src/pageA.js')], pageB: path.resolve(__dirname, './src/pageB.js'), }, output: { path: path.resolve(__dirname, './dist'), filename: 'js/[name].[chunkhash:8].js', chunkFilename: 'js/[name].[chunkhash:8].js' }, module: { rules: [ { // 用正则去匹配要用该 loader 转换的 CSS 文件 test: /.css$/, use: ExtractTextPlugin.extract({ fallback: "style-loader", use: ["css-loader"] }) } ] }, plugins: [ new webpack.optimize.CommonsChunkPlugin({ name: 'common', minChunks: 2, }), new webpack.optimize.CommonsChunkPlugin({ name: 'vendor', minChunks: ({ resource }) => ( resource && resource.indexOf('node_modules') >= 0 && resource.match(/.js$/) ) }), new ExtractTextPlugin({ filename: `css/[name].[chunkhash:8].css`, }), ] }

Copy after login

The first CommonsChunkPlugin is used to extract public modules, which is equivalent to saying webpack boss, if you see a module being loaded twice or more, then please Help me move it to the common chunk. Here, minChunks is 2, and the granularity is the smallest. You can choose how many times to use the modules before extracting them according to your actual situation.

The second CommonsChunkPlugin is used to extract third-party codes, extract them, and determine whether the resources come from node_modules. If so, it means that they are third-party modules, then extract them. It is equivalent to telling the webpack boss that if you see some modules coming from the node_modules directory and their names end with .js, please move them to the vendor chunk. If the vendor chunk does not exist, create a new one. .

What are the benefits of this configuration? As our business grows, we are likely to rely on more and more third-party library codes. If we specially configure an entrance to store third-party code, then our webpack .config.js will become:

// 不利于拓展 module.exports = { entry: { app: './src/main.js', vendor: [ 'vue', 'axio', 'vue-router', 'vuex', // more ], }, }

Copy after login

The third ExtractTextPlugin plug-in is used to extract css from the packaged js file and generate an independent css file. Imagine that when you just modify it The style does not modify the functional logic of the page. You definitely don't want the hash value of your js file to change. You definitely want css and js to be separated from each other and not affect each other.

After running webpack, you can see the effect of packaging:

├── css │ ├── common.2beb7387.css │ ├── pageA.d178426d.css │ └── pageB.33931188.css └── js ├── async.03f28faf.js ├── common.2beb7387.js ├── pageA.d178426d.js ├── pageB.33931188.js └── vendor.22a1d956.js

Copy after login

It can be seen that css and js have been separated, and we have split the module to ensure the uniqueness of the module chunk. Every time you update the code, a different hash value will be generated.

With uniqueness, we need to ensure the stability of the hash value. Imagine this scenario. You definitely don’t want your modification of a certain part of the code (module, css) to cause the hash value of the file to change completely. , then it is obviously unwise, so what should we do to minimize the change of hash value?

In other words, we need to find out the factors that will cause cache failure in webpack compilation, and find ways to solve or optimize them?

The change in the chunkhash value is mainly caused by the following four parts:

The source code containing the module
webpack uses The runtime code that runs at startup
The module moduleid generated by webpack (including the module id and the referenced dependent module id)
chunkID

As long as any part of these four parts changes, the generated chunked file will be different and the cache will become invalid. The following will introduce the four parts one by one:

1. Source code changes:

显然不用多说，缓存必须要刷新，不然就有问题了

二、webpack 启动运行的 runtime 代码：

看过我之前的文章：深入理解 webpack 文件打包机制就会知道，在 webpack 启动的时候需要执行一些启动代码。

(function(modules) { window["webpackJsonp"] = function webpackJsonpCallback(chunkIds, moreModules) { // ... }; function __webpack_require__(moduleId) { // ... } __webpack_require__.e = function requireEnsure(chunkId, callback) { // ... script.src = __webpack_require__.p + "" + chunkId + "." + ({"0":"pageA","1":"pageB","3":"vendor"}[chunkId]||chunkId) + "." + {"0":"e72ce7d4","1":"69f6bbe3","2":"9adbbaa0","3":"53fa02a7"}[chunkId] + ".js"; }; })([]);

Copy after login

大致内容像上面这样，它们是 webpack 的一些启动代码，它们是一些函数，告诉浏览器如何加载 webpack 定义的模块。

其中有一行代码每次更新都会改变的，因为启动代码需要清楚地知道 chunkid 和 chunkhash 值得对应关系，这样在异步加载的时候才能正确地拼接出异步 js 文件的路径。

那么这部分代码最终放在哪个文件呢？因为我们刚才配置的时候最后生成的 common chunk 模块，那么这部分运行时代码会被直接内置在里面，这就导致了，我们每次更新我们业务代码(pageA, pageB, 模块)的时候， common chunkhash 会一直变化，但是这显然不符合我们的设想，因为我们只是要用 common chunk 用来存放公共模块(这里指的是 componentA)，那么我 componentA 都没去修改，凭啥 chunkhash 需要变了。

所以我们需要将这部分 runtime 代码抽离成单独文件。

module.exports = { // ... plugins: [ // ... // 放到其他的 CommonsChunkPlugin 后面 new webpack.optimize.CommonsChunkPlugin({ name: 'runtime', minChunks: Infinity, }), ] }

Copy after login

这相当于是告诉 webpack 帮我把运行时代码抽离，放到单独的文件中。

├── css │ ├── common.4cc08e4d.css │ ├── pageA.d178426d.css │ └── pageB.33931188.css └── js ├── async.03f28faf.js ├── common.4cc08e4d.js ├── pageA.d178426d.js ├── pageB.33931188.js ├── runtime.8c79fdcd.js └── vendor.cef44292.js

Copy after login

多生成了一个 runtime.xxxx.js，以后你在改动业务代码的时候，common chunk 的 hash 值就不会变了，取而代之的是 runtime chunk hash 值会变，既然这部分代码是动态的，可以通过 chunk-manifest-webpack-plugin 将他们 inline 到 html 中，减少一次网络请求。

三、webpack 生成的模块 moduleid

在 webpack2 中默认加载 OccurrenceOrderPlugin 这个插件，OccurrenceOrderPlugin 插件会按引入次数最多的模块进行排序，引入次数的模块的 moduleId 越小，但是这仍然是不稳定的，随着你代码量的增加，虽然代码引用次数的模块 moduleId 越小，越不容易变化，但是难免还是不确定的。

默认情况下，模块的 id 是这个模块在模块数组中的索引。OccurenceOrderPlugin 会将引用次数多的模块放在前面，在每次编译时模块的顺序都是一致的，如果你修改代码时新增或删除了一些模块，这将可能会影响到所有模块的 id。

最佳实践方案是通过 HashedModuleIdsPlugin 这个插件，这个插件会根据模块的相对路径生成一个长度只有四位的字符串作为模块的 id，既隐藏了模块的路径信息，又减少了模块 id 的长度。

这样一来，改变 moduleId 的方式就只有文件路径的改变了，只要你的文件路径值不变，生成四位的字符串就不变，hash 值也不变。增加或删除业务代码模块不会对 moduleid 产生任何影响。

module.exports = { plugins: [ new webpack.HashedModuleIdsPlugin(), // 放在最前面 // ... ] }

Copy after login

四、chunkID

实际情况中分块的个数的顺序在多次编译之间大多都是固定的, 不太容易发生变化。

这里涉及的只是比较基础的模块拆分，还有一些其它情况没有考虑到，比如异步加载组件中包含公共模块，可以再次将公共模块进行抽离。形成异步公共 chunk 模块。有想深入学习的可以看这篇文章：Webpack 大法之 Code Splitting

webpack 做缓存的一些注意点

CSS 文件 hash 值失效的问题
不建议线上发布使用 DllPlugin 插件

CSS 文件 hash 值失效的问题：

ExtractTextPlugin 有个比较严重的问题，那就是它生成文件名所用的[chunkhash]是直接取自于引用该 css 代码段的 js chunk ；换句话说，如果我只是修改 css 代码段，而不动 js 代码，那么最后生成出来的 css 文件名依然没有变化。

所以我们需要将 ExtractTextPlugin 中的 chunkhash 改为 contenthash，顾名思义，contenthash 代表的是文本文件内容的 hash 值，也就是只有 style 文件的 hash 值。这样编译出来的 js 和 css 文件就有独立的 hash 值了。

module.exports = { plugins: [ // ... new ExtractTextPlugin({ filename: `css/[name].[contenthash:8].css`, }), ] }

Copy after login

如果你使用的是 webpack2，webpack3，那么恭喜你，这样就足够了，js 文件和 css 文件修改都不会影响到相互的 hash 值。那如果你使用的是 webpack1，那么就会出现问题。

具体来讲就是 webpack1 和 webpack 在计算 chunkhash 值得不同：

webpack1 在涉及的时候并没有考虑像 ExtractTextPlugin 会将模块内容抽离的问题，所以它在计算 chunkhash 的时候是通过打包之前模块内容去计算的，也就是说在计算的时候 css 内容也包含在内，之后才将 css 内容抽离成单独的文件，

那么就会出现：如果只修改了 css 文件，未修改引用的 js 文件，那么编译输出的 js 文件的 hash 值也会改变。

In this regard, webpack2 has made improvements. It calculates the hash value based on the packaged file content, so it is after ExtractTextPlugin extracts the css code, so there is no such problem as above. If unfortunately you are still using webpack1, it is recommended that you use the md5-hash-webpack-plugin plugin to change webpack’s hash calculation strategy.

It is not recommended to use the DllPlugin plug-in for online publishing

Why do you say that? Because recently a friend came to me and asked me why their leader does not allow the use of the DllPlugin plug-in online.

DllPlugin itself has several shortcomings:

First of all, you need to configure an additional webpack configuration, which increases the workload.
One of the pages uses a large third-party dependency library and other pages do not need to use it at all, but it is not worthwhile to package it directly in dll.js. This useless code has to be loaded every time the page is opened, and the Code Splitting function of webpack2 cannot be used.
You need to download the dll file when you open it for the first time. Because you have put many libraries together, the dll file is very large and the page loading speed for the first time is very slow.

Although you can package it into a dll file, and then let the browser read the cache, so that you don’t have to request it next time, for example, if you use one of the lodash functions, and you use dll The entire lodash file will be imported, which will cause you to load too much useless code, which is not conducive to the first screen rendering time.

I think the correct approach is:

Libraries with strong integrity like React and Vue can generate vendor third-party libraries To do caching, because your general technical system is fixed and a unified technical system is basically used in a site, so a vendor library is generated for caching.
Functional component libraries like antd and lodash can be eliminated through tree shaking, and only useful code will be retained. Never directly put it into the vendor third-party library, otherwise You'll be executing a lot of useless code.

##Conclusion

Okay, I feel like I’ve rambled a lot again. I’ve really gained a lot from reading webpack recently. I hope everyone can learn from the article. You can also gain something. In addition, I recommend again the article I wrote before, which can better help you understand the file caching mechanism: In-depth understanding of the webpack file packaging mechanism

The above is what I compiled for everyone, I hope it will be helpful to everyone in the future.

Very practical ajax user registration module

##Ajax click to continuously load the data list (graphic tutorial)

Ajax Struts2 implements verification code verification function (graphic tutorial)

The above is the detailed content of A brief discussion on Webpack persistent caching practice. For more information, please follow other related articles on the PHP Chinese website!