Website static processing - csi-PHP Tutorial-php.cn

CSI is a browser-side dynamic and static integration solution. After my article was published, a friend asked me if CSI technology is to load data through ajax. My answer at the time was just that your understanding was a bit one-sided, so what exactly is CSI? Technical? This actually needs to be defined from the perspective of integrating dynamic and static resources.

CSI technology actually divides the page loading into two steps after the dynamic and static separation of the page. The first step is to load the static resources. After the static resources are loaded, the second step is to load the dynamic resources. However, this definition is still incomplete. The incomplete part is that we have to emphasize the purpose of separating dynamic and static resources. We split dynamic and static resources in the page to effectively cache static resources. This static resource may be in a static web page. On the container, it may be on a CDN, or it may be on the browser. No matter how the static resources are cached, our purpose is to make the static resources load faster. If we do not make the loading of static resources become Efficient, even if we use the CSI form to design the page, we actually do not take advantage of the advantages of CSI, but inadvertently introduce the shortcomings of CSI. So what are the disadvantages of CSI? The details are as follows:

Disadvantage 1 of CSI: CSI is not conducive to the SEO of the page, that is, search engine optimization. Search engine web crawlers generally access the page based on the URL, obtain the content of the page, remove useless information such as css styles, js scripts, and then analyze the remaining text content. Therefore, if part of the page content needs to be loaded asynchronously, then This loading control must be completed by JavaScript code, so asynchronous loading operations cannot be performed on pages crawled down by web crawlers (I heard that some advanced crawlers can perform asynchronous operations and capture asynchronous content, even if there are With this technology, most mainstream crawlers will still ignore JavaScript code and asynchronously loaded content), which will cause some information in the pages crawled by the crawler to be lost, so CSI is not very friendly to SEO. However, if we analyze this shortcoming carefully, it may not be so serious. We talked about a lot of static separation strategies earlier. If we do a good job in dynamic and static separation strategies, then dynamic resources are basically content that cannot be cached and change frequently. The content of these changes does not need to be crawled by web crawlers. Even if it is crawled, the search engine has a query result pointing to this page. When we click on this page, the search result cannot be found on the page. Keywords, I believe many friends will encounter this situation when using search engines. However, I think if developers do not use CSI correctly, they may not handle this issue particularly well, so this shortcoming is still easily introduced.

　Disadvantage 2 of CSI: We spend so much time and effort trying to make our website static in order to make the page load faster. We simply divided the page loading into two steps, so it is really fast. ? This is not necessarily true. In fact, the method of separation of dynamic and static is similar to the separation of database reading and writing mentioned in my previous series. We separate the reading and writing of the database by splitting the association between the reading and writing of the original table, thereby achieving To solve the reading bottleneck problem, the dynamic and static separation of web pages is because static resources can be easily optimized, so we need to split dynamic and static resources. So when we separate the dynamic and static resources, but do not optimize the static resources, it is clear at a glance that we are missing an operation to speed up the page loading. It is really hard to say whether the page can be loaded faster, and Asynchronous loading requires the execution of JavaScript code, but when loading static resources, it is easy to cause the JavaScript script to be blocked. If the blocked script happens to be the asynchronous loading part, the result will only be slower than before.

It can be seen that the SSI and ESI technologies I mentioned earlier are very necessary for us to take advantage of the advantages of CSI technology on the browser side. SSI and ESI can make the loading of static resources separated from dynamic and static more efficient. , which also makes the first step of CSI operation efficient. After the first step is processed, we only need to control the impact of script blocking on asynchronous loading on the page, then we can achieve the purpose of improving the loading efficiency of the entire page. . In addition, I think it is a false proposition that CSI has a significant impact on SEO. If the use of CSI results in poor SEO results, then it must be that the design of our CSI plan is not in place.

Some people think that CSI has a shortcoming, but I don’t think this is a shortcoming. This is actually a design issue, and whether it is good or bad is determined by personal operating habits. What is this shortcoming that others think it is? When using CSI technology, although the page is loaded quickly, the dynamic content part may display a loading prompt, which will lead to a reduction in the user-friendliness of the page. In fact, this synchronous and asynchronous loading mashup operation It is too common. Almost all large portals, e-commerce websites and countless websites use a mix of synchronous and asynchronous loading methods. If these websites do not do this, I believe that these websites, such as homepage loading, will definitely It's so slow that it makes people vomit blood, because many of their web pages have too much content and the pictures are a bit overwhelming, so they have to use a mix of synchronous and asynchronous loading methods. Even many static resources such as pictures and flash will also be asynchronous. Loading method. Having said this, some people may still feel unconvinced. They just don't like the loading prompt when the page is loading. However, the benefits of CSI are very much needed in the web page. So how should we solve this problem? This problem is easy to solve. First of all, if you are willing to use CSI technology, it means that the user is still willing to use asynchronous loading technology. If you don't like it, it will show the prompt of loading, which means that the user wants not to mix asynchronous operations when doing synchronous loading operations. , although ajax technology is now very popular, there is no way to solve the synchronous loading of ajax technology, that is, we enter the website url in the browser address bar to request the page, so in face of the above requirements, we only need to ensure that this synchronization operation is only one Just use pure synchronous operations without asynchronous loading. This solution is easy to implement, so I won’t go into details here.

After the separation of dynamic and static, we will cache static resources. The previous article talked about a lot of static resource caching on the server side. Now that CSI has arrived on the browser side, we have to talk about the browser. Caching operations. The caching operation of the page is to use HTTP's expires and cache-control. Let's first look at the following writing:

<meta http-equiv="pragma" content="no-cache">
<meta http-equiv="cache-control" content="no-cache">
<meta http-equiv="expires" content="0">

Copy after login

This is the meta configuration used by both jsp and vm files in the java web project I am working on now. Its purpose It is to prevent the page from being cached by the browser. However, if CSI technology is used and the dynamic and static separation is done well, then we can actually no longer write this in the header of the page. We can make the page be cached by the browser within a reasonable time range. Caching, if the page is cached, then when we visit the page in the future, the loading efficiency of the web page will become higher.

There is another problem here. In Yahoo's suggestions for optimizing the website, in order to make full use of the parallel loading characteristics of web pages, we often place images, external js and css files in separate static web containers or CDNs. Then these Files can often be cached by browsers. How can we set this up so that the browser knows to cache them? Here we take apache as an example. In order for static resources to be cached by the browser, apache needs to use the mod_expires module, and then add the following configuration to the apache configuration file:

<FilesMatch "\.(gif|jpg|png|js|css">ExpiresDefault "access plus 10 years"</FilesMatch>

Copy after login

Then after the browser accesses the static resources on this apache, the browser The images and js and css files on the server will be cached in the browser.

When the http response code is 304, the browser will read the resource from the cache. Some friends here may wonder why an http request is sent for the cached resource? To understand this, we need to understand the mechanism of caching. The meaning of caching is to temporarily save something. Since it is a temporary save, there should be a validity period for the save. The way we define caching is done through http, so it makes sense to check Whether the cache has expired should also be determined by http, so every time we use the cache, we have to send a request to the server. The server will check whether the resource has expired. If it has not expired, the server will return a 304 response code, 304 The return response does not have an http message body, so the return data of this http request is very small, so this http efficiency is still very high. If the server finds that the resource has expired, the server will return the new resource to the browser. Well, in fact, this request to detect whether a resource has expired has a proper name called a conditional Get request. As for how the server completes the checking operation, this series will elaborate on it in detail when talking about web front-end optimization, so we won’t go into details here. Seeing this, some friends may have questions again. Why can't the cache expiration be determined on the browser side? This is mainly because the browser's check is very inaccurate, because the user's computer clock may not be accurate, or the user's computer clock is inconsistent with the server's. If the time zone is added, it will be even more troublesome, so it is best to invalidate the cache on the server. Performed so that the accuracy of the cached validity period can be guaranteed. The emergence of html5 has greatly enhanced browser caching capabilities. However, I have not studied in depth the use of html5 technology for caching, so I will not describe it here. Interested friends can study it themselves.

Okay, the CSI theme is finished. When it comes to CSI technology and browsers, we can start another important content in this series: the separation of the front and back ends. This will be the topic of my next article. I have talked about it many times in my blog. I will talk about the front-end and back-end separation again soon. This time it is a summary of my research on front-end and back-end separation for so long.