Web optimization with ETags. Example with WordPress
This post was initially published in 2014 in 2019 in Web Optimization with ETags. Example with WordPress
I haven't written about optimization in a while. You already know what you know me, why it was due. However, I can't let so-called WPO peddlers keep me from writing about something I like. So, here you have a new post.
I'm sure it's happened to you. You arrive at your workplace one day, turn on your computer, open your email and after taking a look at it, you open a terminal and type: git pull. The response from the terminal does not wait: Already up-to-date..
Have you ever thought about what happens behind that git pull? I do. Guessing, I would say that by doing a git pull you are transparently sending to the server the date of the last change that you have. The repository checks the date of the last change that you send it against the date of the last change that it has, so that:
This procedure, which for me was the most logical, is not the real one. The real one is similar, but not exact. Every time a push is done. The repository associates a token (alphanumeric identifying code, something like ae3d9735f280381d0d97d3cdb26066eb16f765a5) to the last commit. When you do a git pull, the last token you have is compared with the list of tokens he has. If your token is an old one, it sends you the changes since then with their corresponding tokens. If the token was the last one, it will tell you that you are up to date.
At this point, you will tell me: Manuel, but wasn't this post about optimizing websites with WordPress? Certainly, and still is. Both the first case presented (that of the date), and the second (that of the token) are ways of working of the HTTP protocol. Let's see it.
Imagine that your browser sends a request to my server to download the favicon of my website. In the response from my server to your browser there will be the string (or HTTP header): Last-Modified: Thu, 29 Dec 2016 11:55:29 GMT. With it, my server is informing your browser when the favicon was last modified. So the browser, once the image is downloaded, will save it in its cache with the metadata “Last-Modified” and value Thu, 29 Dec 2016 11:55:29 GMT
If after a few seconds, a few days or a few months, you decide to enter my website again, your browser will need the favicon of my site again. However, remember that it also has a copy of the image in its cache. How do you know if your cached favicon is the latest one or if you need to download it again? Simple, doing a “git pull”. That is, the browser sends a favicon request to my server again, but informing that it has a version of the image from a certain date. There are two possible answers from my server:
If you remember, at the beginning of the post, I told you that Git worked with tokens to determine when changes were made. HTTP, in addition to the last modification date, allows you to work with tokens called ETags (Entity Tags). An ETag is an alphanumeric code (such as 5864f9b1-47e) with no default format (that is, the HTTP standard does not specify, or almost does not specify, what format the token should have). It is the owner of a site who determines what its format will be.
By default, web servers like Apache create the ETag of each file based on its modification date (and sometimes also the file size). This is redundant (the last modified date HTTP header is based on the same criteria) and suboptimal (because it adds more information to the requests that is of no use). It is advisable in this case to configure your web server so that it does not use ETags with files. For example, to disable file ETags (or FileETags) for Apache, add the following code to tú.htacess: FileETag None
I'm sure you're wondering if the dialog between browser/server using an ETag is the same as what we have seen for the last modification date and using both forms is not optimal and redundant. Why use ETags?
The modification date is sufficient for HTTP requests to files, but with HTTP requests to web pages (HTML) it falls short. A web page depends on many interrelated factors/elements (content, comments, HTML structure, ...) and not on a single file. Therefore, it would be very difficult to find a unified last modified date for all those elements. I know what I'm saying is complicated to follow, I'll try to explain it another way:
Imagine that I assign as the date of modification of the web page (HTTML) of this entry, the date of modification of the text of the entry. Your browser upon entering would cache this page along with the last modification date of the post. If you log back in a minute later, as the post has not changed (and, therefore, its modification date), your browser will go back to using the cached version. If someone wrote me a comment and you came back in, you wouldn't see the comment. Well, the text of the post has not changed, therefore, the last modification date has not changed either, so your browser would show you the version from its cache again. The same thing would happen if I change the HTML and add new CSS. The content of the post has not changed, neither has the date, and your browser would continue to show the cache version.
If instead of working with dates of last modification of the post, we assign to the post's web page an ETag with the following format: {date_modification_content_post}_{date_last_commentario_post}_{number_version_del_tema_WP}
When your browser enters the post for the first time, it will cache the web page (HTML) with its associated ETag as metadata. If you changed any of the token criteria (post modification date, last comment date, or current WP theme version), the ETag associated with the web page would be different. So if you enter the post again, my server will notify that your browser's ETag is not the latest and will send you the entire web page again, along with the new ETag.
If nothing had changed, the token/ETag would be the same (both in the browser and on the server), so when you visit the page with your browser, my server would send you a 304 letting you know that nothing has changed (in WPO terms it is said to be still fresh ) and therefore use the version from your cache.
Something I haven't mentioned until now are the benefits of ETags. Here are some of them:
Everything we have seen is at a high level, we are going to see a small plugin that uses ETag for WordPress pages/posts.
# etags.php <?php namespace trasweb\webperf\ETags; /* * Plugin Name: ETags en posts * Plugin URI: https://trasweb.net/webperf/optimizacion-web-con-etags * Description: Usa el cache en navegador para tus posts. * Version: 0.0.1 * Author: Manuel Canga / Trasweb * Author URI: https://trasweb.net * License: GPL */ add_action('wp', function () { if (is_admin() || ! is_singular()) { return; } $etag_from_navigator = $_SERVER[ 'HTTP_IF_NONE_MATCH' ]??''; $current_ETag = get_current_ETag(); if ($etag_from_navigator === $current_ETag) { status_header(304); exit; } header('ETag: ' . $current_ETag); }); function get_current_ETag() { $last_modified_time_of_content = (int)get_post_time(); $date_of_last_comment = get_date_of_last_comment(); $theme_version = wp_get_theme()[ "Version" ]??'0.0.0'; return md5("{$last_modified_time_of_content}_{$date_of_last_comment}_{$theme_version}"); } function get_date_of_last_comment() { $query = [ 'post_id' => get_the_ID() ?: 0, 'orderby' => ['comment_date_gmt'], 'status' => 'approve', 'order' => 'DESC', 'number' => 1, ]; $last_comment = get_comments($query)[ 0 ]??null; return $last_comment->comment_date_gmt??0; }
First of all, say that this plugin is only training. As with any web optimization technique, such as minification/unification of CSS/JS resources or the use of server-side caching, a study of the site is required first.
As you can see, it couldn't be simpler. First, the ETag of the browser is obtained if there is one (line 20). Secondly, the Etag associated with the current post/page is obtained (line 21).
If both are the same, a 304 code is sent to the browser (line 24, this is the case shown in the main image of this post) and the execution ends. The browser will receive the code 304 and will know that it has to use the version of the page in its cache.
If the Etags are different (either because the browser enters for the first time or because the token has changed), the ETag is sent to the browser and WP is allowed to continue its course (which sends the content of the current post/page ).
The Etag is generated in the get_current_ETag function (line 31 to 38) based on the last time the post/page was modified, the date of the last comment on the post and the version of the current topic. If any of these parameters change, the token will change, forcing the browser to download the new version of the website.
This is all. I hope you liked it and it helps you make your website faster.
Please share it
The above is the detailed content of Web optimization with ETags. Example with WordPress. For more information, please follow other related articles on the PHP Chinese website!