The most common mistake made by many website designers is to assume that the code is correct when their webpage can be displayed normally under IE. It is even common to see people complaining that their website ranking is not ideal. Just go to their website and simply take a look. It can be found that the HTML code is full of various errors. No matter how much effort is put into optimizing the website based on such code, the results may be in vain!
In fact, IE is a browser that has a very high fault tolerance for HTML code. As a side note, although we can have various reasons to attack Microsoft, Microsoft’s product operation is very strict. The efforts made in terms of accessibility and usability cannot be ignored. ——The fact that a Web page can be displayed normally under IE does not mean that there is no problem with the HTML code of the page. It can even be extended to the fact that the Web page can be displayed normally under a variety of browsers, which does not mean that the HTML code is completely legal and valid. After all, every browser must ensure basic fault-tolerant functions. Otherwise, it will happen that the HTML page will not display properly even just because of something in the network transmission. This still happens frequently today when network bandwidth is still tight.
What is legal and valid HTML code
Simply put, our Web page is composed of HTML (Hypertext Markup Language: Hypertext Markup Language) elements. Even for dynamic pages such as ASP and PHP, the ASP or PHP statements are rendered by the SERVER. into corresponding HTML elements and downloaded to the client; for JavaScript and the like, the client converts it into HTML.
Like other languages, HTML also has its own syntax rules. Both browsers and search engine spiders analyze the content in the web page code based on these rules. But many times, even for skilled people, it is still inevitable to make some HTML code errors when building HTML pages, not to mention the redundant and bloated HTML problem caused by most WYSIWYG editors.
If there is no component in the page that violates the HTML standard syntax specifications, it can be called a legal and valid HTML code.
The importance of legal and valid HTML code to SEO
In order for search engines to include our web pages - only on this basis can we talk about website optimization and website promotion - the premise is that the spider of the search engine can read our web files. The basis for search engine spiders to read web pages is the HTML specification. By analyzing the HTML code, spider can determine the content of the web page, and on this basis, it can determine the relevance of the corresponding keywords.
What needs to be made clear is that the search engine Spider is different from the browser in that its fault tolerance is much worse than that of the browser. If there is HTML code that it cannot interpret in the page code, it may stop reading the page. The page may even stop crawling within our site, or more serious errors may cause it to also discard information it has collected about the content of other pages within the site.
Although major search engines are now trying their best to improve Spider’s fault tolerance, so that it can not affect the collection of content when general errors occur in the HTML code. But many times, it still happens that a closing tag is missing, causing the content of the entire page to be ignored.
On the other hand, legal and valid HTML can also ensure that Web pages can be interpreted correctly under multiple browsers, avoiding the situation where the same page displays normally under IE but is severely deformed under Mozilla (of course, it cannot be completely avoided ), which is also of great benefit in improving the usability of the website.
How to verify the legality and validity of HTML code?
There are many similar free services on the Internet that can help us verify whether the web page code is legal and valid. The most famous one is W3C HTML Validator, which is a free service officially launched by W3C (World Wide Web Consortium: World Wide Web Alliance). On its page, you only need to enter the HTML address to be verified or upload an HTML file on the local machine. It will quickly return the verification results, whether they are correct, what are the errors and how to improve them, etc.
At the same time, W3C HTML Validator also provides verification services for CSS files.
Does it have to pass W3C verification?
The answer to this question is not so absolute.
Theoretically, legal HTML code can make it easier for search engine spiders to collect content information on website pages. But on the other hand, not all HTML code errors will affect Spider's crawling. In other words, a small number of errors in HTML are acceptable to Spider. So, must it pass W3C certification?
On the other hand, as in Mark Daoust’s test, it is even implied (without definite conclusion) that the presence of a small number of HTML code errors will have a greater advantage in the Google ranking of the page. Of course, this is highly controversial, but at least it proves The existence of a few HTML code errors does not affect the ranking of the web page in the SERP.
From my personal point of view, if you are relatively unfamiliar with HTML, there is no need to insist on passing the W3C verification 100%. After all, it is fundamental to put more time and energy into the direction you should really work on, such as creating content and links, but Make sure there are no major serious errors in the HTML code. Of course, if you're proficient in the HTML language, why not put in a little effort to make sure it's completely correct? Therefore, what we have to do does not necessarily have to pass W3C certification, but at least we must ensure that it displays normally in various browsers and ensure that the search engine spider can analyze it normally.
Improve the efficiency of HTML code
Earlier we mentioned the redundant and bloated HTML problem caused by many WYSIWYG editors. This situation is quite common on many Chinese websites. WYSIWYG editors such as FrontPage and Dreamweaver often produce a lot of unnecessary redundant code, especially when they modify a web page. When the HTML file of a page contains a large amount of redundant code, the file will become bloated. This will not only reduce the opening speed of the web page and damage the efficiency of the web page, but also seriously affect the search engine ranking of the corresponding web page.
Rather than putting energy into passing W3C certification, I personally think it is better to put more energy into streamlining the code, such as introducing CSS, etc., to achieve clean and concise code. Such optimization effects will be more obvious.