The function of detecting whether a string is utf8 encoded in php

怪我咯
Release: 2023-03-13 09:04:01
Original
1938 people have browsed it

Given a string, how to determine what encoding it is? PHP has a function: mb_detect_encoding. However, this thing requires the mb_string library, which is not available everywhere.

 function is_utf8($string) { 
     return preg_match('%^(?: 
             [\x09\x0A\x0D\x20-\x7E]                 # ASCII 
         | [\xC2-\xDF][\x80-\xBF]                 # non-overlong 2-byte 
         |     \xE0[\xA0-\xBF][\x80-\xBF]             # excluding overlongs 
         | [\xE1-\xEC\xEE\xEF][\x80-\xBF]{2}     # straight 3-byte 
         |     \xED[\x80-\x9F][\x80-\xBF]             # excluding surrogates 
         |     \xF0[\x90-\xBF][\x80-\xBF]{2}     # planes 1-3 
         | [\xF1-\xF3][\x80-\xBF]{3}             # planes 4-15 
         |     \xF4[\x80-\x8F][\x80-\xBF]{2}     # plane 16 
     )*$%xs', $string);      
}
Copy after login

The accuracy is basically the same as mb_detect_encoding, both correct and wrong.
Encoding detection cannot be 100% accurate. This thing can basically meet the requirements.

The above is the detailed content of The function of detecting whether a string is utf8 encoded in php. For more information, please follow other related articles on the PHP Chinese website!

Related labels:
source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template