Common method for PHP to detect whether a string is UTF8 encoded, utf8 encoding
The example in this article summarizes the common methods for PHP to detect whether a string is UTF8 encoded. Share it with everyone for your reference. The specific implementation method is as follows:
There are many ways to detect string encoding, such as using ord to get the base of the character and then entering the judgment, or using the mb_detect_encoding function to process it. Here are four common methods for your reference.
Example 1
Copy code The code is as follows:
/**
* Check whether the string is UTF8 encoded
* @param string $str The detected string
* @return boolean
*/
function is_utf8($str){
$len = strlen($str);
for($i = 0; $i < $len; $i++){
$c = ord($str[$i]);
if ($c > 128) {
if (($c > 247)) return false;
elseif ($c > 239) $bytes = 4;
elseif ($c > 223) $bytes = 3;
elseif ($c > 191) $bytes = 2;
else return false;
if (($i + $bytes) > $len) return false;
while ($bytes > 1) {
$i++;
$b = ord($str[$i]);
if ($b < 128 || $b > 191) return false;
$bytes--;
}
}
}
return true;
}
Example 2
Copy code The code is as follows:
function is_utf8($string) {
Return preg_match('%^(?:
[x09x0Ax0Dx20-x7E]
| [xC2-xDF][x80-xBF] | [xC2-xDF][x80-xBF] # non-overlong 2-byte
| [xE1-xECxEExEF][x80-xBF]{2} # straight 3-byte
)*$%xs', $string);
}
The accuracy is basically the same as mb_detect_encoding(), both correct and wrong.
Coding detection cannot be 100% accurate, and this thing can basically meet the requirements.
Example 3
Copy code
The code is as follows:
function mb_is_utf8($string)
{
Return mb_detect_encoding($string, 'UTF-8') === 'UTF-8';//New discovery
}
Example 4
Copy code
The code is as follows:
// Returns true if $string is valid UTF-8 and false otherwise.
function is_utf8($word)
{
if (preg_match("/^([".chr(228)."-".chr(233)."]{1}[".chr(128)."-".chr(191)."]{ 1}[".chr(128)."-".chr(191)."]{1}){1}/",$word) == true || preg_match("/([".chr(228 )."-".chr(233)."]{1}[".chr(128)."-".chr(191)."]{1}[".chr(128)."-". chr(191)."]{1}){1}$/",$word) == true || preg_match("/([".chr(228)."-".chr(233)."] {1}[".chr(128)."-".chr(191)."]{1}[".chr(128)."-".chr(191)."]{1}){2 ,}/",$word) == true)
{
return true;
}
else
{
return false;
}
} // function is_utf8
I hope this article will be helpful to everyone’s PHP programming design.
http://www.bkjia.com/PHPjc/915431.htmlwww.bkjia.comtruehttp: //www.bkjia.com/PHPjc/915431.htmlTechArticleCommon methods for PHP to detect whether a string is UTF8 encoded. UTF8 encoding This article summarizes the example of PHP detecting whether a string is Common methods of UTF8 encoding. Share it with everyone for your reference. Specific actual...