PHP intercepts UTF8 or GBK encoded Chinese and English strings
Release: 2016-07-25 08:53:51
Original
912 people have browsed it
-
-
//String interception - $a = "s@@Hello";
- var_dump(strlen_weibo($a,'utf-8'));
- Result The output is 8, in which the letter s counts as 1, the full-width @ counts as 2, the half-width @ counts as 1, and the two Chinese characters count as 4. The source code is as follows:
//Function code to intercept strings
- function strlen_weibo($string, $charset='utf-8')
- {
- $n = $count = 0;
- $ length = strlen($string);
- if (strtolower($charset) == 'utf-8')
- {
- while ($n < $length)
- {
- $currentByte = ord($string[$n] );
- if ($currentByte == 9 ||
- $currentByte == 10 ||
- (32 <= $currentByte && $currentByte <= 126)) // bbs.it-home.org
- {
- $ n++;
- $count++;
- } elseif (194 <= $currentByte && $currentByte <= 223)
- {
- $n += 2;
- $count += 2;
- } elseif (224 <= $currentByte && $currentByte <= 239)
- {
- $n += 3;
- $count += 2;
- } elseif (240 <= $currentByte && $currentByte <= 247)
- {
- $n += 4 ;
- $count += 2;
- } elseif (248 <= $currentByte && $currentByte <= 251)
- {
- $n += 5;
- $count += 2;
- } elseif ($currentByte == 252 || $currentByte == 253)
- {
- $n += 6;
- $count += 2;
- } else
- {
- $n++;
- $count++;
- }
- if ($count >= $length )
- {
- break;
- }
- }
- return $count;
- } else
- {
- for ($i = 0; $i < $length; $i++)
- {
- if (ord($string[$i ]) > 127)
- {
- $i++;
- $count++;
- }
- $count++;
- }
- return $count;
- }
- }
-
Copy code
|
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Latest Articles by Author
-
2024-10-22 09:46:29
-
2024-10-13 13:53:41
-
2024-10-12 12:15:51
-
2024-10-11 22:47:31
-
2024-10-11 19:36:51
-
2024-10-11 15:50:41
-
2024-10-11 15:07:41
-
2024-10-11 14:21:21
-
2024-10-11 12:59:11
-
2024-10-11 12:17:31