Function code for calculating the length of Chinese strings and intercepting Chinese strings in PHP

Function code for calculating the length of Chinese strings and intercepting Chinese strings in PHP_PHP Tutorial

WBOY

Release： 2016-07-21 15:25:19

Original

810 people have browsed it

In PHP, we all know that there are special mb_substr and mb_strlen functions that can intercept and calculate the length of Chinese. However, since these functions are not core functions of PHP, they may not be turned on. Of course, if you are using your own server, you only need to enable it in php.ini. If a virtual host is used and the server does not enable this function, then we need to write some functions suitable for our national conditions.
The following functions are quite easy to use. But you need to know that it must be used in a utf-8 environment.

Copy code The code is as follows:

 
header('Content-type: text/html;charset=utf-8') ; 
/**
* Function that can count the length of Chinese strings
* @param $str String to calculate the length
* @param $type Calculation length type, 0 (default) means one Chinese character is counted as one character, 1 Indicates that one Chinese character counts as two characters 
* 
*/ 
function abslength($str) 
{ 
if(empty($str)){ 
return 0; 
} 
 if(function_exists('mb_strlen')){ 
return mb_strlen($str,'utf-8'); 
} 
else { 
preg_match_all("/./u", $str, $ar); 
return count($ar[0]); 
} 
} 
$str = 'We are all Chinese, ye! '; 
$len = abslength($str); 
var_dump($len); //return 12 
$len = abslength($str,'1'); 
echo '< br />'.$len; //return 22 
/* 
Intercept the Chinese string under utf-8 encoding. The parameters can refer to the substr function
@param $str The string to be intercepted
@param $start The starting position to intercept, negative number means reverse interception 
@param $end The length to intercept 
*/ 
function utf8_substr($str,$start=0) { 
if(empty($str)){ 
return false; 
} 
if (function_exists('mb_substr')){ 
if(func_num_args() >= 3) { 
$end = func_get_arg(2); 
return mb_substr($str,$start,$end,'utf-8'); 
} 
else { 
mb_internal_encoding("UTF-8 "); 
return mb_substr($str,$start); 
} 
} 
else { 
$null = ""; 
preg_match_all("/./u", $str, $ar); 
if(func_num_args() >= 3) { 
$end = func_get_arg(2); 
return join($null, array_slice($ar[0],$ start,$end)); 
} 
else { 
return join($null, array_slice($ar[0],$start)); 
} 
} 
} 
$str2 = 'wo wants to intercept zhongwen'; 
echo '
'; 
echo utf8_substr($str2,0,-4); //return wo wants to intercept zhon 

Support gb2312, gbk, utf-8, big5 Chinese interception method

Copy code The code is as follows:

 
/* 
* Chinese interception, supports gb2312, gbk, utf-8, big5 
* 
* @param string $str String to intercept 
* @param int $start Interception starts Starting position
* @param int $length interception length
* @param string $charset utf-8|gb2312|gbk|big5 encoding
* @param $suffix whether to add a suffix
*/ 
public function csubstr($str, $start=0, $length, $charset="utf-8", $suffix=true) 
{ 
if(function_exists("mb_substr")) 
 { 
if(mb_strlen($str, $charset) <= $length) return $str; 
$slice = mb_substr($str, $start, $length, $charset); 
} 
else 
{ 
$re['utf-8'] = "/[x01-x7f]|[xc2-xdf][x80-xbf]|[xe0-xef][x80-xbf]{ 2}|[xf0-xff][x80-xbf]{3}/"; 
$re['gb2312'] = "/[x01-x7f]|[xb0-xf7][xa0-xfe]/" ; 
$re['gbk'] = "/[x01-x7f]|[x81-xfe][x40-xfe]/"; 
$re['big5'] = "/[x01-x7f ]|[x81-xfe]([x40-x7e]|xa1-xfe])/"; 
preg_match_all($re[$charset], $str, $match); 
if(count($match [0]) <= $length) return $str; 
$slice = join("",array_slice($match[0], $start, $length)); 
} 
if( $suffix) return $slice."…"; 
return $slice; 
}