php > $s="Hello";
php > echo mb_strlen($s,"utf8");
utf8 returns 2, I understand
php > echo mb_strlen($s,"gb2312") ;
This returns 4, I understand it too
php > echo mb_strlen($s,"gbk");
I don't understand here?
php > $s="Hello";
php > echo mb_strlen($s,"utf8");
utf8 returns 2, I understand
php > echo mb_strlen($s,"gb2312") ;
This returns 4, I understand it too
php > echo mb_strlen($s,"gbk");
I don't understand here?
Because $s is UTF8 encoded, you can get its length through GBK encoding without converting it to GBK.
UTF8 encodedHello
on GBK, so its length is 3.
This is what you should do:
$a = mb_strlen(iconv( 'utf-8','gbk', $s), 'gbk'); $b = mb_strlen(iconv( 'utf-8','gb2312', $s), 'gb2312');
In other words, GB2312 is also wrong.
mb_strlen is the number of characters returned, so only returning 2 is correct. I don’t know how you understand the two cases of 4 and 3?
But when$s = "Hello"
stores a UTF8 encoded string (encoded according to your source file). If you use GBK or GB2312 to decode this encoded data, It is possible to get garbled codes, so 4 and 3 should be the length of garbled codes.