Remember the relationship between Chinese and English byte length and encoding in PHP and Java-PHP Tutorial-php.cn

Remember the relationship between Chinese and English byte length and encoding in PHP and Java

WBOY

Release： 2016-07-29 08:56:49

Original

918 people have browsed it

1.PHP

PHP is actually the same as C language, uses ASCII, One char occupies 1 byte, in GBK encoding, one English occupies 1 byte, and one Chinese occupies 2 bytes. However, under UTF-8 encoding, an English character still occupies 1 byte, but a Chinese character occupies 3-4 bytes (usually 3 bytes). This usually allows you to obtain the word length of the string or String interception causes trouble. For example:

<?php
$str = "我爱你Iloveyou";
echo strlen($str); //utf8下是17，GBK下是14，但如果问你$str的字长是多少，或者让你显示前6个字，其余省略号表示，怎么办？
?>

Copy after login

The answers to the above questions can be found online. The easiest way is to use the extension library and use the mb_substr function to intercept.

2.Java

A char in java is 2 bytes. Java uses Unicode, and 2 bytes are used to represent a character. The Unicode encoding of a Chinese or English character occupies 2 bytes, but if other encoding methods are used, the number of bytes occupied by a character is different. For example:

public class Test {
    public static void main(String[] args){
        String str = "我们aaaaa";
        int byte_len = str.getBytes().length;
        int len = str.length();
        System.out.println("字节长度为：" + byte_len);
        System.out.println("字符长度为：" + len);
    }
}

Copy after login

The above example, the output results in GBK are: 9 and 7, but the output results in UTF-8 are: 11 and 7, that is, no matter what is used Encoding, the word lengths obtained using str.length() are all consistent. This method returns the number of characters in the string. Whether it is a Chinese character or an English character, it is regarded as one character.

The above introduces the relationship between Chinese and English byte lengths and encodings in PHP and Java, including aspects of the content. I hope it will be helpful to friends who are interested in PHP tutorials.