java How many bytes of Chinese characters
In java, as long as it is a character, whether it is a number or English or Chinese characters both occupy 2 bytes and are both one char.
char c1 = '中'; char c2 = 'A'; char c3 = '1';
To be precise, Unicode encoding refers to a general term for a type of encoding, rather than a specific encoding. UTF-8 is different from Unicode.
Take utf8 as an example. utf8 is a variable-length encoding standard that can represent a character in 1 to 4 bytes, while Chinese occupies 3 bytes and ascii characters occupies 1 byte.
Because java uses unicode as the encoding method. Unicode is a fixed-length encoding standard. Each character is 2 bytes, which is 1 char type space.
Extended knowledge:
Unicode is an encoding scheme. Unicode was created to solve the limitations of traditional character encoding schemes. It provides universal encoding for every character in every language. A unified and unique binary encoding is set for each character to meet the requirements for cross-language and cross-platform text conversion and processing. There are three specific implementations of Unicode encoding, namely utf-8, utf-16, and utf-32. Among them, utf-8 occupies one to four bytes, utf-16 occupies two or four bytes, and utf-32 occupies four bytes. bytes. Currently, Unicode codes are widely used in the field of information exchange around the world.
php Chinese website, a large number of free Java introductory tutorials, welcome to learn online!
The above is the detailed content of How many bytes of Chinese characters in java. For more information, please follow other related articles on the PHP Chinese website!