Home >Java >javaTutorial >What is the difference between Unicode and UTF-8

What is the difference between Unicode and UTF-8

青灯夜游
青灯夜游Original
2018-11-22 10:53:478445browse

The content of this article is to introduce what Unicode and UTF-8 are, so that everyone can understand the difference (difference) between Unicode and UTF-8. It has certain reference value. Friends in need can refer to it. I hope it will be helpful to you.

What is Unicode?

Unicode is a character encoding scheme that uses two bytes to represent each character. Unicode defines a unique number in the range of 0 to 65,535 (216 – 1) for each character and symbol, regardless of platform, program, or language.

What is UTF-8?

UTF-8 is a standard mechanism for converting wide character values ​​to Unicode as a byte stream, an encoding format; can be encoded in 1 to 6 bytes Unicode characters.

The difference between Unicode and UTF-8

Unicode is a character set, while UTF-8 is an encoding rule.

A character set is a list of uniquely numbered characters (these numbers are sometimes called "code points"). To put it simply, each "character" is assigned a unique ID. For example, in the Unicode character set, the digit A is 41.

Encoding rules: It is a rule for converting "code bits" into byte sequences (encoding/decoding can be understood as the process of encryption/decryption). It is an algorithm for converting a list of numbers into binary, so it can Store it on disk.

For example, UTF-8 would translate a sequence of numbers like this: 1, 2, 3, 4:

00000001 00000010 00000011 00000100

Our data is now translated to binary The file can now be saved to disk.

Unicode and UTF-8 relationship diagram:

What is the difference between Unicode and UTF-8

##Conclusion:

UTF -8 is the encoding used to convert binary data to numbers; Unicode is the character set used to convert numbers to characters.

The above is the entire content of this article, I hope it will be helpful to everyone's study. For more related video tutorials, please visit:

java tutorial!

The above is the detailed content of What is the difference between Unicode and UTF-8. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn