In computer and telecommunications technology, a character is the basic information of a unit of glyph, glyph-like unit or symbol.
Characters refer to letters, numbers, words and symbols used in computers, including: 1, 2, 3, A, B, C, ~! ·#¥%……—*()—and so on.
In ASCII encoding, one English alphabetic character requires 1 byte to store.
In GB 2312 encoding or GBK encoding, one Chinese character storage requires 2 bytes.
In UTF-8 encoding, the storage of an English alphabetic character requires 1 byte, and the storage of a Chinese character requires 3 to 4 bytes.
In UTF-16 encoding, the storage of an English alphabetic character or a Chinese character requires 2 bytes (some Chinese characters in the Unicode extension area require 4 bytes to store).
In UTF-32 encoding, the storage of any character in the world requires 4 bytes.
Characters are abstract entities that can be represented using many different character schemes or code pages.
For example, Unicode UTF-16 encoding represents characters as a sequence of 16-bit integers, while Unicode UTF-8 encoding represents the same characters as a sequence of 8-bit bytes. Microsoft's common language runtime uses Unicode UTF-16 (Unicode Transformation Format, a 16-bit encoding) to represent characters.
The above is the detailed content of What are the characters?. For more information, please follow other related articles on the PHP Chinese website!