Home> Common Problem> body text

How many bytes does an ascii code occupy?

百草
Release: 2023-09-07 16:03:25
Original
5872 people have browsed it

An ASCII code occupies one byte. ASCII code is a coding standard used to represent characters. It uses 7-bit binary numbers to represent 128 different characters, including letters, numbers, punctuation marks and special characters. characters etc. A byte is the basic unit of computer storage unit. It consists of 8 binary bits. Each binary bit can be 0 or 1. One byte can represent 256 different values, so it can represent all characters in the ASCII code.

How many bytes does an ascii code occupy?

The operating system for this tutorial: Windows 10 system, DELL G3 computer.

ASCII code (American Standard Code for Information Interchange) is an encoding standard used to represent characters. It uses 7-bit binary numbers to represent 128 different characters, including letters, numbers, punctuation marks and special characters. characters etc. In computers, ASCII codes are usually represented by 8-bit binary numbers, that is, one ASCII code occupies one byte (8 bits) of storage space.

ASCII code characters are represented by 7-bit or 8-bit binary encoding in the computer and are stored in one byte, that is, one ASCII code occupies one byte.

A byte (Byte) is the basic unit of computer storage unit. It consists of 8 binary bits, each binary bit can be 0 or 1. One byte can represent 256 (2^8) different values, so it can represent all characters in the ASCII code.

It should be noted that with the development of computer technology and the demand for internationalization, ASCII codes have gradually been replaced by more universal coding standards such as Unicode. Unicode uses 16 or 32 bits to represent characters and can represent more character sets, including characters and symbols in various languages.

When using Unicode encoding, one character may occupy multiple bytes of storage space. The specific number of bytes occupied depends on the Unicode encoding scheme used, such as UTF-8, UTF-16 or UTF-32, etc. UTF-8 is a common Unicode encoding scheme that uses variable-length encoding. The encoding length of a character can range from 1 to 4 bytes.

To summarize, an ASCII code usually occupies one byte (8 bits) of storage space. However, with the widespread application of Unicode encoding, a character may occupy multiple bytes of storage space. The specific number of bytes occupied depends on the Unicode encoding scheme used.

How many bytes does an ascii code occupy?

ASCII code can be divided into standard ASCII code and extended ASCII code.

Standard ASCII code is also called basic ASCII code. It uses 7 binary digits (the remaining 1 binary digit is 0) to represent all uppercase and lowercase letters, numbers 0 to 9, punctuation marks, and in American Special control characters used in English. Among them:

  • 0~31 and 127 (33 in total) are control characters or special communication characters (the rest are displayable characters)

    For example, the control character: LF ( Line feed), CR (carriage return), FF (page feed), DEL (delete), BS (backspace), BEL (ring), etc.;

    Special characters for communication: SOH (header), EOT (End of text), ACK (confirmation), etc.;

    ASCII values 8, 9, 10 and 13 are converted into backspace, tab, line feed and carriage return characters respectively. They do not have a specific graphic display, but will have different effects on text display depending on the application.

  • 32~126 (95 in total) are characters (32 is a space), of which 48~57 are ten Arabic numerals from 0 to 9.

  • 65~90 are 26 uppercase English letters, 97~122 are 26 lowercase English letters, and the rest are some punctuation marks, arithmetic symbols, etc.

Also note that in standard ASCII, its highest bit (b7) is used as a parity bit. The so-called parity check refers to a method used to check whether errors occur during code transmission. It is generally divided into two types: odd check and even check. Odd parity rules: the number of 1's in a byte of the correct code must be an odd number. If it is not an odd number, add 1 to the highest bit b7; even parity rules: the number of 1's in a byte of the correct code must be an even number. , if it is not an even number, add 1 to the highest bit b7.

The last 128 are called extended ASCII codes. Many x86-based systems support the use of extended (or "high") ASCII. Extended ASCII allows the 8th bit of each character to be used to determine an additional 128 special symbol characters, foreign letters, and graphic symbols.

The ASCII code standard table is as follows

##0x03 ##0000 0100 ##EOT (end of transmission) End of transfer ##0000 0101 0000 0110 0000 0111 ##0000 1000 ##0x0A 0000 1011 0000 1100 0000 1101 0000 1110 0000 1111 ##DC1 (device control 1) Device Control 1 ##0001 0010 0001 0011 0001 0100 ##0001 0110 026 22 ##0001 0111 ##25 0x19 EM (end of medium) End of medium ##0x1C FS ( file separator) file separator ##0001 1101 ##0001 1110 0001 1111 ##0010 0000 040 32 0x20 (space) space 0010 0001 041 ##0010 0010 ##0x23 ##044 ##046 38 0x26 & 和 0010 0111 ##0010 1000 051 41 0x29 ) Closing bracket 052 ##053 43 0x2B plus sign 0010 1100 054 44 0x2C , comma ##0010 1101 ##0011 0011 063 51 0x33 ##0100 ##Character 5 0011 0110 0011 0111 071 57 0x39 9 Characters 9 072 58 0x3A : ##0011 1110 076 62 ## Email Symbol 0100 0001 0101 65 0x41 A 0100 0010 ##0100 0011 ##0104 68 0105 ##Capital letter H ##0100 1001 ##01001010 ##Capital letter J 0113 ##uppercase letters K 0100 1100 0114 76 0x4C L 0100 1101 0100 1110 ##0x4E N Capital N 0117 79 0x4F 0120 ##0122 82 0x52 R 0101 0011 ##V uppercase letters V 0101 0111 0127 87 0x57 W 0101 1000 0101 1001 0101 1010 0101 1011 ##0101 1111 0137 95 0x5F ##0110 0000 0110 0001 ##lowercase a 0110 0010 0110 0011 ##0x63 c lower case c 0144 ##0110 0101 0145 ##0x66 f ##lower case g 0110 1000 0150 104 0x68 h lower case h 0110 1001 0151 ##0153 0154 ##lower case m 0110 1110 0156 0110 1111 ##0x70 0111 0001 0161 ##r 0111 0011 0111 0100 ##0x75 0166 0167 ##lower case x 0111 1001 0171 121 0x79 y lower case y ##0111 1010 0111 1011 0111 1100 ##0177 ##Size rules
ASCII table
Bin
(binary)
Oct
(octal)
Dec
(decimal)
Hex
(Hex)
Abbreviation/Character
Explanation
0000 0000
00
0
0x00
NUL(null)
null character
0000 0001
01
1
0x01
SOH(start of headline)
Title start
0000 0010
02
2
0x02
STX (start of text )
Start of text
##0000 0011
03
3
##ETX (end of text)
End of text
04
4
##0x04
05
5
0x05
ENQ (enquiry)
Request
06
6
0x06
ACK (acknowledge)
Receive notification
07
7
0x07
BEL (bell)
Bell
010
8
##0x08
BS (backspace)
Backspace
0000 1001
011
9
0x09
HT (horizontal tab)
Horizontal tab
0000 1010
##012
10
##LF (NL line feed, new line)
Line feed key
013
11
0x0B
VT (vertical tab)
vertical tab
014
12
0x0C
FF (NP form feed, new page)
Page key
015
13
0x0D
CR (carriage return)
Enter key
016
14
0x0E
SO (shift out)
No need to switch
017
15
##0x0F
SI ( shift in)
Enable shift
0001 0000
020
16
0x10
DLE (data link escape)
Data link escape
0001 0001
021
17
0x11
022
18
0x12
DC2 (device control 2)
Device control 2
023
19
0x13
DC3 (device control 3)
Device control 3
024
20
0x14
DC4 (device control 4)
Device control 4
##0001 0101
025
21
0x15
NAK (negative acknowledge)
Reject to accept
##0x16
SYN (synchronous idle)
SYNC IDLE
027
23
0x17
ETB (end of trans. block)
End Transmission Block
##0001 1000
030
24
0x18
CAN (cancel)
Cancel
0001 1001
031
0001 1010
032
26
0x1A
SUB (substitute)
instead of
0001 1011
033
27
##0x1B
ESC (escape)
Escape (overflow)
0001 1100
034
28
035
29
0x1D
GS (group separator)
Group symbol
036
30
0x1E
RS (record separator)
record separator
##037
31
0x1F
##US (unit separator)
Unit separator
##33
0x21
!
exclamation mark
042
34
##0x22
"
Double quotes
0010 0011
043
35
##0010 0100
36
0x24
$
dollar sign
0010 0101
045
37
0x25
##%
Percent sign
0010 0110
##047
39
0x27
'
Closing single quotation mark
050
40
##0x28
(
Open brackets
##0010 1001
0010 1010
42
0x2A
##*
Asterisk
0010 1011
055
45
##0x2D
-
Minus Sign/Dash
##0010 1110
056
46
0x2E
.
Period
0010 1111
057
47
0x2F
/
slash
0011 0000
060
48
0x30
0
Characters 0
0011 0001
061
49
0x31
1
Character 1
0011 0010
062
##50
0x32
2
Character 2
##3
Character 3
064
## 52
0x34
4
Characters 4
0011 0101
065
53
0x35
##5
##066
54
0x36
6
##Characters 6
067
55
0x37
7
Characters 7
##0011 1000
070
56
0x38
8
Characters 8
##0011 1001
0011 1010
Colon
0011 1011
073
59
0x3B
;
Semicolon
0011 1100
074
60
0x3C
is less than
## 0011 1101
075
61
0x3D
=
EQUAL SIGN
##0x3E
##>
is greater than
##0011 1111
077
63
0x3F
?
Question mark
0100 0000
0100
64
0x40
##@
##Capital Letter A
0102
66
0x42
B
Capital B
0103
67
0x43
C
Capital C
##0100 0100
##0x44
D
uppercase D
0100 0101
69
0x45
E
Capital E
0100 0110
0106
70
0x46
F
uppercase letter F
0100 0111
0107
71
0x47
G
Capital G
0100 1000
0110
72
##0x48
H
0111
73
0x49
I
Capital letter I
0112
##74
0x4A
J
0100 1011
75
0x4B
##K
##Capital L
0115
##77
0x4D
M
uppercase letter M
0116
##78
0100 1111
##O
upper case O
0101 0000
80
0x50
P
Capital P
0101 0001
0121
81
0x51
##Q
uppercase letter Q
0101 0010
##Capital letter R
0123
##83
0x53
S
Capital letter S
##0101 0100
0124
84
0x54
T
Capital T
0101 0101
0125
85
0x55
U
uppercase letter U
0101 0110
0126
86
##0x56
##Capital letter W
0130
##88
0x58
X
Capital letter X
0131
89
0x59
Y
Capital letter Y
0132
90
0x5A
Z
Capital letter Z
0133
##91
0x5B
[
Open square brackets
0101 1100
0134
92
0x5C
##\
Backslash
0101 1101
0135
93
0x5D
]
Closing square bracket
0101 1110
0136
94
0x5E
^
Caret
##_
Underscore
0140
96
0x60
`
Open single quotation mark
0141
97
0x61
##a
##0142
98
0x62
b
##lower case b
##0143
99
##0110 0100
100
0x64
d
lower case d
##101
0x65
e
lowercase e
0110 0110
0146
102
lower case f
0110 0111
0147
103
0x67
##g
##105
##0x69
i
lowercase i
##0110 1010
0152
106
0x6A
j
Lower case j
##0110 1011
107
0x6B
k
lower case k
0110 1100
108
0x6C
##l
lower case letters l
0110 1101
0155
109
0x6D
m
##110
0x6E
n
lowercase n
0157
##111
##0x6F
o
lowercase o
0111 0000
0160
112
##p
##lowercase p
113
0x71
q
lower case q
0111 0010
0162
114
##0x72
##lowercase r
##0163
115
0x73
s
lowercase s
0164
116
0x74
t
lower case t
##0111 0101
0165
117
##u
lowercase u
0111 0110
118
0x76
v
lower case v
0111 0111
119
0x77
##w
lower case letters w
0111 1000
0170
120
0x78
x
0172
122
0x7A
z
lower case z
0173
123
0x7B
##{
Opening brackets
0174
124
0x7C
|
vertical line
0111 1101
0175
125
0x7D
}
Closing curly brace
0111 1110
##0176
126
0x7E
~
tilde
0111 1111
##127
##0x7F
DEL (delete)
Delete
Common ASCII code size rules: numbers Numbers are smaller than letters. For example, "7"
  • The number 0 is smaller than the number 9, and increases in sequence from 0 to 9. For example, "3"
  • The letter A is smaller than the letter Z, and increases in order from A to Z. For example, "A"
  • The uppercase letters of the same letter are 32 smaller than the lowercase letters. Such as "A"
  • The ASCII code sizes of several common letters: "A" is 65; "a" is 97; "0" is 48.

The above is the detailed content of How many bytes does an ascii code occupy?. For more information, please follow other related articles on the PHP Chinese website!

Related labels:
source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template
About us Disclaimer Sitemap
php.cn:Public welfare online PHP training,Help PHP learners grow quickly!