Python’s basic data types

1. Integers

Python can handle integers of any size, including negative integers of course. In Python programs, the representation of integers is exactly the same as the way it is written in mathematics, for example: 1 , 100, -8080, 0, etc.

Since computers use binary, it is sometimes more convenient to use hexadecimal to represent integers. Hexadecimal is represented by the 0x prefix and 0-9, a-f, for example: 0xff00, 0xa5b4c3d2, etc.

2. Floating point numbers

Floating point numbers are also decimals. They are called floating point numbers because when expressed in scientific notation, the decimal point of a floating point number The position is variable. The way integers and floating-point numbers are stored inside the computer are different. Integer operations are always accurate (division is also accurate), while floating-point operations may have rounding errors.

3. String

A string is any text enclosed by '' or "", such as 'abc', "123", etc. Please note that '' or "" itself is just a representation, not part of the string. Therefore, the string 'abc' only has 3 characters: a, b, c. This is similar for other programming languages.

(1) Strings and string escaping in Python

As mentioned above, strings can be represented by '' or "". But sometimes, our string itself contains '' or "" , what should we do?

At this time, you need to "escape" some special characters of the string. Python strings are escaped with \. It's the same with JAVA.

Commonly used escape characters include:

\n 表示换行
\t 表示一个制表符
\ 表示 \ 字符本身

Specific examples:

Then there is a problem above. If a string contains many characters that need to be escaped characters, it would be troublesome to escape each character. In order to cope with this situation, we can add a prefix r in front of the string, indicating that this is a raw string, and the characters inside do not need to be escaped.

However, one thing to note is that the r'...' notation cannot represent multi-line strings, nor can it represent strings containing ' and ".

If you want to represent a multi-line string, you can use '''...'''. Of course, you can also add r in front of the multi-line string to turn this multi-line string into a raw string

(2) String encoding issue

We all know that computers can only process numbers. If you want to process text, you must first convert the text into numbers before processing. The earliest computers were designed using 8 bits as a byte. Therefore, the largest integer that can be represented by a byte is 255 (binary 11111111 = decimal 255), and 0 - 255 is used to represent the size. To write English letters, numbers and some symbols, this encoding table is called ASCII encoding. For example, the encoding of the uppercase letter A is 65, and the encoding of the lowercase letter z is 122.

If you want to represent Chinese, obviously one byte is not enough. At least two bytes are needed, and it cannot conflict with ASCII encoding. Therefore, China has formulated the GB2312 encoding to encode Chinese.

Similarly, other languages such as Japanese and Korean also have this problem. In order to unify the encoding of all text, Unicode came into being. Unicode unifies all languages into a set of encodings, so there will no longer be garbled characters.

Unicode usually uses two bytes to represent a character. The original English encoding has changed from single byte to double byte. You only need to fill in all the high bytes with 0.

Because Python was born earlier than the release of the Unicode standard, the earliest Python only supports ASCII encoding. The ordinary string 'ABC' is ASCII encoded internally in Python.

Python later added support for Unicode, and strings represented in Unicode are represented by u'...'.

However, in the latest Python 3 version, strings are encoded in Unicode, which means that Python strings support multiple languages. Just like the above example, I do not add u'...' in my code and it can be displayed normally.

However, since the Python source code is also a text file, when your source code contains Chinese, you need to specify the UTF-8 encoding when saving the source code. When the Python interpreter reads the source code, in order for it to be read in UTF-8 encoding, we usually write these two lines at the beginning of the file:

#!/usr/bin/env python3
# -*- coding: utf-8 -*-

The first line of comments is to tell Linux/OS X system, this is a Python executable program, Windows systems will ignore this comment;

The second line of comments is to tell the Python interpreter to read the source code according to UTF-8 encoding, otherwise, you will The Chinese output written in may be garbled.

Declaring UTF-8 encoding does not mean that your .py file is UTF-8 encoded. You must and must ensure that the text editor is using UTF-8 without BOM encoding

4. Boolean value

The representation of Boolean value and Boolean algebra is exactly the same. A Boolean value has only two values, True and False, either True or False. In Python, you can directly Use True and False to represent Boolean values (please note the case), which can also be calculated through Boolean operations.

Boolean values can be operated with and, or and not.

and The operation is an AND operation. Only when everything is True, the result of the AND operation is True.

or The operation is an OR operation. As long as one of them is True, the result of the OR operation is True.

not The operation is a non-operation. It is a unary operator that turns True into False and False into True.

5. Null value

Basically every programming language has its own special value - null value. In Python, it is represented by None

Continuing Learning