Python 字串總結，建議收藏！-Python教學-PHP中文網

Python 字串總結，建議收藏！

什麼是 Python 字串

字串是包含一系列字元的物件。字元是長度為 1 的字串。在 Python 中，單字也是字串。但比較有趣的是，Python 程式語言中是沒有字元資料類型的，不過在C、Kotlin 和Java 等其他程式語言中是存在字元資料類型的

我們可以使用單引號、雙引號、三引號或str() 函數來宣告Python 字串。下面的程式碼片段展示如何在 Python 中宣告一個字串：

# A single quote string
single_quote = 'a'# This is an example of a character in other programming languages. It is a string in Python
# Another single quote string
another_single_quote = 'Programming teaches you patience.'
# A double quote string
double_quote = "aa"
# Another double-quote string
another_double_quote = "It is impossible until it is done!"
# A triple quote string
triple_quote = '''aaa'''
# Also a triple quote string
another_triple_quote = """Welcome to the Python programming language. Ready, 1, 2, 3, Go!"""
# Using the str() function
string_function = str(123.45)# str() converts float data type to string data type
# Another str() function
another_string_function = str(True)# str() converts a boolean data type to string data type
# An empty string
empty_string = ''
# Also an empty string
second_empty_string = ""
# We are not done yet
third_empty_string = """"""# This is also an empty string: ''''''

登入後複製

在 Python 中取得字串的另一種方法是使用 input() 函數。 input() 函數讓我們可以使用鍵盤將輸入的值插入程式中。插入的值被讀取為字串，但我們可以將它們轉換為其他資料類型：

# Inputs into a Python program
input_float = input()# Type in: 3.142
input_boolean = input() # Type in: True
# Convert inputs into other data types
convert_float = float(input_float)# converts the string data type to a float
convert_boolean = bool(input_boolean) # converts the string data type to a bool

登入後複製

我們使用 type() 函數來確定 Python 中物件的資料類型，它會傳回物件的類別。當物件是字串時，它會傳回 str 類別。同樣，當物件是字典、整數、浮點數、元組或布林值時，它分別傳回 dict、int、float、tuple、bool 類別。現在讓我們使用type() 函數來確定前面程式碼片段中宣告的變數的資料類型：

# Data types/ classes with type()
print(type(single_quote))
print(type(another_triple_quote))
print(type(empty_string))
print(type(input_float))
print(type(input_boolean))
print(type(convert_float))
print(type(convert_boolean))

登入後複製

ASCII 表與Python 字串字元

美國資訊交換標準碼(ASCII) 旨在幫助我們將字元或文字映射到數字，因為數字集比文字更容易儲存在電腦記憶體中。 ASCII 主要以英語對 128 個字元進行編碼，用於處理電腦和程式設計中的資訊。 ASCII 編碼的英文字元包括小寫字母（a-z）、大寫字母（A-Z）、數字（0-9）以及標點符號等符號

ord() 函數將長度為1（一個字元）的Python 字符字串轉換為其在ASCII 表上的十進位表示，而chr() 函數將十進位表示轉換回字串。例如：

import string
# Convert uppercase characters to their ASCII decimal numbers
ascii_upper_case = string.ascii_uppercase# Output: ABCDEFGHIJKLMNOPQRSTUVWXYZ
for one_letter in ascii_upper_case[:5]:# Loop through ABCDE
print(ord(one_letter))

登入後複製

Output:

登入後複製

# Convert digit characters to their ASCII decimal numbers
ascii_digits = string.digits# Output: 0123456789
for one_digit in ascii_digits[:5]:# Loop through 01234
print(ord(one_digit))

登入後複製

Output:

登入後複製

在上面的程式碼片段中，我們遍歷字串ABCDE 和01234，並將每個字元轉換為它們在ASCII 表中的十進位表示。我們也可以使用 chr() 函數執行反向操作，從而將 ASCII 表上的十進制數字轉換為它們的 Python 字串字元。例如：

decimal_rep_ascii = [37, 44, 63, 82, 100]
for one_decimal in decimal_rep_ascii:
print(chr(one_decimal))

登入後複製

Output:

%
,
?
R
d

登入後複製

在ASCII 表中，上述程式輸出中的字串字元會對應到它們各自的十進位數

字串屬性

零索引：字串中的第一個元素的索引為零，而最後一個元素的索引為len(string) - 1。例如：

immutable_string = "Accountability"
print(len(immutable_string))
print(immutable_string.index('A'))
print(immutable_string.index('y'))

登入後複製

Output:

14
0
13

登入後複製

不變性：這表示我們不能更新字串中的字元。例如我們不能從字串中刪除一個元素或嘗試在其任何索引位置分配一個新元素。如果我們嘗試更新字串，它會拋出TypeError：

immutable_string = "Accountability"
# Assign a new element at index 0
immutable_string[0] = 'B'

登入後複製

Output:

---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
~AppDataLocalTemp/ipykernel_11336/2351953155.py in
2
3 # Assign a new element at index 0
----> 4 immutable_string[0] = 'B'
TypeError: 'str' object does not support item assignment

登入後複製

但是我們可以將字串重新分配給immutable_string 變量，不過我們應該注意它們不是同一個字符串，因為它們不指向記憶體中的同一個物件。 Python 不會更新舊的字串物件；它創建了一個新的，正如我們透過ids 看到的那樣：

immutable_string = "Accountability"
print(id(immutable_string))
immutable_string = "Bccountability"
print(id(immutable_string)
test_immutable = immutable_string
print(id(test_immutable))

登入後複製

Output:

2693751670576
2693751671024
2693751671024

登入後複製

上述兩個id 在同一台計算機上也不相同，這意味著兩個immutable_string 變數都指向記憶體中的不同位址。我們將最後一個 immutable_string 變數指派給 test_immutable 變數。可以看到 test_immutable 變數和最後一個 immutable_string 變數指向同一個位址

#連接：將兩個或多個字串連接在一起以獲得帶有符號的新字串。例如：

first_string = "Zhou"
second_string = "luobo"
third_string = "Learn Python"
fourth_string = first_string + second_string
print(fourth_string)
fifth_string = fourth_string + " " + third_string
print(fifth_string)

登入後複製

Output:

Zhouluobo
Zhouluobo Learn Python

登入後複製

重複：字串可以用 * 符號重複。例如：

print("Ha" * 3)

登入後複製

Output:

HaHaHa

登入後複製

索引和切片：我們已經確定字串是從零開始索引的，我們可以使用其索引值存取字串中的任何元素。我們也可以透過在兩個索引值之間進行切片來取得字串的子集。例如：

main_string = "I learned English and Python with ZHouluobo. You can do it too!"
# Index 0
print(main_string[0])
# Index 1
print(main_string[1])
# Check if Index 1 is whitespace
print(main_string[1].isspace())
# Slicing 1
print(main_string[0:11])
# Slicing 2:
print(main_string[-18:])
# Slicing and concatenation
print(main_string[0:11] + ". " + main_string[-18:])

登入後複製

Output:

I
True
I learned English
You can do it too!
I learned English. You can do it too!

登入後複製

字串方法

str.split(sep=None, maxsplit=-1)：字串分割方法包含兩個屬性：sep 和maxsplit。當使用其預設值呼叫此方法時，它會在任何有空格的地方拆分字串。此方法傳回字串清單：

string = "Apple, Banana, Orange, Blueberry"
print(string.split())

登入後複製

Output:

['Apple,', 'Banana,', 'Orange,', 'Blueberry']

登入後複製

我們可以看到字串沒有很好地拆分，因為拆分的字串包含 ,。我們可以使用sep=',' 在有, 的地方進行拆分：

print(string.split(sep=','))

登入後複製

Output:

['Apple', ' Banana', ' Orange', ' Blueberry']

登入後複製

這比之前的拆分要好，但是我們可以在一些拆分字符串之前看到空格。可以使用 (sep=', ') 刪除它：

# Notice the whitespace after the comma
print(string.split(sep=', '))

登入後複製

Output:

['Apple', 'Banana', 'Orange', 'Blueberry']

登入後複製

現在字串被很好地分割了。有時我們不想分割最大次數，我們可以使用 maxsplit 屬性來指定我們打算拆分的次數：

print(string.split(sep=', ', maxsplit=1))
print(string.split(sep=', ', maxsplit=2))

登入後複製

Output:

['Apple', 'Banana, Orange, Blueberry']
['Apple', 'Banana', 'Orange, Blueberry']

登入後複製

str.splitlines(keepends=False)：有时我们想处理一个在边界处具有不同换行符（'n'、nn'、'r'、'rn'）的语料库。我们要拆分成句子，而不是单个单词。可以使用 splitline 方法来执行此操作。当 keepends=True 时，文本中包含换行符；否则它们被排除在外

import nltk# You may have to `pip install nltk` to use this library.
macbeth = nltk.corpus.gutenberg.raw('shakespeare-macbeth.txt')
print(macbeth.splitlines(keepends=True)[:5]

登入後複製

Output:

['[The Tragedie of Macbeth by William Shakespeare 1603]n', 'n', 'n', 'Actus Primus. Scoena Prima.n', 'n']

登入後複製

str.strip([chars])：我们使用 strip 方法从字符串的两侧删除尾随空格或字符。例如：

string = "Apple Apple Apple no apple in the box apple apple "
stripped_string = string.strip()
print(stripped_string)
left_stripped_string = (
stripped_string
.lstrip('Apple')
.lstrip()
.lstrip('Apple')
.lstrip()
.lstrip('Apple')
.lstrip()
)
print(left_stripped_string)
capitalized_string = left_stripped_string.capitalize()
print(capitalized_string)
right_stripped_string = (
capitalized_string
.rstrip('apple')
.rstrip()
.rstrip('apple')
.rstrip()
)
print(right_stripped_string)

登入後複製

Output:

Apple Apple Apple no apple in the box apple apple
no apple in the box apple apple
No apple in the box apple apple
No apple in the box

登入後複製

在上面的代码片段中，我们使用了 lstrip 和 rstrip 方法，它们分别从字符串的左侧和右侧删除尾随空格或字符。我们还使用了 capitalize 方法，它将字符串转换为句子大小写str.zfill(width)： zfill 方法用 0 前缀填充字符串以获得指定的宽度。例如：

example = "0.8"# len(example) is 3
example_zfill = example.zfill(5) # len(example_zfill) is 5
print(example_zfill)

登入後複製

Output:

000.8

登入後複製

str.isalpha()：如果字符串中的所有字符都是字母，该方法返回True；否则返回 False：

# Alphabet string
alphabet_one = "Learning"
print(alphabet_one.isalpha())
# Contains whitspace
alphabet_two = "Learning Python"
print(alphabet_two.isalpha())
# Contains comma symbols
alphabet_three = "Learning,"
print(alphabet_three.isalpha())

登入後複製

Output:

True
False
False

登入後複製

如果字符串字符是字母数字，str.isalnum() 返回 True；如果字符串字符是十进制，str.isdecimal() 返回 True；如果字符串字符是数字，str.isdigit() 返回 True；如果字符串字符是数字，则 str.isnumeric() 返回 True

如果字符串中的所有字符都是小写，str.islower() 返回 True；如果字符串中的所有字符都是大写，str.isupper() 返回 True；如果每个单词的首字母大写，str.istitle() 返回 True：

# islower() example
string_one = "Artificial Neural Network"
print(string_one.islower())
string_two = string_one.lower()# converts string to lowercase
print(string_two.islower())
# isupper() example
string_three = string_one.upper() # converts string to uppercase
print(string_three.isupper())
# istitle() example
print(string_one.istitle())

登入後複製

Output:

False
True
True
True

登入後複製

str.endswith(suffix) 返回 True 是以指定后缀结尾的字符串。如果字符串以指定的前缀开头，str.startswith(prefix) 返回 True：

sentences = ['Time to master data science', 'I love statistical computing', 'Eat, sleep, code']
# endswith() example
for one_sentence in sentences:
print(one_sentence.endswith(('science', 'computing', 'Code')))

登入後複製

Output:

True
True
False

登入後複製

# startswith() example
for one_sentence in sentences:
print(one_sentence.startswith(('Time', 'I ', 'Ea')))

登入後複製

Output:

True
True
True

登入後複製

str.find(substring) 如果子字符串存在于字符串中，则返回最低索引；否则它返回 -1。str.rfind(substring) 返回最高索引。如果找到，str.index(substring) 和 str.rindex(substring) 也分别返回子字符串的最低和最高索引。如果字符串中不存在子字符串，则会引发 ValueError

string = "programming"
# find() and rfind() examples
print(string.find('m'))
print(string.find('pro'))
print(string.rfind('m'))
print(string.rfind('game'))
# index() and rindex() examples
print(string.index('m'))
print(string.index('pro'))
print(string.rindex('m'))
print(string.rindex('game'))

登入後複製

Output:

6
0
7
-1
6
0
7
---------------------------------------------------------------------------
ValueErrorTraceback (most recent call last)
~AppDataLocalTemp/ipykernel_11336/3954098241.py in
 11 print(string.index('pro'))# Output: 0
 12 print(string.rindex('m'))# Output: 7
---> 13 print(string.rindex('game'))# Output: ValueError: substring not found
ValueError: substring not found

登入後複製

str.maketrans(dict_map) 从字典映射创建一个翻译表，str.translate(maketrans) 用它们的新值替换翻译中的元素。例如：

example = "abcde"
mapped = {'a':'1', 'b':'2', 'c':'3', 'd':'4', 'e':'5'}
print(example.translate(example.maketrans(mapped)))

登入後複製

Output:

登入後複製

字符串操作

循环遍历一个字符串

字符串是可迭代的，因此它们支持使用 for 循环和枚举的循环操作：

# For-loop example
word = "bank"
for letter in word:
print(letter)

登入後複製

Output:

b
a
n
k

登入後複製

# Enumerate example
for idx, value in enumerate(word):
print(idx, value)

登入後複製

Output:

0 b
1 a
2 n
3 k

登入後複製

字符串和关系运算符

当使用关系运算符（>、<、== 等）比较两个字符串时，两个字符串的元素按其 ASCII 十进制数字逐个索引进行比较。例如：

print('a' > 'b')
print('abc' > 'b')

登入後複製

Output:

False
False

登入後複製

在这两种情况下，输出都是 False。关系运算符首先比较两个字符串的索引 0 上元素的 ASCII 十进制数。由于 b 大于 a，因此返回 False；在这种情况下，其他元素的 ASCII 十进制数字和字符串的长度无关紧要

当字符串长度相同时，它比较从索引 0 开始的每个元素的 ASCII 十进制数，直到找到具有不同 ASCII 十进制数的元素。例如：

print('abd' > 'abc')

登入後複製

Output:

True

登入後複製

检查字符串的成员资格

in 运算符用于检查子字符串是否是字符串的成员：

print('data' in 'dataquest')
print('gram' in 'programming')

登入後複製

Output:

True
True

登入後複製

检查字符串成员资格、替换子字符串或匹配模式的另一种方法是使用正则表达式

import re
substring = 'gram'
string = 'programming'
replacement = '1234'
# Check membership
print(re.search(substring, string))
# Replace string
print(re.sub(substring, replacement, string))

登入後複製

Output:

pro1234ming

登入後複製

字符串格式

f-string 和 str.format() 方法用于格式化字符串。两者都使用大括号 {} 占位符。例如：

monday, tuesday, wednesday = "Monday", "Tuesday", "Wednesday"
format_string_one = "{} {} {}".format(monday, tuesday, wednesday)
print(format_string_one)
format_string_two = "{2} {1} {0}".format(monday, tuesday, wednesday)
print(format_string_two)
format_string_three = "{one} {two} {three}".format(one=tuesday, two=wednesday, three=monday)
print(format_string_three)
format_string_four = f"{monday} {tuesday} {wednesday}"
print(format_string_four)

登入後複製

Output:

Monday Tuesday Wednesday
Wednesday Tuesday Monday
Tuesday Wednesday Monday
Monday Tuesday Wednesday

登入後複製

f-strings 更具可读性，并且它们比 str.format() 方法实现得更快。因此，f-string 是字符串格式化的首选方法

处理引号和撇号

撇号 (') 在 Python 中表示一个字符串。为了让 Python 知道我们不是在处理字符串，我们必须使用 Python 转义字符 ()。因此撇号在 Python 中表示为 '。与处理撇号不同，Python 中有很多处理引号的方法。它们包括以下内容：

# 1. Represent string with single quote (`""`) and quoted statement with double quote (`""`)
quotes_one ='"Friends don't let friends use minibatches larger than 32" - Yann LeCun'
print(quotes_one)
# 2. Represent string with double quote `("")` and quoted statement with escape and double quote `("statement")`
quotes_two =""Friends don't let friends use minibatches larger than 32" - Yann LeCun"
print(quotes_two)
# 3. Represent string with triple quote `("""""")` and quoted statment with double quote ("")
quote_three = """"Friends don't let friends use minibatches larger than 32" - Yann LeCun"""
print(quote_three)

登入後複製

Output:

"Friends don't let friends use minibatches larger than 32" - Yann LeCun
"Friends don't let friends use minibatches larger than 32" - Yann LeCun
"Friends don't let friends use minibatches larger than 32" - Yann LeCun

登入後複製