python3.x - python 中的maketrans在utf-8檔中該怎麼使用

Question

我寫了一個處理文字的檔案就是把文字中所有的符號都替換掉，然後替換成空格。用的python中maketrans和translate。其中在使用對於ASCII編碼的檔案時是正常的，但對於utf-8檔案時，就報錯，提示maketrans中的參數不等長...

滿天的星座 · Answer

首先這兩個字串長度不相等， " 是一个字符， \ 也是一个字符
你可以用 len() 查看。
接著關於字串什麼的問題，最好說明 python 的版本

maketrans 參數長度不相等

 my_substitutions = the_text.maketrans(
        # If you find any of these
        "ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789!\"#$%&()*+,-./:;<=>?@[]^_`{|}~'\",
        # Replace them by these
        "abcdefghijklmnopqrstuvwxyz                                            ")

測試程式碼：

from string import translate, maketrans

def text_to_words(the_text):
    """ 
        Return a list of words with all punctuation removed,
        and all in lowercase.
    """
    my_substitutions = maketrans(
        # If you find any of these
        "ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789!\"#$%&()*+,-./:;<=>?@[]^_`{|}~'\",
        # Replace them by these
        "abcdefghijklmnopqrstuvwxyz                                          ")
    # Translate the text now.
    cleaned_text = the_text.translate(my_substitutions)
    wds = cleaned_text.split()
    return wds

text_to_words('ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789!\"#$%&()*+,-./:;<=>?@[]^_`{|}~\'\测试')

output

['abcdefghijklmnopqrstuvwxyz', '\xe6\xb5\x8b\xe8\xaf\x95']

這是 python2 的運作結果

php8，我來也

30分鐘學會網站佈局

尚觀Oracle入門到精通視頻教程

你的第一行UNI-APP程式碼

Flutter 從頭到應用程式啟動

兄弟連新版Linux視頻教程

AXURE 9影片教學（適用於產品經理互動產品設計UI）

零基礎PS影片教學

16天帶你入門UI視頻教程

PS技巧和切片技巧影片教學

阿裡雲環境搭建以及項目上線視頻教程

電腦網路概述－程式設計師必須掌握的基礎知識

程式設計師必備教學——HTTP協定講解

Websocket影片教學