python 的requests问题

Question

导入requests,提示报错 ''' UnicodeDecodeError: 'ascii' codec can't decode byte 0xc9 in position 1: ordinal not in range(128)'''
问题查了，一般是字符之间转换的。但是到了库这里，就不懂了什么原因。。。。
小生愚钝，请教各位老师，请指点一二，麻烦了。

PHP中文网 · Answer

If you just want to play around, you can consider using Python3 directly. Compared with Python2, Python3 will have much fewer character encoding problems.

阿神 · Answer

Looking at your error message, there should be a problem with the encoding format of lanxi.py. You can first go to the cmd console and run python and then import to try. It should not be a problem with requests

大家讲道理 · Answer

The folder path of the ssl encryption function package has special characters

Change to py3, 2 always has various coding problems

高洛峰 · Answer

result_path = result_path + p_path

If the variables in this code have Chinese characters, you can print them out to see, or save them all in unicode form

result_path = u'xxx'
p_path = u'xxx'
或者 通过decode函数将变量转成unicode

巴扎黑 · Answer

Character encoding issues in

UnicodeDecodeError是字符解码失败的原因，这不仅是requests的问题，也不仅是python的问题，所有编程语言都有这样的“问题”，也就是必须要了解字符编码。具体的字符编码可以查询资料。下面py2.

py2的用引号声明的字串类型都是str，字串前加一个u声明的才是unicode。网络IO，文件读写中传输的字符都是编码成bytes，即str类型。载入到计算机执行计算，一般都要解码成unicode。py2的str方法实际上是''.encode('ascii'), unicode方法是''.decode('ascii')

In [1]: s = u'你好'

In [2]: str(s)
---------------------------------------------------------------------------
UnicodeEncodeError                        Traceback (most recent call last)
 in ()
----> 1 str(s)

UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-1: ordinal not in range(128)

In [3]: s.decode('ascii')
---------------------------------------------------------------------------
UnicodeEncodeError                        Traceback (most recent call last)
 in ()
----> 1 s.decode('ascii')

UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-1: ordinal not in range(128)

In [4]: ss = '你好'

In [5]: unicode(ss)
---------------------------------------------------------------------------
UnicodeDecodeError                        Traceback (most recent call last)
 in ()
----> 1 unicode(ss)

UnicodeDecodeError: 'ascii' codec can't decode byte 0xe4 in position 0: ordinal not in range(128)

In [6]: ss.decode('ascii')
---------------------------------------------------------------------------
UnicodeDecodeError                        Traceback (most recent call last)
 in ()
----> 1 ss.decode('ascii')

UnicodeDecodeError: 'ascii' codec can't decode byte 0xe4 in position 0: ordinal not in range(128)

In [7]: ss.decode('utf-8')
Out[7]: u'\u4f60\u597d'

In [8]: ss.decode('gbk')
Out[8]: u'\u6d63\u72b2\u30bd'

Because of the way ss = '你好'是非ascii字符，因此以ascii方式解码失败，当解码成utf-8和gbk就成功了。同理s=u'你好'也不能编码成ascii.

Your question above should be non-ascii字符，decode成ascii字符的时候抛错。result_path + p_path 即这两个变量中，有一个变量是包含非ascii字符的strType:

In [1]: 'hello' + u'world'
Out[1]: u'helloworld'

In [2]: 'hello' + u'世界'
Out[2]: u'hello\u4e16\u754c'

In [3]: '你好' + u'世界'
---------------------------------------------------------------------------
UnicodeDecodeError                        Traceback (most recent call last)
 in ()
----> 1 '你好' + u'世界'

UnicodeDecodeError: 'ascii' codec can't decode byte 0xe4 in position 0: ordinal not in range(128)

In [4]: '你好' + '世界'
Out[4]: '\xe4\xbd\xa0\xe5\xa5\xbd\xe4\xb8\x96\xe7\x95\x8c'

In [5]: '你好' + '世界 world'
Out[5]: '\xe4\xbd\xa0\xe5\xa5\xbd\xe4\xb8\x96\xe7\x95\x8c world'

In [6]: '你好' + u'世界 world'
---------------------------------------------------------------------------
UnicodeDecodeError                        Traceback (most recent call last)
 in ()
----> 1 '你好' + u'世界 world'

UnicodeDecodeError: 'ascii' codec can't decode byte 0xe4 in position 0: ordinal not in range(128)

In [9]: '你好' + u'world'
---------------------------------------------------------------------------
UnicodeDecodeError                        Traceback (most recent call last)
 in ()
----> 1 '你好' + u'world'

UnicodeDecodeError: 'ascii' codec can't decode byte 0xe4 in position 0: ordinal not in range(128)

'你好'中的中文不是ascii字符，和unicode字符拼接的时候，会解码成unicode再拼接，对于最后的例子，'你好' + u'world'，其实执行的是 '你好'.decode('ascii') + u'world', so I reported an error.

The correction method is very simple, just use unified character encoding. The default encoding of py in Linux is utf-8, and it seems to be gbk in win. No matter what, use utf-8 anyway.

In [10]: '你好'.decode('utf-8') + u'world'
Out[10]: u'\u4f60\u597dworld'

In py3, all strings declared in quotation marks are unicode. There is no str和unicode这两种类型。其中str编码成bytes类型，bytes解码成字串类型。两种的相互转换的时候，还是会有 UnicodeDecodeError problem. Don’t think that everything will be fine by using py3. The key to solving the problem is to know how to encode and decode, and you can solve it once and for all.

>>> s = '中文'
>>> s.encode('utf-8')
b'\xe4\xb8\xad\xe6\x96\x87'
>>> s.encode('ascii')
Traceback (most recent call last):
  File "", line 1, in 
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-1: ordinal not in range(128)
>>> print(type(s.encode('utf-8')))

>>> print(type(s))