Encode URL Parameters with Percent-Encoding in Python
When percent-encoding URL parameters to prevent errors and ensure proper normalization, the built-in urllib.quote() function can fall short.
Default Encoding Omission:
Using urllib.quote() as follows:
url = "http://example.com?p=" + urllib.quote(query)
Omits encoding crucial characters like / to /, which causes issues with OAuth normalization.
Unicode Support Deficiency:
Moreover, it fails to handle Unicode strings, resulting in exceptions when attempting to encode non-ASCII characters.
Improved Encoding with urllib.parse.quote() and safe Parameter:
To address these limitations, utilize urllib.parse.quote() from Python 3, which provides a solution:
urllib.parse.quote(string, safe='/', encoding=None, errors=None)
The safe parameter defaults to '/', but specifying an empty string disables additional ASCII character exclusions, thus ensuring proper encoding of /:
urllib.parse.quote('/test', safe='') # Encodes '/' to '%2F'
Fixing Unicode Handling in Python 2:
In Python 2, there was a Unicode handling bug with urllib.quote(). To work around it, manually encode the string as UTF-8 before applying the percent-encoding:
query = urllib.quote(u"Müller".encode('utf8')) print urllib.unquote(query).decode('utf8') # Outputs: Müller
Alternative: urllib.urlencode()
For a simpler approach, consider using urllib.urlencode(), which handles both percent-encoding and Unicode automatically:
encoded_params = urllib.urlencode({'p': query}) # Properly encodes '/' and supports Unicode
The above is the detailed content of How to Properly Encode URL Parameters in Python: Addressing Limitations of `urllib.quote()` and `urllib.urlencode()`. For more information, please follow other related articles on the PHP Chinese website!