Stripping ANSI Escape Sequences in Python Strings
Many command-line tools and SSH applications append ANSI escape sequences to their outputs to control terminal behaviors and enhance visual presentation. However, these sequences can be cumbersome when you want to parse or process the string content without visual cues. This article explores a Pythonic approach to remove such escape sequences and extract the plaintext content.
Problem:
Consider the following example string retrieved from an SSH command:
'ls\r\n\x1b[00m\x1b[01;31mexamplefile.zip\x1b[00m\r\n\x1b[01;31m'
The objective is to programmatically remove the ANSI escape sequences, leaving only the plaintext content:
'examplefile.zip'
Solution:
Python's regular expression module provides a succinct solution to this problem. The following regular expression effectively captures and removes all ANSI escape sequences:
import re ansi_escape = re.compile(r''' \x1B # ESC (?: # 7-bit C1 Fe (except CSI) [@-Z\-_] | # or [ for CSI, followed by a control sequence \[ [0-?]* # Parameter bytes [ -/]* # Intermediate bytes [@-~] # Final byte ) ''', re.VERBOSE)
To apply the regular expression and extract the desired text:
import re ansi_escape = re.compile(r'\x1B(?:[@-Z\-_]|\[[0-?]*[ -/]*[@-~])') sometext = 'ls\r\n\x1b[00m\x1b[01;31mexamplefile.zip\x1b[00m\r\n\x1b[01;31m' result = ansi_escape.sub('', sometext)
Output:
'ls\r\nexamplefile.zip\r\n'
The above is the detailed content of How to Remove ANSI Escape Sequences from Python Strings?. For more information, please follow other related articles on the PHP Chinese website!