Home  >  Article  >  Backend Development  >  Python splits strings for any number of delimiters (code attached)

Python splits strings for any number of delimiters (code attached)

不言
不言forward
2018-11-27 15:50:153139browse

The content this article brings to you is about Python splitting strings (with code) for any number of delimiters. It has certain reference value. Friends in need can refer to it. I hope it will be helpful to you. .

1. Requirements

We need to split the string into different fields, but the delimiters (and the spaces between the delimiters) are in the entire string Not consistent.

2. Solution

The split() method of the string object can only handle very simple situations, and does not support multiple delimiters. There is nothing you can do about the space situation. When some more flexible functions are needed, the re.split() method should be used:

import re
line='abc def ; ghi, jkl,mno, pkr'
#分隔符:分号,都逗号,空格符,前后可以跟着任意数量的额外空格
result=re.split(r'\s*[;,\s]\s*',line)
print(result)

Result:

['abc', 'def', 'ghi', 'jkl', 'mno', 'pkr']

3. Analysis

re.split() is useful because multiple patterns can be specified for separators. For example, in the above solution, the delimiters are: semicolon, comma, space, and can be followed by any number of additional spaces. Like the result obtained by str.split(), the final result is a field list.

When using re.split(), you need to be careful whether the capturing group in the regular expression pattern is included in parentheses.

If a capturing group is used, the matching text will also be included in the final result. For example, consider the following case:

import re
line='abc def ; ghi, jkl,mno, pkr'
result=re.split(r'\s*(;|,|\s)\s*',line)
print(result)

Result:

['abc', ' ', 'def', ';', 'ghi', ',', 'jkl', ',', 'mno', ',', 'pkr']

It may also be useful to obtain the delimiter in a specific context. For example, to improve the output of a string with a delimiter character:

import re
line='abc def ; ghi, jkl,mno, pkr'
result=re.split(r'\s*(;|,|\s)\s*',line)

values=result[::2]
delimiters=result[1::2]+['']

print(values)
print(delimiters)

last=''.join(v+d for v,d in zip(values,delimiters))
print(last)

Result:

['abc', 'def', 'ghi', 'jkl', 'mno', 'pkr']
[' ', ';', ',', ',', ',', '']
abc def;ghi,jkl,mno,pkr

If you don't want to see delimiter characters in the results, but still want to use symbols for regex patterns Grouping, please make sure to use a non-capturing group, specified in the form of (?:...). Examples are as follows:

import re
line='abc def ; ghi, jkl,mno, pkr'
result=re.split(r'\s*(?:;|,|\s)\s*',line)

print(result)

Result:

['abc', 'def', 'ghi', 'jkl', 'mno', 'pkr']

The above is the detailed content of Python splits strings for any number of delimiters (code attached). For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:segmentfault.com. If there is any infringement, please contact admin@php.cn delete