Home  >  Article  >  Backend Development  >  What are the ways to use the re module in python regular expressions? Introduction to the usage of re module

What are the ways to use the re module in python regular expressions? Introduction to the usage of re module

不言
不言Original
2018-09-15 14:15:442973browse

The content of this article is about how to use the re module in python regular expressions? The introduction to the usage of the re module has certain reference value. Friends in need can refer to it. I hope it will be helpful to you.

Regular expressions are the most commonly used method for processing strings. Regular expressions can be seen everywhere in our coding.

The regular expressions in python are slightly different from those in other languages:

1. When replacing a string, the replaced string can be a function

2 , the split function can specify the number of splits, which will lead to pitfalls

3. The expression defined in the previous item must be of a fixed length

The following is a detailed description of how to use the re module. In fact, pay attention The three differences mentioned above are just fine

1, match

Description:

re.match attempts to match from the beginning of the string If a pattern is not matched successfully at the starting position, match() will return None.

Syntax:

re.match(pattern, string, flags=0)

flags are optional identifiers, multiple identifiers can be obtained by bitwise or (|) specified. For example, re.I | re.M is set to I and M identifiers:

##re.LDo locale-aware matching##re.M##re.Sre.U

##Modifier

Description

##re.I

Make the match case-insensitive

Multi-line matching, affects ^ and $

make. Match all characters including newlines

# Parse characters according to the Unicode character set. This flag affects \w, \W, \b, \B.

##re.X

This flag makes writing regular expressions easier to understand by giving you a more flexible format.

The re.match method returns a matching object if the match is successful. You can use the group(num) or groups() matching object function to obtain the matching expression. group() or group(0), returns the matching result of the entire regular expression.

Example:

s= 'abc123abc'
print(re.match(&#39;[a-z]+&#39;, s))           # <_sre.SRE_Match object; span=(0, 3), match=&#39;abc&#39;>
print(re.match(&#39;[a-z]+&#39;, s).group(0))      # abc
print(re.match(&#39;[\d]+&#39;, s))            # None
print(re.match(&#39;[A-Z]+&#39;, s, re.I).group(0))   # abc
print(re.match(&#39;[a-z]+&#39;, s).span())       # (0, 3)

2, search

Instructions:

re.search scans the entire character The concatenation returns the first successful match.

Syntax:

re.search(pattern, string, flags=0)

The re.search method returns a matching object if the match is successful, otherwise it returns None. Match expressions can be obtained using the group(num) or groups() match object functions.

Example:

s = &#39;abc123abc&#39;
print(re.search(&#39;[a-z]+&#39;, s).group())  # abc
print(re.search(&#39;[a-z]+&#39;, s).span())   # (0, 3)
print(re.search(&#39;[\d]+&#39;, s).group())   # 123
print(re.search(&#39;[\d]+&#39;, s).span())    # (3, 6)
print(re.search(&#39;xyz&#39;, s))         # None

groupdict

groupdict Returns a dictionary of all matching named subgroups.

print(re.search(&#39;[a-z]+&#39;, s).groupdict())          # {}
print(re.search(&#39;(?P<letter>[a-z]+)(?P<num>\d+)&#39;, s).groupdict())  # {&#39;num&#39;: &#39;123&#39;, &#39;letter&#39;: &#39;abc&#39;}

3, sub and subn

Description:

re.sub is used to replace matching items in the string.

re.subn returns a tuple containing the replaced string and the number of replacements.

Syntax:

sub(pattern, repl, string, count=0, flags=0)

repl: The string to be replaced can also be a function.

count: The maximum number of substitutions after pattern matching. The default value is 0, which means replacing all matches.

Example:

s = &#39;abc123abc&#39;
print(re.sub(&#39;[\d]+&#39;, &#39;数字&#39;, s))       # abc数字abc
print(re.sub(&#39;[a-z]+&#39;, &#39;字母&#39;, s, 1))   # 字母123abc
# 将匹配的数字乘以 2
def double(matched):
    value = int(matched.group(&#39;value&#39;))
    return str(value * 2)
# repl是一个函数
print(re.sub(&#39;(?P<value>\d+)&#39;, double, s))  # abc246abc
print(re.subn(&#39;[\d]+&#39;, &#39;数字&#39;, s))        # (&#39;abc数字abc&#39;, 1)
print(re.subn(&#39;[a-z]+&#39;, &#39;字母&#39;, s,))      # (&#39;字母123字母&#39;, 2)
print(re.subn(&#39;[a-z]+&#39;, &#39;字母&#39;, s, 1))     # (&#39;字母123字母&#39;, 1)

4, compile

Explanation:

re.compile is used for compilation Regular expression, generates a regular expression (Pattern) object for use by the two functions match() and search().

Grammar:

compile(pattern, flags=0)

Example:

s = &#39;abc123abc&#39;
p = re.compile(&#39;[\d]+&#39;)
print(p.match(s, 4, 5).group(0))    # 2 从位置4开始匹配到位置5
print(p.search(s).group(0))         # 123

5, findall

Description:

re.findall searches for all substrings matched by the regular expression in the string and returns a list. If no match is found, an empty list is returned.

Syntax:

findall(pattern, string, flags=0)

Example:

s = &#39;abc123abc&#39;
print(re.findall(&#39;[a-z]+&#39;, s))  # [&#39;abc&#39;, &#39;abc&#39;]
print(re.findall(&#39;[h-n]+&#39;, s))  # []

6, finditer

Description:

finditer is similar to findall, it searches for all substrings matched by the regular expression in the string and returns them as an iterator.

Grammar:

finditer(pattern, string, flags=0)

Example:

s = &#39;abc123def&#39;
it = re.finditer(&#39;[a-z]+&#39;, s)
for match in it:    print(match.group())

7、 split

Description: The

re.split method splits the string according to the matching substrings and returns a list.

Syntax:

split(pattern, string, maxsplit=0, flags=0)

maxsplit: number of separations, maxsplit=1 separates once, default is 0, no Limit the number of times.

Example:

print(re.split(&#39;a&#39;, &#39;1A1a2A3&#39;, re.I))           # [&#39;1A1&#39;, &#39;2A3&#39;]# 输出结果并未能区分大小写,这是因为re.split(pattern,string,maxsplit, flags)默认是四个参数,当我们传入的三个参数的时候,系统会默认re.I是第三个参数,所以就没起作用。# 如果想让这里的re.I起作用,写成flags=re.I即可。
print(re.split(&#39;a&#39;, &#39;1A1a2A3&#39;, flags=re.I))     # [&#39;1&#39;, &#39;1&#39;, &#39;2&#39;, &#39;3&#39;]

8, escape

Explanation:

re.escape for string The special strings inside are escaped.

Grammar:

escape(pattern)

Example:

print(re.escape(&#39;www.dxy.cn&#39;))  # www\.dxy\.cn

9, regular expression

"(?P8a11bc632ea32a57b3e3693c7987c420...)": Group and name it 8a11bc632ea32a57b3e3693c7987c420.

"(?P=name)": refers to the string matched by the group whose alias is 8a11bc632ea32a57b3e3693c7987c420.

10. Specialities in the definition of the antecedent (negation)

The common definition of the antecedent (?<=exp) and the negative definition of the antecedent (?

(?<=aaa)   # 正确
(?<=aaa|bbb) # 正确
(?<=aaa|bb) # 错误
(?<=\d+)   # 错误
(?<=\d{3})  # 正确

Related recommendations:

What is the use of the logging module in python? Introduction to the usage of the logging module

Detailed introduction to the re regular expression of the python module

The above is the detailed content of What are the ways to use the re module in python regular expressions? Introduction to the usage of re module. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn