This article brings you an introduction to the re module and regular expressions in Python (with code). It has certain reference value. Friends in need can refer to it. I hope it will be helpful to you.
Regular expression (English: Regular Expression, often abbreviated as regex, regexp or RE in code), also known as regular expression, regular expression, regular expression, regular expression, regular expression, is A concept in computer science. Regular expressions use a single string to describe and match a series of strings that match a certain syntax rule. In many text editors, regular expressions are often used to retrieve and replace text that matches a certain pattern.
Character | Function | Regular expression example | Match matching example |
---|---|---|---|
Match any character (except n) | b.b | bab,b2b | |
Matches any character from the character set in [] | i [abCde]m | i am | |
Matches any decimal digit, consistent with [0-9] | w\dcschool | w3cschool | |
matches non-numbers, that is, not numbers | mou\Dh | mouth | |
Matches any space character, same as [\n\t\r\v\f] | i\slike | i like | |
Matches any non-whitespace character, as opposed to \s | n\Se | noe,n3e | |
Matches any alphanumeric character, same as [A-Za-z0-9_] | [A-Za-z]w | ||
Matches non-word characters | [0-9]\W[A-Z] | 3 A |
function | regular expression example | matching example | ||
---|---|---|---|---|
a* | aaa | |||
a | aaa | |||
a? | a or b | ##{m} |
||
[0-9]{5 } | 12345 | {m.} |
||
a{5.} | aaaaa | ##{m,n} | matches the previous one Characters appear from m to n times
||
aaa | Represents boundary matching |
^ | ||
---|---|---|
$ | Match the ending part of the string | |
b | Match any word boundary | |
B | Match non-word boundaries | |
##Match groups |
##\ | matches either left or right The expression
||
---|---|---|
##(ab) | treats the characters in brackets as a group | |
\num | Reference the string matched by group num | |
(?P< ;name>) | Group alias | |
(?P=name) | The reference alias is name Group matched strings | |
re module | In python, you can use the built-in re module Regular expression | Common functions and methods of re module
compile(pattern,flags=0) | Compiles the regular expression pattern using any optional flags, then returns a regular expression object |
---|
re module functions and regular expression object methods | Description |
---|---|
match(pattern, string,flags=0) | Attempts to match a string using a regular expression pattern with optional flags. If the match is successful, return the matching object; if it fails, return None |
search(pattern,string,flags=0) | Search for string using optional flags The first occurrence of the regular expression pattern in . If the match is successful, the matching object is returned; if it fails, None is returned. |
findall(pattern,string,[,flags]) | Find all occurrences in the string regular expression and returns a list |
split(pattern,string,max=0) | According to the pattern separator of the regular expression, the split function separates the characters Split the string into a list, and then return a list of successful matches. The split operation can be max times (the default is to split all successfully matched positions) |
Use repl to replace all occurrences of the regular expression pattern in the string. Unless count is defined, all occurrences will be replaced. |
Description | |
---|---|
Default returns the entire matching object or returns a specific subgroup numbered num | |
Returns a tuple containing all matching subgroups, If there is no successful match, an empty tuple is returned | |
Explanation | |
---|---|
Make the match case-insensitive (ignore case) | |
.(dot) matches anything except n All characters except, re.S mark indicates. (dot) can match all characters | ##re.M |
re.U | |
re.X | |
Matchobject;
group ()) Obtain information and perform other operations as needed.
import re
Functioncompile function is used to compile regular expressions and generate a Pattern object. Its general usage form is as follows:
import re # 将正则表达式编译成pattern对象 pattern = re.compile(r'\d+')
MethodThe match method is used to find the head of the string (you can also specify the starting position), it is
onceMatching, as long as a matching result is found, it is returned instead of searching for all matching results. Its general usage form is as follows:
match(string[, pos[, endpos]])
Among them, string is the string to be matched, pos and endpos are optional parameters, specifying the
start and
endpoint# of the string. ## position, the default values are 0 andlen
(string length) respectively. Therefore, when you do not specify pos and endpos, the match method defaults to matching the head of the string.When the match is successful, a Match object is returned. If there is no match, None is returned.
>>> import re >>> >>> pattern = re.compile(r'\d+') # 正则表达式表示匹配至少一个数字 >>> >>> m = pattern.match("one2three4") # match默认从开头开始匹配,开头是字母o,所以没有匹配成功 >>> print(m) # 匹配失败返回None None >>> >>> m = pattern.match("1two3four") # 开头字符是数字,匹配成功 >>> print(m) <_sre.SRE_Match object; span=(0, 1), match='1'> >>> >>> m.group() # group()方法获取匹配成功的字符 '1' >>> m = pattern.match("onetwo3four56",6,12) # 指定match从数字3开始查找,第一个是数字3,匹配成功 >>> print(m) <_sre.SRE_Match object; span=(6, 7), match='3'> >>> m.group() '3'
The above is the detailed content of Introduction to the re module and regular expressions in python (with code). For more information, please follow other related articles on the PHP Chinese website!