Summary of commonly used PHP regular expressions and grammatical annotations-PHP Tutorial-php.cn

Summary of commonly used PHP regular expressions and grammatical annotations

Release： 2023-04-08 17:54:01

forward

2889 people have browsed it

Basic syntax

Delimiter:

Identifies the beginning and end of a regular expression, use '/' or '#' or '{ }', because the syntax '{ }' may also be a regular expression operator, to avoid confusion, it is not recommended to use. The recommended usage is as follows:

Copy$pattern = &#39;/[0-9]/&#39;;  //我喜欢这个，看起来比较简洁 $pattern = &#39;#[0-9]#&#39;;

Copy after login

Atoms:

Visible atoms: Characters in the Unicode encoding table that are visible to the naked eye after keyboard output, for example: Punctuation; . / ? Or visible characters such as English letters, Chinese characters, etc.
Invisible atoms: characters in the Unicode encoding table that are invisible to the naked eye after keyboard output, such as: newline \n, Tab \t, space Wait,
Generally only these three are used (newline characters are usually matched together with other characters, because only newline characters cannot be matched)
Tips: You need to add '\' in front of the matching operator. For example: ' ' sign, if it matches, you need to write '\ '

metacharacter

Atom filtering method:

| Match Two or more branch selections
[] matches any atom in square brackets
[^] matches any character except the atoms in square brackets;
Example: Duang|duang or [Dd ]uang can match both Duang and duang
Interval writing: [a-z] matches characters from a to z, [0-9] matches characters from 0 to 9. It can also be [a-z0-9]
. Matches any character except newline characters
\d matches any decimal digit, that is, {0-9]
\D matches any non-decimal digit [^0-9] is equivalent to [^\d]
\s matches an invisible atom, that is, [\f\n\r\t\v]
\S matches a visible atom, that is [\f\n\r\t\v], equivalent to [\s]z
\w matches any number, letter or underscore, that is, [0-9a-zA-Z_]
\W matches Any non-number, letter or underscore, [0-9a-zA-Z_], equivalent to [\w]

quantifier

{n} means that the atom in front of it appears exactly n times.
[n] means that the preceding atom appears at least n times
{n,m} appears at least n times and at most m times
* matches 0 times, once or multiple times, that is, {0,}
Match one or more times, that is, {1,}
? Match 0 or 1 time, that is, {0,1}

Boundary control

^ Match the starting position of the string
$ Match the ending position of the string
Example: ^John can match: John but cannot match: 123John, because the string is specified to start with John

Pattern unit

() matches the whole of it as an atom, such as: (X|x)iaomi, which can match xiaomi

Correction mode

Greedy matching

When the matching result is ambiguous, take the longer one (default)

Lazy matching

When the matching result is ambiguous, choose the shorter one. Just add 'U' after '/' in the regular expression, such as '/[0-9]/U';
Example:

Copy$subject = "test__123123123";
preg_match(&#39;/test.+123/&#39;, $subject, $matches); //贪婪模式  var_dump($matches);
preg_match(&#39;/test.+123/U&#39;, $subject, $matches); //懒惰模式var_dump($matches);

Copy after login

Common correction patterns:

U Lazy matching
i Ignore the case of English letters
x Ignore the whitespace characters of regular expressions
s Let the metacharacter '.' match all characters including newlines

Common functions

preg_match

Perform matching Regular expression

preg_match ( string $pattern , string $subject [, array &$matches [, int $flags = 0 [, int $offset = 0 ]]] ) : int

Copy after login

pattern: Pattern to search for, string type. subject: input string. match: If the parameter matches is provided, it will be filled with search results, and the data structure is a one-dimensional array. flags: can be set to PREG_OFFSET_CAPTURE, using the 0th element of the search result as the matched string, and the 1st element as the corresponding offset (position) offset: The search starts from the starting position of the target string.

Return value: number of matches
Similar to the function preg_match_all, the parameters are consistent with preg_match
Difference:

preg_match: only matches once, the data result of the search structure match is a one-dimensional array preg_match_all : Match all, the data structure of search result match is a two-dimensional array.

preg_replace

Performs a regular expression search and replacement, and the return value is the replaced string

preg_replace ( mixed $pattern , mixed $replacement , mixed $subject [, int $limit = -1 [, int &$count ]] ) : mixed

Copy after login

pattern: the pattern to be searched. Can be a string or an array of strings. replacement: a string or array of strings to be replaced subject: a string or array of strings to be searched and replaced. limit: the maximum number of replacements. The default is -1 (unlimited). count: number of replacements.
Similar to function preg_filter, the parameters are the same as preg_replace
Difference (the difference can only be seen when using arrays for matching): preg_replace: returns all results regardless of whether there is replacement or not preg_filter: only returns matching results.

preg_split

Separate strings by a regular expression

preg_split ( string $pattern , string $subject [, int $limit = -1 [, int $flags = 0 ]] ) : array

Copy after login

$pattrn：用于搜索的模式，字符串形式。subject：输入字符串limit：将限制分隔得到的子串最多只有limit个，返回的最后一个子串将包含所有剩余部分。flags：有以下标记的组合：
-- 1. PREG_SPLIT_NO_EMPTY: 返回分隔后的非空部分。
-- 2. PREG_SPLIT_DELIM_CAPTURE: 用分隔符'()'括号把匹配的捕获并返回。
-- 3. PREG_SPLIT_OFFSET_CAPTURE：匹配返回时将会附加字符串偏移量

PREG_SPLIT_DELIM_CAPTURE这个参数可能比较难明白，举个例子看看：

Copy$subject = "1a23b";  
$a = preg_split(&#39;/[\d]/&#39;, $subject, -1, PREG_SPLIT_NO_EMPTY);  
var_dump($a);  
$a = preg_split(&#39;/([\d])/&#39;, $subject, -1, PREG_SPLIT_NO_EMPTY | PREG_SPLIT_DELIM_CAPTURE); 
var_dump($a);

Copy after login

输出如下：

array (size=2)
0 => string 'a' (length=1)
1 => string 'b' (length=1)
array (size=5)
0 => string '1' (length=1)
1 => string 'a' (length=1)
2 => string '2' (length=1)
3 => string '3' (length=1)
4 => string 'b' (length=1)

preg_grep

返回匹配模式的数组条目

preg_grep ( string $pattern , array $input [, int $flags = 0 ] ) : array

Copy after login

$pattern：要搜索的模式，字符串形式$input：输入数组flags：如果不设置则返回匹配的数目，设置PREG_GREP_INVERT则返回不匹配的数目。

preg_quote

转义正则表达式字符，返回为转义后的字符串

preg_quote ( string $str [, string $delimiter = NULL ] ) : string

str：输入字符串delimiter：需要转义的字符串

The above is the detailed content of Summary of commonly used PHP regular expressions and grammatical annotations. For more information, please follow other related articles on the PHP Chinese website!