Detailed explanation of the use of regular pattern modifiers

php中世界最好的语言
Release: 2018-03-30 13:35:38
Original
1707 people have browsed it

这次给大家带来正则的模式修饰符使用详解,使用正则模式修饰符的注意事项有哪些,下面就是实战案例,一起来看一下。

i (PCRE_CASELESS)

如果设置了这个修饰符,模式中的字母会进行大小写不敏感匹配。

m (PCRE_MULTILINE)

默认情况下,PCRE 认为目标字符串是由单行字符组成的(然而实际上它可能会包含多行), “行首”元字符 (^) 仅匹配字符串的开始位置, 而”行末”元字符 ($) 仅匹配字符串末尾, 或者最后的换行符(除非设置了 D 修饰符)。这个行为和 perl 相同。 当这个修饰符设置之后,“行首”和“行末”就会匹配目标字符串中任意换行符之前或之后,另外, 还分别匹配目标字符串的最开始和最末尾位置。这等同于 perl 的 /m 修饰符。如果目标字符串 中没有 “\n” 字符,或者模式中没有出现 ^ 或 $,设置这个修饰符不产生任何影响。

s (PCRE_DOTALL)

如果设置了这个修饰符,模式中的点号元字符匹配所有字符,包含换行符。如果没有这个 修饰符,点号不匹配换行符。这个修饰符等同于 perl 中的/s修饰符。 一个取反字符类比如 [^a] 总是匹配换行符,而不依赖于这个修饰符的设置。

x (PCRE_EXTENDED)

如果设置了这个修饰符,模式中的没有经过转义的或不在字符类中的空白数据字符总会被忽略, 并且位于一个未转义的字符类外部的#字符和下一个换行符之间的字符也被忽略。 这个修饰符 等同于 perl 中的 /x 修饰符,使被编译模式中可以包含注释。 注意:这仅用于数据字符。 空白字符 还是不能在模式的特殊字符序列中出现,比如序列 (?( 引入了一个条件子组(译注: 这种语法定义的 特殊字符序列中如果出现空白字符会导致编译错误。 比如(?(就会导致错误)。

e (PREG_REPLACE_EVAL)

Warning

本特性已自 PHP 5.5.0 起废弃。强烈建议不要使用本特性。

如果设置了这个被弃用的修饰符, preg_replace() 在进行了对替换字符串的后向引用替换之后, 将替换后的字符串作为php 代码评估执行(eval 函数方式),并使用执行结果 作为实际参与替换的字符串。单引号、双引号、反斜线(\)和 NULL 字符在 后向引用替换时会被用反斜线转义.

Caution

Theaddslashes() function is run on each matched backreference before the substitution takes place. As such, when the backreference is used as a quoted string, escaped characters will be converted to literals. However, characters which are escaped, which would normally not be converted, will retain their slashes. This makes use of this modifier very complicated.

Caution

请确保 replacement 参数由合法 php 代码字符串组成,否则 php 将会 在preg_replace() 调用的行上产生一个解释错误。

Caution

Use of this modifier is discouraged, as it can easily introduce security vulnerabilites:

(.*?))e', '"" . strtoupper("$2") . ""', $html );
Copy after login

The above example code can be easily exploited by passing in a string such as

{${eval($_GET[php_code])}}

. This gives the attacker the ability to execute arbitrary PHP code and as such gives him nearly complete access to your server.

To prevent this kind of remote code execution vulnerability the preg_replace_callback() function should be used instead:

(.*?))', function ($m) { return "" . strtoupper($m[2]) . "" }, $html );
Copy after login

Note:

仅 preg_replace() 使用此修饰符,其他 PCRE 函数忽略此修饰符。

A (PCRE_ANCHORED)
If this modifier is set, the pattern is forced to be an "anchored" pattern, which means that the match is constrained to search only from the beginning of the target string. This effect can also be constructed using appropriate patterns, and is the only way to implement this pattern in Perl.
D (PCRE_DOLLAR_ENDONLY)
If this modifier is set, the metacharacter dollar sign in the pattern only matches the end of the target string. If this modifier is not set, when the string ends with a newline character, the dollar sign will also match that newline character (but not any preceding newline character). If modifier m is set, this modifier is ignored. There is no equivalent of this modifier in perl.
S
When a pattern needs to be used multiple times, in order to improve the matching speed, it is worth spending some time to perform some additional analysis on it. If this modifier is set, this additional analysis will be performed. Currently, this analysis of a pattern only applies to non-anchored pattern matches (i.e. without a single fixed start character).
U (PCRE_UNGREEDY)
This modifier reverses the "greedy" mode of the quantifier. Make the quantifier non-greedy by default. You can make it greedy by following the quantifier with ?. This is incompatible with perl. It can also be set using the intra-mode modifier setting (?U), or by marking it non-greedy with a question mark after the quantifier (e.g. .*?).
Note:

In non-greedy mode, characters exceeding pcre.backtrack_limit are usually not matched.

X (PCRE_EXTRA)
This modifier turns on the attachment function that is incompatible between PCRE and perl. Any backslash in the pattern followed by a character with no special meaning will result in an error, these characters are preserved for backward compatibility. By default, in perl, a backslash followed by a character with no special meaning is considered the original text of that character. No other features are currently controlled by this modifier.
J (PCRE_INFO_JCHANGED)
Internal option settings (?J) modify the local PCRE_DUPNAMES option. Allow subgroups with duplicate names. (Annotation: Can only be set by internal options, external /J settings will generate errors.)
u (PCRE_UTF8)
This modifier opens a file that is not compatible with perl Compatible additional features. Pattern strings are considered UTF-8. This modifier is available starting with PHP 4.1.0 or higher for Unix and PHP 4.2.3 for Win32. PHP 4.3.5 starts checking the utf-8 validity of patterns.

I believe you have mastered the method after reading the case in this article. For more exciting information, please pay attention to other related articles on the php Chinese website!

Recommended reading:

Detailed explanation of the use of \d metacharacters in regular expressions (with code)

Using regular expressions in Linux Detailed introduction

The above is the detailed content of Detailed explanation of the use of regular pattern modifiers. For more information, please follow other related articles on the PHP Chinese website!

Related labels:
source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template
About us Disclaimer Sitemap
php.cn:Public welfare online PHP training,Help PHP learners grow quickly!