A brief discussion on regular expressions in PHP-PHP Tutorial-php.cn

This article will introduce you to the regular expressions of PHP. It has certain reference value. Friends in need can refer to it. I hope it will be helpful to you.

Mind Map

Click on the picture below to see the specific content!

Introduction

Regular expressions, what everyone should do during development It is often used. Nowadays, many development languages have regular expression applications, such as javascript, java, .net, php, etc. Today I will talk to you about my understanding of regular expressions. Please tell me if there is any inappropriateness. Advice!

Need-to-know terms - How much do you know about the following terms?

Δ Delimiter

Δ Character field

Δ Modifier

Δ Qualifier

Δ Caret

Δ Wildcard (forward pre-check, reverse pre-check)

Δ Back reference

Δ Lazy matching

Δ Comment

Δ Zero character width

Positioning

When do we use What about regular expressions? It is not enough to use regular expressions for all character operations. PHP uses regular expressions in some aspects, which actually affects efficiency. When we encounter the parsing of complex text data, using regular expressions is a better choice.

Advantages

Regular expressions can improve work efficiency when dealing with complex character operations, and also save you to a certain extent The amount of code is .

Disadvantages

When we use regular expressions, complex regular expressions will increase the complexity of the code, which is very frustrating. Difficult to understand. So sometimes we need to add comments inside regular expressions.

General mode

¤ Delimiter, usually use "/" as the delimiter to start and end, you can also use "# ".

When should you use "#"? Generally, it is when there are many "/" characters in your string, because such characters need to be escaped during regular expression, such as uri .

The code using the "/" delimiter is as follows.

$regex = &#39;/^http:\/\/([\w.]+)\/([\w]+)\/([\w]+)\.html$/i&#39;;
$str = &#39;http://www.youku.com/show_page/id_ABCDEFG.html&#39;;
$matches = array();

if(preg_match($regex, $str, $matches)){
    var_dump($matches);
}

echo "\n";

Copy after login

$matches[0] in preg_match will contain the string matching the entire pattern.

The code using the "#" delimiter is as follows. At this time, does not escape "/" !

$regex = &#39;#^http://([\w.]+)/([\w]+)/([\w]+)\.html$#i&#39;;
$str = &#39;http://www.youku.com/show_page/id_ABCDEFG.html&#39;;
$matches = array();

if(preg_match($regex, $str, $matches)){
    var_dump($matches);
}

echo "\n";

Copy after login

¤ Modifier: used to change the regular expression style behavior.

What we see ('/^http:\/\/([\w.] )\/([\w] )\/([\w] )\.html/ i') The last "i" in it is the modifier, which means ignoring case. Another one we often use is "x" which means ignoring spaces.

Contribution code:

$regex = &#39;/HELLO/&#39;;
$str = &#39;hello word&#39;;
$matches = array();

if(preg_match($regex, $str, $matches)){
    echo &#39;No i:Valid Successful!&#39;,"\n";
}

if(preg_match($regex.&#39;i&#39;, $str, $matches)){
    echo &#39;YES i:Valid Successful!&#39;,"\n";
}

Copy after login

¤ Character field: [\w] The part expanded with square brackets is the character field.

¤ Qualifier: Such as [\w]{3,5} or [\w]* or [\w] The symbols following [\w] represent qualifiers. The specific meaning is now introduced.

{3,5} represents 3 to 5 characters. {3,} exceeds 3 characters, {,5} has up to 5 characters, and {3} has three characters.

* Indicates 0 to multiple

Represents 1 to multiple.

¤ Caret

——"Reverse selection"

can be placed before the expression to start with the current character. (/^n/i, means starting with n).

Note that we often call "\" the "escape character". Used to escape some special symbols, such as ".", "/"

Wildcards (lookarounds): Assert the presence or absence of certain characters in certain strings!

## Lookarounds are divided into two types: lookaheads (forward lookup ?=) and lookbehinds (reverse lookup?<=).

> Format:

Forward lookup: (?=) The corresponding (?!) means negative meaning

Reverse lookup: (?< =) The corresponding (?

followed by the characters

$regex = &#39;/(?<=c)d(?=e)/&#39;;  /* d 前面紧跟c, d 后面紧跟e*/
$str = &#39;abcdefgk&#39;;
$matches = array();

if(preg_match($regex, $str, $matches)){
    var_dump($matches);
}

echo "\n";

Copy after login

Negative meaning:

$regex = &#39;/(?<!c)d(?!e)/&#39;;  /* d 前面不紧跟c, d 后面不紧跟e*/
$str = &#39;abcdefgk&#39;;
$matches = array();

if(preg_match($regex, $str, $matches)){
    var_dump($matches);
}

echo "\n";

Copy after login

>字符宽度:零

验证零字符代码

$regex = &#39;/HE(?=L)LO/i&#39;;
$str = &#39;HELLO&#39;;
$matches = array();

if(preg_match($regex, $str, $matches)){
    var_dump($matches);
}

echo "\n";

Copy after login

打印不出结果！

$regex = &#39;/HE(?=L)LLO/i&#39;;
$str = &#39;HELLO&#39;;
$matches = array();

if(preg_match($regex, $str, $matches)){
    var_dump($matches);
}

echo "\n";

Copy after login

能打印出结果!

说明:(?=L)意思是HE后面紧跟一个L字符。但是(?=L)本身不占字符，要与(L)区分，（L）本身占一个字符。

捕获数据

没有指明类型而进行的分组,将会被获取,供以后使用。

> 指明类型指的是通配符。所以只有圆括号起始位置没有问号的才能被捕捉。

> 在同一个表达式内的引用叫做反向引用。

> 调用格式: \编号(如\1)。

$regex = &#39;/^(Chuanshanjia)[\w\s!]+\1$/&#39;;    
$str = &#39;Chuanshanjia thank Chuanshanjia&#39;;
$matches = array();

if(preg_match($regex, $str, $matches)){
    var_dump($matches);
}

echo "\n";

Copy after login

> 避免捕获数据

格式:(?:pattern)

优点:将使有效反向引用数量保持在最小，代码更加、清楚。

>命名捕获组

格式:(?P<组名>) 调用方式 (?P=组名)

$regex = &#39;/(?P<author>chuanshanjia)[\s]Is[\s](?P=author)/i&#39;;
$str = &#39;author:chuanshanjia Is chuanshanjia&#39;;
$matches = array();

if(preg_match($regex, $str, $matches)){
    var_dump($matches);
}

echo "\n";

Copy after login

运行结果

惰性匹配(记住：会进行两部操作,请看下面的原理部分)

　格式:限定符?

原理:"?"：如果前面有限定符，会使用最小的数据。如“*”会取0个，而“+”会取1个，如过是{3,5}会取3个。

先看下面的两个代码:

代码1.

<?php
$regex = &#39;/heL*/i&#39;;
$str = &#39;heLLLLLLLLLLLLLLLL&#39;;
if(preg_match($regex, $str, $matches)){
    var_dump($matches);
}

echo "\n";

Copy after login

结果1.

代码2

<?php
$regex = &#39;/heL*?/i&#39;;
$str = &#39;heLLLLLLLLLLLLLLLL&#39;;
if(preg_match($regex, $str, $matches)){
    var_dump($matches);
}

echo "\n";

Copy after login

结果2

代码3,使用“+”

<?php
$regex = &#39;/heL+?/i&#39;;
$str = &#39;heLLLLLLLLLLLLLLLL&#39;;
if(preg_match($regex, $str, $matches)){
    var_dump($matches);
}

echo "\n";

Copy after login

结果3

代码4,使用{3,5}

<?php
$regex = &#39;/heL{3,10}?/i&#39;;
$str = &#39;heLLLLLLLLLLLLLLLL&#39;;
if(preg_match($regex, $str, $matches)){
    var_dump($matches);
}

echo "\n";

Copy after login

结果4

正则表达式的注释

格式:(?# 注释内容)

用途:主要用于复杂的注释

贡献代码:是一个用于连接MYSQL数据库的正则表达式

$regex = &#39;/
    ^host=(?<!\.)([\d.]+)(?!\.)                 (?#主机地址)
\|
    ([\w!@#$%^&*()_+\-]+)                       (?#用户名)
\|
    ([\w!@#$%^&*()_+\-]+)                       (?#密码)
(?!\|)$/ix&#39;;

$str = &#39;host=192.168.10.221|root|123456&#39;;
$matches = array();

if(preg_match($regex, $str, $matches)){
    var_dump($matches);
}

echo "\n";

Copy after login

特殊字符

特殊字符	解释
*	0到多次
+	1到多次还可以写成{1,}
?	0或1次
.	匹配除换行符外的所有单个的字符
\w	[a-zA-Z0-9_]
\s	空白字符(空格，换行符，回车符）[\t\n\r]
\d	[0-9]

案例汇总

1、PHP中文匹配

<?php
$str = "PHP编程";
if (preg_match("/([0-9a-zA-Z\x{4e00}-\x{9fa5}]+)/u",$str, $matches)) {
    var_dump($matches);
    echo "\n";
}

Copy after login

以上就是本篇文章的全部内容，希望能对大家的学习有所帮助。更多精彩内容大家可以关注php中文网相关教程栏目！！！

The above is the detailed content of A brief discussion on regular expressions in PHP. For more information, please follow other related articles on the PHP Chinese website!