Metacharacters in php regular representation

Metacharacters

Throws a problem: \d represents matching a character. And now I want to match ten or eight, what should I do with any number of numbers?

At this time we need to use metacharacters. When using atoms, I found that it can only match one character, but problems arise when matching multiple characters.
At this time, we need to use metacharacters to help us modify atoms and achieve more functions.

Don’t be scared by the following. We will understand everything after we do experiments bit by bit. The main thing is that these are more versatile.
It is best to prepare a small card to help yourself remember.

Let’s take a look:

Metacharacters	Function description
*	means matching the previous atom, matching the previous character 0 times or any number of times.
+	Matches the preceding character one or more times
?	The preceding character Optional [Optional] With or without
.	More standardly, points should be counted as atoms. Matches all characters except \n
	or. Note: It has the lowest priority.
^	must start with the string after the circumflex character
$	must Ends with the character before $
\b	Word boundary
\B	Non-boundary
{m}	It can only appear m times
{n,m}	Yes Appear n to m times
{m,}	At least m times, the maximum number is not limited
()	Change the priority or treat a string as a whole, and you can also use it to extract the matched data

+ matches the preceding character at least once.

<?php
$zz = '/\d+/';

$string = "迪奥和奥迪250都是我最爱";

//待会儿再试试中间没有0-9的情况
//$string = "迪奥和奥迪都是我最爱";


if(preg_match($zz, $string, $matches)){
   echo '匹配到了，结果为：';
   var_dump($matches);
}else{
   echo '没有匹配到';
}

?>

matches successfully, proving the + in \d+. \d matches numbers, and + matches the previous character at least once.

* Matches the previous character 0 times or any number of times

<?php
$zz = '/\w*/';

$string = "!@!@!!@#@!$@#!";

//待会儿再试试中间没有0-9的情况
//$string1 = "!@#!@#!abcABC#@#!";


if(preg_match($zz, $string, $matches)){
   echo '匹配到了，结果为：';
   var_dump($matches);
}else{
   echo '没有匹配到';
}

?>

Explanation, the commented out $string1 and $string are matched successfully . Because, \w matches 0-9A-Za-z_, and * means that the previous \w does not need to exist. If present there can be 1 or more.

? The previous character appears 0 or 1 times, optional

<?php

$zz = '/ABC\d?ABC/';

$string = "ABC1ABC";

//待会儿再试试中间没有0-9的情况
//$string1 = "ABC888888ABC";
//$string2 = "ABCABC";


if(preg_match($zz, $string, $matches)){
   echo '匹配到了，结果为：';
   var_dump($matches);
}else{
   echo '没有匹配到';
}

?>

Matches $string, $string2 successfully, but fails to match $string1.
Because there are ABC before and after the match, and there is a 0-9 in the middle. 0-9 is optional, but there cannot be more than one.

. (dot) Matches all characters except \n

<?php

$zz = '/gg.+gg/';

$string = "ABC1ABC";


if(preg_match($zz, $string, $matches)){
   echo '匹配到了，结果为：';
   var_dump($matches);
}else{
   echo '没有匹配到';
}

?>

matches $string, $string2 successfully, but fails to match $string1.
Because there are ABC before and after the match, and there is a 0-9 in the middle. 0-9 is optional, but there cannot be more than one.

|(vertical bar), or, the lowest priority

We will see through experiments the matching of priority and or

<?php

$zz = '/abc|bcd/';

$string1 = "abccd";
$string2 = "ggggbcd";

if (preg_match($zz, $string1, $matches)) {
   echo '匹配到了，结果为：';
   var_dump($matches);
} else {
   echo '没有匹配到';
}

?>

Let’s see See:

1. At first, my idea of matching was to match abccd or abbcd. However, when matching $string1 and $string2, the matching results are abc and bcd.

2. Implemented or matching, the matching results are abc or bcd. It does not have a higher priority than strings contiguous together.

Then the question is, what should I do if I want to match abccd or abbcd in the above example?

You need to use () to change the priority.

<?php

$zz = '/ab(c|b)cd/';

$string1 = "起来abccd阅兵";
$string2 = "ggggbcd";
$string3 = '中国abbcd未来';

if (preg_match($zz, $string1, $matches)) {
   echo '匹配到了，结果为：';
   var_dump($matches);
} else {
   echo '没有匹配到';
}

?>

The results are as follows:

QQ截图20161114135925.png

Conclusion:

1. It does match abccd or abbcd ($string1 or $ string3).

2. But there is one more element in the matching array, and the subscript of this element is 1

3. As long as the content in () matches successfully, the matched data will be placed in In this array element with index 1.

^ (circumflex), must start with the string after ^

<?php

$zz = '/^猪哥好帅\w+/';

$string1 = "猪哥好帅abccdaaaasds";
//$string2没有以猪哥好帅开始
$string2 = "帅abccdaaaasds";


if (preg_match($zz, $string1, $matches)) {
   echo '匹配到了，结果为：';
   var_dump($matches);
} else {
   echo '没有匹配到';
}

?>

The following conclusions were found through experiments:

1.$string1 The match was successful, but $string2 was not matched successfully

2. Because $string1 starts with the specified character

3.$string2 does not start with the character after ^

4. The meaning of the translation of this regular rule is: starting with "Brother Zhu is so handsome" followed by at least one character a-zA-Z0-9_.

$ (dollar sign) must end with the character before $

<?php

$zz = '/\d+努力$/';

$string1 = "12321124333努力";
//$string2
$string2 = "12311124112313力";


if (preg_match($zz, $string1, $matches)) {
   echo '匹配到了，结果为：';
   var_dump($matches);
} else {
   echo '没有匹配到';
}

?>

Let’s run it to see the results and draw the conclusion:

1.$string1 matches successfully, but $string2 does not match successfully

2. The character before $ is \d+, followed by Chinese efforts.

3. Therefore, what matches is this whole. \d refers to the integer type of 0-9, and the + sign represents at least one 0-9

\b and \B word boundary and non-word boundary

us Let’s explain what boundaries are:

1. Regular expressions have boundaries. This boundary is the boundary where the beginning and end of the delimiter are regular.

2.This is an English word, followed by a space, which means that the word has ended and reached the boundary of the word

\bWord boundary means that it must be at the front Or finally.
\B Non-boundary means that it cannot be at the front or last of a regular expression.

<?php

$zz = '/\w+\b/';

$string1 = "this is a apple";
$string2 = "thisis a apple";
$string3 = "thisisaapple";

if (preg_match($zz, $string1, $matches)) {
   echo '匹配到了，结果为：';
   var_dump($matches);
} else {
   echo '没有匹配到';
}

?>

Conclusion:

1.$string1, $string2 and $string3 all match successfully.

2. When $string1 matches, this space is the boundary

3. When $string2 matches, thisis is the boundary

4. When $string3 matches, thisisaapple reaches the entire Regular expressions represent the end and therefore the boundary. So the match is successful.

Let’s experiment with non-word boundaries:

<?php

$zz = '/\Bthis/';

$string1 = "hellothis9";

//$string2 = "hello this9";
//$string2 = "this9中国万岁";

if (preg_match($zz, $string1, $matches)) {
   echo '匹配到了，结果为：';
   var_dump($matches);
} else {
   echo '没有匹配到';
}

?>

Summary:

1. Matching $string1 is successful but $string2 is unsuccessful.

2. Because \B is followed by this, so this cannot appear at word boundaries (spaces and beginning and ending).

{m} can and can only appear m times

<?php

$zz = '/喝\d{3}酒/';

$string1 = "喝988酒";

//$string2 = "喝98811酒";

if (preg_match($zz, $string1, $matches)) {
   echo '匹配到了，结果为：';
   var_dump($matches);
} else {
   echo '没有匹配到';
}

?>

Conclusion:
In the above example\d{3} I specified that 0-9 can only It appears 3 times, no more than once or less than once.

{n,m} can appear n to m times

<?php

$zz = '/喝\d{1,3}酒/';

$string1 = "喝9酒";

//$string2 = "喝988酒";

if (preg_match($zz, $string1, $matches)) {
   echo '匹配到了，结果为：';
   var_dump($matches);
} else {
   echo '没有匹配到';
}

?>

Conclusion:
In the above example\d{1,3}, I specified 0- 9 can only appear once, twice or three times. All other times are wrong

{m,} At least m times, the maximum number is not limited

<?php

$zz = '/喝\d{2,}/';

$string1 = "喝9";

//$string2 = "喝98";
//$string3 = "喝98122121";


if (preg_match($zz, $string1, $matches)) {
   echo '匹配到了，结果为：';
   var_dump($matches);
} else {
   echo '没有匹配到';
}

?>

Conclusion:
In the above example\d{2, }I stipulate that the 0-9 at the end of the drink should appear at least twice, and there is no limit to the maximum number of times. Therefore, $string1 is unsuccessful in matching, and $string2 is matched successfully. $string3 is matched successfully.

Next Section

new file

<?php

$zz = '/喝\d{2,}/';

$string1 = "喝9";

//$string2 = "喝98";
//$string3 = "喝98122121";


if (preg_match($zz, $string1, $matches)) {
 echo '匹配到了，结果为：';
 var_dump($matches);
} else {
 echo '没有匹配到';
}

?>

submit Reset Code

Automatic operation

Full Screen