Basic PHP Development Tutorial: Metacharacters in Regular Expressions

1. Metacharacters

New requirements: \d represents matching a character. And now I want to match ten or eight, what should I do with any number of numbers?

At this time we need to use metacharacters. When using atoms, I found that it can only match one character, but problems arise when matching multiple characters.
At this time, we need to use metacharacters to help us modify atoms and achieve more functions.

Don’t be scared by the following. We will understand everything after we do experiments bit by bit. The main thing is that these are more versatile.

Let’s see:

## + Match the previous character at least 1 time

The code is as follows:

The match is successful, proving the + in \d+. \d matches numbers, and + matches the previous character at least once.

Matches the previous character 0 times or any number of times

Note that the commented out $string1 and $string are matched successfully. Because, \w matches 0-9A-Za-z_, and * means that the previous \w does not need to exist. If present there can be 1 or more.

? The previous character appears 0 or 1 times, optional

Matches $string, $string2 successfully, but fails to match $string1.

Because there are ABC before and after the match, and there is a 0-9 in the middle. 0-9 is optional, but there cannot be more than one.

. (dot) matches all characters except \n

2, | (vertical bar), or, the lowest priorityWe will use experiments to see the priority sum or Matching

Let’s take a look:

1. At first, my idea of matching was to match abccd or abbcd. However, when $string1 and $string2 are matched, the matching results are abc and bcd.

2. After achieving or matching, the matching results are abc or bcd. It does not have a higher priority than strings contiguous together.

Then the question is, what should I do if I want to match abccd or abbcd in the above example?

You need to use () to change the priority.

The code is as follows:

The results are as follows:

Conclusion:

1. It does match abccd or abbcd ($string1 or $string3).

2. But there is one more element in the matching array, and the subscript of this element is 1

3. As long as the content in () matches successfully, the matched data will be placed in the array element with subscript 1.

3. ^ (circumflex) must start with the string after ^

The following conclusions were found through experiments ：

1.$string1 matched successfully, $string2 did not match successfully

2.Because $string1 starts with the specified character

3. And $string2 does not start with the character after ^

4. The translation of this regular rule means: starting with Xiao Ming and followed by a-zA-Z0-9_At least one character.

4. $ (dollar sign) must end with the character before $

Let’s run it and see After looking at the results, we came to the conclusion:

1.$string1 matched successfully, but $string2 failed to match

2.$ The character is \d+, followed by Chinese effort.

3. Therefore, what is matched is this whole. \d refers to an integer of 0-9, and the + sign represents at least one 0-9

##5. \b and \B word boundaries and non-word boundaries

Let’s explain what boundaries are:

1. Regular expressions have boundaries. This boundary is the beginning and end of the delimiter, which are regular boundaries.

2.this is an English word, followed by a space, which means that the word has ended and the boundary of the word has been reached

\bWord boundary, It means it must be first or last.
\BNon-boundary means that it cannot be at the beginning or end of a regular expression.

Conclusion:

1.$string1, $string2 and $string3 all match successfully.

2. When $string1 matches, this space is the boundary

3. When $string2 matches, thisis is the boundary

4. When $string3 matches, thisisaapple reaches the end of the entire regular expression, so it is also the boundary. So the match is successful.

Let’s experiment with non-word boundaries:

Summary:

Matches $string1 successfully but $string2 fails .

Because \B is followed by this, so this cannot appear at word boundaries (spaces and beginning and ending).

6. {m} can and can only appear m times

Conclusion:Above example中\d{3}I stipulated that 0-9 can only appear 3 times, not once more or less.

7. {n,m} can appear n to m times

Conclusion:Part 1 In the example \d{1,3}, I stipulated that 0-9 can only appear once, 2 or 3 times. All other times are wrong

Eight, {m,} at least m times, the maximum number is not limited

Conclusion:
In the above example, we stipulate that \d{2,} and the following 0-9 appear at least twice, and there is no limit to the maximum number of times. Therefore, $string1 is unsuccessful in matching, and $string2 is matched successfully. $string3 is matched successfully.

Continuing Learning