Home  >  Article  >  Web Front-end  >  Understanding of \b in regular expressions

Understanding of \b in regular expressions

一个新手
一个新手Original
2017-10-20 11:11:247051browse

\b is used to set word boundaries. Matches the beginning or end of a word (composed of letters, numbers or underscores) (when the match starts, there cannot be \w before the word; when the match ends, there cannot be \w after the word). If written before a certain character or string in the expression, it means that there cannot be a character corresponding to \w before this character or string; if written after a character, it means that there cannot be a character corresponding to \w after this character or string. Therefore, only one \b can be placed before or after the character, or there can be two (meaning that there cannot be a character corresponding to \w before or after the character).

Look at the correct situation first

Figure 1
Note: 1. If the description of \b does not consist of letters, numbers or underscores word, the expression is misdefined and no string will ever match it. As shown in Figure 2:

Figure 2

2. The target string matches only individual words, excluding the preceding and following words. Spacing (such as spaces, \W, etc.), The spaces on both sides of the matched "Russell" character in Figure 1 are not counted (the test tool shows no blue background filling).
Purpose: Match independent words or parts of strings. The business rule is to find all places where the word "Russell" appears independently. For example, the expression \bRussell\b means Russell123abc does not match, because Russell should not be followed by letters, numbers, or underscores. Russell 123abc and Russell@123abc both match.
3.Special circumstances. Contradictory expression definition.


\b is only used to limit words consisting of letters, numbers or underscores,
If there are other expressions after the expression\b , then the expression after the expression \b must not be \w or cannot be the content in \w, because
example, the expression \bhi\bnihao, means that there cannot be the content of \w before and after the word hi, and It is required that the target string has hi, and hi is followed by the "nihao" character. That is, the definition of this expression is self-contradictory.
Because: according to the definition of \b, \b only ensures that the target string can match the following conditions: the characters before and after \b cannot have \w, so non-\w (or \W) characters before and after \b will match. Note that it is required here that if you want to match a non-\w character, the subtext of the regular rule has already said: the character matching \b must be surrounded by a non-\w character, so you must add a non-\w character after the \b expression. Only with other expressions of \w can the target string be matched.
So a regular expression like this will never be matched: \bhi\bnihao Target string hinihao hi nihao hi*nihao hi @#$nihao . . .
Because \bhi\b requires that there cannot be alphanumeric or underline characters before and after hi, and hi must be followed by the nihao string. So there can be spaces before and after hi! Special characters such as @#¥%, and nihao is defined immediately after hi in the expression, so there will never be a target string that matches this regular expression.

Method: When defining the regular expression, take this non-\w regular expression into consideration (write it into the expression) , and modify the regular expression to \bhi\b \W+nihao then has
hinihao
hi nihao
hi@nihao
hi!@#$ nihao

The last three all match

The above is the detailed content of Understanding of \b in regular expressions. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn