Home  >  Article  >  Web Front-end  >  The principle of regular expressions in js

The principle of regular expressions in js

一个新手
一个新手Original
2017-09-07 09:41:201536browse

In order to use regular expressions more efficiently, you must first understand how it works. The following are the basic steps of regular expression processing.

Basic steps

Step 1: Compile

When you create a regular expression object (using a regular literal or the RegExp constructor), the browser validates your expression and then converts it into a native code program that performs the matching work. . If you assign the regular object to a variable, you can avoid repeating this step.

Step 2: Set the starting position

When the regular class enters the use state, you must first determine the target The starting search position of the string. It is the starting character of the string, or is specified by the lastIndex attribute of the regex, but when it returns here from step four (due to a failed match attempt) , this position is at the next character position after the starting position of the last match.

The way browser manufacturers optimize the regular expression engine is to skip some unnecessary steps by deciding in advance. Avoid a lot of meaningless work. For example, if the regular expression starts with ^, IE and Chrome will usually judge whether the starting position of the string can match, and if the match fails, then you can avoid foolishly searching for subsequent positions. Another An example is to match a string whose third letter is x. A smart approach is to find x first, and then move the starting position back by two characters

Step 3: Match each regular expression word Element

Once the regular expression knows the starting position, it checks the text and the regular expression pattern one by one. When a specific character fails to match, the regular expression tries to backtrack to the position of the previous attempt to match. , and then try other possible paths

Step 4: Match success or failure

If an exact match is found at the current position of the string, then the regular expression declares that the match is successful. If the regular expression If all possible paths of the expression are not matched, the regular expression engine will fall back to the second step and try again from the next character. When each character of the string (and the position after the last string) goes through this process, if there is no successful match, then the regular expression will declare a complete match failure

Backtrack

When the regular expression matches the target string, it tests the expression one by one from left to right components to see if a match can be found. When encountering quantifiers and branches, you need to decide what to do next. If you encounter a quantifier (such as *,+? or {2, } ), the regular expression needs to decide when to try to match more characters; if it encounters a branch (from the | operator) then it must choose one of the options to try to match.

Whenever the regular expression makes a similar decision, if necessary, other choices will be recorded for use when returning. If the current option matches successfully, the regular expression continues to scan the expression, and if other parts also match successfully, then The matching ends. But if the current option cannot find a matching value, or the subsequent partial matching fails, then the regular expression will backtrack to the last decision point, and then select one of the remaining options. This process will continue until it is found match, or if all permutations and combinations of quantifiers and branching options in the regular expression fail, then it will give up the match, move to the next character in the string, and repeat the process.

Example

The following example comes from the "Repeat and Backtracking" section in "High-Performance JavaScript", which can help you understand backtracking well

var str = "

Para 1.

" + "The principle of regular expressions in js" + "

para 2.

" + "

p.

"; /

.*<\/p>/i.test(str);//method 1 /

.*?<\/p>/i.test(str);//method 2

See the picture below
The principle of regular expressions in js


The above is the detailed content of The principle of regular expressions in js. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn