During testing, I found that this code cannot filter out space characters (i.e.: ) in web pages. So I modified it again:
Now let’s briefly explain the three regular expressions used (it should be noted that because I am new to it, maybe my explanation is not correct, it is for reference only):
First one:/?[^>]*>/g
In js, regular expressions start with "/", and the following /g means a global pattern, which means that the matching pattern is applied to the entire string, not after the first match. The matching stopped.
?[^>]*> This is explained separately. The second character "" is a transfer character, used to transfer the following "/" character. ? matches 0 or 1 characters immediately before it. Note: This metacharacter is not supported by all software. So ? matches the "" format or "<" format in the html tag.
Let’s talk about [^>]*>. [] means:
The meaning of ^ is: match the beginning of a line. For example, the regular expression ^When in can match the beginning of the string "When in the course of human events", but cannot match "What and When in the". It means matching text starting with "When in".
The meaning of
* is: match 0 or more characters just before it. For example, regular expressions. *means able to match any number of any charactersSo [^>]* means matching characters other than >. So the pattern that [^>] can match can be as follows:
div
The text I need
div>The text I need
p>The text I need
The third one: // /ig: It is to directly search for characters. The meaning of the following /ig is to perform a case-insensitive search in global mode. g stands for global and i stands for case-insensitive.