Detailed explanation of java regular knowledge-JavaBase-php.cn

Home

Java

JavaBase

Detailed explanation of java regular knowledge

尚

Nov 29, 2019 pm 01:11 PM

java

Detailed explanation of java regular knowledge

Expression meaning: (Recommended: java video tutorial)

1, character

x character x . For example, a represents the character a

\\ backslash character. When writing, write \\\\. (Note: Because Java parses \\\\ into a regular expression \\ during the first parsing, and then parses it into \\ during the second parsing, so any escape characters that are not listed in 1.1 include those in 1.1 \\, and those with \ must be written twice)

\0n Character n with octal value 0 (0 \0nn Character nn with octal value 0 (0 \0mnn With octal value Character mnn with value 0 (0 \xhh Character with hexadecimal value 0x hh

\uhhhh Character with hexadecimal value 0x hhhh

\t Tab character ('\u0009')

\n New line (line feed) character ('\u000A')

\r Carriage return character ('\u000D')

\ f form feed character ('\u000C')

\a alarm (bell) character ('\u0007')

\e escape character ('\u001B')

\cx corresponds to the control character of x

2, character class

[abc] a, b or c (simple class). For example, [egd] means containing the characters e, g or d.

[^abc] Any character except a, b, or c (negative). For example, [^egd] means not containing the characters e, g, or d.

[a- zA-Z] a to z or A to Z, inclusive (range)

[a-d[m-p]] a to d or m to p: [a-dm-p] (and Set)

[a-z&&[def]] d, e or f (intersection)

[a-z&&[^bc]] a to z, except b and c: [ad -z] (subtract)

[a-z&&[^m-p]] a to z, not m to p: [a-lq-z] (subtract)

3 , predefined character classes (note that the backslash must be written twice, for example, \d is written as \\d) any character (which may or may not match the line terminator)

\d Number: [0 -9]

\D Non-digits: [^0-9]

\s Blank characters: [ \t\n\x0B\f\r]

\ S Non-whitespace characters: [^\s]

\w Word characters: [a-zA-Z_0-9]

\W Non-word characters: [^\w]

4.POSIX character class (US-ASCII only) (note that the backslash must be written twice, for example, \p{Lower} is written as \\p{Lower})

\p{Lower} Lowercase alphabetic characters: [a-z].

\p{Upper} Uppercase alphabetic characters: [A-Z]

\p{ASCII} All ASCII: [\x00-\x7F]

\p{Alpha} Alphabetic characters: [\p{Lower}\p{Upper}]

\p{Digit} Decimal digits: [0-9]

\p {Alnum} Alphanumeric characters: [\p{Alpha}\p{Digit}]

\p{Punct} Punctuation: !"#$%&'()* ,-./:;? @[\]^_`{|}~

\p{Graph} Visible characters: [\p{Alnum}\p{Punct}]

\p{Print} Printable Characters: [\p{Graph}\x20]

\p{Blank} Space or tab: [ \t]

\p{Cntrl} Control characters: [\x00- \x1F\x7F]

\p{XDigit} Hexadecimal digits: [0-9a-fA-F]

\p{Space} White space characters: [ \t\n \x0B\f\r]

5.java.lang.Character class (simple java character type)

\p{javaLowerCase} is equivalent to java.lang.Character.isLowerCase( )

\p{javaUpperCase} is equivalent to java.lang.Character.isUpperCase()

\p{javaWhitespace} is equivalent to java.lang.Character.isWhitespace()

\p{javaMirrored} Equivalent to java.lang.Character.isMirrored()

6. Classes for Unicode blocks and categories

\p{InGreek} Greek blocks (simple blocks ) characters in

\p{Lu} Uppercase letters (simple category)

\p{Sc} Currency symbols

\P{InGreek} All characters, Greek blocks Except in (negation)

[\p{L}&&[^\p{Lu}]] All letters, except uppercase letters (minus)

7. Boundary matcher

^ At the beginning of the line, use ^ at the beginning of the regular expression. For example: ^(abc) represents a string starting with abc. Note that the parameter MULTILINE must be set when compiling, such as Pattern p = Pattern.compile(regex,Pattern.MULTILINE);

$ at the end of the line, please use it at the end of the regular expression. For example: (^bca).*(abc$) means a line starting with bca and ending with abc.

\b Word boundaries. For example, \b(abc) means that the beginning or end of the word contains abc, (both abcjj and jjabc can match)

\B Non-word boundary. For example, \B(abc) means that the middle of the word contains abc, (jjabcjj matches but jjabc, abcjj do not match)

\A The beginning of the input

\G The end of the previous match (personal I feel like this parameter is useless). For example, \\Gdog means to search for dog at the end of the previous match. If there is no dog, then search from the beginning. Note that if the beginning is not dog, it cannot match.

\Z The end of the input, used only for the final terminator (if any)

The line terminator is a sequence of one or two characters that marks the end of the line of the input character sequence .

The following codes are recognized as line terminators:

-new line (newline) character ('\n'),

-return followed by a new line character Carriage return character ("\r\n"),

-single carriage return character ('\r'),

-next line character ('\u0085'),

‐Line separator ('\u2028') or

‐Paragraph separator ('\u2029).

\z End of input

When compiling a pattern, one or more flags can be set, for example

Pattern pattern = Pattern.compile(patternString,Pattern.CASE_INSENSITIVE Pattern .UNICODE_CASE);

The following six flags are supported:

‐CASE_INSENSITIVE: Matching characters is case-independent. This flag only considers US ASCII characters by default.

‐UNICODE_CASE: When combined with CASE_INSENSITIVE, use Unicode letter matching

‐MULTILINE: ^ and $ match the beginning and end of a line, rather than the entire input

‐UNIX_LINES : When matching ^ and $ in multiline mode, treat only '\n' as a line terminator

‐DOTALL: When this flag is used, the . symbol matches all line terminators including Character

‐CANON_EQ: Consider the canonical equivalent of Unicode characters

8, Greedy quantifier

X? X, not once or not

X* X, zero or more times

X X, one or more times

X{n} X, exactly n times

X{n,} X, at least n times

X{n,m} X, at least n times, but not more than m times

9.Reluctant quantifier

X??

##X*? X, zero or more times

X ? #X{n,}? X, at least n times

X{n,m}? ##X? Exactly n times

X{n,} X, at least n times

X{n,m} X, at least n times, but not more than m times

Greedy, The difference between Reluctant and Possessive is: (Note only when performing fuzzy processing)

The greedy quantifier is considered "greedy" because it reads the entire fuzzy matched string for the first time. If the first match attempt (the entire input string) fails, the matcher will back off one character after the last character in the matched string and try again, repeating this process until a match is found or there are no more remaining characters. until you can retreat. Depending on the quantifier used in the expression, the last thing it tries to match is 1 or 0 characters.

However, reluctant quantifiers take the opposite approach: they start at the beginning of the string being matched, and then progressively read one character at a time to search for a match. The last thing they try to match is the entire input string.

Finally, the possessive quantifier always reads the entire input string, trying one (and only one) match. Unlike the greedy quantifier, possessive never retreats.

11. Logical operator

XY X followed by Y

X|Y X or Y

(X) X, as a capturing group. For example (abc) means capturing abc as a whole

12, Back reference

\n Any matching nth capture group

capture group can be passed from left to right Count its opening brackets to number. For example, in the expression ((A)(B(C))), there are four such groups:

1 ((A)(B(C)))

2 \A

3 (B(C))

4 (C)

The corresponding group can be referenced by \n in the expression, for example (ab) 34\1 means ab34ab, (ab)34(cd)\1\2 means ab34cdabcd.

13. Quote

\ Nothing, but quote the following characters

\Q Nothing, but quote all characters until \E. The string between QE will be used unchanged (except for the escaped characters in 1.1). For example, ab\\Q{|}\\\\E

would match ab{|}\\

\E Nothing, but end the reference starting with \Q

14, Special construction (non-capturing)

(?:X) X, as a non-capturing group

(?idmsux-idmsux) Nothing, but changes the matching flag from on to off. For example: the expression (?i)abc(?-i)def At this time, (?i) turns on the case-insensitive switch, abc matches

idmsux description is as follows:

‐i CASE_INSENSITIVE The :US-ASCII character set is not case sensitive. (?i)

‐d UNIX_LINES: Turn on UNIX line breaks

‐m MULTILINE: Multiline mode (?m)

UNIX line breaks\n

WINDOWS switching behavior\r\n(?s)

‐u UNICODE_CASE: Unicode is not case sensitive. (?u)

‐x COMMENTS: You can use comments in pattern, ignore the whitespace in pattern, and "#" until the end (# is followed by comments). (?x) For example (?x)abc#asfsdadsa can match the string abc

(?idmsux-idmsux:X) X as a non-capturing group with the given flags on - off. Similar to the above, the above expression can be rewritten as: (?i:abc)def, or (?i)abc(?-i:def)

(?=X) X, passing through zero The width of the positive lookahead. A zero-width positive lookahead assertion continues matching only if subexpression X matches to the right of this position. For example, \w (?=\d) means a letter followed by a number, but does not capture the number (no backtracking)

(?!X) X, via a zero-width negative lookahead. Zero-width negative lookahead assertion. Continue matching only if subexpression X does not match to the right of this position. For example, \w (?!\d) means a letter is not followed by a digit, and digits are not captured.

(? (? (?>X) X, as an independent non-capturing group (no backtracking)

The difference between (?=X) and (?>X) is ( ?> >b|bc) cannot be matched, because when the latter matches b, since it has already been matched, it jumps out of the non-capturing group and does not match the characters in the group again. This can speed up the process.

For more java knowledge, please pay attention to the

java basic tutorial

column.

The above is the detailed content of Detailed explanation of java regular knowledge. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress images for free

Clothoff.io

AI clothes remover

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)

1 months ago By 尊渡假赌尊渡假赌尊渡假赌

R.E.P.O. Best Graphic Settings

1 months ago By 尊渡假赌尊渡假赌尊渡假赌

Assassin's Creed Shadows: Seashell Riddle Solution

2 weeks ago By DDD

R.E.P.O. How to Fix Audio if You Can't Hear Anyone

1 months ago By 尊渡假赌尊渡假赌尊渡假赌

R.E.P.O. Chat Commands and How to Use Them

1 months ago By 尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Hot Topics

Where is the login entrance for gmail email?

7544

CakePHP Tutorial

1381

What is the format of the account name of steam

win11 activation key permanent

nyt connections hints and answers

Related knowledge

Perfect Number in Java Aug 30, 2024 pm 04:28 PM

Guide to Perfect Number in Java. Here we discuss the Definition, How to check Perfect number in Java?, examples with code implementation.

Random Number Generator in Java Aug 30, 2024 pm 04:27 PM

Guide to Random Number Generator in Java. Here we discuss Functions in Java with examples and two different Generators with ther examples.

Weka in Java Aug 30, 2024 pm 04:28 PM

Guide to Weka in Java. Here we discuss the Introduction, how to use weka java, the type of platform, and advantages with examples.

Smith Number in Java Aug 30, 2024 pm 04:28 PM

Guide to Smith Number in Java. Here we discuss the Definition, How to check smith number in Java? example with code implementation.

Java Spring Interview Questions Aug 30, 2024 pm 04:29 PM

In this article, we have kept the most asked Java Spring Interview Questions with their detailed answers. So that you can crack the interview.

Break or return from Java 8 stream forEach? Feb 07, 2025 pm 12:09 PM

Java 8 introduces the Stream API, providing a powerful and expressive way to process data collections. However, a common question when using Stream is: How to break or return from a forEach operation? Traditional loops allow for early interruption or return, but Stream's forEach method does not directly support this method. This article will explain the reasons and explore alternative methods for implementing premature termination in Stream processing systems. Further reading: Java Stream API improvements Understand Stream forEach The forEach method is a terminal operation that performs one operation on each element in the Stream. Its design intention is

TimeStamp to Date in Java Aug 30, 2024 pm 04:28 PM

Guide to TimeStamp to Date in Java. Here we also discuss the introduction and how to convert timestamp to date in java along with examples.

Java Program to Find the Volume of Capsule Feb 07, 2025 am 11:37 AM

Capsules are three-dimensional geometric figures, composed of a cylinder and a hemisphere at both ends. The volume of the capsule can be calculated by adding the volume of the cylinder and the volume of the hemisphere at both ends. This tutorial will discuss how to calculate the volume of a given capsule in Java using different methods. Capsule volume formula The formula for capsule volume is as follows: Capsule volume = Cylindrical volume Volume Two hemisphere volume in, r: The radius of the hemisphere. h: The height of the cylinder (excluding the hemisphere). Example 1 enter Radius = 5 units Height = 10 units Output Volume = 1570.8 cubic units explain Calculate volume using formula: Volume = π × r2 × h (4

See all articles