Java
javaTutorial
Java regular expression: Exactly match words containing special characters (take 'C' as an example)
Java regular expression: Exactly match words containing special characters (take 'C' as an example)

Core Challenge: Special Characters and Exact Matching
When processing text, we often need to find specific words. However, when these words contain special metacharacters in regular expressions, such as in "C", using them directly can cause ambiguity. In regular expressions it means "match the previous element one or more times". Therefore, to match a literal plus sign, it must be escaped.
Another challenge is ensuring the accuracy of the match. For example, we might want to match the complete word "C" but avoid matching just "C" or other variations such as "C#". This requires the use of word boundaries and more advanced assertion techniques to precisely limit the matching scope.
Key regular expression components
In order to achieve accurate matching, we need to master the following core regular expression components:
- Character escape\: When you need to match the literal meaning of special metacharacters in regular expressions (such as ., *, , ?, |, (, ), [, ], {, }, ^, $, \), you must add a backslash\ in front of them to escape. In Java strings, the backslash itself is also a special character, so it needs to be double escaped. For example, the match needs to be written as \\.
- Word boundary \b: \b is a zero-width assertion that matches the beginning or end of a word. A word consists of letters, numbers, or underscores. \b ensures that we are matching an independent "word" and not part of a larger word. For example, \bcat\b will match "cat" in "The cat sat", but not "cat" in "category".
- Non-word boundary \B: \B is the opposite of \b, which matches the position of a non-word boundary. This means that it matches a position that is either between two word characters, two non-word characters, or at the beginning/end of the string where that position does not constitute a word boundary. For example, cat\B will match "cat" in "category", but not "cat" in "cat and dog".
- Negative lookahead assertion (?!...): This is also a zero-width assertion, indicating that the pattern in... cannot be matched after the current position. It is used to exclude specific suffixes without consuming any characters. For example, foo(?!bar) will match "foo" without "bar" following it.
- Case-insensitive (?i): This is an embedded flag placed at the beginning of a regular expression to make subsequent pattern matches case-insensitive. For example, (?i)c would match "c", "C", "C", etc.
To achieve an exact match: "C"
To match the word "C" exactly and ensure that it exists as a separate word, we can use the following regular expression:
(?i)\\bC\\ \\ \\b
explain:
- (?i): Enables case-insensitive matching.
- \\b: Matches the starting boundary of a word.
- C: Matches the letter "C".
- \\ \\ : Matches two literal plus signs " ".
- \\b: Matches the ending boundary of a word.
Sample code:
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class RegexCppMatch {
public static void main(String[] args) {
String text1 = "Framework, c and Visual Studio IDEs.";
String text2 = "C is a powerful language.";
String text3 = "This is a C IDE."; // C followed by word character
String text4 = "Learn C programming.";
String text5 = "C ";
// Match "C " as an independent word Pattern pCpp = Pattern.compile("(?i)\\bC\\ \\ \\b");
System.out.println("--- matches 'C ' (as a separate word) ---");
testMatch(pCpp, text1, "Text1: " text1); // Expected match testMatch(pCpp, text2, "Text2: " text2); // Expected match testMatch(pCpp, text3, "Text3: " text3); // Expected mismatch (because C is followed by word characters and does not form a word boundary)
testMatch(pCpp, text4, "text4: " text4); // Expected no match testMatch(pCpp, text5, "Text5: " text5); // Expected match // Discuss the use of non-word boundaries \B (from original answer)
// Pattern pCppWithNonWordBoundary = Pattern.compile("(?i)\\bC\\ \\ \\B");
// System.out.println("\n--- matches 'C ' (using \\B non-word boundary) ---");
// testMatch(pCppWithNonWordBoundary, text1, "Text1: " text1); // Expected mismatch (C is followed by a space, forming a word boundary)
// testMatch(pCppWithNonWordBoundary, text2, "Text2: " text2); // Expected mismatch (C is followed by a space, forming a word boundary)
// testMatch(pCppWithNonWordBoundary, text3, "Text3: " text3); // Expected match (C is followed by word characters and does not form a word boundary)
// testMatch(pCppWithNonWordBoundary, text5, "text5: " text5); // Expected match (end of string, if C is considered a word character)
}
private static void testMatch(Pattern p, String text, String description) {
Matcher m = p.matcher(text);
if (m.find()) {
System.out.println(description " -> Match successful");
} else {
System.out.println(description " -> Match failed");
}
}
}
A note about \B: The original answer mentioned that the regular expression used to match "C" is (?i).*\\bc\\ \\ \\B.*. \B here represents a non-word boundary. \bc\ \ \B means: Match "C" whose right side is not a word boundary. This means that "C" must be followed by a word character (such as the "I" in "C IDE"), or it is at the end of the string (if "C" is considered a word character). However, if "C" is followed by a space, punctuation mark, or newline character (as in "C and"), then the position between "C" and these non-word characters is a word boundary. Therefore, \B will not match this case. Since we usually want "C" to be matched in contexts such as "C and", it is more common and recommended to use \bC\ \ \b as a complete match of the word. The use of \B will make the matching conditions more strict and only applicable to specific scenarios.
Distinguishing match: "C" but not "C"
If we need to match the independent "C" letter but exclude the case where it is immediately followed by " ", we can use a negative lookahead assertion:
(?i)\\bC\\b(?!\\ {2})
explain:
- (?i): Case-insensitive.
- \\bC\\b: Matches the independent letter "C". Make sure "C" is a complete word, e.g. matches the "C" in "C Language" but not the "C" in "Category".
- (?!\\ {2}): Negative lookahead assertion. It checks whether the current position (that is, after \bC\b) is not followed by two literal plus signs " ". If followed by " ", the entire match fails.
Sample code:
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class RegexCNotCppMatch {
public static void main(String[] args) {
String text1 = "Learn C programming.";
String text2 = "C is a powerful language.";
String text3 = "This is C and C .";
String text4 = "Only C.";
String text5 = "CC "; // C followed by C (not 'C' then ' ')
// matches "C" but not "C"
Pattern pCNotCpp = Pattern.compile("(?i)\\bC\\b(?!\\ {2})");
System.out.println("\n--- matches 'C' but not 'C ' ---");
testMatch(pCNotCpp, text1, "Text1: " text1); // Expected match testMatch(pThe above is the detailed content of Java regular expression: Exactly match words containing special characters (take 'C' as an example). For more information, please follow other related articles on the PHP Chinese website!
Hot AI Tools
Undress AI Tool
Undress images for free
AI Clothes Remover
Online AI tool for removing clothes from photos.
Undresser.AI Undress
AI-powered app for creating realistic nude photos
ArtGPT
AI image generator for creative art from text prompts.
Stock Market GPT
AI powered investment research for smarter decisions
Hot Article
Popular tool
Notepad++7.3.1
Easy-to-use and free code editor
SublimeText3 Chinese version
Chinese version, very easy to use
Zend Studio 13.0.1
Powerful PHP integrated development environment
Dreamweaver CS6
Visual web development tools
SublimeText3 Mac version
God-level code editing software (SublimeText3)
Hot Topics
20518
7
13631
4
How to configure Spark distributed computing environment in Java_Java big data processing
Mar 09, 2026 pm 08:45 PM
Spark cannot run in local mode, ClassNotFoundException: org.apache.spark.sql.SparkSession. This is the most common first step of getting stuck: even the dependencies are not correct. Only spark-core_2.12 is written in Maven, but spark-sql_2.12 is not added. SparkSession crashes as soon as it is built. The Scala version must strictly match the official Spark compiled version - Spark3.4.x uses Scala2.12 by default. If you use spark-sqljar of 2.13, the class loader cannot directly find the main class. Practical advice: Go to mvnre
The correct way to send emails in batches using JavaMail API in Java
Mar 04, 2026 am 10:33 AM
This article explains in detail how to correctly set multiple recipients (BCC/CC/TO) through javax.mail in Java, solves common misunderstandings - repeatedly calling setRecipients() causes only the first/last address to take effect, and provides a safe and reusable code implementation.
Elementary practice: How to write a simple console blog searcher in Java_String matching
Mar 04, 2026 am 10:39 AM
String.contains() is not suitable for blog search because it only supports strict substring matching and cannot handle case, spaces, punctuation, spelling errors, synonyms and fuzzy queries; preprocessing toLowerCase() indexOf() or escaped wildcard regular matching (such as .*java.*config.*) is a more practical lightweight alternative.
How to safely map user-entered weekday string to integer value and implement date offset operation in Java
Mar 09, 2026 pm 09:43 PM
This article introduces a concise and maintainable way to map the weekday string (such as "Monday") to the corresponding serial number (1-7), and use the modulo operation to realize the forward and backward offset of any number of days (such as Monday plus 4 days to get Friday), avoiding lengthy if chains and hard-coded logic.
How to generate a list of duplicate elements using Java's Collections.nCopies_Initialization tips
Mar 06, 2026 am 06:24 AM
Collections.nCopies returns an immutable view. Calling add/remove will throw UnsupportedOperationException; it needs to be wrapped with newArrayList() to modify it, and it is disabled for mutable objects.
How to correctly implement runtime file writing in Java applications (avoiding JAR internal write failures)
Mar 09, 2026 pm 07:57 PM
After a Java application is packaged as a JAR, data cannot be written directly to the resources in the JAR package (such as test.txt) because the JAR is essentially a read-only ZIP archive; the correct approach is to write variable data to an external path (such as a user directory, a temporary directory, or a configuration-specified path).
How to use Homebrew to install Java on Mac_A must-have Java tool chain for developers
Mar 09, 2026 pm 09:48 PM
Homebrew installs the latest stable version of openjdk (such as JDK22) by default, not the LTS version; you need to explicitly execute brewinstallopenjdk@17 or brewinstallopenjdk@21 to install the LTS version, and manually configure PATH and JAVA_HOME to be correctly recognized by the system and IDE.
What is exception masking (Suppressed Exceptions) in Java_Multiple resource shutdown exception handling
Mar 10, 2026 pm 06:57 PM
What is SuppressedException: It is not "swallowed", but actively archived by the JVM. SuppressedException is not an exception loss, but the JVM quietly attaches the secondary exception to the main exception under the premise that "only one exception must be thrown" for you to verify afterwards. It is automatically triggered by the JVM in only two scenarios: one is that the resource closure in try-with-resources fails, and the other is that you manually call addSuppressed() in finally. The key difference is: the former is fully automatic and safe; the latter requires you to keep it to yourself, and it can be written as shadowing if you are not careful. try-





