1. Introduction to Regular Expressions
What are Regular Expressions?
Regular Expressions (Regex) are patterns used to match, search, and manipulate text. They provide a powerful way to work with strings through pattern matching.
Why Use Regular Expressions?
- Text validation (emails, phone numbers, etc.)
- Text extraction (parsing data)
- Text replacement and manipulation
- Pattern searching in large texts
- Data cleaning and formatting
Java Regex Packages:
java.util.regex.Pattern- Compiled regex patternsjava.util.regex.Matcher- Engine that performs match operationsjava.util.regex.PatternSyntaxException- Checked exception for regex syntax errors
2. Basic Regex Syntax
| Pattern | Description | Example |
|---|---|---|
. | Any character | a.c matches "abc", "a c", "a-c" |
\d | Digit (0-9) | \d\d matches "12", "45" |
\D | Non-digit | \D\D matches "ab", "@#" |
\w | Word character (a-z, A-Z, 0-9, _) | \w\w matches "a1", "B_" |
\W | Non-word character | \W\W matches "@ ", "!%" |
\s | Whitespace character | a\sb matches "a b" |
\S | Non-whitespace character | a\Sb matches "a-b", "aBb" |
[abc] | Any of a, b, or c | [aeiou] matches vowels |
[^abc] | Any character except a, b, or c | [^0-9] matches non-digits |
[a-z] | Any lowercase letter | [a-z] matches "a" to "z" |
[A-Z] | Any uppercase letter | [A-Z] matches "A" to "Z" |
[0-9] | Any digit | [0-9] matches "0" to "9" |
3. Quantifiers
| Quantifier | Description | Example |
|---|---|---|
* | 0 or more times | a* matches "", "a", "aa" |
+ | 1 or more times | a+ matches "a", "aa" |
? | 0 or 1 time | a? matches "", "a" |
{n} | Exactly n times | a{3} matches "aaa" |
{n,} | n or more times | a{2,} matches "aa", "aaa" |
{n,m} | Between n and m times | a{2,4} matches "aa", "aaa", "aaaa" |
4. Complete Code Examples
Example 1: Basic Pattern Matching
import java.util.regex.*;
import java.util.*;
public class BasicRegexExamples {
public static void main(String[] args) {
System.out.println("=== Basic Pattern Matching ===");
// 1. Simple pattern matching
String text = "The quick brown fox jumps over the lazy dog";
String pattern = "fox";
boolean found = Pattern.compile(pattern).matcher(text).find();
System.out.println("Pattern 'fox' found: " + found);
// 2. Case insensitive matching
Pattern caseInsensitive = Pattern.compile("THE", Pattern.CASE_INSENSITIVE);
Matcher matcher = caseInsensitive.matcher(text);
System.out.println("Case insensitive 'THE' found: " + matcher.find());
// 3. Using matches() for exact matching
String email = "[email protected]";
boolean isEmail = email.matches(".*@.*\\..*");
System.out.println("Is email format: " + isEmail);
// 4. Multiple patterns
String[] testStrings = {
"hello", "123", "hello123", "HELLO", "123hello"
};
Pattern wordPattern = Pattern.compile("[a-zA-Z]+");
Pattern digitPattern = Pattern.compile("\\d+");
for (String str : testStrings) {
boolean isWord = wordPattern.matcher(str).matches();
boolean isDigit = digitPattern.matcher(str).matches();
System.out.println("'" + str + "' - Word: " + isWord + ", Digit: " + isDigit);
}
// 5. Character classes
System.out.println("\n=== Character Classes ===");
String sample = "abc123 XYZ!@#";
// Find all digits
Matcher digitMatcher = Pattern.compile("\\d").matcher(sample);
System.out.print("Digits: ");
while (digitMatcher.find()) {
System.out.print(digitMatcher.group() + " ");
}
System.out.println();
// Find all words
Matcher wordMatcher = Pattern.compile("\\w+").matcher(sample);
System.out.print("Words: ");
while (wordMatcher.find()) {
System.out.print(wordMatcher.group() + " ");
}
System.out.println();
// Find non-words
Matcher nonWordMatcher = Pattern.compile("\\W+").matcher(sample);
System.out.print("Non-words: ");
while (nonWordMatcher.find()) {
System.out.print("'" + nonWordMatcher.group() + "' ");
}
System.out.println();
}
}
Example 2: Quantifiers and Grouping
import java.util.regex.*;
import java.util.*;
public class QuantifiersAndGrouping {
public static void main(String[] args) {
System.out.println("=== Quantifiers and Grouping ===");
// 1. Basic quantifiers
String[] numbers = {"1", "12", "123", "1234", "12345"};
Pattern twoToFourDigits = Pattern.compile("\\d{2,4}");
System.out.println("Numbers with 2-4 digits:");
for (String num : numbers) {
if (twoToFourDigits.matcher(num).matches()) {
System.out.println("β " + num);
} else {
System.out.println("β " + num);
}
}
// 2. Greedy vs Reluctant quantifiers
String html = "<div>content</div><p>more content</p>";
// Greedy - matches longest possible
Pattern greedy = Pattern.compile("<.*>");
Matcher greedyMatcher = greedy.matcher(html);
if (greedyMatcher.find()) {
System.out.println("Greedy match: " + greedyMatcher.group());
}
// Reluctant - matches shortest possible
Pattern reluctant = Pattern.compile("<.*?>");
Matcher reluctantMatcher = reluctant.matcher(html);
System.out.println("Reluctant matches:");
while (reluctantMatcher.find()) {
System.out.println(" " + reluctantMatcher.group());
}
// 3. Grouping with parentheses
String date = "2024-01-15";
Pattern datePattern = Pattern.compile("(\\d{4})-(\\d{2})-(\\d{2})");
Matcher dateMatcher = datePattern.matcher(date);
if (dateMatcher.matches()) {
System.out.println("\nDate parsing:");
System.out.println("Full date: " + dateMatcher.group(0));
System.out.println("Year: " + dateMatcher.group(1));
System.out.println("Month: " + dateMatcher.group(2));
System.out.println("Day: " + dateMatcher.group(3));
}
// 4. Named groups (Java 7+)
Pattern namedPattern = Pattern.compile("(?<year>\\d{4})-(?<month>\\d{2})-(?<day>\\d{2})");
Matcher namedMatcher = namedPattern.matcher(date);
if (namedMatcher.matches()) {
System.out.println("\nNamed groups:");
System.out.println("Year: " + namedMatcher.group("year"));
System.out.println("Month: " + namedMatcher.group("month"));
System.out.println("Day: " + namedMatcher.group("day"));
}
// 5. Alternation (OR)
Pattern colorPattern = Pattern.compile("red|blue|green");
String[] testColors = {"red car", "blue sky", "yellow sun", "green grass"};
System.out.println("\nColor matching:");
for (String test : testColors) {
if (colorPattern.matcher(test).find()) {
System.out.println("β " + test + " contains color");
} else {
System.out.println("β " + test + " no color found");
}
}
// 6. Complex grouping example
String logLine = "ERROR 2024-01-15 14:30:25 Database connection failed";
Pattern logPattern = Pattern.compile("(\\w+)\\s+(\\d{4}-\\d{2}-\\d{2})\\s+(\\d{2}:\\d{2}:\\d{2})\\s+(.+)");
Matcher logMatcher = logPattern.matcher(logLine);
if (logMatcher.matches()) {
System.out.println("\nLog parsing:");
System.out.println("Level: " + logMatcher.group(1));
System.out.println("Date: " + logMatcher.group(2));
System.out.println("Time: " + logMatcher.group(3));
System.out.println("Message: " + logMatcher.group(4));
}
}
}
Example 3: Real-World Validation Examples
import java.util.regex.*;
import java.util.*;
public class ValidationExamples {
// Validation methods
public static boolean isValidEmail(String email) {
String emailRegex = "^[a-zA-Z0-9_+&*-]+(?:\\.[a-zA-Z0-9_+&*-]+)*@(?:[a-zA-Z0-9-]+\\.)+[a-zA-Z]{2,7}$";
return Pattern.compile(emailRegex).matcher(email).matches();
}
public static boolean isValidPhone(String phone) {
// Supports: (123) 456-7890, 123-456-7890, 123.456.7890, 1234567890
String phoneRegex = "^(\\(\\d{3}\\)|\\d{3})[-.\\s]?\\d{3}[-.\\s]?\\d{4}$";
return Pattern.compile(phoneRegex).matcher(phone).matches();
}
public static boolean isValidPassword(String password) {
// At least 8 chars, 1 uppercase, 1 lowercase, 1 digit, 1 special char
String passwordRegex = "^(?=.*[a-z])(?=.*[A-Z])(?=.*\\d)(?=.*[@$!%*?&])[A-Za-z\\d@$!%*?&]{8,}$";
return Pattern.compile(passwordRegex).matcher(password).matches();
}
public static boolean isValidIPAddress(String ip) {
String ipRegex = "^((25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\.){3}(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)$";
return Pattern.compile(ipRegex).matcher(ip).matches();
}
public static boolean isValidURL(String url) {
String urlRegex = "^(https?|ftp)://[^\\s/$.?#].[^\\s]*$";
return Pattern.compile(urlRegex).matcher(url).matches();
}
public static boolean isValidCreditCard(String card) {
// Basic credit card format (16 digits, possible spaces/dashes)
String cardRegex = "^(\\d{4}[-\\s]?){3}\\d{4}$";
return Pattern.compile(cardRegex).matcher(card).matches();
}
public static boolean isValidDate(String date) {
// YYYY-MM-DD format
String dateRegex = "^\\d{4}-(0[1-9]|1[0-2])-(0[1-9]|[12][0-9]|3[01])$";
return Pattern.compile(dateRegex).matcher(date).matches();
}
public static void main(String[] args) {
System.out.println("=== Real-World Validation Examples ===");
// Test emails
String[] emails = {
"[email protected]",
"invalid-email",
"[email protected]",
"[email protected]",
"missing@tld"
};
System.out.println("\nEmail Validation:");
for (String email : emails) {
System.out.printf("%-25s : %s%n", email, isValidEmail(email) ? "β Valid" : "β Invalid");
}
// Test phone numbers
String[] phones = {
"(123) 456-7890",
"123-456-7890",
"123.456.7890",
"1234567890",
"123-45-6789"
};
System.out.println("\nPhone Number Validation:");
for (String phone : phones) {
System.out.printf("%-20s : %s%n", phone, isValidPhone(phone) ? "β Valid" : "β Invalid");
}
// Test passwords
String[] passwords = {
"StrongPass1!",
"weak",
"NoSpecialChar1",
"nouppercase1!",
"NOLOWERCASE1!"
};
System.out.println("\nPassword Validation:");
for (String pwd : passwords) {
System.out.printf("%-20s : %s%n", pwd, isValidPassword(pwd) ? "β Valid" : "β Invalid");
}
// Test IP addresses
String[] ips = {
"192.168.1.1",
"255.255.255.255",
"300.400.500.600",
"127.0.0.1",
"1.2.3"
};
System.out.println("\nIP Address Validation:");
for (String ip : ips) {
System.out.printf("%-15s : %s%n", ip, isValidIPAddress(ip) ? "β Valid" : "β Invalid");
}
// Test credit cards
String[] cards = {
"1234-5678-9012-3456",
"1234 5678 9012 3456",
"1234567890123456",
"1234-5678-9012",
"abcd-efgh-ijkl-mnop"
};
System.out.println("\nCredit Card Validation:");
for (String card : cards) {
System.out.printf("%-25s : %s%n", card, isValidCreditCard(card) ? "β Valid" : "β Invalid");
}
}
}
Example 4: Text Extraction and Parsing
import java.util.regex.*;
import java.util.*;
public class TextExtractionExamples {
public static void main(String[] args) {
System.out.println("=== Text Extraction and Parsing ===");
// 1. Extract all numbers from text
String textWithNumbers = "The price is $25.99 and quantity is 150 units. Temperature: -5.5Β°C";
Pattern numberPattern = Pattern.compile("-?\\d+\\.?\\d*");
Matcher numberMatcher = numberPattern.matcher(textWithNumbers);
System.out.println("Numbers found:");
while (numberMatcher.find()) {
System.out.println(" " + numberMatcher.group());
}
// 2. Extract email addresses
String textWithEmails = """
Contact us at [email protected] or [email protected].
For complaints: [email protected]
Invalid: not-an-email@
""";
Pattern emailPattern = Pattern.compile("\\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\\.[A-Z|a-z]{2,}\\b");
Matcher emailMatcher = emailPattern.matcher(textWithEmails);
System.out.println("\nEmail addresses found:");
while (emailMatcher.find()) {
System.out.println(" " + emailMatcher.group());
}
// 3. Extract HTML tags
String htmlContent = """
<html>
<head><title>My Page</title></head>
<body>
<h1>Welcome</h1>
<p>This is a <b>bold</b> text.</p>
<img src="image.jpg" alt="Example">
</body>
</html>
""";
Pattern tagPattern = Pattern.compile("<([a-zA-Z][a-zA-Z0-9]*)[^>]*>");
Matcher tagMatcher = tagPattern.matcher(htmlContent);
System.out.println("\nHTML tags found:");
Set<String> uniqueTags = new HashSet<>();
while (tagMatcher.find()) {
uniqueTags.add(tagMatcher.group(1));
}
uniqueTags.forEach(tag -> System.out.println(" <" + tag + ">"));
// 4. Extract words with specific patterns
String sentence = "The quick brown fox jumps over 123 lazy dogs. Programming in Java 17!";
// Extract words starting with capital letters
Pattern capitalWords = Pattern.compile("\\b[A-Z][a-z]*\\b");
Matcher capitalMatcher = capitalWords.matcher(sentence);
System.out.println("\nCapitalized words:");
while (capitalMatcher.find()) {
System.out.println(" " + capitalMatcher.group());
}
// 5. Parse CSV data
String csvLine = "John,30,\"Software Engineer\",\"New York, NY\",$75000";
// Complex pattern to handle quoted fields with commas
Pattern csvPattern = Pattern.compile("\"([^\"]*)\"|([^,]+)");
Matcher csvMatcher = csvPattern.matcher(csvLine);
System.out.println("\nCSV fields:");
List<String> fields = new ArrayList<>();
while (csvMatcher.find()) {
String field = csvMatcher.group(1) != null ? csvMatcher.group(1) : csvMatcher.group(2);
fields.add(field);
System.out.println(" '" + field + "'");
}
// 6. Extract dates in different formats
String textWithDates = """
Events: 2024-01-15, 15/01/2024, Jan 15, 2024, 01-15-2024
Invalid: 2024/13/45, 32-01-2024
""";
Pattern datePattern = Pattern.compile("\\b(\\d{4}-\\d{2}-\\d{2}|\\d{2}/\\d{2}/\\d{4}|\\w{3} \\d{1,2}, \\d{4})\\b");
Matcher dateMatcher = datePattern.matcher(textWithDates);
System.out.println("\nDates found:");
while (dateMatcher.find()) {
System.out.println(" " + dateMatcher.group());
}
// 7. Extract code comments
String javaCode = """
public class Example {
// This is a single-line comment
private int value; // inline comment
/*
* This is a multi-line comment
*/
public void method() {
System.out.println("Hello"); // another comment
}
}
""";
// Pattern for single-line comments
Pattern singleLineComment = Pattern.compile("//.*$", Pattern.MULTILINE);
Matcher commentMatcher = singleLineComment.matcher(javaCode);
System.out.println("\nSingle-line comments:");
while (commentMatcher.find()) {
System.out.println(" " + commentMatcher.group().trim());
}
}
}
Example 5: Text Replacement and Manipulation
import java.util.regex.*;
import java.util.*;
public class TextReplacementExamples {
public static void main(String[] args) {
System.out.println("=== Text Replacement and Manipulation ===");
// 1. Simple replacement
String text = "The color of the sky is blue. I like blue cars.";
String replaced = text.replaceAll("blue", "red");
System.out.println("Original: " + text);
System.out.println("Replaced: " + replaced);
// 2. Using groups in replacement
String nameFormat = "Doe, John";
String formattedName = nameFormat.replaceAll("(\\w+), (\\w+)", "$2 $1");
System.out.println("\nName formatting:");
System.out.println("Before: " + nameFormat);
System.out.println("After: " + formattedName);
// 3. Mask sensitive information
String creditCard = "Card: 1234-5678-9012-3456";
String maskedCard = creditCard.replaceAll("\\b\\d{4}-\\d{4}-\\d{4}-\\d{2}(\\d{2})\\b", "XXXX-XXXX-XXXX-XX$1");
System.out.println("\nCredit card masking:");
System.out.println("Before: " + creditCard);
System.out.println("After: " + maskedCard);
// 4. HTML tag removal
String htmlText = "<p>This is <b>bold</b> and <i>italic</i> text.</p>";
String plainText = htmlText.replaceAll("<[^>]*>", "");
System.out.println("\nHTML to plain text:");
System.out.println("HTML: " + htmlText);
System.out.println("Text: " + plainText);
// 5. URL extraction and conversion
String textWithUrls = "Visit https://example.com and http://sub.domain.org for more info.";
String linkedText = textWithUrls.replaceAll("(https?://[^\\s]+)", "<a href=\"$1\">$1</a>");
System.out.println("\nURL to HTML links:");
System.out.println("Before: " + textWithUrls);
System.out.println("After: " + linkedText);
// 6. Phone number formatting
String[] phoneNumbers = {
"1234567890",
"123-456-7890",
"(123)4567890",
"123.456.7890"
};
System.out.println("\nPhone number standardization:");
for (String phone : phoneNumbers) {
String standardized = phone.replaceAll("\\D", "") // Remove non-digits
.replaceAll("(\\d{3})(\\d{3})(\\d{4})", "($1) $2-$3");
System.out.println(phone + " β " + standardized);
}
// 7. Multiple spaces to single space
String textWithSpaces = "This has multiple spaces between words.";
String cleanedText = textWithSpaces.replaceAll("\\s+", " ");
System.out.println("\nSpace normalization:");
System.out.println("Before: " + textWithSpaces);
System.out.println("After: " + cleanedText);
// 8. Case conversion using Matcher
String mixedCase = "convert THIS text to Title Case";
Pattern wordPattern = Pattern.compile("\\b\\w+\\b");
Matcher wordMatcher = wordPattern.matcher(mixedCase);
StringBuffer titleCase = new StringBuffer();
while (wordMatcher.find()) {
String word = wordMatcher.group();
String titleWord = word.substring(0, 1).toUpperCase() + word.substring(1).toLowerCase();
wordMatcher.appendReplacement(titleCase, titleWord);
}
wordMatcher.appendTail(titleCase);
System.out.println("\nTitle case conversion:");
System.out.println("Before: " + mixedCase);
System.out.println("After: " + titleCase);
// 9. Using Matcher for complex replacement
String template = "Hello {name}, your order {orderId} will be delivered to {address}.";
Map<String, String> values = Map.of(
"name", "John Doe",
"orderId", "ORD-12345",
"address", "123 Main St, City"
);
Pattern placeholder = Pattern.compile("\\{([^}]+)\\}");
Matcher templateMatcher = placeholder.matcher(template);
StringBuffer result = new StringBuffer();
while (templateMatcher.find()) {
String key = templateMatcher.group(1);
String replacement = values.getOrDefault(key, "[" + key + "]");
templateMatcher.appendReplacement(result, Matcher.quoteReplacement(replacement));
}
templateMatcher.appendTail(result);
System.out.println("\nTemplate replacement:");
System.out.println("Template: " + template);
System.out.println("Result: " + result);
}
}
Example 6: Advanced Regex Patterns
import java.util.regex.*;
import java.util.*;
public class AdvancedRegexExamples {
// Lookahead and lookbehind examples
public static void demonstrateLookarounds() {
System.out.println("=== Lookarounds ===");
String text = "apple banana applepie pineapple";
// Positive lookahead - apple followed by pie
Pattern lookahead = Pattern.compile("apple(?=pie)");
Matcher laMatcher = lookahead.matcher(text);
System.out.println("Positive lookahead (apple followed by pie):");
while (laMatcher.find()) {
System.out.println(" Found: '" + laMatcher.group() + "' at position " + laMatcher.start());
}
// Negative lookahead - apple NOT followed by pie
Pattern negativeLookahead = Pattern.compile("apple(?!pie)");
Matcher nlaMatcher = negativeLookahead.matcher(text);
System.out.println("Negative lookahead (apple NOT followed by pie):");
while (nlaMatcher.find()) {
System.out.println(" Found: '" + nlaMatcher.group() + "' at position " + nlaMatcher.start());
}
// Positive lookbehind - pie preceded by apple
Pattern lookbehind = Pattern.compile("(?<=apple)pie");
Matcher lbMatcher = lookbehind.matcher(text);
System.out.println("Positive lookbehind (pie preceded by apple):");
while (lbMatcher.find()) {
System.out.println(" Found: '" + lbMatcher.group() + "' at position " + lbMatcher.start());
}
}
// Backreferences example
public static void demonstrateBackreferences() {
System.out.println("\n=== Backreferences ===");
String[] testWords = {"abba", "hello", "1221", "abcba", "123321"};
Pattern palindromeLike = Pattern.compile("(\\w)(\\w)\\2\\1"); // 4-character palindrome
System.out.println("Words matching pattern (first char = last char, second char = third char):");
for (String word : testWords) {
if (palindromeLike.matcher(word).matches()) {
System.out.println(" β " + word);
}
}
// Find repeated words
String text = "This is is a test test with repeated repeated words.";
Pattern repeatedWords = Pattern.compile("\\b(\\w+)\\s+\\1\\b");
Matcher rwMatcher = repeatedWords.matcher(text);
System.out.println("Repeated words:");
while (rwMatcher.find()) {
System.out.println(" '" + rwMatcher.group(1) + "' repeated at position " + rwMatcher.start());
}
}
// Unicode and character properties
public static void demonstrateUnicode() {
System.out.println("\n=== Unicode and Character Properties ===");
String mixedText = "Hello δΈη π 123!";
// Match any letter (including Unicode)
Pattern unicodeLetters = Pattern.compile("\\p{L}+");
Matcher ulMatcher = unicodeLetters.matcher(mixedText);
System.out.println("Unicode letters:");
while (ulMatcher.find()) {
System.out.println(" '" + ulMatcher.group() + "'");
}
// Match emojis and symbols
Pattern emojis = Pattern.compile("\\p{So}");
Matcher emojiMatcher = emojis.matcher(mixedText);
System.out.println("Emojis and symbols:");
while (emojiMatcher.find()) {
System.out.println(" '" + emojiMatcher.group() + "'");
}
}
// Complex password validation with detailed feedback
public static Map<String, Boolean> validatePasswordDetailed(String password) {
Map<String, Boolean> checks = new LinkedHashMap<>();
checks.put("At least 8 characters", password.length() >= 8);
checks.put("Contains uppercase", Pattern.compile("[A-Z]").matcher(password).find());
checks.put("Contains lowercase", Pattern.compile("[a-z]").matcher(password).find());
checks.put("Contains digit", Pattern.compile("\\d").matcher(password).find());
checks.put("Contains special character", Pattern.compile("[@$!%*?&]").matcher(password).find());
checks.put("No whitespace", !Pattern.compile("\\s").matcher(password).find());
return checks;
}
public static void main(String[] args) {
demonstrateLookarounds();
demonstrateBackreferences();
demonstrateUnicode();
// Password validation with detailed feedback
System.out.println("\n=== Detailed Password Validation ===");
String[] testPasswords = {
"Weak1!",
"StrongPassword123!",
"NoSpecialChar1",
" HasSpace1! ",
"ALLUPPERCASE!1"
};
for (String pwd : testPasswords) {
System.out.println("\nTesting: '" + pwd + "'");
Map<String, Boolean> results = validatePasswordDetailed(pwd);
results.forEach((check, passed) ->
System.out.println(" " + (passed ? "β" : "β") + " " + check));
boolean allPassed = results.values().stream().allMatch(Boolean::booleanValue);
System.out.println(" Overall: " + (allPassed ? "VALID" : "INVALID"));
}
// Complex pattern: Extract and validate time ranges
System.out.println("\n=== Time Range Validation ===");
String[] timeRanges = {
"09:00-17:00",
"13:30-14:45",
"25:00-18:00", // Invalid hour
"09:00-09:00", // Same time
"14:00-12:00" // End before start
};
Pattern timeRangePattern = Pattern.compile("(\\d{2}):(\\d{2})-(\\d{2}):(\\d{2})");
for (String range : timeRanges) {
Matcher trMatcher = timeRangePattern.matcher(range);
if (trMatcher.matches()) {
int startHour = Integer.parseInt(trMatcher.group(1));
int startMinute = Integer.parseInt(trMatcher.group(2));
int endHour = Integer.parseInt(trMatcher.group(3));
int endMinute = Integer.parseInt(trMatcher.group(4));
boolean validHours = startHour >= 0 && startHour < 24 && endHour >= 0 && endHour < 24;
boolean validMinutes = startMinute >= 0 && startMinute < 60 && endMinute >= 0 && endMinute < 60;
boolean validRange = startHour < endHour || (startHour == endHour && startMinute < endMinute);
System.out.printf("%-15s : %s%n", range,
(validHours && validMinutes && validRange) ? "β Valid" : "β Invalid");
} else {
System.out.printf("%-15s : β Invalid format%n", range);
}
}
}
}
Example 7: Performance and Best Practices
import java.util.regex.*;
import java.util.*;
public class PerformanceBestPractices {
// Reuse compiled patterns for better performance
private static final Pattern EMAIL_PATTERN = Pattern.compile(
"^[a-zA-Z0-9_+&*-]+(?:\\.[a-zA-Z0-9_+&*-]+)*@(?:[a-zA-Z0-9-]+\\.)+[a-zA-Z]{2,7}$"
);
private static final Pattern PHONE_PATTERN = Pattern.compile(
"^(\\(\\d{3}\\)|\\d{3})[-.\\s]?\\d{3}[-.\\s]?\\d{4}$"
);
// Pre-compiled common patterns
private static final Map<String, Pattern> PRECOMPILED_PATTERNS = Map.of(
"DIGITS", Pattern.compile("\\d+"),
"WORDS", Pattern.compile("\\w+"),
"WHITESPACE", Pattern.compile("\\s+"),
"EMAIL", EMAIL_PATTERN,
"PHONE", PHONE_PATTERN
);
public static boolean isValidEmail(String email) {
return EMAIL_PATTERN.matcher(email).matches();
}
public static boolean isValidPhone(String phone) {
return PHONE_PATTERN.matcher(phone).matches();
}
public static void demonstratePerformance() {
System.out.println("=== Performance Considerations ===");
int iterations = 10000;
String testEmail = "[email protected]";
// Method 1: Compile pattern each time (SLOW)
long startTime1 = System.currentTimeMillis();
for (int i = 0; i < iterations; i++) {
boolean isValid = Pattern.compile(
"^[a-zA-Z0-9_+&*-]+(?:\\.[a-zA-Z0-9_+&*-]+)*@(?:[a-zA-Z0-9-]+\\.)+[a-zA-Z]{2,7}$"
).matcher(testEmail).matches();
}
long endTime1 = System.currentTimeMillis();
// Method 2: Use pre-compiled pattern (FAST)
long startTime2 = System.currentTimeMillis();
for (int i = 0; i < iterations; i++) {
boolean isValid = EMAIL_PATTERN.matcher(testEmail).matches();
}
long endTime2 = System.currentTimeMillis();
System.out.println("Time with recompilation: " + (endTime1 - startTime1) + "ms");
System.out.println("Time with pre-compiled: " + (endTime2 - startTime2) + "ms");
System.out.println("Performance improvement: " +
((endTime1 - startTime1) - (endTime2 - startTime2)) + "ms");
}
public static void demonstrateBestPractices() {
System.out.println("\n=== Best Practices ===");
// 1. Use specific character classes instead of .
String text = "Hello\nWorld";
// BAD - . doesn't match newline by default
Pattern badPattern = Pattern.compile("Hello.World");
System.out.println("Dot doesn't match newline: " + badPattern.matcher(text).find());
// GOOD - use specific pattern or DOTALL flag
Pattern goodPattern1 = Pattern.compile("Hello.World", Pattern.DOTALL);
Pattern goodPattern2 = Pattern.compile("Hello[\\s\\S]World");
System.out.println("With DOTALL flag: " + goodPattern1.matcher(text).find());
System.out.println("With [\\s\\S]: " + goodPattern2.matcher(text).find());
// 2. Use word boundaries for whole words
String sentence = "The cat in the cathedral";
// Without word boundary - matches "cat" in "cathedral"
Pattern noBoundary = Pattern.compile("cat");
// With word boundary - matches only whole word "cat"
Pattern withBoundary = Pattern.compile("\\bcat\\b");
System.out.println("\nWithout boundary matches: " + noBoundary.matcher(sentence).replaceAll("DOG"));
System.out.println("With boundary matches: " + withBoundary.matcher(sentence).replaceAll("DOG"));
// 3. Be careful with greedy quantifiers
String html = "<div>content</div><p>more</p>";
// GREEDY - matches too much
Pattern greedy = Pattern.compile("<.*>");
// RELUCTANT - matches correctly
Pattern reluctant = Pattern.compile("<.*?>");
System.out.println("\nGreedy match: " + greedy.matcher(html).replaceAll("[TAG]"));
System.out.println("Reluctant match: " + reluctant.matcher(html).replaceAll("[TAG]"));
// 4. Use non-capturing groups when you don't need the group
String date = "2024-01-15";
// Capturing groups (stores all groups)
Pattern capturing = Pattern.compile("(\\d{4})-(\\d{2})-(\\d{2})");
// Non-capturing groups (only stores what you need)
Pattern nonCapturing = Pattern.compile("(?:\\d{4})-(\\d{2})-(\\d{2})");
Matcher capMatcher = capturing.matcher(date);
Matcher nonCapMatcher = nonCapturing.matcher(date);
if (capMatcher.matches()) {
System.out.println("\nCapturing groups count: " + capMatcher.groupCount());
}
if (nonCapMatcher.matches()) {
System.out.println("Non-capturing groups count: " + nonCapMatcher.groupCount());
}
}
public static void commonMistakes() {
System.out.println("\n=== Common Mistakes ===");
// 1. Forgetting to escape special characters
String specialText = "1 + 1 = 2";
// WRONG - + is a quantifier
try {
Pattern wrong = Pattern.compile("1 + 1");
System.out.println("Unescaped +: " + wrong.matcher(specialText).find());
} catch (Exception e) {
System.out.println("Error with unescaped +: " + e.getMessage());
}
// RIGHT - escape special characters
Pattern right = Pattern.compile("1 \\+ 1");
System.out.println("Escaped +: " + right.matcher(specialText).find());
// 2. Incorrect character class usage
String testText = "price: $25.99";
// WRONG - character class with quantifier inside
Pattern wrongCharClass = Pattern.compile("[$0-9]+");
// RIGHT - quantifier outside character class
Pattern rightCharClass = Pattern.compile("[$0-9]+");
System.out.println("Wrong char class: " + wrongCharClass.matcher(testText).find());
System.out.println("Right char class: " + rightCharClass.matcher(testText).find());
// 3. Not handling multiline input properly
String multilineText = "Line 1\nLine 2\nLine 3";
// Without MULTILINE flag - ^ and $ match beginning/end of entire input
Pattern noMultiline = Pattern.compile("^Line");
// With MULTILINE flag - ^ and $ match beginning/end of each line
Pattern withMultiline = Pattern.compile("^Line", Pattern.MULTILINE);
System.out.println("\nWithout MULTILINE flag matches: " + noMultiline.matcher(multilineText).find());
System.out.println("With MULTILINE flag matches: " + withMultiline.matcher(multilineText).find());
}
public static void main(String[] args) {
demonstratePerformance();
demonstrateBestPractices();
commonMistakes();
// Using pre-compiled patterns from map
System.out.println("\n=== Using Pre-compiled Patterns ===");
String testText = "Hello 123 World!";
PRECOMPILED_PATTERNS.forEach((name, pattern) -> {
Matcher matcher = pattern.matcher(testText);
List<String> matches = new ArrayList<>();
while (matcher.find()) {
matches.add(matcher.group());
}
System.out.println(name + ": " + matches);
});
}
}
8. Conclusion
Key Takeaways:
- Pattern Class: Compile regex patterns for reuse
- Matcher Class: Perform matching operations on input
- Groups: Use parentheses for capturing parts of matches
- Quantifiers: Control repetition of patterns
- Character Classes: Define sets of characters to match
Best Practices:
- β Pre-compile frequently used patterns
- β
Use specific character classes instead of
. - β
Use word boundaries (
\b) for whole word matching - β Be careful with greedy quantifiers
- β Test regex patterns thoroughly with various inputs
Common Pitfalls:
- β Forgetting to escape special characters
- β Using greedy quantifiers when reluctant is needed
- β Not considering performance with complex patterns
- β Ignoring multiline input requirements
- β Not handling regex syntax exceptions
Performance Tips:
- Pre-compile patterns that are used repeatedly
- Use character classes instead of alternation when possible
- Avoid nested quantifiers in complex patterns
- Use possessive quantifiers (
*+,++,?+) when backtracking isn't needed
Final Thoughts:
Regular expressions are incredibly powerful for text processing but can become complex. Always:
- Test your patterns with various inputs
- Document complex regex patterns
- Consider readability and maintainability
- Use online regex testers for debugging
Mastering regular expressions will make you much more effective at text processing, validation, and data extraction tasks in Java!