Semgrep Java Rules: Comprehensive Guide for Java Code Analysis

Semgrep is a fast, open-source static analysis tool for finding bugs and enforcing code standards. This guide covers comprehensive Java rule development and integration.


Core Concepts

What is Semgrep?

  • Lightweight static analysis tool
  • Pattern-based code searching
  • Supports multiple languages including Java
  • Easy-to-write custom rules
  • CI/CD integration friendly

Key Features for Java:

  • AST-based pattern matching
  • Custom rule creation
  • Security vulnerability detection
  • Code quality enforcement
  • Best practices validation

Semgrep Rule Structure

1. Basic Rule Anatomy
rules:
- id: rule-unique-identifier
patterns:
- pattern: "pattern-to-match"
message: "Description of the issue"
languages: [java]
severity: ERROR | WARNING | INFO
metadata:
category: security | correctness | performance
technology:
- java
- spring
2. Rule Metadata
rules:
- id: java-best-practices
message: "Best practice violation"
languages: [java]
severity: WARNING
metadata:
category: best-practices
confidence: HIGH
likelihood: MEDIUM
impact: LOW
cwe: ["CWE-117"]  # Common Weakness Enumeration
owasp: ["A1:2017"]  # OWASP Top 10
references:
- "https://example.com/best-practices"

Security Rules

1. SQL Injection Detection
rules:
- id: java-sql-injection
patterns:
- pattern: |
$QUERY.execute($INPUT)
- pattern: |
$QUERY.executeQuery($INPUT)
- pattern: |
$QUERY.executeUpdate($INPUT)
- metavariable-regex:
metavariable: $QUERY
regex: (.*Statement|.*Query)
- metavariable-regex:
metavariable: $INPUT
regex: (.*\+.*|".*"\+.*|.*\+".*")
message: "Potential SQL injection vulnerability. Use prepared statements instead of string concatenation."
languages: [java]
severity: ERROR
metadata:
category: security
cwe: ["CWE-89"]
owasp: ["A1:2017", "A3:2017"]
technology: ["java", "jdbc"]
2. Hardcoded Credentials
rules:
- id: java-hardcoded-credentials
patterns:
- pattern: |
String $VAR = "...";
- metavariable-regex:
metavariable: $VAR
regex: (password|pwd|secret|key|token)
- metavariable-regex:
metavariable: "..."
regex: (?!(""|''|null)).+
message: "Hardcoded credentials detected. Use secure configuration management."
languages: [java]
severity: ERROR
metadata:
category: security
cwe: ["CWE-798"]
owasp: ["A2:2017"]
3. Insecure Random Number Generation
rules:
- id: java-insecure-random
patterns:
- pattern: |
new Random()
- pattern: |
new java.util.Random()
message: "Insecure random number generator detected. Use SecureRandom for cryptographic operations."
languages: [java]
severity: WARNING
metadata:
category: security
cwe: ["CWE-338"]
technology: ["java"]
4. XXE Injection
rules:
- id: java-xxe-injection
patterns:
- pattern: |
DocumentBuilderFactory $FACTORY = DocumentBuilderFactory.newInstance();
- pattern-not: |
$FACTORY.setFeature("http://xml.org/sax/features/external-general-entities", false);
- pattern-not: |
$FACTORY.setFeature("http://xml.org/sax/features/external-parameter-entities", false);
message: "XML External Entity (XXE) vulnerability. Disable external entities in DocumentBuilderFactory."
languages: [java]
severity: ERROR
metadata:
category: security
cwe: ["CWE-611"]
owasp: ["A4:2017"]
5. Path Traversal
rules:
- id: java-path-traversal
patterns:
- pattern: |
new File($BASE + $USERINPUT)
- pattern: |
Files.readAllBytes(Paths.get($BASE, $USERINPUT))
- pattern: |
new FileInputStream($BASE + $USERINPUT)
- metavariable-regex:
metavariable: $USERINPUT
regex: (.*\\.\\./.*|.*\\.\\.\\\.*)
message: "Potential path traversal vulnerability. Validate and sanitize user input for file operations."
languages: [java]
severity: ERROR
metadata:
category: security
cwe: ["CWE-22"]

Code Quality Rules

1. Null Pointer Exception Prevention
rules:
- id: java-potential-npe
patterns:
- pattern: |
$OBJ.$METHOD(...)
- pattern-not: |
if ($OBJ != null) { ... }
- pattern-not: |
$OBJ != null && ...
- pattern-not: |
Objects.requireNonNull($OBJ)
- metavariable-regex:
metavariable: $OBJ
regex: ^(?!this).*$
message: "Potential NullPointerException. Add null check before method call."
languages: [java]
severity: WARNING
metadata:
category: correctness
technology: ["java"]
2. Resource Leak Detection
rules:
- id: java-resource-leak
patterns:
- pattern: |
$RESOURCE = new $TYPE(...);
- pattern-not: |
try ($TYPE $RESOURCE = ...) { ... }
- pattern-not: |
... $RESOURCE.close() ...
- pattern-not: |
... $RESOURCE = ... .getResource() ...
- metavariable-regex:
metavariable: $TYPE
regex: (.*InputStream|.*OutputStream|.*Reader|.*Writer|.*Connection|.*Statement|.*ResultSet)
message: "Resource may not be properly closed. Use try-with-resources or ensure proper cleanup."
languages: [java]
severity: WARNING
metadata:
category: reliability
technology: ["java"]
3. String Concatenation in Loops
rules:
- id: java-string-concat-in-loop
patterns:
- pattern: |
for (...) {
... $RESULT = $RESULT + ... ...
}
- pattern: |
while (...) {
... $RESULT += ... ...
}
- metavariable-regex:
metavariable: $RESULT
regex: .*
message: "String concatenation in loop detected. Use StringBuilder for better performance."
languages: [java]
severity: WARNING
metadata:
category: performance
technology: ["java"]
4. Empty Catch Block
rules:
- id: java-empty-catch-block
patterns:
- pattern: |
catch (...) {
}
- pattern: |
catch (...) {
// $COMMENT
}
- metavariable-regex:
metavariable: $COMMENT
regex: ^\s*$
message: "Empty catch block detected. At least log the exception."
languages: [java]
severity: WARNING
metadata:
category: correctness
technology: ["java"]
5. System.exit() Usage
rules:
- id: java-system-exit
patterns:
- pattern: |
System.exit(...)
- pattern: |
Runtime.getRuntime().exit(...)
message: "System.exit() call detected. Avoid terminating the JVM in application code."
languages: [java]
severity: ERROR
metadata:
category: reliability
technology: ["java"]

Spring Framework Rules

1. Spring Transaction Management
rules:
- id: spring-transactional-misuse
patterns:
- pattern: |
@Transactional
public void $METHOD(...) {
...
$REPO.save(...);
...
}
- metavariable-regex:
metavariable: $METHOD
regex: (.*delete.*|.*remove.*|.*update.*)
- pattern-not: |
@Transactional(readOnly = true)
message: "Transactional method performing write operations should have proper transaction configuration."
languages: [java]
severity: WARNING
metadata:
category: correctness
technology: ["java", "spring"]
2. Spring Security Configuration
rules:
- id: spring-security-misconfiguration
patterns:
- pattern: |
http.authorizeRequests()
.antMatchers("...").permitAll()
.anyRequest().authenticated();
- pattern-not: |
.and().csrf().disable()
- pattern-not: |
.csrf().disable()
message: "CSRF protection might be disabled. Ensure CSRF is properly configured for state-changing operations."
languages: [java]
severity: WARNING
metadata:
category: security
cwe: ["CWE-352"]
owasp: ["A2:2017"]
technology: ["java", "spring-security"]
3. Spring Bean Injection
rules:
- id: spring-field-injection
patterns:
- pattern: |
@Autowired
private $TYPE $FIELD;
- pattern-not: |
@Autowired
public ...($TYPE $FIELD) { ... }
message: "Field injection detected. Prefer constructor injection for better testability and immutability."
languages: [java]
severity: WARNING
metadata:
category: best-practices
technology: ["java", "spring"]
4. Spring Cache Configuration
rules:
- id: spring-cache-misuse
patterns:
- pattern: |
@Cacheable
public List<$TYPE> $METHOD(...) {
return ...;
}
- pattern-not: |
@CacheEvict
- metavariable-regex:
metavariable: $METHOD
regex: (.*update.*|.*delete.*|.*save.*|.*create.*)
message: "Cacheable method performing write operations. Consider using @CacheEvict for write methods."
languages: [java]
severity: WARNING
metadata:
category: performance
technology: ["java", "spring"]

Performance Rules

1. Inefficient Collection Usage
rules:
- id: java-inefficient-collection-init
patterns:
- pattern: |
new ArrayList<>()
- pattern-not: |
new ArrayList<>($SIZE)
- pattern-not: |
Arrays.asList(...)
message: "ArrayList created without initial capacity. Specify initial capacity for better performance."
languages: [java]
severity: INFO
metadata:
category: performance
technology: ["java"]
2. Object Allocation in Loops
rules:
- id: java-object-allocation-in-loop
patterns:
- pattern: |
while (...) {
... new $TYPE(...) ...
}
- pattern: |
for (...) {
... new $TYPE(...) ...
}
- metavariable-regex:
metavariable: $TYPE
regex: (.*DateFormat|.*SimpleDateFormat|.*Random)
message: "Object allocation inside loop detected. Move object creation outside the loop."
languages: [java]
severity: WARNING
metadata:
category: performance
technology: ["java"]
3. Redundant String Operations
rules:
- id: java-redundant-string-operation
patterns:
- pattern: |
$STR.toString()
- metavariable-regex:
metavariable: $STR
regex: .*String.*
message: "Redundant toString() call on String object."
languages: [java]
severity: INFO
metadata:
category: performance
technology: ["java"]

Testing Rules

1. Test Quality Rules
rules:
- id: java-test-assertion-missing
patterns:
- pattern: |
@Test
public void $TESTNAME(...) {
...
}
- pattern-not: |
assert...
- pattern-not: |
Assertions.assert...
- pattern-not: |
verify(...)
- pattern-not: |
Mockito.verify(...)
message: "Test method missing assertions. Tests should verify expected behavior."
languages: [java]
severity: WARNING
metadata:
category: testing
technology: ["java", "junit"]
2. Flaky Test Detection
rules:
- id: java-flaky-test
patterns:
- pattern: |
@Test
public void $TESTNAME(...) {
... Thread.sleep(...) ...
}
- pattern: |
@Test
public void $TESTNAME(...) {
... System.currentTimeMillis() ...
}
message: "Potential flaky test detected. Avoid sleep and time-based logic in tests."
languages: [java]
severity: WARNING
metadata:
category: testing
technology: ["java", "junit"]
3. Test Setup Issues
rules:
- id: java-test-setup-issues
patterns:
- pattern: |
@Test
public void $TESTNAME(...) {
... new $SERVICE() ...
}
- metavariable-regex:
metavariable: $SERVICE
regex: (.*Service|.*Repository|.*Controller)
message: "Direct instantiation of Spring components in tests. Use dependency injection or mocking."
languages: [java]
severity: WARNING
metadata:
category: testing
technology: ["java", "spring", "junit"]

Best Practices Rules

1. Logging Best Practices
rules:
- id: java-logging-best-practices
patterns:
- pattern: |
logger.debug("User: " + user + " action: " + action)
- pattern: |
logger.info("Processing " + count + " items")
message: "String concatenation in log statements. Use parameterized logging for better performance."
languages: [java]
severity: INFO
metadata:
category: performance
technology: ["java", "logging"]
2. Exception Handling
rules:
- id: java-exception-handling
patterns:
- pattern: |
catch (Exception e) {
throw new RuntimeException(e);
}
- pattern-not: |
catch (Exception e) {
throw new $CUSTOMEXCEPTION("...", e);
}
message: "Generic exception caught and rethrown as RuntimeException. Use specific exception types."
languages: [java]
severity: WARNING
metadata:
category: correctness
technology: ["java"]
3. Optional Misuse
rules:
- id: java-optional-misuse
patterns:
- pattern: |
Optional.of($VALUE)
- metavariable-regex:
metavariable: $VALUE
regex: .*null.*
message: "Optional.of() called with potentially null value. Use Optional.ofNullable() instead."
languages: [java]
severity: ERROR
metadata:
category: correctness
technology: ["java"]
4. Date Time API
rules:
- id: java-legacy-date-api
patterns:
- pattern: |
new Date(...)
- pattern: |
new SimpleDateFormat(...)
- pattern: |
Calendar.getInstance(...)
message: "Legacy date-time API detected. Use java.time package for new code."
languages: [java]
severity: WARNING
metadata:
category: best-practices
technology: ["java"]

Advanced Pattern Matching

1. Metavariable Patterns
rules:
- id: java-method-call-pattern
patterns:
- pattern: |
$OBJ.$METHOD($ARGS)
- metavariable-regex:
metavariable: $METHOD
regex: (save|update|delete|create)
- metavariable-regex:
metavariable: $OBJ
regex: (.*Repository|.*DAO|.*Service)
message: "Data modification method call detected. Ensure proper transaction boundaries."
languages: [java]
severity: INFO
2. Pattern-Either for Multiple Cases
rules:
- id: java-multiple-exception-types
pattern-either:
- pattern: |
throw new RuntimeException(...)
- pattern: |
throw new Exception(...)
- pattern: |
throw new Throwable(...)
message: "Generic exception thrown. Use specific exception types for better error handling."
languages: [java]
severity: WARNING
3. Pattern Insides
rules:
- id: java-resource-try-with-resources
patterns:
- pattern: |
$TYPE $VAR = ...;
try {
...
} finally {
$VAR.close();
}
- metavariable-regex:
metavariable: $TYPE
regex: (.*Closeable|.*AutoCloseable)
message: "Manual resource cleanup detected. Use try-with-resources for automatic cleanup."
languages: [java]
severity: INFO
4. Focused Metavariables
rules:
- id: java-focus-metavariable
patterns:
- pattern: |
$FOCUS.method1()
- pattern: |
$FOCUS.method2()
- focus-metavariable: $FOCUS
message: "Multiple method calls on the same object detected."
languages: [java]
severity: INFO

Rule Configuration Files

1. Semgrep Configuration File
# semgrep.yml
rules:
# Import security rules
- rules/security/
# Import code quality rules
- rules/quality/
# Import performance rules
- rules/performance/
# Custom project-specific rules
- rules/custom/
# Exclude paths
exclude:
- "**/test/**"
- "**/generated/**"
- "**/build/**"
- "**/target/**"
# Rule configurations
configs:
- java
- spring
2. Rule Categories Organization
rules/
├── security/
│   ├── sql-injection.yaml
│   ├── xxe.yaml
│   └── path-traversal.yaml
├── quality/
│   ├── null-safety.yaml
│   ├── resource-management.yaml
│   └── exception-handling.yaml
├── performance/
│   ├── collections.yaml
│   ├── strings.yaml
│   └── objects.yaml
├── spring/
│   ├── security.yaml
│   ├── transactions.yaml
│   └── dependency-injection.yaml
└── custom/
├── project-specific.yaml
└── team-standards.yaml

CI/CD Integration

1. GitHub Actions
# .github/workflows/semgrep.yml
name: Semgrep Security Scan
on:
push:
branches: [ main, develop ]
pull_request:
branches: [ main ]
jobs:
semgrep:
name: Semgrep Scan
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v3
- name: Semgrep Scan
uses: returntocorp/semgrep-action@v1
with:
config: p/java-security
outputFormat: sarif
outputFile: semgrep-results.sarif
- name: Upload SARIF results
uses: github/codeql-action/upload-sarif@v2
with:
sarif_file: semgrep-results.sarif
2. GitLab CI
# .gitlab-ci.yml
semgrep:
image: returntocorp/semgrep
script:
- semgrep --config=p/java-security --sarif --output=semgrep-results.sarif .
artifacts:
reports:
sarif: semgrep-results.sarif
only:
- merge_requests
- main
- develop
3. Jenkins Pipeline
// Jenkinsfile
pipeline {
agent any
stages {
stage('Semgrep Scan') {
steps {
sh '''
docker run --rm -v "${WORKSPACE}:/src" \\
returntocorp/semgrep:latest \\
semgrep --config=p/java-security --json > semgrep-report.json
'''
}
post {
always {
archiveArtifacts artifacts: 'semgrep-report.json', fingerprint: true
}
}
}
}
}

Custom Rule Development

1. Rule Development Workflow
# 1. Create test cases
mkdir -p tests/java
cat > tests/java/test-case.java << EOF
public class TestCase {
public void vulnerableMethod(String input) {
String query = "SELECT * FROM users WHERE id = " + input;
// This should trigger the rule
}
public void safeMethod(String input) {
String query = "SELECT * FROM users WHERE id = ?";
// This should not trigger the rule
}
}
EOF
# 2. Develop rule
cat > rules/security/sql-injection-custom.yaml << EOF
rules:
- id: custom-sql-injection
patterns:
- pattern: |
String $QUERY = ... + $INPUT + ...;
- metavariable-regex:
metavariable: $QUERY
regex: (?i)select.*from
message: "Custom SQL injection rule triggered"
languages: [java]
severity: ERROR
EOF
# 3. Test rule
semgrep --config rules/security/sql-injection-custom.yaml tests/java/
2. Rule Testing Framework
# rules/security/sql-injection-test.yaml
rules:
- id: sql-injection-test
patterns:
- pattern: |
String $QUERY = ... + $INPUT + ...;
- metavariable-regex:
metavariable: $QUERY
regex: (?i)select.*from
message: "SQL injection detected"
languages: [java]
severity: ERROR
tests:
- pattern: |
String query = "SELECT * FROM users WHERE id = " + userInput;
expect: match
- pattern: |
String query = "SELECT * FROM users WHERE id = ?";
expect: no-match

Performance Optimization

1. Optimized Rule Patterns
rules:
- id: optimized-rule
patterns:
# More specific pattern first
- pattern: |
$CONN.prepareStatement("..." + $INPUT)
# Broader pattern with constraints
- pattern: |
$CONN.$METHOD($QUERY)
- metavariable-regex:
metavariable: $METHOD
regex: (execute|executeQuery|executeUpdate)
- metavariable-regex:
metavariable: $QUERY
regex: .*\+.*
message: "Optimized SQL injection detection"
languages: [java]
severity: ERROR
2. Excluding False Positives
rules:
- id: java-sql-injection-refined
patterns:
- pattern: |
$QUERY.execute($INPUT)
- pattern-not: |
$QUERY.execute("...")
- pattern-not: |
$QUERY.execute($CONSTANT)
- metavariable-regex:
metavariable: $CONSTANT
regex: ^[A-Z_]+$
message: "SQL injection with reduced false positives"
languages: [java]
severity: ERROR

Best Practices for Rule Writing

  1. Start Specific: Begin with narrow patterns and broaden gradually
  2. Test Thoroughly: Create comprehensive test cases
  3. Minimize False Positives: Use pattern-not to exclude known safe patterns
  4. Provide Clear Messages: Include remediation guidance
  5. Categorize Properly: Use appropriate severity and metadata
  6. Performance Aware: Optimize patterns for faster scanning
# Example of well-structured rule
rules:
- id: java-best-practice-example
message: |
Inefficient string concatenation in loop detected.
Problem: Using string concatenation in loops creates many intermediate string objects.
Solution: Use StringBuilder for better performance.
Example:
// Bad
String result = "";
for (String item : items) {
result += item;
}
// Good
StringBuilder result = new StringBuilder();
for (String item : items) {
result.append(item);
}
languages: [java]
severity: WARNING
metadata:
category: performance
technology: ["java"]
references:
- "https://stackoverflow.com/questions/1532461/stringbuilder-vs-string-concatenation-in-tostring-in-java"

Conclusion

Semgrep Java rules provide:

  • Comprehensive security scanning for common vulnerabilities
  • Code quality enforcement across the codebase
  • Performance optimization detection
  • Framework-specific patterns for Spring, testing, etc.
  • Custom rule development for project-specific standards

By implementing the patterns and rules shown above, you can create a robust static analysis pipeline that catches issues early, enforces coding standards, and improves overall code quality and security. The combination of security rules, code quality checks, and performance optimizations creates a comprehensive safety net for Java development.

Leave a Reply

Your email address will not be published. Required fields are marked *


Macro Nepal Helper