When a Java application crashes with a segmentation fault or fatal error, it can be one of the most challenging scenarios for developers. The JVM itself—typically a rock-solid foundation—has encountered something it cannot handle. In these situations, traditional Java debugging tools are insufficient, and you need to dive deeper with native debugging tools like gdb (GNU Debugger).
This article explores how to analyze JVM crashes using gdb, from capturing crash dumps to interpreting core files and extracting meaningful information.
When Does the JVM Crash?
The JVM is a complex native application written in C/C++. It can crash due to:
- Native Memory Corruption: Bugs in JNI code or native libraries
- JVM Bugs: Rare issues in the JVM itself (more common in newer versions)
- System Resource Exhaustion: Running out of memory, file descriptors, etc.
- Hardware Issues: Faulty memory, CPU problems, or disk errors
- Operating System Bugs: Kernel-level issues affecting the JVM
Common symptoms include:
- Segmentation fault (SIGSEGV)
- Bus error (SIGBUS)
- Fatal error logs with
hs_err_pidfiles - Abrupt process termination without stack traces
Prerequisites for JVM Crash Analysis
Essential Tools:
# On Ubuntu/Debian sudo apt-get install gdb openjdk-17-dbg # On RHEL/CentOS sudo yum install gdb java-17-openjdk-debuginfo # On Amazon Linux 2023 sudo dnf install gdb java-17-openjdk-debuginfo
Key Components:
- gdb: The GNU Debugger for analyzing core dumps and live processes
- Debug Symbols: JVM debug packages (
openjdk-XX-dbgorjava-XX-openjdk-debuginfo) - Core Dump Configuration: Proper system setup for core dump generation
Configuring the System for Crash Analysis
Enable Core Dumps:
# Check current limits ulimit -a # Enable unlimited core dumps (current session) ulimit -c unlimited # Permanent configuration echo "ulimit -c unlimited" >> ~/.bashrc echo "/tmp/core.%e.%p" | sudo tee /proc/sys/kernel/core_pattern # For systemd services, add to service file: # [Service] # LimitCORE=infinity
JVM Crash Dump Options:
# Generate core dump on OutOfMemoryError java -XX:+CrashOnOutOfMemoryError -jar app.jar # Generate core dump on any crash (default behavior) java -XX:+CreateCOREOnCrash -jar app.jar # Explicit core dump location java -XX:ErrorFile=/var/log/hs_err_pid%p.log -jar app.jar
Basic gdb Commands for JVM Analysis
Starting gdb with a Core Dump:
gdb /usr/bin/java core.1234 # or gdb --core=core.1234 /usr/bin/java
Essential gdb Commands:
(gdb) bt # Backtrace - most important first command (gdb) bt full # Detailed backtrace with local variables (gdb) info threads # List all threads (gdb) thread apply all bt # Backtrace for all threads (gdb) info registers # Show CPU registers (gdb) x/10i $pc # Disassemble instructions at program counter (gdb) print expr # Print variable value (gdb) where # Current stack trace
Step-by-Step Crash Analysis Workflow
Scenario 1: Live Process Crashed with Core Dump
# 1. Find the core dump find / -name "core.*" -o -name "java.core.*" 2>/dev/null # 2. Load core dump with debug symbols gdb /usr/lib/jvm/java-17-openjdk/bin/java core.java.1234 # 3. Get comprehensive thread information (gdb) info threads (gdb) thread apply all bt full # 4. Focus on the crashing thread (gdb) thread 1 (gdb) bt full
Scenario 2: Analyzing a Running JVM
# Attach to running JVM process sudo gdb -p 1234 # Get thread information (gdb) info threads (gdb) thread apply all bt # Detach without killing process (gdb) detach (gdb) quit
Interpreting JVM Crash Signatures
Common Crash Patterns:
1. SIGSEGV in Native Code:
Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x7f8a5b7fe700 (LWP 12345)] 0x00007f8a4a3b2150 in SomeNativeFunction () from /path/to/libnative.so (gdb) bt #0 0x00007f8a4a3b2150 in SomeNativeFunction () from /path/to/libnative.so #1 0x00007f8a5a1c3e20 in Java_com_example_NativeClass_nativeMethod () #2 0x00007f8a6b2a1c40 in ?? ()
2. JVM Internal Crash:
Program received signal SIGILL, Illegal instruction. 0x00007f8a6a5c3d10 in VM_Version::get_processor_features() () (gdb) bt #0 0x00007f8a6a5c3d10 in VM_Version::get_processor_features() () #1 0x00007f8a6a5c1a20 in VM_Version::initialize() ()
Advanced JVM-Specific gdb Commands
JVM Debug Symbols Commands:
# Ensure debug symbols are loaded (gdb) info sharedlibrary (gdb) set debug-file-directory /usr/lib/debug # JVM-specific debugging (requires debug symbols) (gdb) p *thread (gdb) p *this
Examining JVM Memory:
(gdb) info proc mappings # Show memory map (gdb) x/100x 0x7f8a5a000000 # Examine memory at address (gdb) x/10s 0x7f8a5a123456 # Examine as strings
Real-World Crash Analysis Examples
Example 1: JNI Code Crash
# Core dump shows crash in JNI code gdb /usr/bin/java core.1234 (gdb) bt #0 0x00007f345a2b1150 in process_buffer (env=0x7f3444007890, obj=0x7f3444012ab0, buffer=0x0, len=1024) at jni_native.c:45 #1 0x00007f345a2b12a0 in Java_com_myapp_NativeProcessor_process (env=0x7f3444007890, obj=0x7f3444012ab0, buffer=0x0, len=1024) at jni_native.c:89 (gdb) frame 0 (gdb) print buffer $1 = (unsigned char *) 0x0 # NULL pointer dereference!
Analysis: The JNI code is trying to use a NULL buffer pointer, causing SIGSEGV.
Example 2: Heap Corruption
(gdb) bt #0 0x00007f8e1a4c9d50 in G1ParScanThreadState::copy_to_survivor_space( oopDesc*, markWord, oopDesc*) () #1 0x00007f8e1a4c8b20 in G1ParScanThreadState::trim_queue() () #2 0x00007f8e1a4c7e10 in G1ParScanThreadState::steal() ()
Analysis: Crash during garbage collection, possibly due to heap corruption from native code.
Using gdb with hs_err_pid Files
The JVM generates hs_err_pid<pid>.log files containing valuable information:
# Extract key information from hs_err file grep -A 10 -B 5 "Problematic frame" hs_err_pid12345.log grep "CURRENT_THREAD" hs_err_pid12345.log grep "Stack:" hs_err_pid12345.log -A 20
Correlate with gdb:
# Find the crashing address from hs_err file
CRASH_ADDR=$(grep "Problematic frame" hs_err_pid12345.log | \
awk -F'=' '{print $3}' | awk '{print $1}')
# Examine that address in gdb
gdb --core=core.12345 /usr/bin/java
(gdb) x/10i $CRASH_ADDR
Automated Crash Analysis Script
Create a script for consistent crash analysis:
#!/bin/bash
# analyze_crash.sh
CORE_DUMP=$1
PID=$(echo $CORE_DUMP | grep -o '[0-9]\+' | head -1)
echo "=== JVM Crash Analysis Report ==="
echo "Core dump: $CORE_DUMP"
echo "PID: $PID"
echo
# Check for hs_err file
HS_ERR_FILE="hs_err_pid${PID}.log"
if [ -f "$HS_ERR_FILE" ]; then
echo "Found hs_err file: $HS_ERR_FILE"
grep "Problematic frame" "$HS_ERR_FILE"
echo
fi
# Load core dump in gdb and extract information
gdb -batch -ex "thread apply all bt full" -ex "quit" \
/usr/bin/java "$CORE_DUMP" 2>/dev/null | \
head -100
echo "=== End of Report ==="
Usage:
chmod +x analyze_crash.sh ./analyze_crash.sh core.1234
Best Practices for JVM Crash Analysis
- Always Install Debug Symbols:
# Match JDK version with debug symbols java -version sudo apt-get install openjdk-17-dbg
- Configure Core Dumps Proactively:
# Add to JVM startup options -XX:+CrashOnOutOfMemoryError -XX:ErrorFile=/var/log/java/hs_err_pid%p.log -XX:OnError="gdb -batch -ex 'thread apply all bt' -ex 'quit' /usr/bin/java %p"
- Preserve Evidence:
# Archive all crash artifacts tar czf crash_analysis_$(date +%Y%m%d_%H%M%S).tar.gz \ core.* hs_err_pid*.log /path/to/app.jar
- Reproduce in Development:
- Use the same JDK version and build
- Same system configuration
- Same application version and data
Common Solutions to JVM Crashes
JNI-Related Crashes:
- Validate all native method parameters
- Use
JNI_ABORTfor read-only buffers - Check for memory leaks in native code
- Verify pointer validity before dereferencing
Memory-Related Crashes:
- Monitor native memory usage with
NMT - Use
-XX:MaxDirectMemorySizeto limit direct buffers - Check for native memory leaks in third-party libraries
Garbage Collection Crashes:
- Try different GC algorithms (
-XX:+UseG1GC) - Reduce heap size if experiencing memory fragmentation
- Update to latest JVM patch release
Conclusion
JVM crash analysis with gdb is a critical skill for Java developers and operators dealing with complex applications, especially those using JNI, native libraries, or running under heavy load. By mastering these techniques, you can:
- Quickly Identify Root Causes: From core dumps and crash logs
- Reduce Mean Time to Resolution (MTTR): With systematic analysis approaches
- Improve Application Stability: By identifying and fixing underlying issues
- Communicate Effectively: Provide detailed crash reports to library vendors or JVM teams
Remember that while gdb provides low-level insights, the best solution is often to prevent crashes through proper coding practices, comprehensive testing, and proactive monitoring of both Java and native components.