Article
In modern cloud-native environments, applications need to communicate their status and readiness to handle traffic. Health checks, liveness probes, and readiness probes are critical mechanisms that enable platforms like Kubernetes to manage application lifecycle effectively. This article explores how to implement comprehensive health monitoring in Java applications.
Understanding the Types of Health Checks
- Liveness Probe: Indicates whether the application is running. If it fails, the platform restarts the container.
- Readiness Probe: Shows if the application is ready to receive traffic. If it fails, the platform stops sending requests.
- Health Check: A broader term encompassing overall application health, including dependencies.
Spring Boot Actuator: The Standard Approach
Spring Boot Actuator provides production-ready health monitoring out of the box.
1. Basic Setup
Add Dependencies:
<dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-actuator</artifactId> </dependency> <!-- For detailed health information --> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-data-jpa</artifactId> </dependency>
Configure in application.yml:
management: endpoints: web: exposure: include: health,info,metrics endpoint: health: show-details: when_authorized show-components: when_authorized probes: enabled: true # Enable liveness and readiness probes
2. Default Health Endpoints
Spring Boot automatically provides:
- Overall Health:
GET /actuator/health - Liveness:
GET /actuator/health/liveness - Readiness:
GET /actuator/health/readiness
Example Response:
{
"status": "UP",
"components": {
"diskSpace": {
"status": "UP",
"details": {
"total": 500107862016,
"free": 350107862016,
"threshold": 10485760
}
},
"ping": {
"status": "UP"
}
}
}
Custom Health Indicators
1. Database Health Check
@Component
public class DatabaseHealthIndicator implements HealthIndicator {
private final DataSource dataSource;
public DatabaseHealthIndicator(DataSource dataSource) {
this.dataSource = dataSource;
}
@Override
public Health health() {
try (Connection connection = dataSource.getConnection()) {
// Try to execute a simple query
String sql = "SELECT 1 FROM DUAL";
try (PreparedStatement statement = connection.prepareStatement(sql)) {
statement.executeQuery();
}
return Health.up()
.withDetail("database", "Connected successfully")
.withDetail("validationQuery", sql)
.build();
} catch (Exception e) {
return Health.down()
.withDetail("database", "Connection failed")
.withDetail("error", e.getMessage())
.build();
}
}
}
2. External Service Health Check
@Component
public class PaymentServiceHealthIndicator implements HealthIndicator {
private final PaymentServiceClient paymentServiceClient;
public PaymentServiceHealthIndicator(PaymentServiceClient paymentServiceClient) {
this.paymentServiceClient = paymentServiceClient;
}
@Override
public Health health() {
try {
boolean isHealthy = paymentServiceClient.healthCheck();
if (isHealthy) {
return Health.up()
.withDetail("paymentService", "Service is responding")
.withDetail("responseTime", "Within acceptable limits")
.build();
} else {
return Health.down()
.withDetail("paymentService", "Service is unhealthy")
.build();
}
} catch (Exception e) {
return Health.down(e)
.withDetail("paymentService", "Service is unreachable")
.withDetail("error", e.getMessage())
.build();
}
}
}
3. Custom Business Logic Health Check
@Component
public class OrderProcessingHealthIndicator implements HealthIndicator {
private final OrderService orderService;
private final OrderProcessingConfig config;
public OrderProcessingHealthIndicator(OrderService orderService,
OrderProcessingConfig config) {
this.orderService = orderService;
this.config = config;
}
@Override
public Health health() {
long pendingOrders = orderService.countPendingOrders();
long maxAllowedPending = config.getMaxPendingOrders();
if (pendingOrders > maxAllowedOrders) {
return Health.down()
.withDetail("orderProcessing", "Backlog too high")
.withDetail("pendingOrders", pendingOrders)
.withDetail("maxAllowed", maxAllowedPending)
.build();
}
return Health.up()
.withDetail("orderProcessing", "Processing normally")
.withDetail("pendingOrders", pendingOrders)
.build();
}
}
Kubernetes-Specific Configuration
1. Deployment YAML with Probes
apiVersion: apps/v1 kind: Deployment metadata: name: order-service spec: replicas: 3 selector: matchLabels: app: order-service template: metadata: labels: app: order-service spec: containers: - name: order-service image: mycompany/order-service:1.0.0 ports: - containerPort: 8080 livenessProbe: httpGet: path: /actuator/health/liveness port: 8080 initialDelaySeconds: 90 # Wait for app to start periodSeconds: 30 # Check every 30 seconds timeoutSeconds: 5 # Timeout after 5 seconds failureThreshold: 3 # 3 consecutive failures = restart readinessProbe: httpGet: path: /actuator/health/readiness port: 8080 initialDelaySeconds: 30 periodSeconds: 10 timeoutSeconds: 3 successThreshold: 1 failureThreshold: 3 startupProbe: httpGet: path: /actuator/health/readiness port: 8080 initialDelaySeconds: 10 periodSeconds: 5 timeoutSeconds: 3 failureThreshold: 10 # Allow up to 50 seconds for startup
2. Custom Liveness and Readiness Logic
Application Configuration:
@Configuration
public class HealthProbeConfiguration {
@Bean
public LivenessStateHealthIndicator livenessStateHealthIndicator() {
return new LivenessStateHealthIndicator();
}
@Bean
public ReadinessStateHealthIndicator readinessStateHealthIndicator() {
return new ReadinessStateHealthIndicator();
}
}
@Component
public class ApplicationReadinessChecker {
private volatile boolean isReady = false;
private final List<StartupTask> startupTasks;
public ApplicationReadinessChecker(List<StartupTask> startupTasks) {
this.startupTasks = startupTasks;
initializeApplication();
}
private void initializeApplication() {
CompletableFuture.runAsync(() -> {
try {
// Perform startup tasks
for (StartupTask task : startupTasks) {
task.execute();
}
// Warm up caches
warmUpCaches();
// Verify critical dependencies
verifyDependencies();
isReady = true;
log.info("Application is ready to handle traffic");
} catch (Exception e) {
log.error("Application failed to initialize", e);
}
});
}
public boolean isReady() {
return isReady;
}
}
@Component
public class CustomReadinessHealthIndicator implements HealthIndicator {
private final ApplicationReadinessChecker readinessChecker;
private final DataSource dataSource;
public CustomReadinessHealthIndicator(ApplicationReadinessChecker readinessChecker,
DataSource dataSource) {
this.readinessChecker = readinessChecker;
this.dataSource = dataSource;
}
@Override
public Health health() {
if (!readinessChecker.isReady()) {
return Health.outOfService()
.withDetail("reason", "Application initializing")
.build();
}
// Check if we can serve traffic (database connection, etc.)
try (Connection conn = dataSource.getConnection()) {
return Health.up()
.withDetail("status", "Ready to serve traffic")
.build();
} catch (Exception e) {
return Health.outOfService()
.withDetail("reason", "Database unavailable")
.withDetail("error", e.getMessage())
.build();
}
}
}
Advanced Health Monitoring Patterns
1. Circuit Breaker Health Integration
@Component
public class CircuitBreakerHealthIndicator implements HealthIndicator {
private final CircuitBreakerRegistry circuitBreakerRegistry;
public CircuitBreakerHealthIndicator(CircuitBreakerRegistry circuitBreakerRegistry) {
this.circuitBreakerRegistry = circuitBreakerRegistry;
}
@Override
public Health health() {
Map<String, Object> details = new HashMap<>();
boolean allHealthy = true;
for (CircuitBreaker circuitBreaker : circuitBreakerRegistry.getAllCircuitBreakers()) {
CircuitBreaker.State state = circuitBreaker.getState();
CircuitBreaker.Metrics metrics = circuitBreaker.getMetrics();
details.put(circuitBreaker.getName() + ".state", state);
details.put(circuitBreaker.getName() + ".failureRate",
metrics.getFailureRate());
if (state == CircuitBreaker.State.OPEN) {
allHealthy = false;
}
}
return allHealthy ?
Health.up().withDetails(details).build() :
Health.down().withDetails(details).build();
}
}
2. Performance-Based Health Checks
@Component
public class PerformanceHealthIndicator implements HealthIndicator {
private final MeterRegistry meterRegistry;
private final PerformanceConfig config;
public PerformanceHealthIndicator(MeterRegistry meterRegistry,
PerformanceConfig config) {
this.meterRegistry = meterRegistry;
this.config = config;
}
@Override
public Health health() {
double errorRate = getErrorRate();
double p99Latency = getP99Latency();
Health.Builder healthBuilder = Health.up();
boolean isHealthy = true;
if (errorRate > config.getMaxErrorRate()) {
healthBuilder.withDetail("errorRate", "Above threshold: " + errorRate);
isHealthy = false;
}
if (p99Latency > config.getMaxLatencyMs()) {
healthBuilder.withDetail("latency", "Above threshold: " + p99Latency + "ms");
isHealthy = false;
}
healthBuilder
.withDetail("currentErrorRate", errorRate)
.withDetail("currentP99Latency", p99Latency + "ms");
return isHealthy ? healthBuilder.build() : Health.down().build();
}
private double getErrorRate() {
// Implementation using MeterRegistry
return meterRegistry.get("http.server.requests")
.tag("outcome", "SERVER_ERROR")
.counter()
.count();
}
}
Best Practices
- Keep Liveness Checks Lightweight: They should not depend on external services
- Make Readiness Checks Comprehensive: Include all critical dependencies
- Set Appropriate Timeouts: Avoid overly aggressive timeouts
- Use Startup Probes for Slow-Starting Applications
- Monitor Your Health Checks: Track health check failures and trends
- Secure Health Endpoints: Consider authentication for sensitive health information
- Provide Meaningful Details: Include relevant metrics and status information
Conclusion
Implementing robust health checks, liveness, and readiness probes is essential for running Java applications in cloud environments. Spring Boot Actuator provides a solid foundation, while custom health indicators allow you to monitor business-specific metrics. By properly configuring these probes in Kubernetes, you enable the platform to automatically handle application failures, rolling deployments, and traffic management, leading to more resilient and self-healing systems.