Comprehensive Guide to Monitoring Java Applications in Kubernetes

Introduction

Running Java applications in Kubernetes introduces new complexities for monitoring. Traditional VM-based monitoring approaches fall short in dynamic containerized environments where pods are ephemeral and resources are shared. This guide covers the essential patterns, tools, and best practices for effectively monitoring your Java applications in Kubernetes, ensuring you can maintain performance, debug issues, and meet SLOs in production.


Article: Mastering Java Application Monitoring in Kubernetes Environments

Monitoring Java applications in Kubernetes requires a multi-layered approach that combines application-level metrics, JVM insights, container resources, and Kubernetes cluster state. Let's explore the complete monitoring stack.

The Four Pillars of Kubernetes Monitoring for Java

  1. Application Metrics - Business logic, custom metrics
  2. JVM Metrics - Memory, GC, threads, classloading
  3. Container Metrics - CPU, memory, disk I/O
  4. Kubernetes Metrics - Pod status, replicas, services

1. Instrumenting Your Java Application

Using Micrometer for Application Metrics

Micrometer is the de facto standard for Java application metrics in Kubernetes. It provides a vendor-neutral interface that can export to various monitoring systems.

Maven Dependencies:

<dependency>
<groupId>io.micrometer</groupId>
<artifactId>micrometer-core</artifactId>
<version>1.11.5</version>
</dependency>
<dependency>
<groupId>io.micrometer</groupId>
<artifactId>micrometer-registry-prometheus</artifactId>
<version>1.11.5</version>
</dependency>

Spring Boot Configuration:

# application.yaml
management:
endpoints:
web:
exposure:
include: health,info,metrics,prometheus
endpoint:
health:
show-details: always
metrics:
enabled: true
prometheus:
enabled: true
metrics:
export:
prometheus:
enabled: true
distribution:
percentiles-histogram:
http.server.requests: true

Custom Business Metrics:

@Service
public class OrderService {
private final MeterRegistry meterRegistry;
private final Counter orderCreationCounter;
private final Timer orderProcessingTimer;
private final Gauge activeOrdersGauge;
private final AtomicInteger activeOrders = new AtomicInteger(0);
public OrderService(MeterRegistry meterRegistry) {
this.meterRegistry = meterRegistry;
// Counter for tracking order creation
this.orderCreationCounter = Counter.builder("orders.created")
.description("Total number of orders created")
.tag("service", "order-service")
.register(meterRegistry);
// Timer for tracking order processing duration
this.orderProcessingTimer = Timer.builder("orders.processing.time")
.description("Time taken to process orders")
.register(meterRegistry);
// Gauge for active orders
this.activeOrdersGauge = Gauge.builder("orders.active")
.description("Number of active orders being processed")
.register(meterRegistry, activeOrders);
}
public Order createOrder(OrderRequest request) {
activeOrders.incrementAndGet();
return orderProcessingTimer.record(() -> {
try {
// Business logic here
Order order = processOrder(request);
orderCreationCounter.increment();
return order;
} finally {
activeOrders.decrementAndGet();
}
});
}
@Timed(value = "orders.special.process", description = "Time to process special orders")
public Order processSpecialOrder(OrderRequest request) {
// Method automatically timed by @Timed annotation
return processOrder(request);
}
}

2. JVM Monitoring Essentials

JVM Micrometer Configuration:

@Configuration
public class JvmMonitoringConfig {
@Bean
MeterRegistryCustomizer<MeterRegistry> metricsCommonTags() {
return registry -> registry.config().commonTags(
"application", "order-service",
"namespace", System.getenv().getOrDefault("NAMESPACE", "default"),
"pod", System.getenv().getOrDefault("HOSTNAME", "unknown")
);
}
}
// JVM metrics are automatically exposed when using Spring Boot Actuator
// including memory, GC, threads, class loading, etc.

Critical JVM Metrics to Monitor:

  • jvm_memory_used_bytes - Heap and non-heap memory usage
  • jvm_gc_pause_seconds - GC duration and frequency
  • jvm_threads_live - Live thread count
  • jvm_classes_loaded - Class loading statistics
  • process_cpu_usage - CPU utilization

3. Kubernetes Deployment with Monitoring

Deployment with Proper Labels and Probes:

apiVersion: apps/v1
kind: Deployment
metadata:
name: order-service
labels:
app: order-service
version: v1
spec:
replicas: 3
selector:
matchLabels:
app: order-service
template:
metadata:
labels:
app: order-service
version: v1
annotations:
prometheus.io/scrape: "true"
prometheus.io/port: "8080"
prometheus.io/path: "/actuator/prometheus"
spec:
containers:
- name: order-service
image: my-registry/order-service:1.0.0
ports:
- containerPort: 8080
env:
- name: JAVA_OPTS
value: "-XX:+UseG1GC -Xmx512m -Xms512m -XX:MaxRAM=700m"
- name: MANAGEMENT_ENDPOINTS_WEB_EXPOSURE_INCLUDE
value: "health,metrics,prometheus,info"
resources:
requests:
memory: "768Mi"
cpu: "250m"
limits:
memory: "1024Mi"
cpu: "500m"
livenessProbe:
httpGet:
path: /actuator/health/liveness
port: 8080
initialDelaySeconds: 60
periodSeconds: 10
timeoutSeconds: 5
readinessProbe:
httpGet:
path: /actuator/health/readiness
port: 8080
initialDelaySeconds: 30
periodSeconds: 5
timeoutSeconds: 3
startupProbe:
httpGet:
path: /actuator/health/readiness
port: 8080
failureThreshold: 30
periodSeconds: 10

4. Custom Health Indicators

Spring Boot Health Indicators:

@Component
public class DatabaseHealthIndicator implements HealthIndicator {
private final DataSource dataSource;
public DatabaseHealthIndicator(DataSource dataSource) {
this.dataSource = dataSource;
}
@Override
public Health health() {
try (Connection conn = dataSource.getConnection()) {
if (conn.isValid(1000)) {
return Health.up()
.withDetail("database", "Available")
.withDetail("validationQuery", "SUCCESS")
.build();
}
} catch (Exception e) {
return Health.down(e)
.withDetail("database", "Unavailable")
.build();
}
return Health.unknown().build();
}
}
@Component
public class ExternalServiceHealthIndicator implements HealthIndicator {
private final RestTemplate restTemplate;
public ExternalServiceHealthIndicator(RestTemplate restTemplate) {
this.restTemplate = restTemplate;
}
@Override
public Health health() {
try {
ResponseEntity<String> response = restTemplate.getForEntity(
"http://payment-service/actuator/health", String.class);
if (response.getStatusCode().is2xxSuccessful()) {
return Health.up()
.withDetail("payment-service", "Available")
.build();
} else {
return Health.down()
.withDetail("payment-service", "Unhealthy")
.withDetail("statusCode", response.getStatusCodeValue())
.build();
}
} catch (Exception e) {
return Health.down(e)
.withDetail("payment-service", "Unreachable")
.build();
}
}
}

5. Distributed Tracing with Jaeger

Dependencies:

<dependency>
<groupId>io.opentelemetry</groupId>
<artifactId>opentelemetry-api</artifactId>
</dependency>
<dependency>
<groupId>io.micrometer</groupId>
<artifactId>micrometer-tracing-bridge-otel</artifactId>
</dependency>

Configuration:

management:
tracing:
sampling:
probability: 1.0
otlp:
tracing:
endpoint: http://jaeger-collector:4317

6. Logging Configuration for Kubernetes

Structured Logging with Logback:

<!-- logback-spring.xml -->
<configuration>
<springProperty scope="context" name="appName" source="spring.application.name" defaultValue="java-app"/>
<springProperty scope="context" name="namespace" source="kubernetes.namespace" defaultValue="default"/>
<appender name="JSON" class="ch.qos.logback.core.ConsoleAppender">
<encoder class="net.logstash.logback.encoder.LogstashEncoder">
<customFields>{"app":"${appName}","namespace":"${namespace}","pod":"${HOSTNAME}"}</customFields>
</encoder>
</appender>
<root level="INFO">
<appender-ref ref="JSON" />
</root>
</configuration>

Java Logging with MDC:

@RestController
public class OrderController {
private static final Logger logger = LoggerFactory.getLogger(OrderController.class);
@PostMapping("/orders")
public ResponseEntity<Order> createOrder(@RequestBody OrderRequest request) {
// Add contextual information for tracing
MDC.put("orderId", request.getOrderId());
MDC.put("customerId", request.getCustomerId());
logger.info("Creating new order", 
Map.of("items", request.getItems().size(), 
"total", request.getTotalAmount()));
try {
Order order = orderService.createOrder(request);
logger.info("Order created successfully");
return ResponseEntity.ok(order);
} catch (Exception e) {
logger.error("Failed to create order", e);
throw e;
} finally {
MDC.clear();
}
}
}

7. Prometheus Queries for Java in Kubernetes

Critical Alerting Rules:

# prometheus-rules.yaml
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
name: java-app-alerts
labels:
prometheus: k8s
role: alert-rules
spec:
groups:
- name: java-app
rules:
- alert: HighJVMMemoryUsage
expr: (sum by (pod) (container_memory_usage_bytes{pod=~".*", container="order-service"})) / (sum by (pod) (kube_pod_container_resource_limits{pod=~".*", container="order-service", resource="memory"})) > 0.8
for: 5m
labels:
severity: warning
annotations:
summary: "High JVM Memory Usage on {{ $labels.pod }}"
description: "JVM memory usage is above 80% for pod {{ $labels.pod }}"
- alert: GCPScavengeDurationHigh
expr: rate(jvm_gc_pause_seconds_sum{gc="G1 Young Generation"}[5m]) > 0.1
for: 2m
labels:
severity: warning
annotations:
summary: "High GC Pressure on {{ $labels.pod }}"
description: "Garbage collection is taking significant time on {{ $labels.pod }}"
- alert: ApplicationErrorRateHigh
expr: rate(http_server_requests_seconds_count{status=~"5.."}[5m]) / rate(http_server_requests_seconds_count[5m]) > 0.05
for: 3m
labels:
severity: critical
annotations:
summary: "High Error Rate on {{ $labels.pod }}"
description: "5xx error rate is above 5% for pod {{ $labels.pod }}"
- alert: PodRestartFrequently
expr: rate(kube_pod_container_status_restarts_total[1h]) > 0
for: 0m
labels:
severity: warning
annotations:
summary: "Pod {{ $labels.pod }} is restarting frequently"

8. Grafana Dashboards for Java/Kubernetes

Key Dashboard Panels:

  • Application Performance: Request rate, error rate, latency percentiles
  • JVM Metrics: Heap usage, GC duration, thread states
  • Container Resources: CPU, memory, network I/O
  • Business Metrics: Orders processed, active users, cache hit rates

9. Kubernetes HPA with Custom Metrics

Horizontal Pod Autoscaler:

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: order-service-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: order-service
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Pods
pods:
metric:
name: orders_processing_time_seconds
target:
type: AverageValue
averageValue: "500m"

Best Practices Summary

  1. Use Structured Logging - JSON format with proper context
  2. Export JVM Metrics - Enable comprehensive JVM monitoring
  3. Configure Proper Resource Limits - Prevent resource starvation
  4. Implement Meaningful Health Checks - Liveness, readiness, and startup probes
  5. Use Distributed Tracing - For microservices architectures
  6. Set Up Meaningful Alerts - Focus on symptoms, not causes
  7. Monitor Business Metrics - Connect technical metrics to business value
  8. Use Sidecar Pattern - For log aggregation when needed
  9. Label Everything - Consistent labels across metrics and logs
  10. Test Your Monitoring - Regularly verify alerts and dashboards

Conclusion

Monitoring Java applications in Kubernetes requires a holistic approach that combines application insights, JVM metrics, container resources, and Kubernetes cluster state. By implementing the patterns and tools discussed here—Micrometer for metrics, structured logging, proper health checks, and comprehensive alerting—you can gain deep visibility into your Java applications' behavior and performance in Kubernetes.

The key is to start with the fundamentals (application metrics and health checks) and gradually add more sophisticated monitoring (distributed tracing, custom business metrics) as your needs evolve. This layered approach ensures you can effectively troubleshoot issues, optimize performance, and deliver reliable Java applications in your Kubernetes environment.


Call to Action: Start by instrumenting one of your Java applications with Micrometer today. Export basic JVM metrics and set up a simple Grafana dashboard. What performance insights will you uncover about your application's behavior in Kubernetes?

Pyroscope Profiling in Java
Explains how to use Pyroscope for continuous profiling in Java applications, helping developers analyze CPU and memory usage patterns to improve performance and identify bottlenecks.
https://macronepal.com/blog/pyroscope-profiling-in-java/

OpenTelemetry Metrics in Java: Comprehensive Guide
Provides a complete guide to collecting and exporting metrics in Java using OpenTelemetry, including counters, histograms, gauges, and integration with monitoring tools. (MACRO NEPAL)
https://macronepal.com/blog/opentelemetry-metrics-in-java-comprehensive-guide/

OTLP Exporter in Java: Complete Guide for OpenTelemetry
Explains how to configure OTLP exporters in Java to send telemetry data such as traces, metrics, and logs to monitoring systems using HTTP or gRPC protocols. (MACRO NEPAL)
https://macronepal.com/blog/otlp-exporter-in-java-complete-guide-for-opentelemetry/

Thanos Integration in Java: Global View of Metrics
Explains how to integrate Thanos with Java monitoring systems to create a scalable global metrics view across multiple Prometheus instances.

https://macronepal.com/blog/thanos-integration-in-java-global-view-of-metrics

Time Series with InfluxDB in Java: Complete Guide (Version 2)
Explains how to manage time-series data using InfluxDB in Java applications, including storing, querying, and analyzing metrics data.

https://macronepal.com/blog/time-series-with-influxdb-in-java-complete-guide-2

Time Series with InfluxDB in Java: Complete Guide
Provides an overview of integrating InfluxDB with Java for time-series data handling, including monitoring applications and managing performance metrics.

https://macronepal.com/blog/time-series-with-influxdb-in-java-complete-guide

Implementing Prometheus Remote Write in Java (Version 2)
Explains how to configure Java applications to send metrics data to Prometheus-compatible systems using the remote write feature for scalable monitoring.

https://macronepal.com/blog/implementing-prometheus-remote-write-in-java-a-complete-guide-2

Implementing Prometheus Remote Write in Java: Complete Guide
Provides instructions for sending metrics from Java services to Prometheus servers, enabling centralized monitoring and real-time analytics.

https://macronepal.com/blog/implementing-prometheus-remote-write-in-java-a-complete-guide

Building a TileServer GL in Java: Vector and Raster Tile Server
Explains how to build a TileServer GL in Java for serving vector and raster map tiles, useful for geographic visualization and mapping applications.

https://macronepal.com/blog/building-a-tileserver-gl-in-java-vector-and-raster-tile-server

Indoor Mapping in Java
Explains how to create indoor mapping systems in Java, including navigation inside buildings, spatial data handling, and visualization techniques.

Leave a Reply

Your email address will not be published. Required fields are marked *


Macro Nepal Helper