The RED Method: Essential Metrics for Monitoring Java Microservices

In a microservices architecture, traditional server-level metrics like CPU and memory usage are no longer sufficient to understand application health. You might have a JVM that looks perfectly healthy while your service is failing to handle requests. The RED Method provides a simple, powerful framework for monitoring microservices by focusing on what matters most: the requests they handle.

This article explores the RED Method, its implementation in Java using Micrometer and Prometheus, and how to use these metrics for effective observability and alerting.


What is the RED Method?

The RED Method was popularized by Tom Wilkie at Weaveworks. It stands for three key metrics you should monitor for every microservice:

  • Rate - The number of requests per second your service is handling
  • Errors - The number of failed requests per second
  • Duration - The amount of time these requests take, typically as a histogram or percentiles

These three metrics give you a complete picture of your service's health from the client's perspective.

Why RED Matters for Microservices

  • Service-Centric: Focuses on service behavior rather than infrastructure
  • User-Focused: Measures what actually matters to end users
  • Standardized: Provides consistency across different services and teams
  • Actionable: Directly correlates with business impact and SLOs

Implementing RED Metrics in Java

1. Dependencies Setup

Add the necessary dependencies to your pom.xml:

<dependencies>
<!-- Spring Boot Starter -->
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-actuator</artifactId>
</dependency>
<!-- Micrometer for metrics -->
<dependency>
<groupId>io.micrometer</groupId>
<artifactId>micrometer-core</artifactId>
</dependency>
<dependency>
<groupId>io.micrometer</groupId>
<artifactId>micrometer-registry-prometheus</artifactId>
</dependency>
<!-- For custom metrics -->
<dependency>
<groupId>io.micrometer</groupId>
<artifactId>micrometer-observation</artifactId>
</dependency>
</dependencies>

2. Application Configuration

Configure your application.yml to expose RED metrics:

spring:
application:
name: order-service
management:
endpoints:
web:
exposure:
include: health,info,prometheus,metrics
endpoint:
prometheus:
enabled: true
metrics:
export:
prometheus:
enabled: true
distribution:
percentiles-histogram:
http.server.requests: true
percentiles:
- 0.5
- 0.95
- 0.99
tags:
application: ${spring.application.name}
environment: production
region: us-east-1
logging:
level:
io.micrometer: DEBUG

3. Automatic HTTP Metrics with Spring Boot

Spring Boot automatically collects RED metrics for HTTP endpoints via Micrometer. The metrics are exposed at /actuator/prometheus:

# RATE - Requests per second
http_server_requests_seconds_count{
method="GET",
uri="/api/orders",
status="200",
application="order-service"
} 1500.0
# ERRORS - Error count (4xx, 5xx responses)
http_server_requests_seconds_count{
method="GET", 
uri="/api/orders",
status="500",
application="order-service"
} 23.0
# DURATION - Response time percentiles
http_server_requests_seconds{
method="GET",
uri="/api/orders",
status="200",
application="order-service",
percentile="0.95"
} 0.234

4. Custom RED Metrics for Business Operations

For non-HTTP operations or business-level metrics, implement custom RED metrics:

@Component
public class OrderServiceMetrics {
private final MeterRegistry meterRegistry;
private final Counter orderCreationRequests;
private final Counter orderCreationErrors;
private final Timer orderCreationDuration;
public OrderServiceMetrics(MeterRegistry meterRegistry) {
this.meterRegistry = meterRegistry;
// RATE: Count of order creation attempts
this.orderCreationRequests = Counter.builder("order.creation.requests")
.description("Total number of order creation requests")
.tags("application", "order-service", "operation", "create")
.register(meterRegistry);
// ERRORS: Count of failed order creations
this.orderCreationErrors = Counter.builder("order.creation.errors")
.description("Total number of failed order creation requests")
.tags("application", "order-service", "operation", "create")
.register(meterRegistry);
// DURATION: Time taken to create orders
this.orderCreationDuration = Timer.builder("order.creation.duration")
.description("Time taken to create an order")
.tags("application", "order-service", "operation", "create")
.publishPercentiles(0.5, 0.95, 0.99)
.register(meterRegistry);
}
public void recordOrderCreation(Runnable operation) {
orderCreationRequests.increment();
try {
orderCreationDuration.record(operation);
} catch (Exception e) {
orderCreationErrors.increment();
throw e;
}
}
public Timer.Sample startTimer() {
return Timer.start(meterRegistry);
}
public void recordSuccess(Timer.Sample sample) {
sample.stop(orderCreationDuration);
}
public void recordError(Timer.Sample sample) {
sample.stop(orderCreationDuration);
orderCreationErrors.increment();
}
}

5. Service Implementation with RED Metrics

Apply the custom metrics in your service layer:

@Service
@Transactional
public class OrderService {
private static final Logger logger = LoggerFactory.getLogger(OrderService.class);
private final OrderRepository orderRepository;
private final InventoryService inventoryService;
private final OrderServiceMetrics metrics;
public OrderService(OrderRepository orderRepository,
InventoryService inventoryService,
OrderServiceMetrics metrics) {
this.orderRepository = orderRepository;
this.inventoryService = inventoryService;
this.metrics = metrics;
}
public Order createOrder(OrderRequest orderRequest) {
// Method 1: Using lambda wrapper
return metrics.recordOrderCreation(() -> createOrderInternal(orderRequest));
}
public CompletableFuture<Order> createOrderAsync(OrderRequest orderRequest) {
Timer.Sample sample = metrics.startTimer();
return inventoryService.checkAvailability(orderRequest.getItems())
.thenCompose(available -> {
if (!available) {
metrics.recordError(sample);
throw new InventoryException("Items not available");
}
Order order = new Order(orderRequest);
return orderRepository.saveAsync(order);
})
.whenComplete((result, throwable) -> {
if (throwable != null) {
metrics.recordError(sample);
} else {
metrics.recordSuccess(sample);
}
});
}
private Order createOrderInternal(OrderRequest orderRequest) {
// Business logic
if (!inventoryService.checkAvailabilitySync(orderRequest.getItems())) {
throw new InventoryException("Items not available");
}
Order order = new Order(orderRequest);
return orderRepository.save(order);
}
public OrderStatus getOrderStatus(Long orderId) {
Timer.Sample sample = metrics.startTimer();
try {
Order order = orderRepository.findById(orderId)
.orElseThrow(() -> new OrderNotFoundException(orderId));
metrics.recordSuccess(sample);
return order.getStatus();
} catch (Exception e) {
metrics.recordError(sample);
throw e;
}
}
}

6. REST Controller with Enhanced RED Metrics

Create a controller that leverages both automatic and custom metrics:

@RestController
@RequestMapping("/api/orders")
public class OrderController {
private final OrderService orderService;
private final OrderServiceMetrics metrics;
public OrderController(OrderService orderService, OrderServiceMetrics metrics) {
this.orderService = orderService;
this.metrics = metrics;
}
@PostMapping
public ResponseEntity<OrderResponse> createOrder(@Valid @RequestBody OrderRequest request) {
try {
Order order = orderService.createOrder(request);
return ResponseEntity.status(HttpStatus.CREATED)
.body(OrderResponse.from(order));
} catch (InventoryException e) {
logger.warn("Inventory check failed for order: {}", e.getMessage());
throw new ResponseStatusException(HttpStatus.CONFLICT, e.getMessage(), e);
}
}
@GetMapping("/{orderId}/status")
public ResponseEntity<OrderStatusResponse> getOrderStatus(@PathVariable Long orderId) {
OrderStatus status = orderService.getOrderStatus(orderId);
return ResponseEntity.ok(new OrderStatusResponse(orderId, status));
}
@GetMapping("/metrics/demo")
public ResponseEntity<Map<String, String>> generateMetrics() {
// Demo endpoint to generate various metric scenarios
Random random = new Random();
if (random.nextInt(100) < 10) { // 10% error rate
throw new RuntimeException("Simulated error for metrics demo");
}
try {
// Simulate variable processing time
Thread.sleep(random.nextInt(1000));
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
}
return ResponseEntity.ok(Map.of(
"status", "success",
"message", "Metric demo completed"
));
}
}

Prometheus Queries for RED Metrics

1. Rate Queries

# Requests per second (Rate)
rate(http_server_requests_seconds_count{application="order-service", uri="/api/orders"}[5m])
# Business operation rate
rate(order_creation_requests_total{application="order-service"}[5m])

2. Error Queries

# Error rate (4xx, 5xx responses)
rate(http_server_requests_seconds_count{application="order-service", status=~"5.."}[5m])
# Error percentage
(rate(http_server_requests_seconds_count{application="order-service", status=~"5.."}[5m]) 
/ 
rate(http_server_requests_seconds_count{application="order-service"}[5m])) * 100
# Business error rate
rate(order_creation_errors_total{application="order-service"}[5m])

3. Duration Queries

# 95th percentile response time
http_server_requests_seconds{application="order-service", uri="/api/orders", percentile="0.95"}
# Average response time
rate(http_server_requests_seconds_sum{application="order-service", uri="/api/orders"}[5m])
/
rate(http_server_requests_seconds_count{application="order-service", uri="/api/orders"}[5m])
# Business operation duration
order_creation_duration_seconds{application="order-service", percentile="0.95"}

Alerting Rules Based on RED Metrics

Create Prometheus alerting rules to monitor your service health:

groups:
- name: order_service_red_alerts
rules:
# High Error Rate Alert
- alert: OrderServiceHighErrorRate
expr: |
(
rate(http_server_requests_seconds_count{application="order-service", status=~"5.."}[5m])
/
rate(http_server_requests_seconds_count{application="order-service"}[5m])
) * 100 > 5
for: 2m
labels:
severity: critical
service: order-service
annotations:
summary: "High error rate in Order Service"
description: "Error rate is {{ $value }}%, exceeding 5% threshold"
# High Latency Alert
- alert: OrderServiceHighLatency
expr: |
http_server_requests_seconds{application="order-service", percentile="0.95"} > 1.0
for: 3m
labels:
severity: warning
service: order-service
annotations:
summary: "High latency in Order Service"
description: "95th percentile latency is {{ $value }}s"
# Traffic Drop Alert
- alert: OrderServiceTrafficDrop
expr: |
rate(http_server_requests_seconds_count{application="order-service"}[10m]) * 60 < 1
for: 5m
labels:
severity: warning
service: order-service
annotations:
summary: "Traffic drop detected in Order Service"
description: "Request rate has dropped to {{ $value }} requests/minute"

Grafana Dashboard for RED Metrics

Create a comprehensive RED dashboard with these panels:

Rate Panel

  • Requests per second by endpoint
  • Traffic growth trends
  • Peak/off-peak patterns

Errors Panel

  • Error rate percentage
  • Error types breakdown (4xx vs 5xx)
  • Error correlation with traffic spikes

Duration Panel

  • Response time percentiles (50th, 95th, 99th)
  • Latency distribution histogram
  • Duration trends over time

Best Practices for RED Method

  1. Consistent Tagging: Use consistent tags (application, environment, region) across all services
  2. Meaningful Percentiles: Track 50th, 95th, and 99th percentiles for duration
  3. SLO Alignment: Base alerts on your Service Level Objectives
  4. Cross-Service Correlation: Include upstream/downstream service information in metrics
  5. Business Context: Add business-specific metrics alongside technical RED metrics

Conclusion

The RED Method provides a simple yet powerful framework for monitoring Java microservices. By focusing on Rate, Errors, and Duration, you gain immediate insight into your service's health from the user's perspective.

Key benefits:

  • Early Problem Detection: Spot issues before they impact users
  • Standardized Monitoring: Consistent approach across all services
  • Actionable Metrics: Direct correlation with user experience
  • SLO Compliance: Easy alignment with service level objectives

Implementing RED metrics with Micrometer and Prometheus in your Java microservices creates a robust observability foundation that scales with your architecture. Combined with effective alerting and dashboards, it ensures you can maintain reliability and performance as your system grows in complexity.

Leave a Reply

Your email address will not be published. Required fields are marked *


Macro Nepal Helper