Intelligent Alert Management: Implementing Opsgenie Alert Routing in Java

Article

In modern DevOps environments, managing alerts effectively is crucial for maintaining system reliability and ensuring the right people are notified at the right time. Opsgenie (now part of Atlassian) is a powerful incident management platform that provides robust alert routing capabilities. By implementing Opsgenie alert routing in your Java applications, you can ensure critical alerts are automatically routed to the appropriate teams based on content, severity, time, and other factors.

This article will explore Opsgenie's alert routing concepts and demonstrate how to implement sophisticated alert routing logic in Java applications.


Understanding Opsgenie Alert Routing

Opsgenie alert routing determines where alerts go based on predefined rules. Key routing concepts include:

  • Routing Rules: Conditions that determine which team receives an alert
  • Teams: Groups of users who handle specific types of alerts
  • Schedules: Define who is on-call when
  • Escalations: Define what happens if alerts aren't acknowledged
  • Integrations: Sources that send alerts to Opsgenie

Setting Up Opsgenie Integration

1. Get Opsgenie API Credentials

  1. Log into your Opsgenie account
  2. Go to Settings → Integration List
  3. Add a new "API" integration or use an existing one
  4. Copy the API Key

2. Maven Dependencies

<properties>
<opsgenie.client.version>2.16.0</opsgenie.client.version>
</properties>
<dependencies>
<!-- Opsgenie Java SDK -->
<dependency>
<groupId>com.opsgenie.integration</groupId>
<artifactId>opsgenie-client</artifactId>
<version>${opsgenie.client.version}</version>
</dependency>
<!-- HTTP client for direct REST API calls -->
<dependency>
<groupId>org.apache.httpcomponents.client5</groupId>
<artifactId>httpclient5</artifactId>
<version>5.2.1</version>
</dependency>
<!-- JSON processing -->
<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-databind</artifactId>
<version>2.15.2</version>
</dependency>
<!-- For Spring Boot applications -->
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>
</dependencies>

Basic Alert Creation with Routing

1. Using Opsgenie Java SDK

@Component
public class BasicOpsgenieClient {
private final String apiKey;
private final String opsgenieUrl = "https://api.opsgenie.com/v2/alerts";
public BasicOpsgenieClient(@Value("${opsgenie.api.key}") String apiKey) {
this.apiKey = apiKey;
}
public void createAlert(String message, String team, String priority) {
try {
HttpHost target = new HttpHost("api.opsgenie.com", 443, "https");
HttpPost request = new HttpPost("/v2/alerts");
// Set headers
request.setHeader("Authorization", "GenieKey " + apiKey);
request.setHeader("Content-Type", "application/json");
// Create alert payload
ObjectMapper mapper = new ObjectMapper();
ObjectNode alertPayload = mapper.createObjectNode();
alertPayload.put("message", message);
alertPayload.put("priority", priority);
alertPayload.put("alias", "alert-" + System.currentTimeMillis());
// Routing - send to specific team
ObjectNode respondersArray = mapper.createObjectNode();
respondersArray.put("name", team);
respondersArray.put("type", "team");
alertPayload.set("responders", mapper.createArrayNode().add(respondersArray));
// Additional details
ObjectNode details = mapper.createObjectNode();
details.put("source", "java-application");
details.put("environment", "production");
alertPayload.set("details", details);
StringEntity entity = new StringEntity(alertPayload.toString());
request.setEntity(entity);
try (CloseableHttpClient httpClient = HttpClients.createDefault();
CloseableHttpResponse response = httpClient.execute(target, request)) {
if (response.getCode() == 202) {
System.out.println("Alert created successfully: " + message);
} else {
System.err.println("Failed to create alert: " + response.getCode());
}
}
} catch (Exception e) {
throw new RuntimeException("Failed to send alert to Opsgenie", e);
}
}
}

Advanced Alert Routing Strategies

1. Content-Based Routing Service

@Service
@Slf4j
public class AlertRoutingService {
private final BasicOpsgenieClient opsgenieClient;
private final Map<String, String> teamMappings;
public AlertRoutingService(BasicOpsgenieClient opsgenieClient) {
this.opsgenieClient = opsgenieClient;
this.teamMappings = initializeTeamMappings();
}
private Map<String, String> initializeTeamMappings() {
Map<String, String> mappings = new HashMap<>();
mappings.put("database", "database-team");
mappings.put("payment", "payments-team");
mappings.put("authentication", "security-team");
mappings.put("api", "backend-team");
mappings.put("frontend", "frontend-team");
mappings.put("infrastructure", "platform-team");
return mappings;
}
public void routeAlert(AlertEvent event) {
String team = determineTargetTeam(event);
String priority = determinePriority(event);
log.info("Routing alert to team: {} with priority: {}", team, priority);
opsgenieClient.createAlert(
event.getMessage(),
team,
priority
);
}
private String determineTargetTeam(AlertEvent event) {
// Rule 1: Check alert tags
for (String tag : event.getTags()) {
if (teamMappings.containsKey(tag.toLowerCase())) {
return teamMappings.get(tag.toLowerCase());
}
}
// Rule 2: Check message content
String message = event.getMessage().toLowerCase();
if (message.contains("database") || message.contains("sql") || message.contains("connection pool")) {
return teamMappings.get("database");
}
if (message.contains("payment") || message.contains("transaction") || message.contains("stripe")) {
return teamMappings.get("payment");
}
if (message.contains("login") || message.contains("auth") || message.contains("jwt")) {
return teamMappings.get("authentication");
}
if (message.contains("kubernetes") || message.contains("pod") || message.contains("node")) {
return teamMappings.get("infrastructure");
}
// Rule 3: Default to platform team
return teamMappings.get("infrastructure");
}
private String determinePriority(AlertEvent event) {
switch (event.getSeverity()) {
case CRITICAL:
return "P1";
case HIGH:
return "P2";
case MEDIUM:
return "P3";
case LOW:
return "P4";
default:
return "P3";
}
}
}
// Supporting data classes
public class AlertEvent {
private String message;
private Severity severity;
private List<String> tags;
private String source;
private Map<String, String> properties;
// Constructors, getters, setters
public AlertEvent(String message, Severity severity, List<String> tags, String source) {
this.message = message;
this.severity = severity;
this.tags = tags != null ? tags : new ArrayList<>();
this.source = source;
this.properties = new HashMap<>();
}
}
public enum Severity {
CRITICAL, HIGH, MEDIUM, LOW
}

2. Time-Based and Schedule-Aware Routing

@Service
public class ScheduleAwareRoutingService {
private final AlertRoutingService alertRoutingService;
public ScheduleAwareRoutingService(AlertRoutingService alertRoutingService) {
this.alertRoutingService = alertRoutingService;
}
public void routeWithScheduleAwareness(AlertEvent event) {
// Check if it's outside business hours
if (isOutsideBusinessHours()) {
// Route to on-call engineer instead of whole team
event.addTag("on-call");
event.setMessage("[ON-CALL] " + event.getMessage());
// Increase priority for after-hours alerts
if (event.getSeverity() == Severity.MEDIUM) {
event.setSeverity(Severity.HIGH);
}
}
// Check if it's weekend
if (isWeekend()) {
event.addTag("weekend");
event.setMessage("[WEEKEND] " + event.getMessage());
}
alertRoutingService.routeAlert(event);
}
private boolean isOutsideBusinessHours() {
Calendar calendar = Calendar.getInstance();
int hour = calendar.get(Calendar.HOUR_OF_DAY);
int dayOfWeek = calendar.get(Calendar.DAY_OF_WEEK);
// Business hours: Mon-Fri, 9 AM - 6 PM
return dayOfWeek == Calendar.SATURDAY || 
dayOfWeek == Calendar.SUNDAY ||
hour < 9 || hour >= 18;
}
private boolean isWeekend() {
Calendar calendar = Calendar.getInstance();
int dayOfWeek = calendar.get(Calendar.DAY_OF_WEEK);
return dayOfWeek == Calendar.SATURDAY || dayOfWeek == Calendar.SUNDAY;
}
}

3. Spring Boot Configuration and Controller

@Configuration
public class OpsgenieConfig {
@Value("${opsgenie.api.key}")
private String apiKey;
@Value("${opsgenie.enabled:true}")
private boolean enabled;
@Bean
public BasicOpsgenieClient opsgenieClient() {
return new BasicOpsgenieClient(apiKey);
}
@Bean
@ConditionalOnProperty(name = "opsgenie.enabled", havingValue = "true")
public AlertRoutingService alertRoutingService() {
return new AlertRoutingService(opsgenieClient());
}
@Bean
@ConditionalOnProperty(name = "opsgenie.enabled", havingValue = "true")
public ScheduleAwareRoutingService scheduleAwareRoutingService() {
return new ScheduleAwareRoutingService(alertRoutingService());
}
}
@RestController
@RequestMapping("/api/alerts")
@Slf4j
public class AlertController {
private final ScheduleAwareRoutingService routingService;
public AlertController(ScheduleAwareRoutingService routingService) {
this.routingService = routingService;
}
@PostMapping
public ResponseEntity<String> createAlert(@RequestBody AlertRequest request) {
try {
log.info("Received alert request: {}", request.getMessage());
AlertEvent event = new AlertEvent(
request.getMessage(),
request.getSeverity(),
request.getTags(),
request.getSource()
);
// Add custom properties
if (request.getProperties() != null) {
event.getProperties().putAll(request.getProperties());
}
routingService.routeWithScheduleAwareness(event);
return ResponseEntity.accepted().body("Alert routed successfully");
} catch (Exception e) {
log.error("Failed to process alert request", e);
return ResponseEntity.status(HttpStatus.INTERNAL_SERVER_ERROR)
.body("Failed to process alert: " + e.getMessage());
}
}
}
// Request DTO
public class AlertRequest {
private String message;
private Severity severity;
private List<String> tags;
private String source;
private Map<String, String> properties;
// Constructors, getters, setters
}

4. Application Configuration

application.yml:

opsgenie:
api:
key: ${OPSGENIE_API_KEY:your-api-key-here}
enabled: true
logging:
level:
com.yourcompany.opsgenie: DEBUG
management:
endpoints:
web:
exposure:
include: health,metrics,info

Advanced Routing Scenarios

1. Circuit Breaker Alert Routing

@Service
@Slf4j
public class CircuitBreakerAlertService {
private final ScheduleAwareRoutingService routingService;
private final Map<String, Long> lastAlertTime = new ConcurrentHashMap<>();
private final long ALERT_COOLDOWN_MS = 300000; // 5 minutes
public CircuitBreakerAlertService(ScheduleAwareRoutingService routingService) {
this.routingService = routingService;
}
public void handleCircuitBreakerEvent(CircuitBreakerEvent event) {
String alertKey = event.getServiceName() + "-" + event.getEventType();
// Check cooldown to avoid alert storms
if (shouldSendAlert(alertKey)) {
AlertEvent alertEvent = createAlertFromCircuitBreakerEvent(event);
routingService.routeWithScheduleAwareness(alertEvent);
lastAlertTime.put(alertKey, System.currentTimeMillis());
}
}
private boolean shouldSendAlert(String alertKey) {
Long lastTime = lastAlertTime.get(alertKey);
return lastTime == null || 
(System.currentTimeMillis() - lastTime) > ALERT_COOLDOWN_MS;
}
private AlertEvent createAlertFromCircuitBreakerEvent(CircuitBreakerEvent event) {
String message = String.format(
"Circuit Breaker %s for service %s - Failures: %d",
event.getEventType().toString(),
event.getServiceName(),
event.getFailureCount()
);
List<String> tags = Arrays.asList(
"circuit-breaker",
event.getServiceName(),
event.getEventType().toString().toLowerCase()
);
Severity severity = event.getEventType() == CircuitBreakerEventType.OPEN ? 
Severity.HIGH : Severity.MEDIUM;
return new AlertEvent(message, severity, tags, "circuit-breaker-monitor");
}
}
// Circuit breaker event classes
public class CircuitBreakerEvent {
private String serviceName;
private CircuitBreakerEventType eventType;
private int failureCount;
// Constructors, getters, setters
}
public enum CircuitBreakerEventType {
OPEN, HALF_OPEN, CLOSED, FORCED_OPEN
}

2. Multi-Region Alert Routing

@Service
public class MultiRegionAlertRouter {
private final AlertRoutingService alertRoutingService;
private final Map<String, String> regionTeams;
public MultiRegionAlertRouter(AlertRoutingService alertRoutingService) {
this.alertRoutingService = alertRoutingService;
this.regionTeams = initializeRegionTeams();
}
private Map<String, String> initializeRegionTeams() {
Map<String, String> teams = new HashMap<>();
teams.put("us-east", "us-platform-team");
teams.put("us-west", "us-platform-team");
teams.put("eu-central", "eu-platform-team");
teams.put("ap-southeast", "apac-platform-team");
return teams;
}
public void routeRegionalAlert(RegionalAlertEvent event) {
String team = regionTeams.getOrDefault(event.getRegion(), "global-platform-team");
// Override team based on region
AlertEvent alertEvent = new AlertEvent(
"[" + event.getRegion().toUpperCase() + "] " + event.getMessage(),
event.getSeverity(),
event.getTags(),
event.getSource()
);
alertEvent.addTag("region:" + event.getRegion());
alertEvent.getProperties().put("region", event.getRegion());
alertEvent.getProperties().put("team", team);
alertRoutingService.routeAlert(alertEvent);
}
}
public class RegionalAlertEvent extends AlertEvent {
private String region;
public RegionalAlertEvent(String message, Severity severity, 
List<String> tags, String source, String region) {
super(message, severity, tags, source);
this.region = region;
}
// Getters, setters
}

Testing the Alert Routing

@SpringBootTest
@TestPropertySource(properties = {
"opsgenie.api.key=test-key",
"opsgenie.enabled=true"
})
class AlertRoutingServiceTest {
@Autowired
private AlertRoutingService alertRoutingService;
@MockBean
private BasicOpsgenieClient opsgenieClient;
@Test
void testDatabaseAlertRouting() {
// Given
AlertEvent event = new AlertEvent(
"Database connection pool exhausted",
Severity.HIGH,
Arrays.asList("database", "production"),
"monitoring-system"
);
// When
alertRoutingService.routeAlert(event);
// Then
verify(opsgenieClient, times(1))
.createAlert(anyString(), eq("database-team"), eq("P2"));
}
@Test
void testPaymentAlertRouting() {
// Given
AlertEvent event = new AlertEvent(
"Payment gateway timeout",
Severity.CRITICAL,
Arrays.asList("payment", "stripe"),
"payment-service"
);
// When
alertRoutingService.routeAlert(event);
// Then
verify(opsgenieClient, times(1))
.createAlert(anyString(), eq("payments-team"), eq("P1"));
}
}

Best Practices for Alert Routing

  1. Avoid Alert Storms: Implement cooldown periods and deduplication
  2. Use Meaningful Tags: Tag alerts with relevant metadata for better routing
  3. Test Routing Rules: Regularly test that alerts route to the correct teams
  4. Monitor Alert Volume: Track alert rates to identify noisy alerts
  5. Implement Escalation Policies: Ensure critical alerts get attention
  6. Use Alert Templates: Standardize alert formats for consistency

Conclusion

Implementing sophisticated Opsgenie alert routing in Java applications enables intelligent incident management that automatically directs alerts to the right teams based on content, severity, time, and other contextual factors. By leveraging the patterns and code examples in this guide, you can build a robust alerting system that reduces mean time to resolution (MTTR) and ensures critical issues receive appropriate attention.

The combination of content-based routing, schedule awareness, and regional routing provides a comprehensive foundation for managing alerts in complex, distributed systems. Remember to continuously refine your routing rules based on incident patterns and team feedback to optimize your alert management workflow.

Leave a Reply

Your email address will not be published. Required fields are marked *


Macro Nepal Helper