PromQL (Prometheus Query Language) is a powerful functional query language that enables you to query and aggregate time series data collected by Prometheus. When integrated with Java applications, it provides deep insights into application performance, business metrics, and system behavior.
Core PromQL Concepts
Basic Data Types
- Instant Vector: Set of time series with a single timestamp
- Range Vector: Set of time series over a range of time
- Scalar: Simple numeric floating-point value
- String: Simple string value
Integration with Java Metrics
Dependencies
<!-- Micrometer Prometheus Registry --> <dependency> <groupId>io.micrometer</groupId> <artifactId>micrometer-registry-prometheus</artifactId> <version>1.11.0</version> </dependency> <!-- Prometheus Java Client --> <dependency> <groupId>io.prometheus</groupId> <artifactId>simpleclient</artifactId> <version>0.16.0</version> </dependency> <dependency> <groupId>io.prometheus</groupId> <artifactId>simpleclient_httpserver</artifactId> <version>0.16.0</version> </dependency> <!-- For advanced queries --> <dependency> <groupId>io.prometheus</groupId> <artifactId>simpleclient_query</artifactId> <version>0.16.0</version> </dependency>
Basic PromQL Query Implementation
Example 1: PromQL Client for Java Applications
@Component
@Slf4j
public class PromQLClient {
private final PrometheusHttpClient prometheusClient;
private final ObjectMapper objectMapper;
public PromQLClient(@Value("${prometheus.url:http://localhost:9090}") String prometheusUrl,
ObjectMapper objectMapper) {
this.prometheusClient = new PrometheusHttpClient(prometheusUrl);
this.objectMapper = objectMapper;
}
public QueryResult executeQuery(String query) {
return executeQuery(query, null);
}
public QueryResult executeQuery(String query, Long time) {
try {
Map<String, String> params = new HashMap<>();
params.put("query", query);
if (time != null) {
params.put("time", time.toString());
}
String response = prometheusClient.get("/api/v1/query", params);
return objectMapper.readValue(response, QueryResult.class);
} catch (Exception e) {
log.error("Failed to execute PromQL query: {}", query, e);
throw new PromQLException("Query execution failed: " + e.getMessage(), e);
}
}
public QueryResult executeRangeQuery(String query, Long start, Long end, String step) {
try {
Map<String, String> params = new HashMap<>();
params.put("query", query);
params.put("start", start.toString());
params.put("end", end.toString());
params.put("step", step);
String response = prometheusClient.get("/api/v1/query_range", params);
return objectMapper.readValue(response, QueryResult.class);
} catch (Exception e) {
log.error("Failed to execute PromQL range query: {}", query, e);
throw new PromQLException("Range query execution failed: " + e.getMessage(), e);
}
}
public List<MetricSeries> queryMetricSeries(String metricName, Map<String, String> labels) {
StringBuilder queryBuilder = new StringBuilder(metricName);
if (labels != null && !labels.isEmpty()) {
queryBuilder.append("{");
List<String> labelFilters = labels.entrySet().stream()
.map(entry -> entry.getKey() + "=\"" + entry.getValue() + "\"")
.collect(Collectors.toList());
queryBuilder.append(String.join(",", labelFilters));
queryBuilder.append("}");
}
QueryResult result = executeQuery(queryBuilder.toString());
return extractMetricSeries(result);
}
private List<MetricSeries> extractMetricSeries(QueryResult result) {
List<MetricSeries> series = new ArrayList<>();
if (result != null && result.getData() != null && result.getData().getResult() != null) {
for (QueryResult.ResultItem item : result.getData().getResult()) {
MetricSeries metricSeries = new MetricSeries();
metricSeries.setMetric(item.getMetric());
List<DataPoint> dataPoints = new ArrayList<>();
if (item.getValues() != null) {
for (List<Object> value : item.getValues()) {
DataPoint dataPoint = new DataPoint();
dataPoint.setTimestamp((Double) value.get(0));
dataPoint.setValue(value.get(1).toString());
dataPoints.add(dataPoint);
}
} else if (item.getValue() != null) {
DataPoint dataPoint = new DataPoint();
dataPoint.setTimestamp((Double) item.getValue().get(0));
dataPoint.setValue(item.getValue().get(1).toString());
dataPoints.add(dataPoint);
}
metricSeries.setDataPoints(dataPoints);
series.add(metricSeries);
}
}
return series;
}
// Response DTOs
@Data
public static class QueryResult {
private String status;
private Data data;
private String errorType;
private String error;
@Data
public static class Data {
private String resultType;
private List<ResultItem> result;
}
@Data
public static class ResultItem {
private Map<String, String> metric;
private List<Object> value;
private List<List<Object>> values;
}
}
@Data
public static class MetricSeries {
private Map<String, String> metric;
private List<DataPoint> dataPoints;
}
@Data
public static class DataPoint {
private double timestamp;
private String value;
}
}
// Custom exception
class PromQLException extends RuntimeException {
public PromQLException(String message) {
super(message);
}
public PromQLException(String message, Throwable cause) {
super(message, cause);
}
}
// HTTP Client for Prometheus
@Component
@Slf4j
public class PrometheusHttpClient {
private final String baseUrl;
private final RestTemplate restTemplate;
public PrometheusHttpClient(@Value("${prometheus.url:http://localhost:9090}") String baseUrl) {
this.baseUrl = baseUrl;
this.restTemplate = new RestTemplate();
this.restTemplate.setErrorHandler(new DefaultResponseErrorHandler() {
@Override
public boolean hasError(ClientHttpResponse response) throws IOException {
return response.getStatusCode().is5xxServerError();
}
});
}
public String get(String endpoint, Map<String, String> params) {
try {
UriComponentsBuilder builder = UriComponentsBuilder.fromHttpUrl(baseUrl + endpoint);
if (params != null) {
params.forEach(builder::queryParam);
}
String url = builder.toUriString();
log.debug("Executing Prometheus query: {}", url);
ResponseEntity<String> response = restTemplate.getForEntity(url, String.class);
return response.getBody();
} catch (Exception e) {
log.error("HTTP request failed for endpoint: {}", endpoint, e);
throw new PromQLException("HTTP request failed: " + e.getMessage(), e);
}
}
}
Common PromQL Patterns for Java Applications
Example 2: Application Performance Queries
@Service
@Slf4j
public class ApplicationPerformanceQueries {
private final PromQLClient promQLClient;
public ApplicationPerformanceQueries(PromQLClient promQLClient) {
this.promQLClient = promQLClient;
}
// CPU and Memory Usage
public List<MetricSeries> getJvmMemoryUsage(String application, String memoryPool) {
String query = String.format(
"jvm_memory_used_bytes{application=\"%s\", area=\"heap\"} / " +
"jvm_memory_max_bytes{application=\"%s\", area=\"heap\"} * 100",
application, application);
if (memoryPool != null) {
query = String.format(
"jvm_memory_used_bytes{application=\"%s\", area=\"heap\", pool=\"%s\"} / " +
"jvm_memory_max_bytes{application=\"%s\", area=\"heap\", pool=\"%s\"} * 100",
application, memoryPool, application, memoryPool);
}
return promQLClient.executeRangeQuery(query,
System.currentTimeMillis() / 1000 - 3600, // 1 hour ago
System.currentTimeMillis() / 1000, // now
"30s" // 30 second intervals
);
}
public List<MetricSeries> getGarbageCollectionTime(String application, String gcType) {
String query = String.format(
"rate(jvm_gc_pause_seconds_sum{application=\"%s\"}[5m]) / " +
"rate(jvm_gc_pause_seconds_count{application=\"%s\"}[5m])",
application, application);
if (gcType != null) {
query = String.format(
"rate(jvm_gc_pause_seconds_sum{application=\"%s\", gc=\"%s\"}[5m]) / " +
"rate(jvm_gc_pause_seconds_count{application=\"%s\", gc=\"%s\"}[5m])",
application, gcType, application, gcType);
}
return promQLClient.executeQuery(query);
}
// HTTP Request Metrics
public List<MetricSeries> getRequestRate(String application, String endpoint) {
String query = String.format(
"rate(http_server_requests_seconds_count{application=\"%s\"}[5m])",
application);
if (endpoint != null) {
query = String.format(
"rate(http_server_requests_seconds_count{application=\"%s\", uri=\"%s\"}[5m])",
application, endpoint);
}
return promQLClient.executeRangeQuery(query,
System.currentTimeMillis() / 1000 - 1800, // 30 minutes ago
System.currentTimeMillis() / 1000,
"15s"
);
}
public List<MetricSeries> getErrorRate(String application) {
String query = String.format(
"rate(http_server_requests_seconds_count{application=\"%s\", status=~\"5..\"}[5m]) / " +
"rate(http_server_requests_seconds_count{application=\"%s\"}[5m]) * 100",
application, application);
return promQLClient.executeRangeQuery(query,
System.currentTimeMillis() / 1000 - 3600,
System.currentTimeMillis() / 1000,
"1m"
);
}
public List<MetricSeries> getResponseTimePercentile(String application, double percentile) {
String query = String.format(
"histogram_quantile(%.2f, " +
"rate(http_server_requests_seconds_bucket{application=\"%s\"}[5m]))",
percentile, application);
return promQLClient.executeRangeQuery(query,
System.currentTimeMillis() / 1000 - 3600,
System.currentTimeMillis() / 1000,
"1m"
);
}
// Database Metrics
public List<MetricSeries> getDatabaseConnectionPoolUsage(String application) {
String query = String.format(
"hikaricp_connections_active{application=\"%s\"} / " +
"hikaricp_connections_max{application=\"%s\"} * 100",
application, application);
return promQLClient.executeRangeQuery(query,
System.currentTimeMillis() / 1000 - 1800,
System.currentTimeMillis() / 1000,
"30s"
);
}
public List<MetricSeries> getDatabaseQueryRate(String application) {
String query = String.format(
"rate(hikaricp_connections_usage_seconds_count{application=\"%s\"}[5m])",
application);
return promQLClient.executeRangeQuery(query,
System.currentTimeMillis() / 1000 - 3600,
System.currentTimeMillis() / 1000,
"1m"
);
}
// Custom Business Metrics
public List<MetricSeries> getBusinessTransactionRate(String application, String transactionType) {
String query = String.format(
"rate(business_transactions_total{application=\"%s\", type=\"%s\"}[5m])",
application, transactionType);
return promQLClient.executeRangeQuery(query,
System.currentTimeMillis() / 1000 - 7200, // 2 hours ago
System.currentTimeMillis() / 1000,
"2m"
);
}
public List<MetricSeries> getBusinessTransactionDuration(String application, String transactionType) {
String query = String.format(
"rate(business_transaction_duration_seconds_sum{application=\"%s\", type=\"%s\"}[5m]) / " +
"rate(business_transaction_duration_seconds_count{application=\"%s\", type=\"%s\"}[5m])",
application, transactionType, application, transactionType);
return promQLClient.executeRangeQuery(query,
System.currentTimeMillis() / 1000 - 7200,
System.currentTimeMillis() / 1000,
"2m"
);
}
// System Health Indicators
public Map<String, Object> getApplicationHealthSummary(String application) {
Map<String, Object> healthSummary = new HashMap<>();
try {
// Check error rate
List<MetricSeries> errorRate = getErrorRate(application);
double currentErrorRate = getLatestValue(errorRate);
healthSummary.put("errorRate", currentErrorRate);
// Check response time
List<MetricSeries> p95ResponseTime = getResponseTimePercentile(application, 0.95);
double currentP95 = getLatestValue(p95ResponseTime);
healthSummary.put("p95ResponseTime", currentP95);
// Check memory usage
List<MetricSeries> memoryUsage = getJvmMemoryUsage(application, null);
double currentMemoryUsage = getLatestValue(memoryUsage);
healthSummary.put("memoryUsage", currentMemoryUsage);
// Determine overall health status
String status = determineHealthStatus(currentErrorRate, currentP95, currentMemoryUsage);
healthSummary.put("status", status);
healthSummary.put("timestamp", System.currentTimeMillis());
} catch (Exception e) {
log.error("Failed to get health summary for application: {}", application, e);
healthSummary.put("status", "UNKNOWN");
healthSummary.put("error", e.getMessage());
}
return healthSummary;
}
private double getLatestValue(List<MetricSeries> series) {
if (series == null || series.isEmpty()) {
return 0.0;
}
MetricSeries firstSeries = series.get(0);
if (firstSeries.getDataPoints() == null || firstSeries.getDataPoints().isEmpty()) {
return 0.0;
}
// Get the most recent data point
DataPoint latestPoint = firstSeries.getDataPoints().get(firstSeries.getDataPoints().size() - 1);
return Double.parseDouble(latestPoint.getValue());
}
private String determineHealthStatus(double errorRate, double p95ResponseTime, double memoryUsage) {
if (errorRate > 10.0) { // More than 10% error rate
return "CRITICAL";
} else if (errorRate > 5.0) { // More than 5% error rate
return "WARNING";
} else if (p95ResponseTime > 5.0) { // P95 response time > 5 seconds
return "WARNING";
} else if (memoryUsage > 90.0) { // Memory usage > 90%
return "WARNING";
} else {
return "HEALTHY";
}
}
}
Advanced PromQL Analysis
Example 3: Trend Analysis and Anomaly Detection
@Service
@Slf4j
public class TrendAnalysisService {
private final PromQLClient promQLClient;
public TrendAnalysisService(PromQLClient promQLClient) {
this.promQLClient = promQLClient;
}
// Trend Analysis
public TrendAnalysisResult analyzeRequestTrend(String application, int days) {
long endTime = System.currentTimeMillis() / 1000;
long startTime = endTime - (days * 24 * 60 * 60);
// Get request rate data
String query = String.format(
"rate(http_server_requests_seconds_count{application=\"%s\"}[5m])",
application);
List<MetricSeries> series = promQLClient.executeRangeQuery(
query, startTime, endTime, "1h");
return analyzeTrend(series, "request_rate");
}
public TrendAnalysisResult analyzeMemoryTrend(String application, int days) {
long endTime = System.currentTimeMillis() / 1000;
long startTime = endTime - (days * 24 * 60 * 60);
// Get memory usage data
String query = String.format(
"jvm_memory_used_bytes{application=\"%s\", area=\"heap\"} / " +
"jvm_memory_max_bytes{application=\"%s\", area=\"heap\"} * 100",
application, application);
List<MetricSeries> series = promQLClient.executeRangeQuery(
query, startTime, endTime, "1h");
return analyzeTrend(series, "memory_usage");
}
private TrendAnalysisResult analyzeTrend(List<MetricSeries> series, String metricName) {
TrendAnalysisResult result = new TrendAnalysisResult();
result.setMetricName(metricName);
result.setAnalysisTime(System.currentTimeMillis());
if (series == null || series.isEmpty()) {
result.setStatus("NO_DATA");
return result;
}
MetricSeries dataSeries = series.get(0);
List<DataPoint> dataPoints = dataSeries.getDataPoints();
if (dataPoints == null || dataPoints.size() < 2) {
result.setStatus("INSUFFICIENT_DATA");
return result;
}
// Calculate basic statistics
DescriptiveStatistics stats = new DescriptiveStatistics();
dataPoints.forEach(point ->
stats.addValue(Double.parseDouble(point.getValue())));
result.setCurrentValue(stats.getMax()); // Latest value
result.setAverage(stats.getMean());
result.setMinValue(stats.getMin());
result.setMaxValue(stats.getMax());
result.setStandardDeviation(stats.getStandardDeviation());
// Calculate trend
double trend = calculateLinearTrend(dataPoints);
result.setTrend(trend);
result.setTrendDirection(trend > 0 ? "INCREASING" : trend < 0 ? "DECREASING" : "STABLE");
// Detect anomalies
List<Anomaly> anomalies = detectAnomalies(dataPoints, stats.getMean(), stats.getStandardDeviation());
result.setAnomalies(anomalies);
result.setAnomalyCount(anomalies.size());
// Set status based on analysis
result.setStatus(determineTrendStatus(result));
return result;
}
private double calculateLinearTrend(List<DataPoint> dataPoints) {
if (dataPoints.size() < 2) return 0.0;
SimpleRegression regression = new SimpleRegression();
for (int i = 0; i < dataPoints.size(); i++) {
DataPoint point = dataPoints.get(i);
regression.addData(i, Double.parseDouble(point.getValue()));
}
return regression.getSlope();
}
private List<Anomaly> detectAnomalies(List<DataPoint> dataPoints, double mean, double stdDev) {
List<Anomaly> anomalies = new ArrayList<>();
double threshold = mean + (2 * stdDev); // 2 standard deviations
for (DataPoint point : dataPoints) {
double value = Double.parseDouble(point.getValue());
if (value > threshold) {
Anomaly anomaly = new Anomaly();
anomaly.setTimestamp(point.getTimestamp());
anomaly.setValue(value);
anomaly.setDeviation(value - mean);
anomalies.add(anomaly);
}
}
return anomalies;
}
private String determineTrendStatus(TrendAnalysisResult result) {
if (result.getAnomalyCount() > 3) {
return "ANOMALOUS";
} else if (Math.abs(result.getTrend()) > result.getStandardDeviation() * 0.5) {
return "TRENDING";
} else {
return "STABLE";
}
}
// Correlation Analysis
public CorrelationResult analyzeCorrelation(String application, String metric1, String metric2) {
long endTime = System.currentTimeMillis() / 1000;
long startTime = endTime - (24 * 60 * 60); // 24 hours
// Get both metrics
List<MetricSeries> series1 = promQLClient.executeRangeQuery(
metric1 + "{application=\"" + application + "\"}",
startTime, endTime, "5m");
List<MetricSeries> series2 = promQLClient.executeRangeQuery(
metric2 + "{application=\"" + application + "\"}",
startTime, endTime, "5m");
return calculateCorrelation(series1, series2, metric1, metric2);
}
private CorrelationResult calculateCorrelation(List<MetricSeries> series1,
List<MetricSeries> series2,
String metric1, String metric2) {
CorrelationResult result = new CorrelationResult();
result.setMetric1(metric1);
result.setMetric2(metric2);
result.setAnalysisTime(System.currentTimeMillis());
if (series1 == null || series2 == null ||
series1.isEmpty() || series2.isEmpty()) {
result.setCorrelation(0.0);
result.setStrength("NO_DATA");
return result;
}
// Align data points by timestamp
Map<Double, Double> alignedData = alignDataPoints(series1.get(0), series2.get(0));
if (alignedData.size() < 10) {
result.setCorrelation(0.0);
result.setStrength("INSUFFICIENT_DATA");
return result;
}
// Calculate Pearson correlation
double correlation = calculatePearsonCorrelation(alignedData);
result.setCorrelation(correlation);
result.setStrength(determineCorrelationStrength(Math.abs(correlation)));
return result;
}
private Map<Double, Double> alignDataPoints(MetricSeries series1, MetricSeries series2) {
Map<Double, Double> aligned = new HashMap<>();
Map<Double, Double> points1 = createTimestampMap(series1);
Map<Double, Double> points2 = createTimestampMap(series2);
// Find common timestamps
for (Double timestamp : points1.keySet()) {
if (points2.containsKey(timestamp)) {
aligned.put(points1.get(timestamp), points2.get(timestamp));
}
}
return aligned;
}
private Map<Double, Double> createTimestampMap(MetricSeries series) {
Map<Double, Double> map = new HashMap<>();
if (series.getDataPoints() != null) {
for (DataPoint point : series.getDataPoints()) {
map.put(point.getTimestamp(), Double.parseDouble(point.getValue()));
}
}
return map;
}
private double calculatePearsonCorrelation(Map<Double, Double> data) {
double[] x = data.keySet().stream().mapToDouble(Double::doubleValue).toArray();
double[] y = data.values().stream().mapToDouble(Double::doubleValue).toArray();
return new PearsonsCorrelation().correlation(x, y);
}
private String determineCorrelationStrength(double correlation) {
if (correlation >= 0.8) return "STRONG";
if (correlation >= 0.5) return "MODERATE";
if (correlation >= 0.3) return "WEAK";
return "NONE";
}
// Data classes for analysis results
@Data
public static class TrendAnalysisResult {
private String metricName;
private long analysisTime;
private double currentValue;
private double average;
private double minValue;
private double maxValue;
private double standardDeviation;
private double trend;
private String trendDirection;
private List<Anomaly> anomalies;
private int anomalyCount;
private String status;
}
@Data
public static class Anomaly {
private double timestamp;
private double value;
private double deviation;
}
@Data
public static class CorrelationResult {
private String metric1;
private String metric2;
private long analysisTime;
private double correlation;
private String strength;
}
}
Alerting and Notification
Example 4: PromQL-based Alerting System
@Service
@Slf4j
public class PromQLAlertingService {
private final PromQLClient promQLClient;
private final List<AlertRule> alertRules;
private final ScheduledExecutorService alertExecutor;
private final Map<String, AlertState> alertStates;
public PromQLAlertingService(PromQLClient promQLClient) {
this.promQLClient = promQLClient;
this.alertRules = loadAlertRules();
this.alertExecutor = Executors.newScheduledThreadPool(3);
this.alertStates = new ConcurrentHashMap<>();
startAlertMonitoring();
}
private List<AlertRule> loadAlertRules() {
// Load alert rules from configuration
return List.of(
new AlertRule("high_error_rate",
"rate(http_server_requests_seconds_count{status=~\"5..\"}[5m]) / " +
"rate(http_server_requests_seconds_count[5m]) * 100 > 5",
"Error rate exceeds 5%", "CRITICAL", 300),
new AlertRule("high_memory_usage",
"jvm_memory_used_bytes{area=\"heap\"} / " +
"jvm_memory_max_bytes{area=\"heap\"} * 100 > 90",
"JVM heap memory usage exceeds 90%", "WARNING", 180),
new AlertRule("high_response_time",
"histogram_quantile(0.95, rate(http_server_requests_seconds_bucket[5m])) > 2",
"95th percentile response time exceeds 2 seconds", "WARNING", 300),
new AlertRule("service_down",
"up == 0",
"Service is down", "CRITICAL", 60)
);
}
private void startAlertMonitoring() {
// Check alerts every 30 seconds
alertExecutor.scheduleAtFixedRate(this::checkAllAlerts, 0, 30, TimeUnit.SECONDS);
}
public void checkAllAlerts() {
for (AlertRule rule : alertRules) {
try {
checkAlert(rule);
} catch (Exception e) {
log.error("Failed to check alert rule: {}", rule.getName(), e);
}
}
}
private void checkAlert(AlertRule rule) {
QueryResult result = promQLClient.executeQuery(rule.getQuery());
if (result != null && result.getData() != null &&
result.getData().getResult() != null &&
!result.getData().getResult().isEmpty()) {
// Alert condition is true
handleAlertTriggered(rule, result);
} else {
// Alert condition is false
handleAlertResolved(rule);
}
}
private void handleAlertTriggered(AlertRule rule, QueryResult result) {
String alertKey = rule.getName();
AlertState currentState = alertStates.get(alertKey);
if (currentState == null) {
// First time alert triggered
currentState = new AlertState();
currentState.setRule(rule);
currentState.setFirstTriggered(System.currentTimeMillis());
currentState.setTriggerCount(1);
alertStates.put(alertKey, currentState);
} else {
// Update existing alert state
currentState.setTriggerCount(currentState.getTriggerCount() + 1);
currentState.setLastTriggered(System.currentTimeMillis());
}
// Check if we should send notification (respect cooldown period)
long timeSinceLastNotification = System.currentTimeMillis() - currentState.getLastNotified();
if (timeSinceLastNotification > rule.getCooldownSeconds() * 1000) {
sendAlertNotification(rule, result, currentState);
currentState.setLastNotified(System.currentTimeMillis());
}
log.warn("Alert triggered: {} - {}", rule.getName(), rule.getDescription());
}
private void handleAlertResolved(AlertRule rule) {
String alertKey = rule.getName();
AlertState currentState = alertStates.get(alertKey);
if (currentState != null && currentState.isActive()) {
// Alert was previously active, now resolved
sendResolvedNotification(rule, currentState);
alertStates.remove(alertKey);
log.info("Alert resolved: {}", rule.getName());
}
}
private void sendAlertNotification(AlertRule rule, QueryResult result, AlertState state) {
AlertNotification notification = new AlertNotification();
notification.setAlertName(rule.getName());
notification.setSeverity(rule.getSeverity());
notification.setDescription(rule.getDescription());
notification.setTriggeredAt(System.currentTimeMillis());
notification.setTriggerCount(state.getTriggerCount());
notification.setQueryResult(result);
// Send notification via various channels
sendEmailNotification(notification);
sendSlackNotification(notification);
logAlertToFile(notification);
log.warn("Alert notification sent: {} (trigger count: {})",
rule.getName(), state.getTriggerCount());
}
private void sendResolvedNotification(AlertRule rule, AlertState state) {
ResolvedNotification notification = new ResolvedNotification();
notification.setAlertName(rule.getName());
notification.setDescription(rule.getDescription());
notification.setResolvedAt(System.currentTimeMillis());
notification.setDuration(System.currentTimeMillis() - state.getFirstTriggered());
// Send resolved notification
sendEmailNotification(notification);
log.info("Resolved notification sent: {}", rule.getName());
}
private void sendEmailNotification(Object notification) {
// Implementation for email notification
log.debug("Sending email notification: {}", notification);
}
private void sendSlackNotification(Object notification) {
// Implementation for Slack notification
log.debug("Sending Slack notification: {}", notification);
}
private void logAlertToFile(AlertNotification notification) {
// Implementation for file logging
log.debug("Logging alert to file: {}", notification);
}
public List<ActiveAlert> getActiveAlerts() {
return alertStates.values().stream()
.filter(AlertState::isActive)
.map(state -> {
ActiveAlert alert = new ActiveAlert();
alert.setRule(state.getRule());
alert.setFirstTriggered(state.getFirstTriggered());
alert.setLastTriggered(state.getLastTriggered());
alert.setTriggerCount(state.getTriggerCount());
return alert;
})
.collect(Collectors.toList());
}
// Data classes for alerting
@Data
public static class AlertRule {
private final String name;
private final String query;
private final String description;
private final String severity;
private final long cooldownSeconds;
}
@Data
public static class AlertState {
private AlertRule rule;
private long firstTriggered;
private long lastTriggered;
private long lastNotified;
private int triggerCount;
public boolean isActive() {
return firstTriggered > 0;
}
}
@Data
public static class AlertNotification {
private String alertName;
private String severity;
private String description;
private long triggeredAt;
private int triggerCount;
private QueryResult queryResult;
}
@Data
public static class ResolvedNotification {
private String alertName;
private String description;
private long resolvedAt;
private long duration;
}
@Data
public static class ActiveAlert {
private AlertRule rule;
private long firstTriggered;
private long lastTriggered;
private int triggerCount;
}
@PreDestroy
public void cleanup() {
alertExecutor.shutdown();
try {
if (!alertExecutor.awaitTermination(30, TimeUnit.SECONDS)) {
alertExecutor.shutdownNow();
}
} catch (InterruptedException e) {
alertExecutor.shutdownNow();
Thread.currentThread().interrupt();
}
}
}
Integration with Spring Boot Actuator
Example 5: Spring Boot Integration
@Configuration
@EnableConfigurationProperties(PrometheusProperties.class)
@Slf4j
public class PrometheusConfiguration {
@Bean
public MeterRegistryCustomizer<MeterRegistry> metricsCommonTags(
@Value("${spring.application.name:application}") String applicationName) {
return registry -> registry.config().commonTags(
"application", applicationName,
"environment", System.getProperty("environment", "development")
);
}
@Bean
public PrometheusMeterRegistry prometheusMeterRegistry(PrometheusProperties properties) {
PrometheusConfig config = new PrometheusConfig() {
@Override
public String get(String key) {
return null;
}
@Override
public boolean descriptions() {
return properties.isDescriptions();
}
};
CollectorRegistry collectorRegistry = new CollectorRegistry();
PrometheusMeterRegistry registry = new PrometheusMeterRegistry(config, collectorRegistry, Clock.SYSTEM);
// Add custom metrics
registerCustomMetrics(registry);
return registry;
}
@Bean
public PromQLClient promQLClient(PrometheusProperties properties) {
return new PromQLClient(properties.getUrl(), new ObjectMapper());
}
@Bean
public ApplicationPerformanceQueries performanceQueries(PromQLClient promQLClient) {
return new ApplicationPerformanceQueries(promQLClient);
}
@Bean
public TrendAnalysisService trendAnalysisService(PromQLClient promQLClient) {
return new TrendAnalysisService(promQLClient);
}
@Bean
@ConditionalOnProperty(name = "prometheus.alerting.enabled", havingValue = "true")
public PromQLAlertingService alertingService(PromQLClient promQLClient) {
return new PromQLAlertingService(promQLClient);
}
private void registerCustomMetrics(MeterRegistry registry) {
// Custom business metrics
Counter.builder("business_transactions_total")
.description("Total number of business transactions")
.tag("type", "order")
.register(registry);
Timer.builder("business_transaction_duration_seconds")
.description("Business transaction duration in seconds")
.register(registry);
Gauge.builder("active_users")
.description("Number of active users")
.register(registry, this, config -> getActiveUserCount());
DistributionSummary.builder("order_value")
.description("Distribution of order values")
.baseUnit("USD")
.register(registry);
}
private double getActiveUserCount() {
// Implementation to get active user count
return Math.random() * 1000;
}
}
// Configuration properties
@ConfigurationProperties(prefix = "prometheus")
@Data
public class PrometheusProperties {
private String url = "http://localhost:9090";
private boolean descriptions = true;
private Alerting alerting = new Alerting();
@Data
public static class Alerting {
private boolean enabled = false;
private long checkInterval = 30;
private Notification notification = new Notification();
}
@Data
public static class Notification {
private Email email = new Email();
private Slack slack = new Slack();
}
@Data
public static class Email {
private boolean enabled = false;
private String from;
private List<String> to = new ArrayList<>();
}
@Data
public static class Slack {
private boolean enabled = false;
private String webhookUrl;
private String channel;
}
}
// REST Controller for metrics and alerts
@RestController
@RequestMapping("/api/metrics")
@Slf4j
public class MetricsController {
private final ApplicationPerformanceQueries performanceQueries;
private final TrendAnalysisService trendAnalysisService;
private final PromQLAlertingService alertingService;
public MetricsController(ApplicationPerformanceQueries performanceQueries,
TrendAnalysisService trendAnalysisService,
@Autowired(required = false) PromQLAlertingService alertingService) {
this.performanceQueries = performanceQueries;
this.trendAnalysisService = trendAnalysisService;
this.alertingService = alertingService;
}
@GetMapping("/health/{application}")
public ResponseEntity<Map<String, Object>> getApplicationHealth(@PathVariable String application) {
try {
Map<String, Object> health = performanceQueries.getApplicationHealthSummary(application);
return ResponseEntity.ok(health);
} catch (Exception e) {
log.error("Failed to get application health for: {}", application, e);
return ResponseEntity.status(HttpStatus.INTERNAL_SERVER_ERROR).build();
}
}
@GetMapping("/trend/{application}")
public ResponseEntity<TrendAnalysisService.TrendAnalysisResult> getApplicationTrend(
@PathVariable String application,
@RequestParam(defaultValue = "7") int days) {
try {
TrendAnalysisService.TrendAnalysisResult trend =
trendAnalysisService.analyzeRequestTrend(application, days);
return ResponseEntity.ok(trend);
} catch (Exception e) {
log.error("Failed to get application trend for: {}", application, e);
return ResponseEntity.status(HttpStatus.INTERNAL_SERVER_ERROR).build();
}
}
@GetMapping("/alerts/active")
public ResponseEntity<List<PromQLAlertingService.ActiveAlert>> getActiveAlerts() {
if (alertingService == null) {
return ResponseEntity.ok(List.of());
}
try {
List<PromQLAlertingService.ActiveAlert> activeAlerts = alertingService.getActiveAlerts();
return ResponseEntity.ok(activeAlerts);
} catch (Exception e) {
log.error("Failed to get active alerts", e);
return ResponseEntity.status(HttpStatus.INTERNAL_SERVER_ERROR).build();
}
}
@PostMapping("/query")
public ResponseEntity<PromQLClient.QueryResult> executeQuery(@RequestBody QueryRequest request) {
try {
PromQLClient.QueryResult result;
if (request.isRangeQuery()) {
result = performanceQueries.getPromQLClient().executeRangeQuery(
request.getQuery(),
request.getStart(),
request.getEnd(),
request.getStep()
);
} else {
result = performanceQueries.getPromQLClient().executeQuery(
request.getQuery(),
request.getTime()
);
}
return ResponseEntity.ok(result);
} catch (Exception e) {
log.error("Failed to execute query: {}", request.getQuery(), e);
return ResponseEntity.status(HttpStatus.INTERNAL_SERVER_ERROR).build();
}
}
}
@Data
class QueryRequest {
private String query;
private boolean rangeQuery = false;
private Long start;
private Long end;
private String step;
private Long time;
}
Best Practices
Common PromQL Patterns for Java Applications
@Service
@Slf4j
public class PromQLBestPractices {
private final PromQLClient promQLClient;
public PromQLBestPractices(PromQLClient promQLClient) {
this.promQLClient = promQLClient;
}
// 1. Rate and Increase Patterns
public List<MetricSeries> getRequestRateWithLabels(String application, Map<String, String> labels) {
StringBuilder query = new StringBuilder(
"rate(http_server_requests_seconds_count{application=\"" + application + "\"");
if (labels != null && !labels.isEmpty()) {
query.append(",");
List<String> labelFilters = labels.entrySet().stream()
.map(entry -> entry.getKey() + "=\"" + entry.getValue() + "\"")
.collect(Collectors.toList());
query.append(String.join(",", labelFilters));
}
query.append("}[5m])");
return promQLClient.executeQuery(query.toString());
}
// 2. Histogram Quantiles for Latency
public List<MetricSeries> getLatencyQuantiles(String application, List<Double> quantiles) {
List<MetricSeries> results = new ArrayList<>();
for (Double quantile : quantiles) {
String query = String.format(
"histogram_quantile(%.2f, rate(http_server_requests_seconds_bucket{application=\"%s\"}[5m]))",
quantile, application);
List<MetricSeries> series = promQLClient.executeQuery(query);
if (!series.isEmpty()) {
results.addAll(series);
}
}
return results;
}
// 3. Grouping and Aggregation
public List<MetricSeries> getAggregatedMetrics(String application, String aggregation) {
String query = String.format(
"%s(rate(http_server_requests_seconds_count{application=\"%s\"}[5m]))",
aggregation, application);
return promQLClient.executeQuery(query);
}
// 4. Joining Multiple Metrics
public List<MetricSeries> getErrorRateByEndpoint(String application) {
String query = String.format(
"rate(http_server_requests_seconds_count{application=\"%s\", status=~\"5..\"}[5m]) / " +
"rate(http_server_requests_seconds_count{application=\"%s\"}[5m]) * 100",
application, application);
return promQLClient.executeQuery(query);
}
// 5. Predicting Trends
public List<MetricSeries> predictMemoryUsage(String application) {
String query = String.format(
"predict_linear(jvm_memory_used_bytes{application=\"%s\", area=\"heap\"}[1h], 3600)",
application);
return promQLClient.executeQuery(query);
}
}
Conclusion
PromQL provides powerful capabilities for querying and analyzing Java application metrics:
Key Benefits:
- Real-time Analysis: Immediate insights into application performance
- Historical Trends: Long-term pattern recognition and capacity planning
- Alerting: Proactive notification of issues
- Correlation Analysis: Understanding relationships between different metrics
Common Use Cases:
- Performance Monitoring: Response times, throughput, error rates
- Resource Utilization: Memory, CPU, database connections
- Business Metrics: Transaction volumes, user activity, revenue
- Capacity Planning: Trend analysis and prediction
- Troubleshooting: Root cause analysis and correlation
Best Practices:
- Use appropriate time ranges for different types of analysis
- Leverage rate() and increase() for counter metrics
- Utilize histogram quantiles for latency analysis
- Implement meaningful alerting with proper cooldown periods
- Monitor query performance and optimize as needed
PromQL transforms raw metrics into actionable insights, enabling comprehensive observability and proactive management of Java applications.