Safe Deployment Strategies: Implementing Progressive Rollout in Java

Article

Progressive rollout is a deployment strategy that gradually releases new software versions to users, minimizing risk by limiting exposure to potential issues. Unlike big-bang deployments, progressive rollout allows you to validate changes with small user segments before expanding to the entire user base. This article explores how to implement sophisticated progressive rollout mechanisms in Java applications.


Understanding Progressive Rollout

Key Concepts:

  • Gradual Exposure: Slowly increase traffic to new versions
  • Health Validation: Continuously monitor system health during rollout
  • Automatic Rollback: Revert changes when issues are detected
  • Multiple Dimensions: Rollout based on user segments, regions, or other attributes

Architecture Overview

A progressive rollout system typically includes:

  1. Feature Toggle Service: Manages feature availability
  2. Traffic Router: Directs requests to appropriate versions
  3. Metrics Collector: Gathers performance and business metrics
  4. Rollout Manager: Controls rollout progression and rollbacks
  5. Analysis Engine: Evaluates rollout health

Implementing Progressive Rollout in Java

1. Core Dependencies

<properties>
<micrometer.version>1.11.5</micrometer.version>
<resilience4j.version>2.1.0</resilience4j.version>
</properties>
<dependencies>
<!-- Metrics and monitoring -->
<dependency>
<groupId>io.micrometer</groupId>
<artifactId>micrometer-core</artifactId>
<version>${micrometer.version}</version>
</dependency>
<dependency>
<groupId>io.micrometer</groupId>
<artifactId>micrometer-registry-prometheus</artifactId>
<version>${micrometer.version}</version>
</dependency>
<!-- Circuit breaker for rollout control -->
<dependency>
<groupId>io.github.resilience4j</groupId>
<artifactId>resilience4j-spring-boot3</artifactId>
<version>${resilience4j.version}</version>
</dependency>
<!-- Configuration management -->
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-actuator</artifactId>
</dependency>
</dependencies>

2. Rollout Configuration Model

@ConfigurationProperties(prefix = "rollout")
@Data
public class RolloutConfig {
private boolean enabled = false;
private Duration evaluationWindow = Duration.ofMinutes(5);
private int minimumRequests = 1000;
private Map<String, ServiceConfig> services = new HashMap<>();
@Data
public static class ServiceConfig {
private double initialTrafficPercentage = 1.0;
private double maxTrafficPercentage = 100.0;
private double incrementStep = 10.0;
private Duration stepDuration = Duration.ofMinutes(5);
private HealthCheckConfig healthCheck = new HealthCheckConfig();
private List<RolloutConstraint> constraints = new ArrayList<>();
}
@Data
public static class HealthCheckConfig {
private double errorRateThreshold = 2.0;
private double p99LatencyThreshold = 500.0;
private double businessMetricThreshold = -5.0;
private int consecutiveSuccessesRequired = 3;
}
@Data
public static class RolloutConstraint {
private ConstraintType type;
private String field;
private List<String> values;
private double percentage;
}
public enum ConstraintType {
USER_SEGMENT, REGION, ENVIRONMENT, CUSTOM_ATTRIBUTE
}
}
@Data
@AllArgsConstructor
public class RolloutDefinition {
private String id;
private String serviceName;
private String version;
private RolloutStatus status;
private Instant startTime;
private double currentTrafficPercentage;
private Map<String, Object> metadata;
private List<RolloutConstraint> constraints;
public RolloutDefinition(String serviceName, String version) {
this.serviceName = serviceName;
this.version = version;
this.id = generateId(serviceName, version);
this.status = RolloutStatus.PENDING;
this.startTime = Instant.now();
this.currentTrafficPercentage = 0.0;
this.metadata = new HashMap<>();
this.constraints = new ArrayList<>();
}
private String generateId(String serviceName, String version) {
return serviceName + "-" + version + "-" + Instant.now().toEpochMilli();
}
}
enum RolloutStatus {
PENDING, RUNNING, PAUSED, COMPLETED, ROLLED_BACK, FAILED
}

3. Advanced Traffic Routing

@Component
@Slf4j
public class ProgressiveRolloutRouter {
private final RolloutConfig config;
private final MeterRegistry meterRegistry;
private final Random random = new Random();
private final Map<String, RolloutDefinition> activeRollouts = new ConcurrentHashMap<>();
private final Map<String, RolloutMetrics> rolloutMetrics = new ConcurrentHashMap<>();
public ProgressiveRolloutRouter(RolloutConfig config, MeterRegistry meterRegistry) {
this.config = config;
this.meterRegistry = meterRegistry;
}
public <T> T route(String serviceName, String operation, String userId,
Supplier<T> oldVersion, Supplier<T> newVersion) {
return route(serviceName, operation, userId, null, oldVersion, newVersion);
}
public <T> T route(String serviceName, String operation, String userId,
Map<String, String> context, 
Supplier<T> oldVersion, Supplier<T> newVersion) {
if (!config.isEnabled()) {
return oldVersion.get();
}
RolloutDefinition rollout = activeRollouts.get(serviceName);
if (rollout == null || rollout.getStatus() != RolloutStatus.RUNNING) {
return oldVersion.get();
}
if (shouldRouteToNewVersion(serviceName, userId, context, rollout)) {
return executeWithMetrics(serviceName, operation, newVersion, "new", rollout.getVersion());
} else {
return executeWithMetrics(serviceName, operation, oldVersion, "old", "baseline");
}
}
private <T> T executeWithMetrics(String serviceName, String operation, 
Supplier<T> supplier, String versionType, String version) {
Timer.Sample sample = Timer.start(meterRegistry);
String rolloutId = activeRollouts.get(serviceName) != null ? 
activeRollouts.get(serviceName).getId() : "none";
try {
T result = supplier.get();
recordSuccess(serviceName, operation, versionType, version, rolloutId);
return result;
} catch (Exception e) {
recordError(serviceName, operation, versionType, version, rolloutId, e);
throw e;
} finally {
sample.stop(Timer.builder("rollout.operation.duration")
.tags("service", serviceName, "operation", operation, 
"version_type", versionType, "version", version)
.register(meterRegistry));
}
}
private boolean shouldRouteToNewVersion(String serviceName, String userId, 
Map<String, String> context,
RolloutDefinition rollout) {
// Check traffic percentage
if (!isWithinTrafficPercentage(rollout)) {
return false;
}
// Check constraints
if (!satisfiesConstraints(userId, context, rollout)) {
return false;
}
return true;
}
private boolean isWithinTrafficPercentage(RolloutDefinition rollout) {
double randomValue = random.nextDouble() * 100;
return randomValue < rollout.getCurrentTrafficPercentage();
}
private boolean satisfiesConstraints(String userId, Map<String, String> context, 
RolloutDefinition rollout) {
if (rollout.getConstraints().isEmpty()) {
return true;
}
for (RolloutConstraint constraint : rollout.getConstraints()) {
if (!satisfiesConstraint(userId, context, constraint)) {
return false;
}
}
return true;
}
private boolean satisfiesConstraint(String userId, Map<String, String> context, 
RolloutConstraint constraint) {
switch (constraint.getType()) {
case USER_SEGMENT:
return isUserInSegment(userId, constraint);
case REGION:
return isInRegion(context, constraint);
case ENVIRONMENT:
return isInEnvironment(context, constraint);
case CUSTOM_ATTRIBUTE:
return satisfiesCustomAttribute(context, constraint);
default:
return true;
}
}
private boolean isUserInSegment(String userId, RolloutConstraint constraint) {
if (userId == null) return false;
// Simple hash-based user segmentation
int userHash = Math.abs(userId.hashCode()) % 100;
return userHash < constraint.getPercentage();
}
private boolean isInRegion(Map<String, String> context, RolloutConstraint constraint) {
String region = context != null ? context.get("region") : null;
return region != null && constraint.getValues().contains(region);
}
private boolean isInEnvironment(Map<String, String> context, RolloutConstraint constraint) {
String environment = context != null ? context.get("environment") : "production";
return constraint.getValues().contains(environment);
}
private boolean satisfiesCustomAttribute(Map<String, String> context, RolloutConstraint constraint) {
if (context == null || constraint.getField() == null) return false;
String value = context.get(constraint.getField());
return value != null && constraint.getValues().contains(value);
}
private void recordSuccess(String serviceName, String operation, 
String versionType, String version, String rolloutId) {
meterRegistry.counter("rollout.operation.success",
"service", serviceName, "operation", operation,
"version_type", versionType, "version", version,
"rollout_id", rolloutId).increment();
updateRolloutMetrics(serviceName, versionType, true);
}
private void recordError(String serviceName, String operation, 
String versionType, String version, String rolloutId, Exception error) {
meterRegistry.counter("rollout.operation.error",
"service", serviceName, "operation", operation,
"version_type", versionType, "version", version,
"rollout_id", rolloutId, "error_type", error.getClass().getSimpleName()).increment();
updateRolloutMetrics(serviceName, versionType, false);
}
private void updateRolloutMetrics(String serviceName, String versionType, boolean success) {
String key = serviceName + ":" + versionType;
RolloutMetrics metrics = rolloutMetrics.computeIfAbsent(key, k -> new RolloutMetrics());
if (success) {
metrics.recordSuccess();
} else {
metrics.recordError();
}
}
public void startRollout(RolloutDefinition rollout) {
ServiceConfig serviceConfig = config.getServices().get(rollout.getServiceName());
if (serviceConfig != null) {
rollout.setCurrentTrafficPercentage(serviceConfig.getInitialTrafficPercentage());
}
rollout.setStatus(RolloutStatus.RUNNING);
activeRollouts.put(rollout.getServiceName(), rollout);
log.info("Started rollout {} for service {} at {}% traffic", 
rollout.getId(), rollout.getServiceName(), rollout.getCurrentTrafficPercentage());
}
public void updateTrafficPercentage(String serviceName, double percentage) {
RolloutDefinition rollout = activeRollouts.get(serviceName);
if (rollout != null) {
double oldPercentage = rollout.getCurrentTrafficPercentage();
rollout.setCurrentTrafficPercentage(percentage);
log.info("Updated traffic percentage for {}: {}% -> {}%", 
serviceName, oldPercentage, percentage);
}
}
public void completeRollout(String serviceName) {
RolloutDefinition rollout = activeRollouts.remove(serviceName);
if (rollout != null) {
rollout.setStatus(RolloutStatus.COMPLETED);
log.info("Completed rollout for service: {}", serviceName);
}
}
public void rollbackRollout(String serviceName) {
RolloutDefinition rollout = activeRollouts.remove(serviceName);
if (rollout != null) {
rollout.setStatus(RolloutStatus.ROLLED_BACK);
log.warn("Rolled back rollout for service: {}", serviceName);
}
}
public Optional<RolloutDefinition> getActiveRollout(String serviceName) {
return Optional.ofNullable(activeRollouts.get(serviceName));
}
public RolloutMetrics getMetrics(String serviceName, String versionType) {
return rolloutMetrics.get(serviceName + ":" + versionType);
}
}
@Data
class RolloutMetrics {
private long successCount = 0;
private long errorCount = 0;
private Instant windowStart = Instant.now();
public void recordSuccess() { successCount++; }
public void recordError() { errorCount++; }
public double getErrorRate() {
long total = successCount + errorCount;
return total > 0 ? (errorCount * 100.0) / total : 0.0;
}
public long getTotalRequests() {
return successCount + errorCount;
}
public void reset() {
successCount = 0;
errorCount = 0;
windowStart = Instant.now();
}
}

4. Rollout Manager with Health Analysis

@Service
@Slf4j
public class ProgressiveRolloutManager {
private final ProgressiveRolloutRouter router;
private final RolloutConfig config;
private final MeterRegistry meterRegistry;
private final ScheduledExecutorService scheduler;
private final Map<String, RolloutDefinition> rollouts = new ConcurrentHashMap<>();
private final Map<String, RolloutHealth> rolloutHealth = new ConcurrentHashMap<>();
public ProgressiveRolloutManager(ProgressiveRolloutRouter router,
RolloutConfig config,
MeterRegistry meterRegistry) {
this.router = router;
this.config = config;
this.meterRegistry = meterRegistry;
this.scheduler = Executors.newScheduledThreadPool(3);
}
public RolloutDefinition startRollout(String serviceName, String version, 
List<RolloutConstraint> constraints,
Map<String, Object> metadata) {
RolloutDefinition rollout = new RolloutDefinition(serviceName, version);
rollout.setConstraints(constraints != null ? constraints : new ArrayList<>());
rollout.setMetadata(metadata != null ? metadata : new HashMap<>());
rollouts.put(rollout.getId(), rollout);
router.startRollout(rollout);
// Schedule health monitoring
scheduler.scheduleAtFixedRate(() -> 
monitorRolloutHealth(rollout.getId()), 1, 1, TimeUnit.MINUTES);
// Schedule progressive traffic increase
scheduler.scheduleAtFixedRate(() -> 
progressRollout(rollout.getId()), 2, 2, TimeUnit.MINUTES);
log.info("Started progressive rollout: {}", rollout.getId());
return rollout;
}
private void monitorRolloutHealth(String rolloutId) {
RolloutDefinition rollout = rollouts.get(rolloutId);
if (rollout == null || rollout.getStatus() != RolloutStatus.RUNNING) {
return;
}
RolloutHealth health = analyzeRolloutHealth(rollout);
rolloutHealth.put(rolloutId, health);
log.debug("Rollout health for {}: healthy={}, errorRate={}%", 
rolloutId, health.isHealthy(), health.getErrorRate());
if (!health.isHealthy() && shouldRollback(rollout, health)) {
performRollback(rollout, health);
}
}
private void progressRollout(String rolloutId) {
RolloutDefinition rollout = rollouts.get(rolloutId);
if (rollout == null || rollout.getStatus() != RolloutStatus.RUNNING) {
return;
}
RolloutHealth health = rolloutHealth.get(rolloutId);
if (health == null || !health.isHealthy()) {
return;
}
ServiceConfig serviceConfig = config.getServices().get(rollout.getServiceName());
if (serviceConfig == null) {
return;
}
double currentTraffic = rollout.getCurrentTrafficPercentage();
double maxTraffic = serviceConfig.getMaxTrafficPercentage();
double increment = serviceConfig.getIncrementStep();
if (currentTraffic < maxTraffic && health.getConsecutiveHealthyChecks() >= 3) {
double newTraffic = Math.min(currentTraffic + increment, maxTraffic);
router.updateTrafficPercentage(rollout.getServiceName(), newTraffic);
rollout.setCurrentTrafficPercentage(newTraffic);
log.info("Increased traffic for {} to {}%", rolloutId, newTraffic);
if (newTraffic >= maxTraffic) {
completeRollout(rollout);
}
}
}
private RolloutHealth analyzeRolloutHealth(RolloutDefinition rollout) {
RolloutMetrics newVersionMetrics = router.getMetrics(rollout.getServiceName(), "new");
RolloutMetrics oldVersionMetrics = router.getMetrics(rollout.getServiceName(), "old");
if (newVersionMetrics == null || newVersionMetrics.getTotalRequests() < config.getMinimumRequests()) {
return RolloutHealth.insufficientData();
}
double errorRate = newVersionMetrics.getErrorRate();
ServiceConfig serviceConfig = config.getServices().get(rollout.getServiceName());
double errorThreshold = serviceConfig != null ? 
serviceConfig.getHealthCheck().getErrorRateThreshold() : 5.0;
boolean healthy = errorRate <= errorThreshold;
String message = String.format("Error rate: %.2f%%", errorRate);
return new RolloutHealth(healthy, message, errorRate, 
newVersionMetrics.getTotalRequests());
}
private boolean shouldRollback(RolloutDefinition rollout, RolloutHealth health) {
ServiceConfig serviceConfig = config.getServices().get(rollout.getServiceName());
if (serviceConfig == null) return true;
return health.getErrorRate() > serviceConfig.getHealthCheck().getErrorRateThreshold();
}
private void performRollback(RolloutDefinition rollout, RolloutHealth health) {
router.rollbackRollout(rollout.getServiceName());
rollout.setStatus(RolloutStatus.ROLLED_BACK);
log.warn("Rolled back rollout {} due to health issues: {}", 
rollout.getId(), health.getMessage());
}
private void completeRollout(RolloutDefinition rollout) {
router.completeRollout(rollout.getServiceName());
rollout.setStatus(RolloutStatus.COMPLETED);
log.info("Completed rollout: {}", rollout.getId());
}
public Optional<RolloutDefinition> getRollout(String rolloutId) {
return Optional.ofNullable(rollouts.get(rolloutId));
}
public List<RolloutDefinition> getActiveRollouts() {
return rollouts.values().stream()
.filter(r -> r.getStatus() == RolloutStatus.RUNNING)
.collect(Collectors.toList());
}
public void pauseRollout(String rolloutId) {
RolloutDefinition rollout = rollouts.get(rolloutId);
if (rollout != null) {
rollout.setStatus(RolloutStatus.PAUSED);
log.info("Paused rollout: {}", rolloutId);
}
}
public void resumeRollout(String rolloutId) {
RolloutDefinition rollout = rollouts.get(rolloutId);
if (rollout != null && rollout.getStatus() == RolloutStatus.PAUSED) {
rollout.setStatus(RolloutStatus.RUNNING);
log.info("Resumed rollout: {}", rolloutId);
}
}
}
@Data
@AllArgsConstructor
class RolloutHealth {
private boolean healthy;
private String message;
private double errorRate;
private long requestCount;
private int consecutiveHealthyChecks;
public RolloutHealth(boolean healthy, String message, double errorRate, long requestCount) {
this.healthy = healthy;
this.message = message;
this.errorRate = errorRate;
this.requestCount = requestCount;
this.consecutiveHealthyChecks = healthy ? 1 : 0;
}
public static RolloutHealth insufficientData() {
return new RolloutHealth(true, "Insufficient data for analysis", 0.0, 0, 0);
}
}

5. Spring Boot Integration

@RestController
@RequestMapping("/api/rollout")
@Slf4j
public class RolloutController {
private final ProgressiveRolloutManager rolloutManager;
private final ProgressiveRolloutRouter router;
public RolloutController(ProgressiveRolloutManager rolloutManager,
ProgressiveRolloutRouter router) {
this.rolloutManager = rolloutManager;
this.router = router;
}
@PostMapping
public ResponseEntity<RolloutDefinition> startRollout(
@RequestBody StartRolloutRequest request) {
RolloutDefinition rollout = rolloutManager.startRollout(
request.getServiceName(),
request.getVersion(),
request.getConstraints(),
request.getMetadata()
);
return ResponseEntity.accepted().body(rollout);
}
@GetMapping("/{rolloutId}")
public ResponseEntity<RolloutDefinition> getRollout(@PathVariable String rolloutId) {
return rolloutManager.getRollout(rolloutId)
.map(ResponseEntity::ok)
.orElse(ResponseEntity.notFound().build());
}
@PostMapping("/{rolloutId}/pause")
public ResponseEntity<Void> pauseRollout(@PathVariable String rolloutId) {
rolloutManager.pauseRollout(rolloutId);
return ResponseEntity.accepted().build();
}
@PostMapping("/{rolloutId}/resume")
public ResponseEntity<Void> resumeRollout(@PathVariable String rolloutId) {
rolloutManager.resumeRollout(rolloutId);
return ResponseEntity.accepted().build();
}
@PostMapping("/{rolloutId}/rollback")
public ResponseEntity<Void> rollbackRollout(@PathVariable String rolloutId) {
rolloutManager.getRollout(rolloutId).ifPresent(rollout -> {
router.rollbackRollout(rollout.getServiceName());
});
return ResponseEntity.accepted().build();
}
@GetMapping
public ResponseEntity<List<RolloutDefinition>> getActiveRollouts() {
return ResponseEntity.ok(rolloutManager.getActiveRollouts());
}
}
@Data
class StartRolloutRequest {
@NotBlank
private String serviceName;
@NotBlank
private String version;
private List<RolloutConstraint> constraints = new ArrayList<>();
private Map<String, Object> metadata = new HashMap<>();
}

6. Service Integration Example

@Service
public class OrderService {
private final ProgressiveRolloutRouter router;
private final OrderRepositoryV1 oldRepository;
private final OrderRepositoryV2 newRepository;
public OrderService(ProgressiveRolloutRouter router,
OrderRepositoryV1 oldRepository,
OrderRepositoryV2 newRepository) {
this.router = router;
this.oldRepository = oldRepository;
this.newRepository = newRepository;
}
public Order findOrder(String orderId, String userId) {
Map<String, String> context = createContext(userId);
return router.route("order-service", "findOrder", userId, context,
() -> oldRepository.findOrder(orderId),
() -> newRepository.findOrder(orderId)
);
}
public Order createOrder(Order order, String userId) {
Map<String, String> context = createContext(userId);
return router.route("order-service", "createOrder", userId, context,
() -> oldRepository.createOrder(order),
() -> newRepository.createOrder(order)
);
}
private Map<String, String> createContext(String userId) {
Map<String, String> context = new HashMap<>();
context.put("region", getUserRegion(userId));
context.put("environment", getCurrentEnvironment());
context.put("user_tier", getUserTier(userId));
return context;
}
private String getUserRegion(String userId) {
// Determine user's region based on user ID or other data
int hash = Math.abs(userId.hashCode()) % 3;
return switch (hash) {
case 0 -> "us-east";
case 1 -> "eu-west";
case 2 -> "ap-southeast";
default -> "global";
};
}
private String getCurrentEnvironment() {
// In real implementation, get from configuration
return "production";
}
private String getUserTier(String userId) {
// Determine user tier (free, premium, enterprise)
int hash = Math.abs(userId.hashCode()) % 10;
if (hash < 7) return "free";
if (hash < 9) return "premium";
return "enterprise";
}
}

7. Configuration

application.yml:

rollout:
enabled: true
evaluation-window: 5m
minimum-requests: 1000
services:
order-service:
initial-traffic-percentage: 1.0
max-traffic-percentage: 100.0
increment-step: 10.0
step-duration: 5m
health-check:
error-rate-threshold: 2.0
p99-latency-threshold: 500.0
consecutive-successes-required: 3
constraints:
- type: USER_SEGMENT
percentage: 100.0
- type: REGION
values: ["us-east", "eu-west"]
- type: ENVIRONMENT
values: ["staging", "production"]
management:
endpoints:
web:
exposure:
include: health,metrics,prometheus,rollout
metrics:
export:
prometheus:
enabled: true

Testing the Implementation

@SpringBootTest
@TestPropertySource(properties = {
"rollout.enabled=true",
"rollout.services.order-service.initial-traffic-percentage=50.0"
})
class ProgressiveRolloutRouterTest {
@Autowired
private ProgressiveRolloutRouter router;
@MockBean
private MeterRegistry meterRegistry;
@Test
void testUserSegmentation() {
// Given
String userId = "user-123";
Supplier<String> oldVersion = () -> "old";
Supplier<String> newVersion = () -> "new";
// When
String result = router.route("test-service", "testOp", userId, oldVersion, newVersion);
// Then - should route based on user segmentation
assertThat(result).isIn("old", "new");
}
}

Best Practices

  1. Start Conservatively: Begin with 1% traffic and small increments
  2. Monitor Business Metrics: Track conversion rates, revenue impact
  3. Set Meaningful Timeouts: Ensure health checks complete promptly
  4. Implement Manual Controls: Allow operator intervention
  5. Log Comprehensive Data: Maintain audit trails of rollout decisions
  6. Test Rollback Procedures: Ensure smooth reversion when needed
  7. Use Multiple Dimensions: Combine user segments, regions, and other attributes

Conclusion

Implementing progressive rollout in Java provides a robust framework for safe, controlled deployment of new features and services. This implementation offers:

  • Gradual traffic increase with configurable steps
  • Multi-dimensional routing based on user attributes and context
  • Comprehensive health monitoring with automatic rollback
  • Flexible constraints for targeted rollouts
  • Real-time metrics collection for informed decisions

By adopting this progressive rollout framework, you can significantly reduce deployment risks, validate changes in production safely, and build confidence in your release processes while maintaining system stability and user satisfaction.

Leave a Reply

Your email address will not be published. Required fields are marked *


Macro Nepal Helper