Java's Project Loom introduced virtual threads as a groundbreaking feature in Java 21, promising to revolutionize how we write concurrent applications. While virtual threads make high-throughput concurrent programming dramatically simpler, achieving optimal performance requires understanding their characteristics and applying appropriate tuning strategies. This article explores practical performance tuning techniques for virtual threads in Java.
Virtual Threads: A Quick Recap
Virtual threads are lightweight threads managed by the Java runtime rather than the operating system. Unlike platform threads, you can have millions of virtual threads simultaneously without overwhelming the system.
Key Characteristics:
- Lightweight: Minimal memory footprint (~2KB vs ~1MB for platform threads)
- Managed by JVM: Scheduled on carrier threads (platform threads)
- Blocking-Friendly: Cheap to block, enabling simple synchronous code
- Scalable: Designed for massive concurrency
Performance Tuning Strategies
1. Configure Carrier Thread Pool Size
Virtual threads run on carrier threads from the ForkJoinPool. The default size is the number of available processors, which may not be optimal for I/O-heavy workloads.
public class CarrierThreadTuning {
public static void main(String[] args) {
// System property to configure carrier thread count
System.setProperty("jdk.virtualThreadScheduler.parallelism", "64");
System.setProperty("jdk.virtualThreadScheduler.maxPoolSize", "256");
// Or create custom executor
ExecutorService customExecutor = Executors.newThreadPerTaskExecutor(
Thread.ofVirtual()
.scheduler(new ForkJoinPool(128)) // Custom scheduler
.factory()
);
try (var executor = customExecutor) {
for (int i = 0; i < 10_000; i++) {
executor.submit(() -> {
// Task execution
performIOOperation();
});
}
}
}
static void performIOOperation() {
try {
Thread.sleep(Duration.ofMillis(100)); // Simulate I/O
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
}
}
}
Tuning Guidelines:
- I/O-heavy workloads: Set parallelism to 2-4x CPU cores
- CPU-heavy workloads: Keep near CPU core count
- Mixed workloads: Start with 2x CPU cores and benchmark
2. Avoid Thread Local Abuse
ThreadLocals can cause memory leaks and performance issues with virtual threads due to their large numbers.
public class ThreadLocalOptimization {
// PROBLEMATIC: ThreadLocal with virtual threads
private static final ThreadLocal<SimpleDateFormat> DATE_FORMATTER =
ThreadLocal.withInitial(() -> new SimpleDateFormat("yyyy-MM-dd"));
// BETTER: Use immutable, thread-safe alternatives
private static final DateTimeFormatter SAFE_DATE_FORMATTER =
DateTimeFormatter.ofPattern("yyyy-MM-dd");
// ACCEPTABLE: Scoped ThreadLocal with careful cleanup
public void processRequest() {
var userContext = new UserContext();
try {
ScopedValue.where(USER_CONTEXT, userContext).run(() -> {
handleRequest();
});
} finally {
userContext.cleanup();
}
}
// PREFER: ScopedValue (Java 20+) for inheritable context
private static final ScopedValue<UserContext> USER_CONTEXT = ScopedValue.newInstance();
}
3. Optimize Synchronization and Pinning
Monitor operations (synchronized) can pin virtual threads to carrier threads, reducing throughput.
public class SynchronizationOptimization {
private final Object lock = new Object();
private int counter;
// PROBLEMATIC: Synchronized method with I/O
public synchronized String fetchData(String id) throws IOException {
// I/O operation while synchronized - causes pinning!
String data = httpClient.send(buildRequest(id), BodyHandlers.ofString()).body();
return process(data);
}
// BETTER: Separate synchronization from I/O
public String fetchDataOptimized(String id) throws IOException {
String data = httpClient.send(buildRequest(id), BodyHandlers.ofString()).body();
// Synchronize only the minimal critical section
synchronized (lock) {
counter++;
return process(data);
}
}
// BEST: Use java.util.concurrent locks
private final ReentrantLock reentrantLock = new ReentrantLock();
public String fetchDataBest(String id) throws IOException {
String data = httpClient.send(buildRequest(id), BodyHandlers.ofString()).body();
reentrantLock.lock();
try {
counter++;
return process(data);
} finally {
reentrantLock.unlock();
}
}
// MONITOR: Detect pinning with JVM options
public static void main(String[] args) {
// Add JVM flags to detect pinning:
// -Djdk.tracePinnedThreads=full
// -Djdk.virtualThreadScheduler.maxPoolSize=1 (to make pinning obvious)
}
}
4. Batch and Buffer I/O Operations
While virtual threads handle I/O well, excessive small I/O operations can still be optimized.
public class IOBatching {
private final HttpClient httpClient = HttpClient.newHttpClient();
// SUBOPTIMAL: Many small sequential requests
public List<String> fetchUserDataSequential(List<String> userIds) {
return userIds.stream()
.map(id -> {
try {
return fetchUserData(id);
} catch (Exception e) {
throw new RuntimeException(e);
}
})
.toList();
}
// BETTER: Concurrent requests with virtual threads
public List<String> fetchUserDataConcurrent(List<String> userIds) {
try (var executor = Executors.newVirtualThreadPerTaskExecutor()) {
List<CompletableFuture<String>> futures = userIds.stream()
.map(id -> CompletableFuture.supplyAsync(() -> fetchUserData(id), executor))
.toList();
return futures.stream()
.map(CompletableFuture::join)
.toList();
}
}
// OPTIMAL: Batch API calls when possible
public List<String> fetchUserDataBatched(List<String> userIds) {
// Group into batches of 100
int batchSize = 100;
List<List<String>> batches = new ArrayList<>();
for (int i = 0; i < userIds.size(); i += batchSize) {
batches.add(userIds.subList(i, Math.min(i + batchSize, userIds.size())));
}
try (var executor = Executors.newVirtualThreadPerTaskExecutor()) {
return batches.stream()
.flatMap(batch -> {
// Use batch API endpoint if available
return CompletableFuture.supplyAsync(() -> fetchBatchUserData(batch), executor)
.join().stream();
})
.toList();
}
}
private String fetchUserData(String userId) {
// Individual user fetch
return "data-" + userId;
}
private List<String> fetchBatchUserData(List<String> userIds) {
// Batch fetch implementation
return userIds.stream().map(this::fetchUserData).toList();
}
}
5. Memory and Resource Management
Virtual threads have small stacks but can still cause memory issues in large numbers.
public class ResourceManagement {
// PROBLEMATIC: Unbounded virtual thread creation
public void processMessagesUnbounded(List<Message> messages) {
messages.forEach(message -> {
Thread.startVirtualThread(() -> processMessage(message));
});
// No control over resource consumption
});
// BETTER: Use structured concurrency with ExecutorService
public void processMessagesBounded(List<Message> messages) {
try (var executor = Executors.newVirtualThreadPerTaskExecutor()) {
List<CompletableFuture<Void>> futures = messages.stream()
.map(message -> CompletableFuture.runAsync(() -> processMessage(message), executor))
.toList();
// Wait for all tasks with timeout
CompletableFuture.allOf(futures.toArray(new CompletableFuture[0]))
.orTimeout(30, TimeUnit.SECONDS)
.join();
}
}
// BEST: Use semaphores for resource limiting
private final Semaphore dbConnectionSemaphore = new Semaphore(50);
public void processWithResourceLimit(List<Message> messages) {
try (var executor = Executors.newVirtualThreadPerTaskExecutor()) {
List<CompletableFuture<Void>> futures = messages.stream()
.map(message -> CompletableFuture.runAsync(() -> {
try {
dbConnectionSemaphore.acquire();
processWithDatabase(message);
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
} finally {
dbConnectionSemaphore.release();
}
}, executor))
.toList();
CompletableFuture.allOf(futures.toArray(new CompletableFuture[0])).join();
}
}
}
6. Monitoring and Profiling
Effective monitoring is crucial for performance tuning.
public class VirtualThreadMonitoring {
public static void main(String[] args) {
// Enable virtual thread metrics
System.setProperty("jdk.virtualThreadScheduler.enableMetrics", "true");
ThreadFactory virtualThreadFactory = Thread.ofVirtual()
.name("worker-", 0)
.uncaughtExceptionHandler((t, e) ->
System.err.println("Uncaught exception in thread: " + t.getName() + ", error: " + e))
.factory();
try (var executor = Executors.newThreadPerTaskExecutor(virtualThreadFactory)) {
for (int i = 0; i < 1000; i++) {
final int taskId = i;
executor.submit(() -> {
System.out.println("Executing task " + taskId + " on thread: " +
Thread.currentThread().getName());
performTask(taskId);
});
}
}
// Monitor with JFR (Java Flight Recorder)
// jcmd <pid> JFR.start duration=60s filename=virtualthreads.jfr
}
static void performTask(int id) {
try {
// Simulate work with some I/O
Thread.sleep(Duration.ofMillis(100));
// Monitor carrier thread usage
if (Thread.currentThread().isVirtual()) {
// This is a virtual thread
}
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
}
}
}
JVM Flags for Monitoring:
# Enable virtual thread metrics -Djdk.virtualThreadScheduler.enableMetrics=true # Trace pinned threads -Djdk.tracePinnedThreads=full # Increase debug information -Djdk.virtualThreadScheduler.debug=true # JFR recording for virtual threads jcmd <pid> JFR.start duration=60s filename=vt-profile.jfr settings=profile
7. Database Connection Pool Tuning
Virtual threads change connection pool requirements.
public class DatabasePoolTuning {
// Traditional sizing (platform threads)
// pool-size = max-concurrent-requests
// Virtual thread sizing (different approach)
// pool-size = max-concurrent-database-connections-needed
@Bean
public DataSource dataSource() {
HikariConfig config = new HikariConfig();
// With virtual threads, you might need MORE connections
// because many virtual threads can be waiting on database I/O
config.setMaximumPoolSize(200); // Increased from typical 20-50
// Shorter timeout since virtual threads are cheap to create/destroy
config.setConnectionTimeout(5000);
// Smaller minimum idle since we can scale quickly
config.setMinimumIdle(10);
return new HikariDataSource(config);
}
// Use with virtual threads
public void processWithDatabase() {
try (var executor = Executors.newVirtualThreadPerTaskExecutor()) {
List<CompletableFuture<User>> futures = userIds.stream()
.map(id -> CompletableFuture.supplyAsync(() ->
userRepository.findById(id), executor))
.toList();
List<User> users = futures.stream()
.map(CompletableFuture::join)
.toList();
}
}
}
Performance Anti-Patterns to Avoid
public class VirtualThreadAntiPatterns {
// ANTI-PATTERN 1: Using synchronized with I/O
public synchronized void processRequest() {
databaseCall(); // I/O in synchronized method - PINNING!
fileOperation(); // More I/O - MORE PINNING!
}
// ANTI-PATTERN 2: Creating unbounded numbers of virtual threads
public void massiveCreation() {
while (true) {
Thread.startVirtualThread(this::blockingOperation);
// No resource control - potential memory exhaustion
}
}
// ANTI-PATTERN 3: Using ThreadLocal without cleanup
public void leakyMethod() {
threadLocal.set(expensiveObject);
// Forgetting to remove() - memory leak with millions of virtual threads
}
// ANTI-PATTERN 4: CPU-bound work in virtual threads
public void cpuIntensiveWork() {
Thread.startVirtualThread(() -> {
mathematicalComputation(); // Wastes carrier threads
});
}
}
Benchmarking Virtual Threads
Always validate performance improvements with proper benchmarks:
@State(Scope.Thread)
@BenchmarkMode(Mode.Throughput)
@OutputTimeUnit(TimeUnit.SECONDS)
public class VirtualThreadBenchmark {
private ExecutorService virtualThreadExecutor;
private ExecutorService platformThreadExecutor;
@Setup
public void setup() {
virtualThreadExecutor = Executors.newVirtualThreadPerTaskExecutor();
platformThreadExecutor = Executors.newFixedThreadPool(200);
}
@Benchmark
public void virtualThreadsIO() throws Exception {
performIOBoundWork(virtualThreadExecutor);
}
@Benchmark
public void platformThreadsIO() throws Exception {
performIOBoundWork(platformThreadExecutor);
}
private void performIOBoundWork(ExecutorService executor) throws Exception {
List<CompletableFuture<Void>> futures = IntStream.range(0, 1000)
.mapToObj(i -> CompletableFuture.runAsync(() -> {
try {
Thread.sleep(10); // Simulate I/O
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
}
}, executor))
.toList();
CompletableFuture.allOf(futures.toArray(new CompletableFuture[0])).join();
}
@TearDown
public void tearDown() {
virtualThreadExecutor.close();
platformThreadExecutor.shutdown();
}
}
Conclusion
Virtual threads offer tremendous performance benefits for I/O-bound workloads, but optimal performance requires thoughtful tuning:
- Right-size carrier thread pools based on workload characteristics
- Minimize synchronization and prefer
ReentrantLockoversynchronized - Use ScopedValue instead of ThreadLocal for context propagation
- Implement proper resource limits with semaphores and bounded executors
- Monitor for pinning and optimize synchronized sections
- Adjust connection pools to account for higher concurrency
- Profile continuously with JFR and metrics
By applying these tuning strategies, you can maximize the throughput and efficiency of your virtual thread-based applications while maintaining the simplicity of synchronous programming models.