Taming the Chaos: Using LongAdder for High-Contention Counters in Java

In the world of concurrent programming, one of the most common yet deceptively challenging tasks is implementing a simple counter. When multiple threads are constantly updating a shared value, you create a high-contention scenario, which can severely bottleneck your application's performance. For years, AtomicLong was the go-to solution for this, but Java 8 introduced a smarter tool for the job: LongAdder.

This article explores why LongAdder is often the superior choice for high-throughput, write-heavy counters.

Table of Contents

The Problem: Contention with AtomicLong

First, let's understand the issue with the classic approach. AtomicLong uses a compare-and-swap (CAS) operation to update its value atomically. In a low-contention environment, this is very efficient. A thread reads the current value, calculates the new one, and attempts to update it. If another thread hasn't changed the value in the meantime, the update succeeds.

However, under high contention, this changes dramatically. When dozens or hundreds of threads are trying to update the same variable, CAS operations fail repeatedly. Threads are forced to retry in a tight loop, burning CPU cycles and creating a "hot spot" on the memory bus. This leads to significant performance degradation, even though the logic is technically thread-safe.

The core problem: All threads are fighting to update a single memory location.

The Solution: LongAdder's Divide-and-Conquer Strategy

LongAdder, part of the java.util.concurrent.atomic package, addresses this by employing a brilliant "divide-and-conquer" strategy. Instead of maintaining a single counter, it uses an array of variables called cells.

The internal magic works like this:

Base Variable: Initially, updates are made to a base field, similar to AtomicLong.
Cell Creation on Contention: When a thread detects contention (i.e., its CAS operation on the base variable fails), LongAdder creates a new Cell for that thread. Each Cell is an independent counter, padded to avoid false sharing.
Distributed Updates: From that point on, the thread updates its own dedicated Cell. This drastically reduces contention because threads are now writing to different memory locations.
Summing for Results: When you need to retrieve the total value (using the sum() method), LongAdder simply adds together the base value and the values from all the active cells.

In essence, LongAdder scatters the contention across multiple variables and gathers the result only when needed.

Code Comparison: AtomicLong vs. LongAdder

Let's see the difference in code.

Using AtomicLong (The Old Way)

import java.util.concurrent.atomic.AtomicLong;
public class AtomicCounter {
private final AtomicLong count = new AtomicLong(0);
public void increment() {
count.incrementAndGet(); // One CAS operation per call
}
public long getCount() {
return count.get();
}
}

Using LongAdder (The Modern Way)

import java.util.concurrent.atomic.LongAdder;
public class AdderCounter {
private final LongAdder count = new LongAdder();
public void increment() {
count.increment(); // May update base or a thread-local cell
}
public long getCount() {
return count.sum(); // Requires combining all values
}
}

While the usage looks almost identical, the performance characteristics under load are vastly different.

Performance Benchmark

Let's simulate a high-contention scenario. The following benchmark (conceptual) pits AtomicLong against LongAdder with 4 threads, each performing 10 million increments.

Implementation	Time (ms)	Relative Performance
`AtomicLong`	~4500 ms	1x (Baseline)
`LongAdder`	~800 ms	~5.6x Faster

Note: Actual results depend on the number of CPU cores and the level of contention, but LongAdder consistently demonstrates a massive advantage in high-update scenarios.

When to Use LongAdder (And When Not To)

LongAdder is not a silver bullet. Its design involves a trade-off.

Use LongAdder when:

You have a highly contended, write-heavy counter (e.g., statistics counters, request per second meters, event listeners).
The primary operations are increment(), decrement(), and add().
You can tolerate the higher cost of reading the value (since sum() must traverse the cell array).

Stick with AtomicLong when:

Contention is low, or you have a read-heavy workload. AtomicLong provides a cheap get() operation for frequent reads.
You need features like atomic compareAndSet or precise, sequential updates that depend on reading the immediate previous value. LongAdder does not guarantee that the value seen by one thread is immediately visible to another, making it unsuitable for sequence generators where strict ordering is required.

Conclusion

LongAdder is a masterclass in practical concurrency optimization. By trading expensive, centralized CAS operations for cheaper, distributed updates and a slightly more costly read operation, it perfectly aligns with the needs of modern, high-throughput applications where counters are updated far more often than they are read.

The next time you find yourself reaching for an AtomicLong to track metrics in a performance-critical, multi-threaded environment, pause and ask: "Is this write-heavy?" If the answer is yes, LongAdder is almost certainly the better choice.

Further Reading: LongAccumulator is a more generalized version of LongAdder that allows you to define any accumulation function, not just addition.