Write Amplification
Write amplification (WA) is one of the most important concepts in storage systems. A single logical write from your application causes multiple physical writes inside the device or system.
The formula
Section titled “The formula”WAF = Physical bytes written / Logical bytes writtenA WAF of 1.0 is ideal. In practice: 1.5–30+ depending on system and workload.
Why it happens in SSDs
Section titled “Why it happens in SSDs”SSDs erase at block level (256KB–4MB) but write at page level (4KB). Result: updating 4KB can force reading, erasing, and rewriting an entire block.
# Measure WAF from /proc/diskstatsdef measure_waf(sectors_before, sectors_after, ops_before, ops_after): physical_kb = ((sectors_after - sectors_before) * 512) / 1024 logical_ops = ops_after - ops_before return physical_kb / logical_ops if logical_ops > 0 else 0WAF in LSM trees
Section titled “WAF in LSM trees”levels, fan_out = 4, 10print(f"Estimated LSM WAF: {levels * fan_out / 2}x") # 20xReducing WAF
Section titled “Reducing WAF”| Technique | System | Effect |
|---|---|---|
| Sequential writes | SSD/HDD | 10–100x vs random |
| Over-provisioning | SSD | Less GC pressure |
| Tiered compaction | LSM | Fewer rewrites |
| Erasure coding | Distributed | 1.5x vs 3x replication |