Why I/O issues are hard to spot
I/O sits behind everything: asset streaming, file indexing, paging, update systems, and background services. When the storage path misbehaves, symptoms appear elsewhere: frame pacing gets weird, apps “freeze”, and CPU usage looks misleading.
The key is to treat storage as latency, not only throughput. A drive can score high in short benchmarks while still producing nasty latency tails under sustained pressure.
The usual suspects (and how they show up)
PCIe negotiation & link stability
Link speed/width changes, retraining events, or borderline slots can cause sudden drops under load. If performance collapses only under sustained use, look here.
Controller + driver behavior
Drivers can change queue handling, power states, and error recovery. “Minor” updates sometimes shift latency more than throughput.
Thermal throttling & sustained writes
NVMe can look fast for 30 seconds and then fall off a cliff. If you see periodic dips, check temps and cache exhaustion behavior.
What to measure (so it reflects real usage)
Most user-visible pain comes from latency spikes and stalls, not from average throughput. Use sustained runs, mixed workloads, and monitor the tails. If you only run a short sequential test, you’re measuring the cache, not the system.
- Latency p95/p99 during mixed reads/writes, not only sequential.
- Stall frequency (micro-freezes) during real tasks: installs, updates, streaming.
- Error counters and controller resets (even rare ones matter).
- Sustained throughput after cache is exhausted and temps stabilize.
Quick checklist
When performance feels “weird”
Verify PCIe link width/speed is stable under load.
Run a sustained mixed workload, not only sequential.
Check NVMe temps and throttling behavior.
Before you blame the drive
Confirm firmware + driver versions and recent changes.
Disable “power saving” toggles temporarily to compare.
Check cables/slots if SATA or add-in cards are involved.