Tech Blog / I/O

I/O, PCIe & Storage Behavior

Storage problems rarely look like “storage problems.” They look like random stutters, app hangs, sudden latency spikes, or throughput that collapses under sustained load. This guide explains how PCIe, controllers, drivers, thermal limits, and queue behavior interact — and what to measure when something feels “off.”

Focus Latency spikes
Signal Queue & errors
Method Sustained tests

Why I/O issues are hard to spot

I/O sits behind everything: asset streaming, file indexing, paging, update systems, and background services. When the storage path misbehaves, symptoms appear elsewhere: frame pacing gets weird, apps “freeze”, and CPU usage looks misleading.

The key is to treat storage as latency, not only throughput. A drive can score high in short benchmarks while still producing nasty latency tails under sustained pressure.

The usual suspects (and how they show up)

PCIe negotiation & link stability

Link speed/width changes, retraining events, or borderline slots can cause sudden drops under load. If performance collapses only under sustained use, look here.

Controller + driver behavior

Drivers can change queue handling, power states, and error recovery. “Minor” updates sometimes shift latency more than throughput.

Thermal throttling & sustained writes

NVMe can look fast for 30 seconds and then fall off a cliff. If you see periodic dips, check temps and cache exhaustion behavior.

What to measure (so it reflects real usage)

Most user-visible pain comes from latency spikes and stalls, not from average throughput. Use sustained runs, mixed workloads, and monitor the tails. If you only run a short sequential test, you’re measuring the cache, not the system.

Rule: If the test ends before the drive heats up or the cache fills, it’s a demo — not validation.
  • Latency p95/p99 during mixed reads/writes, not only sequential.
  • Stall frequency (micro-freezes) during real tasks: installs, updates, streaming.
  • Error counters and controller resets (even rare ones matter).
  • Sustained throughput after cache is exhausted and temps stabilize.

Quick checklist

When performance feels “weird”

Verify PCIe link width/speed is stable under load.

Run a sustained mixed workload, not only sequential.

Check NVMe temps and throttling behavior.

Before you blame the drive

Confirm firmware + driver versions and recent changes.

Disable “power saving” toggles temporarily to compare.

Check cables/slots if SATA or add-in cards are involved.

“Throughput sells. Latency is what you actually feel.”