Jump to content

NVMe

From Pulsed Media Wiki


NVMe (Non-Volatile Memory Express) is a protocol for accessing solid-state drives over the PCIe bus. It connects the SSD directly to the CPU, bypassing the SATA controller that was designed for slower hard disk drives.

NVMe was standardized in 2011 because SATA's 600 MB/s ceiling and single command queue could not keep up with flash memory speeds. Modern NVMe drives deliver 3,000-14,000 MB/s depending on the PCIe generation.

How NVMe differs from SATA

SATA was designed in the early 2000s for spinning hard drives. It assumes one operation at a time and uses a single command queue with 32 entries. NVMe was designed from scratch for flash storage:

NVMe SATA
Bus PCIe (direct to CPU) SATA controller (extra hop)
Command queues Up to 65,535 1
Queue depth 65,536 commands per queue 32 (with NCQ)
Max bandwidth PCIe 3.0 x4: 3,500 MB/s; PCIe 4.0 x4: 7,000 MB/s 600 MB/s
Latency ~10-20 microseconds ~50-100 microseconds
CPU overhead Lower (streamlined protocol) Higher (legacy AHCI stack)

The queue depth difference matters most under concurrent I/O. An NVMe drive can process thousands of requests in parallel, while a SATA drive processes them roughly one at a time. For a single sequential read, the speed difference is "only" 6-12x. For random I/O under load, NVMe can be 50-100x faster.

Form factors

NVMe drives come in several physical shapes:

Form factor Size Common use
M.2 (2280) 22mm x 80mm, mounts on motherboard Laptops, desktops, small servers
U.2 (2.5") Standard 2.5-inch drive bay Enterprise servers (hot-swappable)
PCIe add-in card Full or half-height PCIe slot Workstations, servers without M.2/U.2
E1.S / E3.S (EDSFF) Enterprise data center form factor High-density server deployments

M.2 slots can accept both SATA and NVMe drives (the slot supports both protocols). Check the motherboard specifications — not all M.2 slots support NVMe.

NVMe endurance

NVMe SSDs have a finite number of write cycles per flash cell. This is measured in TBW (Terabytes Written) or DWPD (Drive Writes Per Day over the warranty period).

A typical 1TB consumer NVMe drive is rated for 300-600 TBW. Enterprise drives are rated for 1-3 DWPD, meaning a 1TB enterprise drive can sustain 1-3TB of writes per day for 5 years.

In server environments with constant writes (databases, caching, swap), endurance is the limiting factor rather than performance. Monitoring tools like smartctl report the percentage of life used:

<syntaxhighlight lang="bash">

  1. Check NVMe drive health

smartctl -a /dev/nvme0n1

  1. Look for "Percentage Used" — 100% means rated endurance reached

</syntaxhighlight>

Drives continue to function past 100% used endurance, but the risk of failure increases. Replacement should be planned before reaching that point.

NVMe in seedbox hosting

In Pulsed Media's infrastructure, NVMe drives serve as a performance cache tier rather than primary bulk storage. The bulk data (torrents, user files) lives on HDD arrays in RAID. NVMe drives provide fast swap space.

Linux uses a three-tier memory hierarchy:

  1. DRAM (RAM) — 80 nanosecond access, limited capacity
  2. NVMe swap — ~100 microsecond access, fast enough to be transparent for most workloads
  3. HDD — ~10 millisecond access, bulk storage

When the operating system runs low on RAM, it moves inactive memory pages to swap. With NVMe swap, this happens fast enough that the performance difference is barely noticeable for most operations. On an HDD, swapping causes visible freezes and slowdowns.

This approach gives seedbox users the benefit of large HDD storage (tens of terabytes per server) with the responsiveness of SSD-backed caching, without the cost of an all-SSD array.

See also

On the blog: