Jump to content

AMD EPYC 7001 (Naples)

From Pulsed Media Wiki


The AMD EPYC 7001 series (codenamed "Naples") is AMD's first-generation server processor on the Zen architecture. Launched in June 2017, Naples broke Intel's decade-long server monopoly by delivering up to 32 cores, 128 PCIe 3.0 lanes, and 8 DDR4 memory channels at a fraction of the Xeon Platinum price. The 7551P single-socket variant at $2,100 directly competed with Intel's Xeon Platinum 8180 at $10,009 — comparable multi-threaded performance at one-fifth the cost.

Naples uses a Multi-Chip Module (MCM) design: 4 Zeppelin dies on a single substrate, each containing 8 Zen cores, 2 DDR4 channels, and 32 PCIe 3.0 lanes. The MCM approach enabled AMD to reach 32 cores on GlobalFoundries' 14nm process at a time when Intel was still struggling with 10nm.

Architecture

Parameter Detail
Codename Naples
Family Family 17h Models 00h–0Fh
Microarchitecture Zen 1
Process 14nm (GlobalFoundries)
Socket SP3 (LGA 4094)
Die design MCM — 4× Zeppelin dies per package
Max cores/threads 32 / 64
PCIe 128× PCIe 3.0 lanes (32 per die)
Memory 8-channel DDR4 (2 channels per die). Max 2 TB with LRDIMM.
Memory speed DDR4-2666 (1 DIMM per channel), DDR4-2400 (2 DIMMs per channel)
Interconnect Infinity Fabric (inter-die communication)
TDP range 120W–180W
Launch date June 2017
AGESA NaplesPI (latest known: 1.0.0.B)
Errata document AMD Pub #55449 Rev 1.21, August 2023

MCM topology

Each Zeppelin die is a complete compute unit: 8 cores (2 CCX of 4 cores each), 2 DDR4 channels, 32 PCIe 3.0 lanes, and an Infinity Fabric link. The 4 dies communicate via Infinity Fabric in a mesh topology, adding latency for cross-die memory access (NUMA). For workloads sensitive to memory locality, pinning processes to a single die avoids Infinity Fabric hops.

The MCM design means PCIe lanes are tied to specific dies. Slots on a motherboard connect to specific dies, and unpopulated memory channels on a die can affect that die's PCIe functionality (see PCIe errata below).

SKU table

Dual-socket (standard)

Model Cores Threads Base (GHz) Boost (GHz) L3 cache TDP PCIe lanes DDR4 channels Launch MSRP (USD)
EPYC 7601 32 64 2.2 3.2 64 MB 180W 128 8 $4,200
EPYC 7571 32 64 2.1 2.7 64 MB 200W 128 8 ~$3,400
EPYC 7551 32 64 2.0 3.0 64 MB 180W 128 8 $3,400
EPYC 7501 32 64 2.0 3.0 64 MB 155W/170W 128 8 $2,100
EPYC 7451 24 48 2.3 3.2 64 MB 180W 128 8 $2,400
EPYC 7401 24 48 2.0 3.0 64 MB 155W/170W 128 8 $1,850
EPYC 7371 16 32 3.1 3.6 64 MB 180W 128 8 ~$2,300
EPYC 7351 16 32 2.4 2.9 64 MB 155W/170W 128 8 $750
EPYC 7301 16 32 2.2 2.7 64 MB 155W/170W 128 8 $575
EPYC 7281 16 32 2.1 2.7 32 MB 155W/170W 128 8 $650
EPYC 7261 8 16 2.5 2.9 64 MB 155W/170W 128 8 $475
EPYC 7251 8 16 2.1 2.9 32 MB 120W 128 8 $475

Single-socket (P-series)

P-series SKUs are licensed for single-socket systems only, at lower prices. Feature-identical to their dual-socket counterparts.

Model Cores Threads Base (GHz) Boost (GHz) L3 cache TDP Launch MSRP (USD)
EPYC 7551P 32 64 2.0 3.0 64 MB 180W $2,100
EPYC 7401P 24 48 2.0 3.0 64 MB 155W/170W $1,075
EPYC 7351P 16 32 2.4 2.9 64 MB 155W/170W $750

The EPYC 7551P became the standout value SKU: 32 cores, 128 PCIe lanes, and 8 memory channels at $2,100 — the cheapest 32-core x86 server processor available at launch, and for several years afterward.

The EPYC 7571 and 7371 were later additions to the lineup. The 7571 trades clock speed for a 200W TDP envelope, while the 7371 targeted frequency-sensitive workloads at 3.1 GHz base — the highest base clock in the Naples lineup. The EPYC 7261 offered an unusual combination: only 8 cores but the full 64 MB L3 cache, making it attractive for cache-sensitive workloads at an entry-level price.

PCIe errata

AMD Pub #55449 Rev 1.21 (August 2023) documents 8 errata affecting PCIe functionality on all EPYC 7001 processors. These are silicon-level bugs present in all Naples CPUs, regardless of motherboard. None have silicon fixes — all are "no fix planned" except #1044 (fix status unclear).

These errata are the documented root cause of PCIe slot failures reported on Naples motherboards including the Gigabyte MZ31-AR0-00 EPYC Server Motherboard, ASRock EPYCD8-2T, and Supermicro H11SSL-i.

Errata summary

Erratum Title Severity Workaround Fix planned
#1126 Gen3 link training hang Critical — dead slots BIOS/AGESA (may degrade link width) No
#1125 Reading root port registers hangs CPU Critical — diagnostic trap Avoid lspci -vvv No
#1146 DPC register logic inverted High — false port containment Set PCIE_RP_PIO_SYSERROR to 7_0707h No
#1083 Gen3 spurious correctable error Moderate — cascades via #1146 None needed (self-recovering) No
#1063 MSI with wrong Requestor ID Moderate — breaks IOMMU iommu=off or intremap=off No
#1044 L1.1/L1.2 entry hang Moderate Disable L1 substates; pcie_aspm=off Yes (unclear if shipped)
#1177 L1.2 entry hang (no silicon workaround) Moderate pcie_aspm=off No
#1080 Gen1 EDB token causes NAK Low None needed if AER disabled No

Errata details

#1126 — PCIe link may hang when attempting to switch to Gen3 mode

AMD description: "PCIe link may hang when attempting to switch to Gen3 mode."

Workaround: "The workaround will prevent the PCIe link from hanging. This workaround may cause the PCIe link to train to a degraded link width." Workaround is BIOS/AGESA-level.

Practical impact: When a PCIe slot tries to negotiate Gen3 (8 GT/s) speed, the link hangs permanently — the slot appears completely dead to the operating system. Forcing Gen2 via setpci on the root port can confirm this erratum as the cause. If the slot comes alive at Gen2, erratum #1126 is confirmed. The BIOS/AGESA workaround is the long-term fix, but may reduce effective bandwidth (x16 slot training as x8 or x4).

Erratum #1126 is the primary suspect for "dead slot" reports on Naples motherboards.

#1125 — Reading PCIe Error Source ID or MSI Message Control register may hang processor

AMD description: "Reading the PCIe Error Source Identification register in any PCIe root port, or reading or writing the MSI Message Control register in any PCIe root port, may cause the processor to hang."

Workaround: Avoid reading these registers.

Practical impact: lspci -vvv can hang the system. The -vvv flag causes lspci to read all capability registers, including the dangerous ones. This is a trap for anyone diagnosing PCIe problems — the standard diagnostic tool triggers a system hang. Use lspci -tv (topology view) or lspci -nn (numeric IDs) instead. Also affects monitoring software that performs deep PCI config space scans (some versions of hwinfo, for example).

#1146 — PCIe DPC RP PIO error reporting register logic inverted

AMD description: "The DPC Extended Capability RP PIO SysError (PCIERCCFG::PCIE_RP_PIO_SYSERROR) register incorrectly inverts the sense of its enable bits, so that a bit value of 0b enables the function and a bit value of 1b disables the function. [...] The reset value of PCIERCCFG::PCIE_RP_PIO_SYSERROR is 0000_0000h which, due to the inversion of the bits, enables RP PIO errors to be reported as a System Error."

Workaround: Program PCIERCCFG::PCIE_RP_PIO_SYSERROR[18:0] to 7_0707h to disable the DPC RP PIO System Error feature.

Practical impact: The default register value (all zeros) enables error reporting because of the inverted logic. On a fresh boot without the BIOS workaround, every RP PIO error triggers a System Error, which can kill the port via DPC (Downstream Port Containment). When erratum #1083 generates a correctable error, that error cascades through this misconfigured DPC register and shuts down the port. The slot appears dead, but is actually DPC-contained — a software state, not a hardware failure.

#1083 — PCIe Root Port Gen3 receiver may log spurious correctable error

AMD description: "In Gen 3 mode, the PCIe Root Port receiver may miss the TLP after a SKP if no IDL is sent before the SKP, causing the port to log a correctable error before the TLP is recovered. This scenario can only happen if the SKP Ordered Set contains 0xC0, causing spurious EDB error."

Workaround: None needed — the link self-recovers.

Practical impact: Generates "Bad TLP" or "correctable error" entries in dmesg. The link recovers automatically, but error counters increment. In isolation, this erratum is harmless. Combined with erratum #1146 (DPC register inversion), the correctable error can cascade into full port shutdown — making this pair the second most common cause of "dead slot" symptoms after erratum #1126.

#1063 — PCIe controller will generate MSI with incorrect Requestor ID

AMD description: "The PCIe controller will generate MSIs with an incorrect Requestor ID of 0x0 on internal interrupt events including: Hot-plug, PME (Power Management Event), AER (Advanced Error Reporting), DPC (Dynamic Power Control), Link Equalization, Link Bandwidth Notification."

Workaround: System software may contain the workaround. Alternatively: iommu=off or intremap=off kernel parameter.

Practical impact: All PCIe internal interrupt events carry the wrong Requestor ID. When IOMMU interrupt remapping is enabled, these interrupts are silently dropped because they fail the IOMMU's source validation check. Hot-plug events, error notifications, and bandwidth change notifications never reach the OS. Devices that require hot-plug enumeration may fail to appear. This erratum is a strong candidate for "dead slot" symptoms specifically when IOMMU is enabled.

#1044 — PCIe controller may hang on entry into L1.1 or L1.2 power management substate

AMD description: "Under a highly specific and detailed set of internal timing conditions, the PCIe controller may hang on entry into either L1.1 or L1.2 power management substate. This failure occurs when L1 power management substate exit is triggered by a link partner asserting CLKREQ# prior to the completion of the L1 power management stubstates entry protocol."

Workaround: Disable L1.1 and L1.2 power management substates. Kernel parameter pcie_aspm=off.

Fix planned: Yes (unclear whether fix was shipped in later AGESA versions).

Practical impact: Affects NVMe SSDs and network cards with aggressive power management that use L1 substates. Kernel parameter pcie_aspm=off prevents L1 substate entry entirely, mitigating both this erratum and #1177.

#1177 — Processor entering L1.2 power management substate may hang

AMD description: "If the processor encounters an L1 power management substate exit triggered by a PCIe link partner while it is entering L1.2 power management substate, it may incorrectly assert CLKREQ# before waiting for the required period of 6 microseconds."

Workaround: None at silicon level. Kernel parameter pcie_aspm=off prevents L1.2 entry.

Practical impact: Related to erratum #1044 but distinct — no silicon-level workaround exists. The CLKREQ# timing violation can hang or reset the system if the PCIe link partner does not tolerate it. The same pcie_aspm=off parameter mitigates both errata by preventing L1 substate entry entirely.

#1080 — EDB token forwarded upstream when Gen1 link exits electrical idle

AMD description: "When the PCIe link is operating in Gen1 mode and enters electrical idle, the EDB token is mistakenly forwarded upstream. This unexpected EDB token may incorrectly trigger NAKs when the link exits to L0."

Workaround: If AER is not enabled, no workaround is required.

Practical impact: Low severity. Only affects PCIe Gen1 (2.5 GT/s) links. Most devices negotiate Gen2 or Gen3 and are unaffected. NAKs cause retransmission, resulting in slight performance impact but no data loss or functional failures.

Errata interaction map

Several Naples PCIe errata interact to create cascade failures that are difficult to diagnose individually:

  • #1083 → #1146 → dead slot: Erratum #1083 generates a correctable error. Erratum #1146's inverted DPC register escalates it to a System Error. DPC contains the port. Slot appears dead but is software-contained.
  • #1126 → dead slot: Gen3 link training hangs. Slot is genuinely non-functional at Gen3. Works at Gen2.
  • #1063 + IOMMU → dead slot: Wrong MSI Requestor ID causes IOMMU to drop enumeration interrupts. Slot fails to enumerate.
  • #1044 + #1177 → system hang: Both L1 substate errata can hang the processor during power state transitions. Mitigated by pcie_aspm=off.
  • #1125 → diagnostic trap: Attempting to diagnose any of the above with lspci -vvv hangs the system.

The same "dead slot" symptom can result from three different root causes, each requiring different diagnostics and different fixes. Multiple Naples motherboards from different manufacturers and different batches exhibit the same patterns because the errata are in the silicon, not the board.

Recommended kernel parameters for Naples systems

pcie_aspm=off        # Prevents L1 substate hangs (errata #1044, #1177)
# Optional, depending on symptoms:
iommu=off            # If dead slots appear only with IOMMU enabled (erratum #1063)
intremap=off         # Alternative to iommu=off — disables only interrupt remapping

Forcing Gen2 to diagnose erratum #1126

If a PCIe slot appears dead, forcing Gen2 (5 GT/s) on that slot's root port can confirm erratum #1126 as the cause. If the slot comes alive at Gen2, the Gen3 link training hang is confirmed.

First, identify the root port for the dead slot using lspci -tv (never -vvv). Then force Gen2 using setpci:

# Find the root port's Link Control 2 register offset
# For root port at bus:dev.fn (e.g. 00:03.1):
setpci -s 00:03.1 CAP_EXP+30.W          # Read current Target Link Speed
setpci -s 00:03.1 CAP_EXP+30.W=0002     # Force Gen2 (5 GT/s)
# Then trigger a link retrain:
setpci -s 00:03.1 CAP_EXP+10.W=0020     # Set Retrain Link bit

Replace 00:03.1 with the actual root port address from lspci -tv. If the device enumerates after forcing Gen2, the permanent fix is a BIOS/AGESA update that includes the #1126 workaround. Running at Gen2 halves the bandwidth (5 GT/s vs 8 GT/s) but is functional.

Generational comparison

Feature Naples (7001) Rome (7002) Milan (7003) Genoa (9004)
Launch 2017 2019 2021 2022
Architecture Zen 1 Zen 2 Zen 3 Zen 4
Process 14nm 7nm 7nm 5nm
Max cores 32 64 64 96
PCIe generation 3.0 3.0 4.0 5.0
PCIe lanes 128 128 128 128
DDR generation DDR4 DDR4 DDR4 DDR5
Memory channels 8 8 8 12
Socket SP3 SP3 SP3 SP5
Die design 4× Zeppelin MCM 1 IOD + up to 8 CCDs 1 IOD + up to 8 CCDs 1 IOD + up to 12 CCDs

Naples through Milan share the SP3 socket (LGA 4094), making motherboard upgrades possible within that range. Genoa moved to SP5, requiring new motherboards.

The shift from Naples to Rome was the most significant architectural change: Rome separated I/O functions into a dedicated 14nm I/O die (IOD) while moving compute to 7nm chiplets. This eliminated Naples' die-level PCIe dependency (where each die's PCIe lanes required local memory population) and doubled core count to 64.

Market disruption

Naples ended Intel's decade-long server monopoly. AMD's server market share trajectory tells the story:

Period AMD server market share Context
2016 0.4% Pre-EPYC (Opteron remnants)
Q2 2017 ~1% Naples launch
Q4 2018 ~3.2% Early adoption
Q4 2019 ~5.1% Rome (7002) launched
Q4 2020 ~8.9% Milan (7003) era
Q4 2022 ~17.6% Genoa (9004) era
Q1 2025 ~27.2% Current

Source: Mercury Research quarterly reports.

Naples offered more cores per dollar, more PCIe lanes per socket (128 vs Intel's 48), more memory channels (8 vs 6), and no artificial feature segmentation. Intel locked ECC memory, high PCIe lane counts, and multi-socket support behind different Xeon tiers. Every Naples SKU shipped with all features enabled.

Secondary market (2025–2026)

Naples processors are widely available on the secondary market at a fraction of their launch prices. Per-SKU pricing from eBay completed listings (2026):

SKU Cores eBay price range (USD) Typical price Launch MSRP Depreciation
EPYC 7601 32C $40–80 ~$55 $4,200 98.7%
EPYC 7571 32C $35–70 ~$50 ~$3,400 98.5%
EPYC 7551 32C $30–60 ~$40 $3,400 98.8%
EPYC 7551P 32C $35–65 ~$45 $2,100 97.9%
EPYC 7501 32C $25–50 ~$35 $2,100 98.3%
EPYC 7451 24C $25–55 ~$35 $2,400 98.5%
EPYC 7401P 24C $20–40 ~$30 $1,075 97.2%
EPYC 7401 24C $20–40 ~$28 $1,850 98.5%
EPYC 7371 16C $20–35 ~$25 ~$2,300 98.9%
EPYC 7351P 16C $15–30 ~$20 $750 97.3%
EPYC 7351 16C $15–30 ~$20 $750 97.3%
EPYC 7301 16C $12–25 ~$18 $575 96.9%
EPYC 7281 16C $15–25 ~$18 $650 97.2%
EPYC 7261 8C $12–20 ~$15 $475 96.8%
EPYC 7251 8C $10–20 ~$15 $475 96.8%

Pulsed Media sources EPYC 7451 (24C) from Damicon at approximately €110 per unit.

At these prices, a complete 32-core / 128-lane / 8-channel server can be built for $200–400 (CPU + motherboard). This makes Naples the cheapest path to high core count and PCIe lane density for homelabs, small business servers, NAS builds, and development environments.

Caution: Used Naples motherboards from the Chinese secondary market (particularly Rev 1.x boards) may have unknown thermal history, missing BMC serial stickers, and exposure to the PCIe errata documented above. Budget for BIOS recovery hardware (CH341A programmer, ~$5) when purchasing used boards.

Compatible motherboards

Motherboard Form factor DIMM slots SATA 10GbE Notes
Gigabyte MZ31-AR0-00 E-ATX 16 16 (SlimSAS) 2× SFP+ Most I/O density. Bifurcation broken.
ASRock Rack EPYCD8-2T ATX 8 8 2× RJ45 Well-documented, good community support.
Supermicro H11SSL-i ATX 8 8 None Supermicro ecosystem.
Tyan Tomcat SX S8026 ATX 8 10 2× SFP+ Rare.

Pulsed Media deployment

Pulsed Media uses EPYC processors on the M10G SSD and Dragon-R product lines. The 128 PCIe lanes and 8 memory channels give each VM enough I/O bandwidth for dedicated seedbox hosting without contention.

See also

External references

  • AMD Pub #55449 Rev 1.21 (August 2023) — "Revision Guide for AMD Family 17h Models 00h-0Fh Processors" (official errata document)