AMD Quietly Deactivates Unused Feature in Zen 4 CPUs
While Zen 5 PCs see Microsoft Windows-focused updates, a quieter change is impacting older Zen 4 processors. A recent release seemingly disables a seldom-used feature and likely won’t impact performance.
Loop Buffer Deactivated
Deep dives using Benchmarking and hardware diagnostics via a website called Chips and Cheese revealed this change involved diligently monitoring performance counters, providing an insightful look into Louvain.
You see, the Loop Buffer, essentially a mini replica of the uOp cache in the CPU frontend, holds instructions. When a program caught in a loop of repetitive code, this can simply draw instructions directly from the Loop Buffer, giving processors a chance to rest other parts of the frontend like fetches and decoders, potentially saving some power. However, in practical usage, it appears this feature is rarely utilized.
Though newborns the number of engines the Loop Buffer can hold is about 144 entries.
Uncovering a Potential Glitch
While not officially announced by AMD, the disablement suggests discovering an "erratum" making this feature unreliable.
This may also be a preemptive measure against vulnerabilities.
The impact on performance is likely minimal. The majority of scenarios didn’t utilize the feature, and the lv-
”Chips & Cheese disabled it, using both SPEC CPU2017 and Cyberpunk 2077 benchmarks to demonstrate it seems the performance was mainly unaffected. However, there was a slight dip visible in Cyberpunk 2077, only occurring when not utilizing Zen 4 cores with 3D V-Cache.
A Story of Trade-offs
It’s likely that the Loop Buffer’s power savings weren’t very noticeable in real-world applications.
Now, AMD seems to have removed it, focusing resources elsewhere.
One might wonder if this could resurface in later architectures like Zen 6 or Zen 7 after refactoring and refinement. It wouldn’t be the first for a CPU feature to go missing and return later with improvements.
Not Entirely Unique
Note that even Intel had a similar feature, named the Loop Stream Detector.
It’s interesting to recall how we rarely hear about these kinds of internal technical maneuvers, highlighting that CPU design is a complex, ongoing process.