[llvm] [X86] Reduce znver3/4 LoopMicroOpBufferSize to practical loop unrolling values (PR #91340)

via llvm-commits llvm-commits at lists.llvm.org
Wed May 15 05:40:11 PDT 2024


ganeshgit wrote:

> Would people prefer we just drop the LoopMicroOpBufferSize entry from the znver3/4 models (same as znver1/2)? This prevents most loop unrolling and we then rely on the cpu's op cache higher decode rate to get higher performance (but we end up testing every loop).

I have at least one counter example which gains with the LoopMicroOpBufferSize setting we have for znver3/4. Let us go by your deduction of the metric for LoopMicroOpBufferSize based on the misprediction penalty.  

https://github.com/llvm/llvm-project/pull/91340


More information about the llvm-commits mailing list