[llvm] [AArch64] Set MaxInterleaving to 4 for Neoverse V2 (PR #100385)

David Green via llvm-commits llvm-commits at lists.llvm.org
Fri Jul 26 03:11:43 PDT 2024


davemgreen wrote:

Running flang-new on spec should work. It might be worth making sure the others in it are OK.

I think this makes sense when you look at it from the point of view of the CPU. There are 4 vector pipelines, so it makes sense to start interleaving 4x, and other cores do the same. The other dimension is the code that you are running and the dynamic trip-count of the important loops in that code. Certain domains tend to have higher trip counts, like HPC, ML and certain DSP routines. Benchmarks often have a super high trip count to make the benchmark meaningful. DSP/image processing can often be 16 x 16 tiles, and for certain other domains the trip counts are often very low, possibly closer to 1 than anything else.

So I'm not sure I trust the more benchmark-y results as much as the others. I think it makes sense to have a higher maximum interleave factor, it just might not always be best to use it.  Having it higher for reductions might be more of a win than other cases, for example.

I think this is a good idea so long as the performance is OK. Can you add Neoverse-V3 to the same switch, I think it should work similarly.

https://github.com/llvm/llvm-project/pull/100385


More information about the llvm-commits mailing list