[llvm] [AArch64] Set MaxInterleaving to 4 for Neoverse V2 (PR #100385)

David Sherwood via llvm-commits llvm-commits at lists.llvm.org
Tue Aug 20 03:06:27 PDT 2024


david-arm wrote:

> > > One issue with interleaving is that epilogue vecotrizaiton only considers VF, but not VF x UF. There are a number of cases where epilogue vectorization would be beneficial on AArch64 when the VF < 16 but UF > 1. @juliannagele is currently looking into adjusting the cost model for that.
> > 
> > 
> > I hadn't realised that we didn't account for the UF already. That sounds like a good thing to fix, thanks for the info. @juliannagele @fhahn do you have a timeframe for when such a patch will be ready? I would like to avoid the regressions if we can, and would not want to end up relying on a patch the never materializes. Otherwise perhaps @sjoerdmeijer could increase the limit for cases where the UF is 4, and we can improve it further in the future where needed.
> 
> @davemgreen: in the meantime, I will prepare a patch to create a TTI hook that controls the `epilogue-vectorization-minimum-VF`. I would like to lower it to 8 for the V2 but I don't think it makes sense to do this for all targets.

Do you need to take interleaving into account when lowering the minimum? The reason I ask is that for large loops we may not interleave at all so you'll have an epilogue even when the main vector loop has IC=1,VF=8. Maybe that's fine, but I can imagine some downsides in terms of increased code size and worse performance for some low trip counts.

https://github.com/llvm/llvm-project/pull/100385


More information about the llvm-commits mailing list