[llvm] [SLP] Vectorize non-power-of-2 ops with padding. (PR #77790)

Tue Jan 16 12:58:12 PST 2024

alexey-bataev wrote:

> > As I said before, landing this alone causes perf regressions. I'm working on this, need to land couple patches to avoid this. I'm working on this feature for 3 years already (or more), most part of my previous patches are related to non-power-of-2 support.
> 
> I did perf evaluation for this change on SPEC2017 and some large internal benchmarks on AArch64, and the only regressions I was able to spot were due to AArch64 cost-model issues, where the cost of non-power-of-2 vector ops wasn't estimated accurately (one instance fixed by #78181).
> 
> With the initial level of support, do you think there are fundamental issues in SLPVectorizer that aren't related to target specific cost models not providing accurate costs? To iterate on the latter, having limited support early would be helpful as people can fix their cost models early and also iteratively as more features enabled for non-power-of-2 vec.

It contradicts with my work, I did for several years. I have a plan of implementing this feature and moving forward step by step. Most of this newly added stuff, like padding etc., is not needed, if everything is implemented correctly. This is not a complete solution, but just a small hack to enable the feature, which may cause issues in future and adds extra burden to maintaining and removing it later.

https://github.com/llvm/llvm-project/pull/77790