[PATCH] D118979: [AArch64] Set maximum VF with shouldMaximizeVectorBandwidth

Wed Feb 9 09:42:13 PST 2022

sdesmalen added a comment.

I'm missing a bit of rationale for this change. There is an interplay between having a wider VF or having a larger interleave factor. For 128bit vectors, an `add <4 x i64> %x, %y` will be legalized into two adds. Conceptually this is similar to vectorizing with `<2 x i64>` and having an interleave-factor of 2. I can imagine that interleaving in the loop-vectorizer leads to better code, because it avoids issues around type legalisation and may provide more opportunities for other IR passes to optimize the IR or move things around. If we always choose a wider VF I wonder if that may lead to poorer codegen because of type-legalization.

Is there a specific example where it's clearly an improvement to have a wider VF? And would choosing a larger unroll-factor help those cases?

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D118979/new/

https://reviews.llvm.org/D118979