[PATCH] D142015: [LV] Plan with and without FoldTailByMasking

Thu Jan 26 00:59:04 PST 2023

david-arm added a comment.

Hi @dmgreen, I'm sorry I've not looked into this in more detail, but from the description you gave (which is very detailed - thanks!) I do have one concern about choosing tail-folding by default in a tie. One major problem we have at the moment is that the active lane mask call is effectively free because, unless I'm mistaken, we don't add the cost of this intrinsic to the loop. A simple IV increment (an add instruction) is likely to be cheaper than the codegen for the loop predicate for many targets? However, I do appreciate that targets that can't efficiently generate loop predicates probably also can't do masked loads and stores, so the tail-folded loop cost is likely to be high anyway. I wonder if it's better to be conservative and choose the non-tail-folded version in a tie until we've got a fairer comparison between different vectorisation styles?

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D142015/new/

https://reviews.llvm.org/D142015