[PATCH] D142015: [LV] Plan with and without FoldTailByMasking

David Sherwood via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Jan 26 00:59:04 PST 2023


david-arm added a comment.

Hi @dmgreen, I'm sorry I've not looked into this in more detail, but from the description you gave (which is very detailed - thanks!) I do have one concern about choosing tail-folding by default in a tie. One major problem we have at the moment is that the active lane mask call is effectively free because, unless I'm mistaken, we don't add the cost of this intrinsic to the loop. A simple IV increment (an add instruction) is likely to be cheaper than the codegen for the loop predicate for many targets? However, I do appreciate that targets that can't efficiently generate loop predicates probably also can't do masked loads and stores, so the tail-folded loop cost is likely to be high anyway. I wonder if it's better to be conservative and choose the non-tail-folded version in a tie until we've got a fairer comparison between different vectorisation styles?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D142015/new/

https://reviews.llvm.org/D142015



More information about the llvm-commits mailing list