[PATCH] D108136: [LoopVectorize] Permit vectorisation of more select(cmp(), X, Y) reduction patterns

David Sherwood via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Sep 13 08:19:20 PDT 2021


david-arm added a comment.

In D108136#2996087 <https://reviews.llvm.org/D108136#2996087>, @dmgreen wrote:

> I have been looking at a similar area of the code recently for something unrelated. My first thought was that these are funny reductions, only reducing 2 values into 1. As in https://godbolt.org/z/531E7cPxY?
>
> Do you know how common this comes up, and if the loop in question will usually have a large enough iteration count to warrant the overheads of vectorization? I guess that's hard to say in general.
>
> Can you upload with full context

Hi @dmgreen, so I have only seen one example so far, but I do know that gcc vectorises this loop (although gcc's loop seems a bit messy). I tested some simple loops with large trip counts and we see significant performance improvements from vectorisation. I imagine that in practice real-world loops would do more than just perform this type of reduction, and would likely be in combination with something else.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D108136/new/

https://reviews.llvm.org/D108136



More information about the llvm-commits mailing list