[PATCH] D130637: [LV] Don't predicate uniform mem op stores unneccessarily

Wed Jul 27 10:15:14 PDT 2022

fhahn added a comment.

> ; if the address is accessed at least once, we know the instruction doesn't need predicated. This change just extends it to handle uniform mem op stores as well.

I might be missing something, but I am not sure if this logic can be directly extended to stores. I think the reasoning for uniform loads was that they load the same value in each iteration, so it is sufficient to load it once (and with tail folding it is still guaranteed to execute once). But uniform stores (as per `isUniformMemOp`) will store to the same address, but not necessarily the same value in each iteration.

With tail folding, if the stores to the same address are executed unconditionally, the final stored value will be the one from the last lane of the vector iteration. But couldn't that lane be masked out and the value from last active lane should be the final stored value? AFAICT this is happening  in `llvm/test/Transforms/LoopVectorize/pr45679-fold-tail-by-masking.ll`?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D130637/new/

https://reviews.llvm.org/D130637