[PATCH] D112552: [LoopVectorize] When tail-folding, don't always predicate uniform loads
Sander de Smalen via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Oct 29 07:33:11 PDT 2021
sdesmalen accepted this revision.
sdesmalen added a comment.
This revision is now accepted and ready to land.
This looks good to me. We know the load will be executed to populate at least one of the lanes so the scalar uniform load can be performed unconditionally within the vector body.
================
Comment at: llvm/test/Transforms/LoopVectorize/AArch64/tail-fold-uniform-memops.ll:10
+; we don't artificially create new predicated blocks for the load.
+define void @uniform_load(i32* noalias %dst, i32* noalias readonly %src, i64 %n) #0 {
+; CHECK-LABEL: @uniform_load(
----------------
nit: I'd suggest adding this test in D11261 as well, so that you can see the difference in this patch.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D112552/new/
https://reviews.llvm.org/D112552
More information about the llvm-commits
mailing list