[PATCH] D115724: [InstCombine] Fold for masked scatters to a uniform address

Sander de Smalen via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Dec 16 05:44:15 PST 2021


sdesmalen added inline comments.


================
Comment at: llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp:379
+    // destination
+    if (auto *SplatValue = getSplatValue(II.getArgOperand(0))) {
+      Align Alignment = cast<ConstantInt>(II.getArgOperand(2))->getAlignValue();
----------------
This case still needs a check to ensure that `!ConstMask->isNullValue()`, because at least one lane must be active for this to be valid. (see my comment on your test to explain why this is currently untested)


================
Comment at: llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp:385
+    }
+    // Vector splat address w/known mask -> scalar store
+    // If only the mask and destination are a splat, then we can extract
----------------
nit: Vector splat address with all-active mask -> scalar store of last lane.


================
Comment at: llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp:386-388
+    // If only the mask and destination are a splat, then we can extract
+    // the value from the last lane of the source and do a scalar store
+    // to destination
----------------
These lines of comments can be dropped if you take the previous suggestion.


================
Comment at: llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp:396
+      Value *RunTimeVF = VF.isScalable() ? Builder.CreateVScale(EC) : EC;
+      // LastLane = RunTimeVF - 1
+      Value *LastLane = Builder.CreateSub(RunTimeVF, Builder.getInt32(1));
----------------
nit: remove comment, the line of code below says as much.


================
Comment at: llvm/test/Transforms/InstCombine/masked_intrinsics.ll:285
+  %broadcast.splatvalue = shufflevector <4 x i16> %broadcast.value, <4 x i16> poison, <4 x i32> zeroinitializer
+  call void @llvm.masked.scatter.v4i16.v4p0i16(<4 x i16> %broadcast.splatvalue, <4 x i16*> %broadcast.splat, i32 2, <4 x i1> shufflevector (<4 x i1> insertelement (<4 x i1> poison, i1 true, i32 1), <4 x i1> poison, <4 x i32> zeroinitializer))
+  ret void
----------------
This is still an all-active mask though (contrary to the comment on this test), so this is not testing the code you're aiming to test.

Also, fixed-width vectors can be written like this: `<4 x i1> <i32 1, i32 1, i32 1, i32 0>` (for a mask that is not all-active, but also not all-inactive, which I think is what you're after)


================
Comment at: llvm/test/Transforms/InstCombine/vscale_masked_intrinsics.ll:16
+  %broadcast.splatvalue = shufflevector <vscale x 4 x i16> %broadcast.value, <vscale x 4 x i16> poison, <vscale x 4 x i32> zeroinitializer
+  call void @llvm.masked.scatter.nxv4i16.nxv4p0i16(<vscale x 4 x i16> %broadcast.splatvalue, <vscale x 4 x i16*> %broadcast.splat, i32 2, <vscale x 4 x i1> shufflevector (<vscale x 4 x i1> insertelement (<vscale x 4 x i1> poison, i1 true, i32 1), <vscale x 4 x i1> poison, <vscale x 4 x i32> zeroinitializer))
+  ret void
----------------
instead of passing an all true mask, you could simply insert 1 `i1 1` element into a zeroinitializer


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D115724/new/

https://reviews.llvm.org/D115724



More information about the llvm-commits mailing list