[PATCH] D115724: [InstCombine] Fold for masked scatters to a uniform address
Sander de Smalen via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Dec 16 05:44:15 PST 2021
sdesmalen added inline comments.
================
Comment at: llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp:379
+ // destination
+ if (auto *SplatValue = getSplatValue(II.getArgOperand(0))) {
+ Align Alignment = cast<ConstantInt>(II.getArgOperand(2))->getAlignValue();
----------------
This case still needs a check to ensure that `!ConstMask->isNullValue()`, because at least one lane must be active for this to be valid. (see my comment on your test to explain why this is currently untested)
================
Comment at: llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp:385
+ }
+ // Vector splat address w/known mask -> scalar store
+ // If only the mask and destination are a splat, then we can extract
----------------
nit: Vector splat address with all-active mask -> scalar store of last lane.
================
Comment at: llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp:386-388
+ // If only the mask and destination are a splat, then we can extract
+ // the value from the last lane of the source and do a scalar store
+ // to destination
----------------
These lines of comments can be dropped if you take the previous suggestion.
================
Comment at: llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp:396
+ Value *RunTimeVF = VF.isScalable() ? Builder.CreateVScale(EC) : EC;
+ // LastLane = RunTimeVF - 1
+ Value *LastLane = Builder.CreateSub(RunTimeVF, Builder.getInt32(1));
----------------
nit: remove comment, the line of code below says as much.
================
Comment at: llvm/test/Transforms/InstCombine/masked_intrinsics.ll:285
+ %broadcast.splatvalue = shufflevector <4 x i16> %broadcast.value, <4 x i16> poison, <4 x i32> zeroinitializer
+ call void @llvm.masked.scatter.v4i16.v4p0i16(<4 x i16> %broadcast.splatvalue, <4 x i16*> %broadcast.splat, i32 2, <4 x i1> shufflevector (<4 x i1> insertelement (<4 x i1> poison, i1 true, i32 1), <4 x i1> poison, <4 x i32> zeroinitializer))
+ ret void
----------------
This is still an all-active mask though (contrary to the comment on this test), so this is not testing the code you're aiming to test.
Also, fixed-width vectors can be written like this: `<4 x i1> <i32 1, i32 1, i32 1, i32 0>` (for a mask that is not all-active, but also not all-inactive, which I think is what you're after)
================
Comment at: llvm/test/Transforms/InstCombine/vscale_masked_intrinsics.ll:16
+ %broadcast.splatvalue = shufflevector <vscale x 4 x i16> %broadcast.value, <vscale x 4 x i16> poison, <vscale x 4 x i32> zeroinitializer
+ call void @llvm.masked.scatter.nxv4i16.nxv4p0i16(<vscale x 4 x i16> %broadcast.splatvalue, <vscale x 4 x i16*> %broadcast.splat, i32 2, <vscale x 4 x i1> shufflevector (<vscale x 4 x i1> insertelement (<vscale x 4 x i1> poison, i1 true, i32 1), <vscale x 4 x i1> poison, <vscale x 4 x i32> zeroinitializer))
+ ret void
----------------
instead of passing an all true mask, you could simply insert 1 `i1 1` element into a zeroinitializer
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D115724/new/
https://reviews.llvm.org/D115724
More information about the llvm-commits
mailing list