[PATCH] D115724: [InstCombine] Fold for masked scatters to a uniform address
Caroline via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Dec 16 07:25:27 PST 2021
CarolineConcatto added inline comments.
================
Comment at: llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp:379
+ // destination
+ if (auto *SplatValue = getSplatValue(II.getArgOperand(0))) {
+ Align Alignment = cast<ConstantInt>(II.getArgOperand(2))->getAlignValue();
----------------
sdesmalen wrote:
> This case still needs a check to ensure that `!ConstMask->isNullValue()`, because at least one lane must be active for this to be valid. (see my comment on your test to explain why this is currently untested)
As discussed internally this is done in line 371:
// If the mask is all zeros, a scatter does nothing.
if (ConstMask->isNullValue())
return eraseInstFromFunction(II);
================
Comment at: llvm/test/Transforms/InstCombine/masked_intrinsics.ll:285
+ %broadcast.splatvalue = shufflevector <4 x i16> %broadcast.value, <4 x i16> poison, <4 x i32> zeroinitializer
+ call void @llvm.masked.scatter.v4i16.v4p0i16(<4 x i16> %broadcast.splatvalue, <4 x i16*> %broadcast.splat, i32 2, <4 x i1> shufflevector (<4 x i1> insertelement (<4 x i1> poison, i1 true, i32 1), <4 x i1> poison, <4 x i32> zeroinitializer))
+ ret void
----------------
sdesmalen wrote:
> This is still an all-active mask though (contrary to the comment on this test), so this is not testing the code you're aiming to test.
>
> Also, fixed-width vectors can be written like this: `<4 x i1> <i32 1, i32 1, i32 1, i32 0>` (for a mask that is not all-active, but also not all-inactive, which I think is what you're after)
I have changed all the fixed vector to be as you said.
<i32 X, i32 X, i32 X, i32 X>, where X can be 0 or 1 depending on what is being tested.
================
Comment at: llvm/test/Transforms/InstCombine/vscale_masked_intrinsics.ll:16
+ %broadcast.splatvalue = shufflevector <vscale x 4 x i16> %broadcast.value, <vscale x 4 x i16> poison, <vscale x 4 x i32> zeroinitializer
+ call void @llvm.masked.scatter.nxv4i16.nxv4p0i16(<vscale x 4 x i16> %broadcast.splatvalue, <vscale x 4 x i16*> %broadcast.splat, i32 2, <vscale x 4 x i1> shufflevector (<vscale x 4 x i1> insertelement (<vscale x 4 x i1> poison, i1 true, i32 1), <vscale x 4 x i1> poison, <vscale x 4 x i32> zeroinitializer))
+ ret void
----------------
sdesmalen wrote:
> instead of passing an all true mask, you could simply insert 1 `i1 1` element into a zeroinitializer
I have updated the all zero mask to be:
<vscale x 4 x i1> zeroinitializer,
I don't know if I follow what you want here.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D115724/new/
https://reviews.llvm.org/D115724
More information about the llvm-commits
mailing list