[PATCH] D115724: [InstCombine] Fold for masked scatters to a uniform address
Sander de Smalen via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Jan 4 05:48:17 PST 2022
sdesmalen added inline comments.
================
Comment at: llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp:377-378
+ if (auto *SplatPtr = getSplatValue(II.getArgOperand(1))) {
+ // The value is a splat so we need a scalar store from the value to
+ // destination
+ if (auto *SplatValue = getSplatValue(II.getArgOperand(0))) {
----------------
nit: `// scatter(splat(value), splat(ptr), non-zero-mask) -> store value, ptr`
================
Comment at: llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp:381
+ Align Alignment = cast<ConstantInt>(II.getArgOperand(2))->getAlignValue();
+ StoreInst *S = new StoreInst(SplatValue, SplatPtr, false, Alignment);
+ S->copyMetadata(II);
----------------
nit: `/*IsVolatile=*/ false`
================
Comment at: llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp:385
+ }
+ // Vector splat address with all-active mask -> scalar store of last lane.
+ if (ConstMask->isAllOnesValue()) {
----------------
nit: `// scatter(vector, splat(ptr), splat(true)) -> store extract(vector, lastlane), ptr`
================
Comment at: llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp:396
+ Builder.CreateExtractElement(II.getArgOperand(0), LastLane);
+ StoreInst *S = new StoreInst(Extract, SplatPtr, false, Alignment);
+ S->copyMetadata(II);
----------------
nit: `/*IsVolatile=*/ false`
================
Comment at: llvm/test/Transforms/InstCombine/masked_intrinsics.ll:274
+
+; The tests bellow have the pointer as a splat in the masked scatter
+
----------------
nit: The scatters in the tests below take a splatted pointer
================
Comment at: llvm/test/Transforms/InstCombine/masked_intrinsics.ll:276
+
+;; Value splat and mask is not not all active/one
+define void @valid_2value_inv_store_i16(i16* noalias %dst, i16 %val) #0 {
----------------
can you stick with only 1 terminology? i.e. just 'all active'.
================
Comment at: llvm/test/Transforms/InstCombine/masked_intrinsics.ll:277
+;; Value splat and mask is not not all active/one
+define void @valid_2value_inv_store_i16(i16* noalias %dst, i16 %val) #0 {
+; CHECK-LABEL: @valid_2value_inv_store_i16(
----------------
Like mentioned on D115726, I'd avoid using 'valid' and 'invalid', but instead do:
@scatter_v4i16_uniform_vals_uniform_ptrs_no_all_active_mask
================
Comment at: llvm/test/Transforms/InstCombine/masked_intrinsics.ll:304
+ %broadcast.splatvalue = shufflevector <4 x i16> %broadcast.value, <4 x i16> poison, <4 x i32> zeroinitializer
+ call void @llvm.masked.scatter.v4i16.v4p0i16(<4 x i16> %broadcast.splatvalue, <4 x i16*> %broadcast.splat, i32 2, <4 x i1> <i1 0, i1 0, i1 0, i1 0>)
+ ret void
----------------
This case isn't affected by your patch, so I would just remove it.
================
Comment at: llvm/test/Transforms/InstCombine/masked_intrinsics.ll:309
+
+;; The mask is all active/one
+define void @valid_maks_inv_store_i16(i16* noalias %dst, <4 x i16>* noalias readonly %src) #0 {
----------------
The mask is `<0, 0, 1, 1>` which is not all active.
Maybe place these under 'negative tests', since they cannot be optimised.
================
Comment at: llvm/test/Transforms/InstCombine/masked_intrinsics.ll:350-351
+;
+ %insert.elt = insertelement <4 x i16*> poison, i16* %dst, i32 1
+ %broadcast.splat = shufflevector <4 x i16*> %insert.elt, <4 x i16*> poison, <4 x i32> zeroinitializer
+ %wide.load = load <4 x i16>, <4 x i16>* %src, align 2
----------------
Just pass a `<4 x i16*>` as argument, it's a bit opaque that this is not a splat. (i.e. you're splatting element zero, which is not where `%dst` is inserted, but that's not an obvious pattern to spot)
================
Comment at: llvm/test/Transforms/InstCombine/vscale_masked_intrinsics.ll:8
+;; Value splat and mask is not not all active/one
+define void @valid_value_inv_store_i16(i16* noalias %dst, i16 %val, <vscale x 4 x i1> %mask) #0 {
+; CHECK-LABEL: @valid_value_inv_store_i16(
----------------
I think the tests for fixed-width vectors are largely sufficient, only for the two positive tests/cases I think you should add a scalable-vector test.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D115724/new/
https://reviews.llvm.org/D115724
More information about the llvm-commits
mailing list