[PATCH] D115724: [InstCombine] Fold for masked scatters to a uniform address

Sander de Smalen via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Jan 4 05:48:17 PST 2022


sdesmalen added inline comments.


================
Comment at: llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp:377-378
+  if (auto *SplatPtr = getSplatValue(II.getArgOperand(1))) {
+    // The value is a splat so we need a scalar store from the value to
+    // destination
+    if (auto *SplatValue = getSplatValue(II.getArgOperand(0))) {
----------------
nit: `// scatter(splat(value), splat(ptr), non-zero-mask) -> store value, ptr`


================
Comment at: llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp:381
+      Align Alignment = cast<ConstantInt>(II.getArgOperand(2))->getAlignValue();
+      StoreInst *S = new StoreInst(SplatValue, SplatPtr, false, Alignment);
+      S->copyMetadata(II);
----------------
nit: `/*IsVolatile=*/ false`


================
Comment at: llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp:385
+    }
+    // Vector splat address with all-active mask -> scalar store of last lane.
+    if (ConstMask->isAllOnesValue()) {
----------------
nit: `// scatter(vector, splat(ptr), splat(true)) -> store extract(vector, lastlane), ptr`


================
Comment at: llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp:396
+          Builder.CreateExtractElement(II.getArgOperand(0), LastLane);
+      StoreInst *S = new StoreInst(Extract, SplatPtr, false, Alignment);
+      S->copyMetadata(II);
----------------
nit: `/*IsVolatile=*/ false`


================
Comment at: llvm/test/Transforms/InstCombine/masked_intrinsics.ll:274
+
+; The tests bellow have the pointer as a splat in the masked scatter
+
----------------
nit: The scatters in the tests below take a splatted pointer


================
Comment at: llvm/test/Transforms/InstCombine/masked_intrinsics.ll:276
+
+;; Value splat and  mask is not not all active/one
+define void @valid_2value_inv_store_i16(i16* noalias %dst, i16 %val) #0 {
----------------
can you stick with only 1 terminology? i.e. just 'all active'.


================
Comment at: llvm/test/Transforms/InstCombine/masked_intrinsics.ll:277
+;; Value splat and  mask is not not all active/one
+define void @valid_2value_inv_store_i16(i16* noalias %dst, i16 %val) #0 {
+; CHECK-LABEL: @valid_2value_inv_store_i16(
----------------
Like mentioned on D115726, I'd avoid using 'valid' and 'invalid', but instead do:

  @scatter_v4i16_uniform_vals_uniform_ptrs_no_all_active_mask



================
Comment at: llvm/test/Transforms/InstCombine/masked_intrinsics.ll:304
+  %broadcast.splatvalue = shufflevector <4 x i16> %broadcast.value, <4 x i16> poison, <4 x i32> zeroinitializer
+  call void @llvm.masked.scatter.v4i16.v4p0i16(<4 x i16> %broadcast.splatvalue, <4 x i16*> %broadcast.splat, i32 2, <4 x i1> <i1 0, i1 0, i1 0, i1 0>)
+  ret void
----------------
This case isn't affected by your patch, so I would just remove it.


================
Comment at: llvm/test/Transforms/InstCombine/masked_intrinsics.ll:309
+
+;; The mask is all active/one
+define void @valid_maks_inv_store_i16(i16* noalias %dst, <4 x i16>* noalias readonly %src) #0 {
----------------
The mask is `<0, 0, 1, 1>` which is not all active.

Maybe place these under 'negative tests', since they cannot be optimised.


================
Comment at: llvm/test/Transforms/InstCombine/masked_intrinsics.ll:350-351
+;
+  %insert.elt = insertelement <4 x i16*> poison, i16* %dst, i32 1
+  %broadcast.splat = shufflevector <4 x i16*> %insert.elt, <4 x i16*> poison, <4 x i32> zeroinitializer
+  %wide.load = load <4 x i16>, <4 x i16>* %src, align 2
----------------
Just pass a `<4 x i16*>` as argument, it's a bit opaque that this is not a splat. (i.e. you're splatting element zero, which is not where `%dst` is inserted, but that's not an obvious pattern to spot)


================
Comment at: llvm/test/Transforms/InstCombine/vscale_masked_intrinsics.ll:8
+;; Value splat and  mask is not not all active/one
+define void @valid_value_inv_store_i16(i16* noalias %dst, i16 %val, <vscale x 4 x i1> %mask) #0 {
+; CHECK-LABEL: @valid_value_inv_store_i16(
----------------
I think the tests for fixed-width vectors are largely sufficient, only for the two positive tests/cases I think you should add a scalable-vector test.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D115724/new/

https://reviews.llvm.org/D115724



More information about the llvm-commits mailing list