[PATCH] D122896: [InstCombine] Extend support for folding select + masked gathers

Peter Waller via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Apr 4 04:15:37 PDT 2022


peterwaller-arm accepted this revision.
peterwaller-arm added a comment.
This revision is now accepted and ready to land.

Accept with some nits in the tests which need fixing.



================
Comment at: llvm/test/Transforms/InstCombine/select-masked_gather.ll:4-5
+
+; Fold zeroing of inactive lanes into the load's passthrough parameter.
+define <vscale x 2 x float> @masked_load_and_zero_inactive_1(<vscale x 2 x float*> %ptr, <vscale x 2 x i1> %mask) {
+; CHECK-LABEL: @masked_load_and_zero_inactive_1(
----------------
Test names refer to `load` but are testing gather. Worth differentiating for so they're easy to grep for. Please update references to 'load' in function names (also, in comments).


================
Comment at: llvm/test/Transforms/InstCombine/select-masked_gather.ll:123
+
+declare <vscale x 2 x float> @llvm.masked.gather.v4f32.p0v4f32(<vscale x 2 x float*>, i32 immarg, <vscale x 2 x i1>, <vscale x 2 x float>)
+declare <vscale x 2 x i32> @llvm.masked.gather.nxv2i32.nxv2p0i32(<vscale x 2 x i32*>, i32, <vscale x 2 x i1>, <vscale x 2 x i32>)
----------------
This intrinsic appears to be incorrectly named, since it's vscale x 2 x float, so it should be `nxv2f32`. That one is already available on line 125.


================
Comment at: llvm/test/Transforms/InstCombine/select-masked_gather.ll:124-125
+declare <vscale x 2 x float> @llvm.masked.gather.v4f32.p0v4f32(<vscale x 2 x float*>, i32 immarg, <vscale x 2 x i1>, <vscale x 2 x float>)
+declare <vscale x 2 x i32> @llvm.masked.gather.nxv2i32.nxv2p0i32(<vscale x 2 x i32*>, i32, <vscale x 2 x i1>, <vscale x 2 x i32>)
+declare <vscale x 2 x float> @llvm.masked.gather.nxv2f32.p0nxv2f32(<vscale x 2 x float*>, i32, <vscale x 2 x i1>, <vscale x 2 x float>)
----------------
These two intrinsics both have a have p0 in their titles but the arrangement of it is inconsistent. Looking at other tests, this part of the intrinsic name is often elided, so it would just be:
```
declare <vscale x 2 x i32> @llvm.masked.gather.nxv2i32(<vscale x 2 x i32*>, i32, <vscale x 2 x i1>, <vscale x 2 x i32>)
declare <vscale x 2 x float> @llvm.masked.gather.nxv2f32(<vscale x 2 x float*>, i32, <vscale x 2 x i1>, <vscale x 2 x float>)
```

or if you want to use the full name take a look at the CHECK lines in these tests which have e.g. `llvm.masked.gather.nxv2f32.nxv2p0f32` which looks to be the fully qualified way of writing it.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D122896/new/

https://reviews.llvm.org/D122896



More information about the llvm-commits mailing list