[llvm] [GISel] funnel shift combiner port from SelectionDAG ISel to GlobalISel (PR #135132)

Sun Apr 20 12:05:52 PDT 2025

================
@@ -105,3 +105,55 @@ define i16 @test_shl_i48_2(i48 %x, i48 %y) {
   %trunc = trunc i48 %shl to i16
   ret i16 %trunc
 }
+
+define i16 @test_fshl_i32(i32 %x, i32 %_, i32 %y) {
+; RV32-LABEL: test_fshl_i32:
+; RV32:       # %bb.0:
+; RV32-NEXT:    not a3, a2
+; RV32-NEXT:    sll a0, a0, a2
+; RV32-NEXT:    srli a1, a1, 1
+; RV32-NEXT:    srl a1, a1, a3
+; RV32-NEXT:    or a0, a0, a1
+; RV32-NEXT:    ret
+;
+; RV64-LABEL: test_fshl_i32:
+; RV64:       # %bb.0:
+; RV64-NEXT:    not a3, a2
+; RV64-NEXT:    sllw a0, a0, a2
+; RV64-NEXT:    srliw a1, a1, 1
+; RV64-NEXT:    srlw a1, a1, a3
+; RV64-NEXT:    or a0, a0, a1
+; RV64-NEXT:    ret
+
+    %fshl = call i32 @llvm.fshl.i32(i32 %x, i32 %_, i32 %y)
+    %shl = shl i32 %x, %y
+    %or = or i32 %fshl, %shl
+    %trunc = trunc i32 %or to i16
+    ret i16 %trunc
+}
+
+define i16 @test_fshr_i32(i32 %_, i32 %x, i32 %y) {
+; RV32-LABEL: test_fshr_i32:
+; RV32:       # %bb.0:
+; RV32-NEXT:    not a3, a2
+; RV32-NEXT:    slli a0, a0, 1
+; RV32-NEXT:    sll a0, a0, a3
+; RV32-NEXT:    srl a1, a1, a2
+; RV32-NEXT:    or a0, a0, a1
+; RV32-NEXT:    ret
+;
+; RV64-LABEL: test_fshr_i32:
+; RV64:       # %bb.0:
+; RV64-NEXT:    not a3, a2
+; RV64-NEXT:    slli a0, a0, 1
+; RV64-NEXT:    sllw a0, a0, a3
+; RV64-NEXT:    srlw a1, a1, a2
+; RV64-NEXT:    or a0, a0, a1
+; RV64-NEXT:    ret
+
+    %fshr = call i32 @llvm.fshr.i32(i32 %_, i32 %x, i32 %y)
+    %lshr = lshr i32 %x, %y
+    %or = or i32 %fshr, %lshr
+    %trunc = trunc i32 %or to i16
+    ret i16 %trunc
+}
----------------
mshockwave wrote:

Personally I don't think you have to test i48, which is an illegal type for RISC-V. Because the majority of the instructions you saw in i48's case are generated by the legalizer, which is unrelated to your patch here. So for the purpose of testing your pattern, it'll be more clear and succinct to only include legal types (i.e. i32 and i64)

> Also negative multi-use tests.

I think this related to a check that is potentially missing from your pattern now: ideally, the pattern would replace three instructions
```
(G_FSHR $out1, $z, $x, $y),
(G_LSHR $out2, $x, $y),
(G_OR $root, $out1, $out2)
```
into just one:
```
(G_FSHR $root, $z, $x, $y)
```
But if `$out2` or `$out1` have more than one user, than you cannot remove the intermediate `G_LSHR` or `G_FSHR`, respectively. In which case you have to emit the following sequence to ensure the correctness
```
(G_FSHR $out1, $z, $x, $y),
(G_LSHR $out2, $x, $y),
(G_FSHR $root, $z, $x, $y)
```
But as you might already notice, this "new" sequence of 3 instructions is nowhere better than the original sequence, which also have 3 instructions (in most architectures, G_FSHR will not be faster than G_OR). So in the case of `$out2` or `$out1` having more than one user -- it makes more sense to just _not_ do any optimization. I don't think your current pattern would conditionally trigger based on the number of users of intermediate values. You can checkout how other patterns handle it. Guarding an optimization pattern with checks on whether the intermediate values have one use is a pretty common thing among many optimizations. If you want to learn more, you can search `hasOneUse` or `m_OneUse` in DAGCombiner to see some examples.

Finally, a negative test is a test that makes sure your optimization does _not_ trigger under certain conditions. In your case, it would be an input sequence with G_FSHR or G_LSHR that have more than one users, and the CHECK lines should pledge that they won't be optimized. 

https://github.com/llvm/llvm-project/pull/135132