[llvm] [CGP] Despeculate ctlz/cttz with "illegal" integer types (PR #137197)
Sergei Barannikov via llvm-commits
llvm-commits at lists.llvm.org
Tue Apr 29 10:47:24 PDT 2025
================
@@ -441,33 +444,35 @@ define i64 @test_cttz_i64(i64 %a) nounwind {
;
; RV32M-LABEL: test_cttz_i64:
; RV32M: # %bb.0:
+; RV32M-NEXT: or a2, a0, a1
+; RV32M-NEXT: beqz a2, .LBB3_3
+; RV32M-NEXT: # %bb.1: # %cond.false
; RV32M-NEXT: lui a2, 30667
; RV32M-NEXT: addi a3, a2, 1329
; RV32M-NEXT: lui a2, %hi(.LCPI3_0)
; RV32M-NEXT: addi a2, a2, %lo(.LCPI3_0)
-; RV32M-NEXT: bnez a1, .LBB3_3
-; RV32M-NEXT: # %bb.1:
-; RV32M-NEXT: li a1, 32
-; RV32M-NEXT: beqz a0, .LBB3_4
-; RV32M-NEXT: .LBB3_2:
-; RV32M-NEXT: neg a1, a0
-; RV32M-NEXT: and a0, a0, a1
+; RV32M-NEXT: bnez a0, .LBB3_4
+; RV32M-NEXT: # %bb.2: # %cond.false
+; RV32M-NEXT: neg a0, a1
+; RV32M-NEXT: and a0, a1, a0
; RV32M-NEXT: mul a0, a0, a3
; RV32M-NEXT: srli a0, a0, 27
; RV32M-NEXT: add a0, a2, a0
; RV32M-NEXT: lbu a0, 0(a0)
+; RV32M-NEXT: addi a0, a0, 32
; RV32M-NEXT: li a1, 0
; RV32M-NEXT: ret
; RV32M-NEXT: .LBB3_3:
-; RV32M-NEXT: neg a4, a1
-; RV32M-NEXT: and a1, a1, a4
-; RV32M-NEXT: mul a1, a1, a3
-; RV32M-NEXT: srli a1, a1, 27
-; RV32M-NEXT: add a1, a2, a1
-; RV32M-NEXT: lbu a1, 0(a1)
-; RV32M-NEXT: bnez a0, .LBB3_2
+; RV32M-NEXT: li a1, 0
----------------
s-barannikov wrote:
This is something that could be handled by RISCVRedundantCopyElimination (after some improvements like supporting AND/OR), but the context for the optimization is created later, by TailDuplicatePass. (It is +14 passes later.)
```
bb.0 (%ir-block.0):
successors: %bb.1(0x30000000), %bb.3(0x50000000); %bb.1(37.50%), %bb.3(62.50%)
liveins: $x10, $x11
renamable $x12 = OR renamable $x10, renamable $x11
BNE killed renamable $x12, $x0, %bb.3
bb.1:
; predecessors: %bb.0
renamable $x11 = COPY $x0
renamable $x10 = ADDI $x0, 64
PseudoRET implicit $x10, implicit $x11
```
I think the extra `li` may not be a big problem here as the result of `cttz` is usually truncated to 32 bits.
https://github.com/llvm/llvm-project/pull/137197
More information about the llvm-commits
mailing list