[llvm] [InstCombine] Prefer to keep power-of-2 constants when combining ashr exact and slt/ult of a constant (PR #86111)

Alex Bradbury via llvm-commits llvm-commits at lists.llvm.org
Thu Mar 21 04:27:22 PDT 2024


https://github.com/asb created https://github.com/llvm/llvm-project/pull/86111

We have flexibility in what constant to use when combining an `ashr exact` with a slt or ult of a constant, and it's not possible to revisit this decision later in the compilation pipeline after the `ashr exact` is removed. Keeping a constant close to power-of-2 (pow2val + 1) should be no worse than neutral, and in some cases may allow better codegen later on for targets that can more cheaply generated power of 2 (which may be selectable if converting back to setle/setge) or near power of 2 constants.

Alive2: <https://alive2.llvm.org/ce/z/anqsyz> and
<https://alive2.llvm.org/ce/z/AL9P3o>

---
*Discussion*:

More test changes are obviously required if landing this, but I thought I'd put this out there to get feedback on the approach as I'm not sure there's any precedent on making a heuristic choice for constants in instcombine. If it were possible to do this later in the pipeline (e.g. riscvcodegenprepare or later) I would, but as noted above once you lose the `ashr exact` you no longer know it's legal to change the constant. I don't have a benchmark this makes a huge difference on, but came across it when chasing something else (and intuitively, it seems plausible that making this kind of compare materialisable with a single instruction may reduce register pressure in the right cases).

The motivating example was something like:
```
define dso_local signext i32 @foo(ptr noundef %0, ptr noundef %1) local_unnamed_addr #0 {
  %3 = ptrtoint ptr %1 to i64
  %4 = ptrtoint ptr %0 to i64
  %5 = sub i64 %3, %4
  %6 = sdiv exact i64 %5, 32
  %7 = icmp sle i64 %6, 64
  br i1 %7, label %8, label %9

8:                                                ; preds = %2
  call void @abort() #2
  unreachable

9:                                                ; preds = %2
  ret i32 0

}
```

You get this kind of code from pointer arithmetic like `end - beg <= CONST` in C.

With this patch this can be compiled to:
```
	sub	a1, a1, a0
	bseti	a0, zero, 11
	bge	a0, a1, .LBB0_2
# %bb.1:
	li	a0, 0
	ret
```

But without it you get an awkward to generate constant:
```
	sub	a1, a1, a0
	lui	a0, 1
	addiw	a0, a0, -2017
	bge	a0, a1, .LBB0_2
```
I'm arguing that producing pow2 or close to pow2 constants if you can is a good default that shouldn't hurt and may help - but perhaps there are cases I haven't considered?

>From 3548b6f0ff11cf4bc9e1d07396f16793322fe3c4 Mon Sep 17 00:00:00 2001
From: Alex Bradbury <asb at igalia.com>
Date: Thu, 21 Mar 2024 11:10:06 +0000
Subject: [PATCH] [InstCombine] Prefer to keep power-of-2 constants when
 combining ashr exact and slt/ult of a constant

We have flexibility in what constant to use when combining an `ashr
exact` with a slt or ult of a constant, and it's not possible to revisit
this decision later in the compilation pipeline after the `ashr exact`
is removed. Keeping a constant close to power-of-2 (pow2val + 1) should
be now worse than neutral, and in some cases may allow better codegen
later on for targets that can more cheaply generated power of 2 (which
may be selectable if converting back to setle/setge) or near power of 2
constants.

Alive2: <https://alive2.llvm.org/ce/z/anqsyz> and
<https://alive2.llvm.org/ce/z/AL9P3o>
---
 .../Transforms/InstCombine/InstCombineCompares.cpp    | 11 +++++++++++
 llvm/test/Transforms/InstCombine/icmp-shr-lt-gt.ll    |  4 ++--
 2 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp b/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
index db302d7e526844..016b16ec50f8ac 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
@@ -2482,10 +2482,21 @@ Instruction *InstCombinerImpl::foldICmpShrConstant(ICmpInst &Cmp,
   // those conditions rather than checking them. This is difficult because of
   // undef/poison (PR34838).
   if (IsAShr && Shr->hasOneUse()) {
+    if (IsExact && (Pred == CmpInst::ICMP_SLT || Pred == CmpInst::ICMP_ULT) &&
+        !C.isMinSignedValue() && (C - 1).isPowerOf2()) {
+      // When C - 1 is a power of two and the transform can be legally
+      // performed, prefer this form so the produced constant is close to a
+      // power of two.
+      // icmp slt/ult (ashr exact X, ShAmtC), C
+      // --> icmp slt/ult (C - 1) << ShAmtC) -1
+      APInt ShiftedC = (C - 1).shl(ShAmtVal) + 1;
+      return new ICmpInst(Pred, X, ConstantInt::get(ShrTy, ShiftedC));
+    }
     if (IsExact || Pred == CmpInst::ICMP_SLT || Pred == CmpInst::ICMP_ULT) {
       // When ShAmtC can be shifted losslessly:
       // icmp PRED (ashr exact X, ShAmtC), C --> icmp PRED X, (C << ShAmtC)
       // icmp slt/ult (ashr X, ShAmtC), C --> icmp slt/ult X, (C << ShAmtC)
+
       APInt ShiftedC = C.shl(ShAmtVal);
       if (ShiftedC.ashr(ShAmtVal) == C)
         return new ICmpInst(Pred, X, ConstantInt::get(ShrTy, ShiftedC));
diff --git a/llvm/test/Transforms/InstCombine/icmp-shr-lt-gt.ll b/llvm/test/Transforms/InstCombine/icmp-shr-lt-gt.ll
index 1b8efe4351c6dc..6cdf18d73c9e90 100644
--- a/llvm/test/Transforms/InstCombine/icmp-shr-lt-gt.ll
+++ b/llvm/test/Transforms/InstCombine/icmp-shr-lt-gt.ll
@@ -3379,7 +3379,7 @@ define i1 @ashrslt_01_01_exact(i4 %x) {
 
 define i1 @ashrslt_01_02_exact(i4 %x) {
 ; CHECK-LABEL: @ashrslt_01_02_exact(
-; CHECK-NEXT:    [[C:%.*]] = icmp slt i4 [[X:%.*]], 4
+; CHECK-NEXT:    [[C:%.*]] = icmp slt i4 [[X:%.*]], 3
 ; CHECK-NEXT:    ret i1 [[C]]
 ;
   %s = ashr exact i4 %x, 1
@@ -3389,7 +3389,7 @@ define i1 @ashrslt_01_02_exact(i4 %x) {
 
 define i1 @ashrslt_01_03_exact(i4 %x) {
 ; CHECK-LABEL: @ashrslt_01_03_exact(
-; CHECK-NEXT:    [[C:%.*]] = icmp slt i4 [[X:%.*]], 6
+; CHECK-NEXT:    [[C:%.*]] = icmp slt i4 [[X:%.*]], 5
 ; CHECK-NEXT:    ret i1 [[C]]
 ;
   %s = ashr exact i4 %x, 1



More information about the llvm-commits mailing list