[llvm] [ValueTracking] Handle non-canonical operand order in `isImpliedCondICmps` (PR #85575)

Mon Mar 18 15:52:58 PDT 2024

goldsteinn wrote:

> This patch seems to break minmax idiom: [dtcxzyw/llvm-opt-benchmark#421 (comment)](https://github.com/dtcxzyw/llvm-opt-benchmark/pull/421#discussion_r1527819200)

Investigated the regression a bit. A reduced case is:
```
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

declare void @use(i64)

define i32 @ParseCipherList(i64 %sub.ptr.lhs.cast) {
entry:
  %currLen = alloca i32, i32 0, align 4
  br label %do.body24

do.body24:  ; preds = %if.end34, %entry
  %conv29 = trunc i64 %sub.ptr.lhs.cast to i32
  store i32 %conv29, ptr %currLen, align 4
  %0 = load i32, ptr %currLen, align 4
  %cmp30 = icmp ugt i32 48, %0
  br i1 %cmp30, label %if.then32, label %if.end34

if.then32:  ; preds = %do.body24
  %1 = load i32, ptr %currLen, align 4
  br label %if.end34

if.end34:  ; preds = %if.then32, %do.body24
  %length.0 = phi i32 [ %1, %if.then32 ], [ 48, %do.body24 ]
  %conv35 = zext i32 %length.0 to i64
  call void @use(i64 %conv35)
  %cmp38 = icmp eq i64 %conv35, 49
  %cond = select i1 %cmp38, i32 0, i32 %length.0
  %idxprom = zext i32 %cond to i64
  %arrayidx40 = getelementptr [49 x i8], ptr null, i64 0, i64 %idxprom
  store i8 0, ptr %arrayidx40, align 1
  br label %do.body24
}
```

What ends up happening is, with the change, `earlycse` is able to simplify `%cond` so we get:
```
if.end34:                                         ; preds = %if.then32, %do.body24
  %length.0 = phi i32 [ %conv29, %if.then32 ], [ 48, %do.body24 ]
  %conv35 = zext i32 %length.0 to i64
  call void @use(i64 %conv35)
  %arrayidx40 = getelementptr [49 x i8], ptr null, i64 0, i64 %conv35
  store i8 0, ptr %arrayidx40, align 1
  br label %do.body24
```

Without the change `earlycse` produces:
```
  %length.0 = phi i32 [ %conv29, %if.then32 ], [ 48, %do.body24 ]
  %conv35 = zext i32 %length.0 to i64
  call void @use(i64 %conv35)
  %cmp38 = icmp eq i64 %conv35, 49
  %cond = select i1 %cmp38, i32 0, i32 %length.0
  %idxprom = zext i32 %cond to i64
  %arrayidx40 = getelementptr [49 x i8], ptr null, i64 0, i64 %idxprom
  store i8 0, ptr %arrayidx40, align 1
  br label %do.body24
```

The big difference then comes up when we reach `instcombine`. without the `earlycse` cleanup, we do essentially same simplification in `instcombine`/`instsimplify`:
```
  %length.0 = phi i32 [ %conv29, %if.then32 ], [ 48, %do.body24 ]
  %conv35 = zext nneg i32 %length.0 to i64
  call void @use(i64 %conv35)
  %idxprom = zext nneg i32 %length.0 to i64
  %arrayidx40 = getelementptr [49 x i8], ptr null, i64 0, i64 %idxprom
  store i8 poison, ptr %arrayidx40, align 1
  br label %do.body24
```

But we end up duplicating the `zext nneg i32 %length.0 to i64` instruction instead of folding it all to one instruction.
This makes the `phi` multi-use which disables `canEvaluateZExtd` which preserves the minmax pattern in the `phi` for when its later reduced to select by `simplifycfg`.

I don't think any part of how this regression got introduced is really fixable.

Ultimately the fix boils down to matching:
```
define i32 @ParseCipherList(i64 %sub.ptr.lhs.cast) local_unnamed_addr {
  %conv29 = trunc i64 %sub.ptr.lhs.cast to i32
  %cmp30 = icmp ult i32 %conv29, 48
  %spec.select = select i1 %cmp30, i64 %sub.ptr.lhs.cast, i64 48
  %conv35 = and i64 %spec.select, 63
  ...
```

As a min/max pattern which is a bit tough. Its not really possible w.o
the context of the `and` instruction (with the and we can simplify the
true arm of the select back to `%conv29` to recreate the pattern).

This is doable in `foldSelectInstWithICmp` before `canonicalizeSPF`
but I don't really see a way to make this anything but extremely brittle,
as I don't see a reasonable way to introduce the demanded bits aspect
(so we can match `trunc` of an arm) into `matchDecomposedSelectPattern`.

My feeling is this is basically wont fix. Do you see any clean way to
address this?

https://github.com/llvm/llvm-project/pull/85575