[llvm] [ValueTracking] Handle non-canonical operand order in `isImpliedCondICmps` (PR #85575)
via llvm-commits
llvm-commits at lists.llvm.org
Mon Mar 18 15:52:58 PDT 2024
goldsteinn wrote:
> This patch seems to break minmax idiom: [dtcxzyw/llvm-opt-benchmark#421 (comment)](https://github.com/dtcxzyw/llvm-opt-benchmark/pull/421#discussion_r1527819200)
Investigated the regression a bit. A reduced case is:
```
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"
declare void @use(i64)
define i32 @ParseCipherList(i64 %sub.ptr.lhs.cast) {
entry:
%currLen = alloca i32, i32 0, align 4
br label %do.body24
do.body24: ; preds = %if.end34, %entry
%conv29 = trunc i64 %sub.ptr.lhs.cast to i32
store i32 %conv29, ptr %currLen, align 4
%0 = load i32, ptr %currLen, align 4
%cmp30 = icmp ugt i32 48, %0
br i1 %cmp30, label %if.then32, label %if.end34
if.then32: ; preds = %do.body24
%1 = load i32, ptr %currLen, align 4
br label %if.end34
if.end34: ; preds = %if.then32, %do.body24
%length.0 = phi i32 [ %1, %if.then32 ], [ 48, %do.body24 ]
%conv35 = zext i32 %length.0 to i64
call void @use(i64 %conv35)
%cmp38 = icmp eq i64 %conv35, 49
%cond = select i1 %cmp38, i32 0, i32 %length.0
%idxprom = zext i32 %cond to i64
%arrayidx40 = getelementptr [49 x i8], ptr null, i64 0, i64 %idxprom
store i8 0, ptr %arrayidx40, align 1
br label %do.body24
}
```
What ends up happening is, with the change, `earlycse` is able to simplify `%cond` so we get:
```
if.end34: ; preds = %if.then32, %do.body24
%length.0 = phi i32 [ %conv29, %if.then32 ], [ 48, %do.body24 ]
%conv35 = zext i32 %length.0 to i64
call void @use(i64 %conv35)
%arrayidx40 = getelementptr [49 x i8], ptr null, i64 0, i64 %conv35
store i8 0, ptr %arrayidx40, align 1
br label %do.body24
```
Without the change `earlycse` produces:
```
%length.0 = phi i32 [ %conv29, %if.then32 ], [ 48, %do.body24 ]
%conv35 = zext i32 %length.0 to i64
call void @use(i64 %conv35)
%cmp38 = icmp eq i64 %conv35, 49
%cond = select i1 %cmp38, i32 0, i32 %length.0
%idxprom = zext i32 %cond to i64
%arrayidx40 = getelementptr [49 x i8], ptr null, i64 0, i64 %idxprom
store i8 0, ptr %arrayidx40, align 1
br label %do.body24
```
The big difference then comes up when we reach `instcombine`. without the `earlycse` cleanup, we do essentially same simplification in `instcombine`/`instsimplify`:
```
%length.0 = phi i32 [ %conv29, %if.then32 ], [ 48, %do.body24 ]
%conv35 = zext nneg i32 %length.0 to i64
call void @use(i64 %conv35)
%idxprom = zext nneg i32 %length.0 to i64
%arrayidx40 = getelementptr [49 x i8], ptr null, i64 0, i64 %idxprom
store i8 poison, ptr %arrayidx40, align 1
br label %do.body24
```
But we end up duplicating the `zext nneg i32 %length.0 to i64` instruction instead of folding it all to one instruction.
This makes the `phi` multi-use which disables `canEvaluateZExtd` which preserves the minmax pattern in the `phi` for when its later reduced to select by `simplifycfg`.
I don't think any part of how this regression got introduced is really fixable.
Ultimately the fix boils down to matching:
```
define i32 @ParseCipherList(i64 %sub.ptr.lhs.cast) local_unnamed_addr {
%conv29 = trunc i64 %sub.ptr.lhs.cast to i32
%cmp30 = icmp ult i32 %conv29, 48
%spec.select = select i1 %cmp30, i64 %sub.ptr.lhs.cast, i64 48
%conv35 = and i64 %spec.select, 63
...
```
As a min/max pattern which is a bit tough. Its not really possible w.o
the context of the `and` instruction (with the and we can simplify the
true arm of the select back to `%conv29` to recreate the pattern).
This is doable in `foldSelectInstWithICmp` before `canonicalizeSPF`
but I don't really see a way to make this anything but extremely brittle,
as I don't see a reasonable way to introduce the demanded bits aspect
(so we can match `trunc` of an arm) into `matchDecomposedSelectPattern`.
My feeling is this is basically wont fix. Do you see any clean way to
address this?
https://github.com/llvm/llvm-project/pull/85575
More information about the llvm-commits
mailing list