[llvm] [InstCombine] Extend `foldICmpAddConstant` to disjoint `or`. (PR #75899)
    Mikhail Gudim via llvm-commits 
    llvm-commits at lists.llvm.org
       
    Thu Jan  4 01:13:42 PST 2024
    
    
  
mgudim wrote:
OK, I see what's going on. First, note that  9223372036854775792 has 0 in position 63, bits [0, 3] are zero and all other bits are 1. Below is the simplified test case:
```
define i64 @foo(i64 %x, i64 %y, i1 %c) {
  %shlx_ = shl i64 %x, 8
  %shly_ = shl i64 %y, 4
  %add2 = add i64 %shly_, 16
  %select_ = select i1 %c, i64 %shlx_, i64 %add2
  %add_ = add i64 %select_, 15
  %cmp_ = icmp slt i64 %add_, 0
  br i1 %cmp_, label %t, label %f
t:
	unreachable
f:
	%and_ = and i64 %add_, 9223372036854775792
	ret i64 %and_
}
```
Before my change, the code looks like this when we come to `visitAnd` in `InstCombine`:
```
define i64 @foo(i64 %x, i64 %y, i1 %c) {
  %shlx_ = shl i64 %x, 8
  %shly_ = shl i64 %y, 4
  %add2 = add i64 %shly_, 16
  %select_ = select i1 %c, i64 %shlx_, i64 %add2
  %cmp_ = icmp slt i64 %select_, 0
  br i1 %cmp_, label %t, label %f
t:                                                ; preds = %0
   unreachable
f:                                                ; preds = %0
  %add_ = or disjoint i64 %select_, 15
  %and = and i64 %select_, 9223372036854775792
  ret i64 %and
}
```
In other words, the `cmp_ = icmp slt i64 %select_, 0` was NOT simplified, so we know that if we come to `f` highest bit is zero. We now see that `add` and `and` are not needed.
After my change:
```
define i64 @foo(i64 %x, i64 %y, i1 %c) {
  %shlx_ = shl i64 %x, 8
  %shly_ = shl i64 %y, 4
  %add2 = add i64 %shly_, 16
  %select_ = select i1 %c, i64 %shlx_, i64 %add2
  %cmp_ = icmp slt i64 %select_, -15
  br i1 %cmp_, label %t, label %f
t:                                                ; preds = %0
  unreachable
f:                                                ; preds = %0
  %add_ = or disjoint i64 %select_, 15
  %and = and i64 %add_, 9223372036854775792
  ret i64 %and
}
```
The compare instruction is now changed to `%cmp_ = icmp slt i64 %select_, -15`, so now when we analyze `and` we can't deduce that the highest bit is zero anyway.
There are a couple of ways to fix this but I am not sure which one is the best. I was thinking we could add some capabilities to `simplifyICmpWithZero`. Or maybe `foldICmpUsingKnownBits`?
@nikic What do you think?
https://github.com/llvm/llvm-project/pull/75899
    
    
More information about the llvm-commits
mailing list