[llvm] [ValueTracking] Extend LHS/RHS with matching operand to work without constants. (PR #85557)

via llvm-commits llvm-commits at lists.llvm.org
Tue Mar 19 11:45:33 PDT 2024


goldsteinn wrote:

> This patch seems to block SROA: [dtcxzyw/llvm-opt-benchmark#419 (comment)](https://github.com/dtcxzyw/llvm-opt-benchmark/pull/419#discussion_r1527863982).

Seems to boil down to simplifications happening earlier.

A reduced form:
```

define void @fun0() {
entry:
  %first111 = alloca [0 x [0 x [0 x ptr]]], i32 0, align 8
  store i64 0, ptr %first111, align 8
  %last = getelementptr i8, ptr %first111, i64 8
  call void @fun3(ptr %first111, ptr %last)
  ret void
}

define void @fun3(ptr %first, ptr %last, ptr %p_in) {
entry:
  %sub.ptr.lhs.cast = ptrtoint ptr %last to i64
  %sub.ptr.rhs.cast = ptrtoint ptr %first to i64
  %sub.ptr.sub = sub i64 %sub.ptr.lhs.cast, %sub.ptr.rhs.cast
  %call = ashr exact i64 %sub.ptr.sub, 3
  %call2 = load volatile i64, ptr %p_in, align 8
  %cmp = icmp ugt i64 %call, %call2
  br i1 %cmp, label %common.ret, label %if.else

common.ret:                                       ; preds = %if.else29, %if.else, %entry
  ret void

if.else:                                          ; preds = %entry
  %c_load.cast0.i = ptrtoint ptr %last to i64
  %c_load.cast.div0.i = ashr exact i64 %c_load.cast0.i, 3
  %cmp24.not = icmp ult i64 %c_load.cast.div0.i, %call
  br i1 %cmp24.not, label %if.else29, label %common.ret

if.else29:                                        ; preds = %if.else
  %n_is_c = call i1 @llvm.is.constant.i64(i64 %c_load.cast.div0.i)
  %cmp2 = icmp eq i64 %c_load.cast.div0.i, -1
  %or.cond1 = and i1 %n_is_c, %cmp2
  %add.ptr = getelementptr i64, ptr %first, i64 %c_load.cast.div0.i
  %.pre = ptrtoint ptr %add.ptr to i64
  %ptr.lhs.pre-phi = select i1 %or.cond1, i64 0, i64 %.pre
  %ptr.sub = sub i64 %ptr.lhs.pre-phi, %sub.ptr.rhs.cast
  call void @llvm.memmove.p0.p0.i64(ptr null, ptr %first, i64 %ptr.sub, i1 false)
  br label %common.ret
}

; Function Attrs: nocallback nofree nounwind willreturn memory(argmem: readwrite)
declare void @llvm.memmove.p0.p0.i64(ptr nocapture writeonly, ptr nocapture readonly, i64, i1 immarg) #0

; Function Attrs: convergent nocallback nofree nosync nounwind willreturn memory(none)
declare i1 @llvm.is.constant.i64(i64) #1

attributes #0 = { nocallback nofree nounwind willreturn memory(argmem: readwrite) }
attributes #1 = { convergent nocallback nofree nosync nounwind willreturn memory(none) }

```

Where we go awry is when fold:
```
  %c_load.cast.div0.i = ashr exact i64 %c_load.cast0.i, 3
  %cmp24.not = icmp ult i64 %c_load.cast.div0.i, %call
```
->
```
  %cmp24.not = icmp ugt i64 %sub.ptr.sub, %c_load.cast0.i
```

Which eventually results in the following diff after inlining:
```
  %c_load.cast.div0.i.i = ashr exact i64 %c_load.cast0.i.i, 3
  %cmp24.not.i = icmp ult i64 %c_load.cast.div0.i.i, 1
```
```
  %cmp24.not.i = icmp ugt i64 8, %c_load.cast0.i.i
```

Then finally:
```
  %cmp24.not.i = icmp eq ptr %c_load0.i.i, null
```
vs

```
  %cmp24.not.i = icmp ult ptr %c_load0.i.i, inttoptr (i64 8 to ptr)
```

Essentially we throw away the information that the low 3 bits of the pointer are zero
before we have enough information to fully reduce the compare to something easy to
analyze.

Looking into a fix...

https://github.com/llvm/llvm-project/pull/85557


More information about the llvm-commits mailing list