[LLVMdev] `llvm.$op.with.overflow`, InstCombine and ScalarEvolution

Thu Mar 26 14:54:21 PDT 2015

I've run into cases where, because not all of LLVM's optimizations
understand the semantics of the `llvm.$op.with.overflow` intrinsics,
canonicalizing compares to `llvm.$op.with.overflow` ends up preventing
optimization.

For instance, running the following snippet through `opt -indvars`
optimizes `%to.optimize` to `true`, but running it through
`opt -instcombine -indvars` does not.

```
declare void @side_effect(i1)

define void @foo(i8 %x, i8 %y) {
 entry:
  %sum = add i8 %x, %y
  %e = icmp ugt i8 %x, %sum
  br i1 %e, label %exit, label %loop

 loop:
  %idx = phi i8 [ %x, %entry ], [ %idx.inc, %loop ]
  %idx.inc = add i8 %idx, 1
  %to.optimize = icmp ule i8 %idx, %sum
  call void @side_effect(i1 %to.optimize)
  %c = icmp ule i8 %idx.inc, %sum
  br i1 %c, label %loop, label %exit

 exit:
  ret void
}
```

This happens because `-instcombine` does the following tranform:

```
entry:
  %uadd = call { i8, i1 } @llvm.uadd.with.overflow.i8(i8 %x, i8 %y)
  %0 = extractvalue { i8, i1 } %uadd, 0
  %e = extractvalue { i8, i1 } %uadd, 1
  br i1 %e, label %exit, label %loop.preheader
```

and ScalarEvolution can no longer see through the `extractvalue` of
the call to `llvm.uadd.with.overflow.i8`.

The right way to fix this depends on the intended purpose of the
`llvm.$op.with.overflow` intrinsics.  Three solutions I can think of:

 * if they're a convenience useful only for better codegen, can the
   transform that instcombine is doing currently (transforming
   compares to `extractvalue` of a `.with.overflow` intrinsic) be
   moved to CodeGenPrep?

 * if they're something more fundamental, then maybe they should to be
   promoted to an instruction?  They've been around since at least
   llvm 2.6 as far as I can tell.  Personally, I seriously doubt this
   is a good idea, given that the semantics of these intrinsics can be
   completely described by a DAG composed of existing instructions.

 * add rules to ScalarEvolution to have it understand these intrinsics
   (or maybe even expand them before `-indvars` -- I think
   `-liv-reduce` tries to do this in some cases), but I'd vote for
   keeping this complexity out of ScalarEvolution unless there are
   good reasons why the `.with.overflow` calls need to be introduced
   before codegenprep.

What do you think?

-- Sanjoy