[llvm-dev] [RFC] How to manifest information in LLVM-IR, or, revisiting llvm.assume

Wed Dec 18 11:59:26 PST 2019

Am Mo., 16. Dez. 2019 um 17:17 Uhr schrieb Doerfert, Johannes via
llvm-dev <llvm-dev at lists.llvm.org>:
> 1) Use named operand bundles to encode information.
>    If we want to encode property XYZ for a value %V holds at a certain
>    program point and the property is dependent on %N we could encode
>    that as:
>      `llvm.assume() ["XYZ"(%V, %N)]`

What is the advantage of using operator bundles over directly using
arguments? That is, why not using

   call llvm.assume_fn.i32.i32(@llvm.assume.expression_#27, %i, %j)

Is it to avoid the overloads of llvm.assume_fn? If yes, why are
overloads of llvm.assume a problem?

> 2) Outline assumption expression code (with side-effects).
>   If we potentially have side-effects, or we simply have a non-trivial
>   expression that requires to be lowered into instructions, we can
>   outline the assumption expression code and tie it to the
>   `llvm.assume` via another operand bundle property. It could look
>   something like this:
>     `__builtin_assume(foo(i) == bar(j));`
>   will cause us to generate
>     ```
>     /// Must return true!
>     static bool llvm.assume.expression_#27(int i, int j) {
>       return foo(i) == bar(j);
>     }
>     ```
>   and a `llvm.assume` call like this:
>     `llvm.assume() ["assume_fn"(@llvm.assume.expression_#27, %i, %j))]
>   So we generate the expression in a new function which we (only) tie to
>   the `llvm.assume()` through the "assume_fn" operand bundle. This will
>   make sure we do not accidentally evaluate the code, or assume it is
>   evaluated and produced side-effects. We can still optimize the code
>   and use the information that we learn from it at the `llvm.assume`
>   site though.

I like the idea. If emitting assume_fns to the object file is an
issue, we could introduce a LinkageType that is never emitted to an
object file (and can only be used in an llvm.assume).

>  3) Use tokens to mark ranges.
>    We have tokens which can be used to tie two instructions together,
>    basically forming a range (with some conditions on the initial CFG).
>    If we tie two `llvm.assume` calls together we can say that the
>    information provided by the first holds for any point dominated by it
>    and post-dominated by the second.

Is this the same mechanism as for llvm.lifetime.start/end? It may not
be easy to keep them consistently (post-)dominating each other, see
http://lists.llvm.org/pipermail/llvm-dev/2017-March/111551.html .

Michael