[llvm-dev] [RFC] How to manifest information in LLVM-IR, or, revisiting llvm.assume

Wed Dec 18 12:09:47 PST 2019

On 12/18, Michael Kruse wrote:
> Am Mo., 16. Dez. 2019 um 17:17 Uhr schrieb Doerfert, Johannes via
> llvm-dev <llvm-dev at lists.llvm.org>:
> > 1) Use named operand bundles to encode information.
> >    If we want to encode property XYZ for a value %V holds at a certain
> >    program point and the property is dependent on %N we could encode
> >    that as:
> >      `llvm.assume() ["XYZ"(%V, %N)]`
> 
> What is the advantage of using operator bundles over directly using
> arguments? That is, why not using
> 
>    call llvm.assume_fn.i32.i32(@llvm.assume.expression_#27, %i, %j)
> 
> Is it to avoid the overloads of llvm.assume_fn? If yes, why are
> overloads of llvm.assume a problem?

The advantage is that we can encode all attributes and other "simple"
properties without outlining in a very easy to recognize way. While it
is not as powerful I'm also happy if we adopt something like:
  call llvm.assume.pi32(i32* align(8) dereferenceable(16) %ptr)

Pure outlining based assumptions might also work.

> > 2) Outline assumption expression code (with side-effects).
> >   If we potentially have side-effects, or we simply have a non-trivial
> >   expression that requires to be lowered into instructions, we can
> >   outline the assumption expression code and tie it to the
> >   `llvm.assume` via another operand bundle property. It could look
> >   something like this:
> >     `__builtin_assume(foo(i) == bar(j));`
> >   will cause us to generate
> >     ```
> >     /// Must return true!
> >     static bool llvm.assume.expression_#27(int i, int j) {
> >       return foo(i) == bar(j);
> >     }
> >     ```
> >   and a `llvm.assume` call like this:
> >     `llvm.assume() ["assume_fn"(@llvm.assume.expression_#27, %i, %j))]
> >   So we generate the expression in a new function which we (only) tie to
> >   the `llvm.assume()` through the "assume_fn" operand bundle. This will
> >   make sure we do not accidentally evaluate the code, or assume it is
> >   evaluated and produced side-effects. We can still optimize the code
> >   and use the information that we learn from it at the `llvm.assume`
> >   site though.
> 
> I like the idea. If emitting assume_fns to the object file is an
> issue, we could introduce a LinkageType that is never emitted to an
> object file (and can only be used in an llvm.assume).

That is a really cool idea.

> >  3) Use tokens to mark ranges.
> >    We have tokens which can be used to tie two instructions together,
> >    basically forming a range (with some conditions on the initial CFG).
> >    If we tie two `llvm.assume` calls together we can say that the
> >    information provided by the first holds for any point dominated by it
> >    and post-dominated by the second.
> 
> Is this the same mechanism as for llvm.lifetime.start/end? It may not
> be easy to keep them consistently (post-)dominating each other, see
> http://lists.llvm.org/pipermail/llvm-dev/2017-March/111551.html .

Fair, but I argue that is fine as long as it means we only "loose" information.
We would still have the assume calls so point wise information would not
even be lost.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 228 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20191218/930bc37c/attachment.sig>