[llvm-dev] RFC: Add "operand bundles" to calls and invokes

Sun Aug 9 20:32:05 PDT 2015

We'd like to propose a scheme to attach "operand bundles" to call and
invoke instructions.  This is based on the offline discussion
mentioned in
http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-July/088748.html.

# Motivation & Definition

Our motivation behind this is to track the state required for
deoptimization (described briefly later) through the LLVM pipeline as
a first-class IR citizen.  We want to do this is a way that is
generally useful.

An "operand bundle" is a set of SSA values (called "bundle operands")
tagged with a string (called the "bundle tag").  One or more of such
bundles may be attached to a call or an invoke.  The intended use of
these values is to support "frame introspection"-like functionality
for managed languages.

# Abstract Syntax

The syntax of a call instruction will be changed to look like this:

<result> = [tail | musttail] call [cconv] [ret attrs] <ty> [<fnty>*]
    <fnptrval>(<function args>)  [operand_bundle*] [fn attrs]

where operand_bundle = tag '('[ value ] (',' value )* ')'
      value = normal SSA values
      tag = "< some name >"

In other words, after the function arguments we now have an optional
list of operand bundles of the form `"< bundle tag >"(bundle
attributes, values...)`.  There can be more than one operand bundle in
a call.  Two operand bundles in the same call instruction cannot have
the same tag.

We'd do something similar for invokes.  I'll omit the invoke syntax
from this RFC to keep things brief.

An example:

    define i32 @f(i32 %x) {
     entry:
      %t = add i32 %x, 1
      ret i32 %t
    }

    define void @g(i16 %val, i8* %ptr) {
     entry:
      call void @f(i32 10) "some-bundle"(i32 42) "debug"(i32 100)
      call void @f(i32 20) "some-bundle"(i16 %val, i8* %ptr)
    }

Note 1: Operand bundles are *not* part of a function's signature, and
a given function may be called from multiple places with different
kinds of operand bundles.  This reflects the fact that the operand
bundles are conceptually a part of the *call*, not the callee being
dispatched to.

Note 2: There may be tag specific requirements not mentioned here.
E.g. we may add a rule in the future that says operand bundles with
the tag `"integer-id"` may only contain exactly one constant integer.

# IR Semantics

Bundle operands (SSA values part of some operand bundle) are normal
SSA values.  They need to dominate the call or invoke instruction
they're being passed into and can be optimized as usual.  For
instance, LLVM is allowed (and strongly encouraged!) to PRE / LICM a
load feeding into an operand bundle if legal.

Operand bundles are characterized by the `"< bundle tag >"` string
associated with them.

The overall strategy is:

 1. The semantics are as conservative as is reasonable for operand
    bundles with tags that LLVM does not have a special understanding
    of.  This way LLVM does not miscompile code by default.

 2. LLVM understands the semantics of operand bundles with certain
    specific tags more precisely, and can optimize them better.

This RFC talks mainly about (1).  We will discuss (2) as we add smarts
to LLVM about specific kinds of operand bundles.

The IR-level semantics of an operand bundle with an arbitrary tag are:

 1. The bundle operands passed in to a call escape in unknown ways
    before transferring control to the callee.  For instance:

      declare void @opaque_runtime_fn()

      define void @f(i32* %v) { }

      define i32 @g() {
        %t = i32* @malloc(...)
        ;; "unknown" is a tag LLVM does not have any special knowledge of
        call void @f(i32* %t) "unknown"(i32* %t)

        store i32 42, i32* %t
        call void @opaque_runtime_fn();
        ret (load i32, i32* %t)
      }

    Normally (without the `"unknown"` bundle) it would be okay to
    optimize `@g` to return `42`.  But the `"unknown"` operand bundle
    escapes `%t`, and the call to `@opaque_runtime_fn` can therefore
    modify the location pointed to by `%t`.

 2. Calls and invokes with operand bundles have unknown read / write
    effect on the heap on entry and exit (even if the call target is
    `readnone` or `readonly`).  For instance:

      define void @f(i32* %v) { }

      define i32 @g() {
        %t = i32* @malloc(...)
        %t.unescaped = i32* @malloc(...)
        ;; "unknown" is a tag LLVM does not have any special knowledge of
        call void @f(i32* %t) "unknown"(i32* %t)
        ret (load i32, i32* %t)
      }

    Normally it would be okay to optimize `@g` to return `undef`, but
    the `"unknown"` bundle potentially clobbers `%t`.  Note that it
    clobbers `%t` only because it was *also escaped* by the
    `"unknown"` operand bundle -- it does not clobber `%t.unescaped`
    because it isn't reachable from the heap yet.

    However, it is okay to optimize

      define void @f(i32* %v) {
        store i32 10, i32* %v
        print(load i32, i32* %v)
      }

      define void @g() {
        %t = ...
        ;; "unknown" is a tag LLVM does not have any special knowledge of
        call void @f(i32* %t) "unknown"()
      }

    to

      define void @f(i32* %v) {
        store i32 10, i32* %v
        print(10)
      }

      define void @g() {
        %t = ...
        call void @f(i32* %t) "unknown"()
      }

    The arbitrary heap clobbering only happens on the boundaries of
    the call operation, and therefore we can still do store-load
    forwarding *within* `@f`.

Since we haven't specified any "pure" LLVM way of accessing the
contents of operand bundles, the client is required to model such
accesses as calls to opaque functions (or inline assembly).  This
ensures that things like IPSCCP work as intended.  E.g. it is legal to
optimize

   define i32 @f(i32* %v) { ret i32 10 }

   define void @g() {
     %t = i32* @malloc(...)
     %v = call i32 @f(i32* %t) "unknown"(i32* %t)
     print(%v)
   }

to

   define i32 @f(i32* %v) { ret i32 10 }

   define void @g() {
     %t = i32* @malloc(...)
     %v = call i32 @f(i32* %t) "unknown"(i32* %t)
     print(10)
   }

LLVM won't generally be able to inline through calls and invokes with
operand bundles -- the inliner does not know what to replace the
arbitrary heap accesses implied on function entry and exit with.
However, we intend to teach the inliner to inline through calls /
invokes with some specific kinds of operand bundles.

# Lowering

The lowering strategy will be special cased for each bundle tag.
There won't be any "generic" lowering strategy -- `llc` is expected to
abort if it sees an operand bundle that it does not understand.

There is no requirement that the operand bundles actually make it to
the backend.  Rewriting the operand bundles into "vanilla" LLVM IR at
some point in the pipeline (instead of teaching codegen to lower them)
is a perfectly reasonable lowering strategy.

# Example use cases

A couple of usage scenarios are very briefly described below:

## Deoptimization

This is our motivating use case.  Some managed environments expect to
be able to discover the state of the abstract virtual machine at specific call
sites.  LLVM will be able to support this requirement by attaching a
`"deopt"` operand bundle containing the state of the abstract virtual
machine (as a vector of SSA values) at the appropriate call sites.
There is a straightforward way
to extend the inliner work with `"deopt"` operand bundles.

`"deopt"` operand bundles will not have to be as pessimistic about
heap effects as the general "unknown operand bundle" case -- they only
imply a read from the entire heap on function entry or function exit,
depending on what kind of deoptimization state we're interested in.
They also don't imply escaping semantics.

## Value injection

By passing in one or more `alloca`s to an `"injectable-value"` tagged
operand bundle, languages can allow the runtime to overwrite the
values of specific variables, while still preserving a significant
amount of optimization potential.

Thoughts?
-- Sanjoy