[llvm-dev] RFC: Add "operand bundles" to calls and invokes

Wed Aug 12 12:24:26 PDT 2015

On 08/09/2015 08:32 PM, Sanjoy Das wrote:
> We'd like to propose a scheme to attach "operand bundles" to call and
> invoke instructions.  This is based on the offline discussion
> mentioned in
> http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-July/088748.html.
I'm (obviously) in support of the overall proposal.  :)  A few details 
below.
>
> # Motivation & Definition
>
> Our motivation behind this is to track the state required for
> deoptimization (described briefly later) through the LLVM pipeline as
> a first-class IR citizen.  We want to do this is a way that is
> generally useful.
>
> An "operand bundle" is a set of SSA values (called "bundle operands")
> tagged with a string (called the "bundle tag").  One or more of such
> bundles may be attached to a call or an invoke.  The intended use of
> these values is to support "frame introspection"-like functionality
> for managed languages.
>
>
> # Abstract Syntax
>
> The syntax of a call instruction will be changed to look like this:
>
> <result> = [tail | musttail] call [cconv] [ret attrs] <ty> [<fnty>*]
>      <fnptrval>(<function args>)  [operand_bundle*] [fn attrs]
>
> where operand_bundle = tag '('[ value ] (',' value )* ')'
>        value = normal SSA values
>        tag = "< some name >"
tag needs to be "some string name" or <future keyword>.  We also need to 
be clear about what the compatibility guarantees are. If I remember 
correctly, we discussed something along the following:
- string bundle names are entirely version locked to particular revision 
of LLVM.  They are for experimentation and incremental development.  
There is no attempt to forward serialize them.  In particular, using a 
string name which is out of sync with the version of LLVM can result in 
miscompiles.
- keyword bundle names become first class parts of the IR, they are 
forward serialized, and fully supported.  Obviously, getting an 
experimental string bundle name promoted to a first class keyword bundle 
will require broad discussion and buy in.

We were deliberately trying to parallel the defacto policy around 
attributes vs string-attributes.
>
> In other words, after the function arguments we now have an optional
> list of operand bundles of the form `"< bundle tag >"(bundle
> attributes, values...)`.  There can be more than one operand bundle in
> a call.  Two operand bundles in the same call instruction cannot have
> the same tag.
I don't think we need that last sentence.  It should be up to the bundle 
implementation if that's legal or not.  I don't have a strong preference 
here and we could easily relax this later.
>
> We'd do something similar for invokes.  I'll omit the invoke syntax
> from this RFC to keep things brief.
>
> An example:
>
>      define i32 @f(i32 %x) {
>       entry:
>        %t = add i32 %x, 1
>        ret i32 %t
>      }
>
>      define void @g(i16 %val, i8* %ptr) {
>       entry:
>        call void @f(i32 10) "some-bundle"(i32 42) "debug"(i32 100)
>        call void @f(i32 20) "some-bundle"(i16 %val, i8* %ptr)
>      }
>
> Note 1: Operand bundles are *not* part of a function's signature, and
> a given function may be called from multiple places with different
> kinds of operand bundles.  This reflects the fact that the operand
> bundles are conceptually a part of the *call*, not the callee being
> dispatched to.
>
> Note 2: There may be tag specific requirements not mentioned here.
> E.g. we may add a rule in the future that says operand bundles with
> the tag `"integer-id"` may only contain exactly one constant integer.
>
>
> # IR Semantics
>
> Bundle operands (SSA values part of some operand bundle) are normal
> SSA values.  They need to dominate the call or invoke instruction
> they're being passed into and can be optimized as usual.  For
> instance, LLVM is allowed (and strongly encouraged!) to PRE / LICM a
> load feeding into an operand bundle if legal.
>
> Operand bundles are characterized by the `"< bundle tag >"` string
> associated with them.
>
> The overall strategy is:
>
>   1. The semantics are as conservative as is reasonable for operand
>      bundles with tags that LLVM does not have a special understanding
>      of.  This way LLVM does not miscompile code by default.
>
>   2. LLVM understands the semantics of operand bundles with certain
>      specific tags more precisely, and can optimize them better.
>
> This RFC talks mainly about (1).  We will discuss (2) as we add smarts
> to LLVM about specific kinds of operand bundles.
>
> The IR-level semantics of an operand bundle with an arbitrary tag are:
>
>   1. The bundle operands passed in to a call escape in unknown ways
>      before transferring control to the callee.  For instance:
>
>        declare void @opaque_runtime_fn()
>
>        define void @f(i32* %v) { }
>
>        define i32 @g() {
>          %t = i32* @malloc(...)
>          ;; "unknown" is a tag LLVM does not have any special knowledge of
>          call void @f(i32* %t) "unknown"(i32* %t)
>
>          store i32 42, i32* %t
>          call void @opaque_runtime_fn();
>          ret (load i32, i32* %t)
>        }
>
>      Normally (without the `"unknown"` bundle) it would be okay to
>      optimize `@g` to return `42`.  But the `"unknown"` operand bundle
>      escapes `%t`, and the call to `@opaque_runtime_fn` can therefore
>      modify the location pointed to by `%t`.
>
>   2. Calls and invokes with operand bundles have unknown read / write
>      effect on the heap on entry and exit (even if the call target is
>      `readnone` or `readonly`).  For instance:
I don't think we actually need this.  I think it would be perfectly fine 
to require the frontend ensure that the called function is not readonly 
if it being readonly would be problematic for the call site.  I'm not 
really opposed to this generalization - I could see it being useful - 
but I'm worried about the amount of work involved.  A *lot* of the 
optimizer assumes that attributes on a call site strictly less 
conservative than the underlying function. Changing that could have a 
long bug tail.  I'd rather defer that work until someone defines an 
operand bundle type which requires it.  The motivating example 
(deoptimization) doesn't seem to require this.
>
>        define void @f(i32* %v) { }
>
>        define i32 @g() {
>          %t = i32* @malloc(...)
>          %t.unescaped = i32* @malloc(...)
>          ;; "unknown" is a tag LLVM does not have any special knowledge of
>          call void @f(i32* %t) "unknown"(i32* %t)
>          ret (load i32, i32* %t)
>        }
>
>      Normally it would be okay to optimize `@g` to return `undef`, but
>      the `"unknown"` bundle potentially clobbers `%t`.  Note that it
>      clobbers `%t` only because it was *also escaped* by the
>      `"unknown"` operand bundle -- it does not clobber `%t.unescaped`
>      because it isn't reachable from the heap yet.
>
>      However, it is okay to optimize
>
>        define void @f(i32* %v) {
>          store i32 10, i32* %v
>          print(load i32, i32* %v)
>        }
>
>        define void @g() {
>          %t = ...
>          ;; "unknown" is a tag LLVM does not have any special knowledge of
>          call void @f(i32* %t) "unknown"()
>        }
>
>      to
>
>        define void @f(i32* %v) {
>          store i32 10, i32* %v
>          print(10)
>        }
>
>        define void @g() {
>          %t = ...
>          call void @f(i32* %t) "unknown"()
>        }
>
>      The arbitrary heap clobbering only happens on the boundaries of
>      the call operation, and therefore we can still do store-load
>      forwarding *within* `@f`.
>
> Since we haven't specified any "pure" LLVM way of accessing the
> contents of operand bundles, the client is required to model such
> accesses as calls to opaque functions (or inline assembly).
I'm a bit confused by this section.  By "client" do you mean frontend?  
And what are you trying to allow in the second sentence? The first 
sentence seems sufficient.
> This
> ensures that things like IPSCCP work as intended.  E.g. it is legal to
> optimize
>
>     define i32 @f(i32* %v) { ret i32 10 }
>
>     define void @g() {
>       %t = i32* @malloc(...)
>       %v = call i32 @f(i32* %t) "unknown"(i32* %t)
>       print(%v)
>     }
>
> to
>
>     define i32 @f(i32* %v) { ret i32 10 }
>
>     define void @g() {
>       %t = i32* @malloc(...)
>       %v = call i32 @f(i32* %t) "unknown"(i32* %t)
>       print(10)
>     }
To say this differently, an operand bundle at a call site can not change 
the implementation of the called function.  This is not a mechanism for 
function interposition.
>
> LLVM won't generally be able to inline through calls and invokes with
> operand bundles -- the inliner does not know what to replace the
> arbitrary heap accesses implied on function entry and exit with.
> However, we intend to teach the inliner to inline through calls /
> invokes with some specific kinds of operand bundles.
>
>
> # Lowering
>
> The lowering strategy will be special cased for each bundle tag.
> There won't be any "generic" lowering strategy -- `llc` is expected to
> abort if it sees an operand bundle that it does not understand.
>
> There is no requirement that the operand bundles actually make it to
> the backend.  Rewriting the operand bundles into "vanilla" LLVM IR at
> some point in the pipeline (instead of teaching codegen to lower them)
> is a perfectly reasonable lowering strategy.
>
>
> # Example use cases
>
> A couple of usage scenarios are very briefly described below:
>
> ## Deoptimization
>
> This is our motivating use case.  Some managed environments expect to
> be able to discover the state of the abstract virtual machine at specific call
> sites.  LLVM will be able to support this requirement by attaching a
> `"deopt"` operand bundle containing the state of the abstract virtual
> machine (as a vector of SSA values) at the appropriate call sites.
> There is a straightforward way
> to extend the inliner work with `"deopt"` operand bundles.
>
> `"deopt"` operand bundles will not have to be as pessimistic about
> heap effects as the general "unknown operand bundle" case -- they only
> imply a read from the entire heap on function entry or function exit,
> depending on what kind of deoptimization state we're interested in.
> They also don't imply escaping semantics.
An alternate framing here which would remove the attribute case I was 
worried about about would be to separate the memory and abstract state 
semantics of deoptimization.  If the deopt bundle only described the 
abstract state and it was up to the frontend to ensure the callee was at 
least readonly, we wouldn't need to model memory in the deopt bundle.  I 
think that's a much better starting place.
>
>
> ## Value injection
>
> By passing in one or more `alloca`s to an `"injectable-value"` tagged
> operand bundle, languages can allow the runtime to overwrite the
> values of specific variables, while still preserving a significant
> amount of optimization potential.
To be clear, this was intended to model use cases like Python's ability 
to inject values into caller frames.
>
>
>
> Thoughts?
> -- Sanjoy