<html><head><style type='text/css'>p { margin: 0; }</style></head><body><div style='font-family: arial,helvetica,sans-serif; font-size: 10pt; color: #000000'><br><hr id="zwchr"><blockquote style="border-left: 2px solid rgb(16, 16, 255); margin-left: 5px; padding-left: 5px; color: rgb(0, 0, 0); font-weight: normal; font-style: normal; text-decoration: none; font-family: Helvetica,Arial,sans-serif; font-size: 12pt;"><b>From: </b>"David Majnemer" <david.majnemer@gmail.com><br><b>To: </b>"Sanjoy Das" <sanjoy@playingwithpointers.com><br><b>Cc: </b>"llvm-dev" <llvm-dev@lists.llvm.org>, "Philip Reames" <listmail@philipreames.com>, "Chandler Carruth" <chandlerc@gmail.com>, "Nick Lewycky" <nlewycky@google.com>, "Hal Finkel" <hfinkel@anl.gov>, "Chen Li" <meloli87@gmail.com>, "Russell Hadley" <rhadley@microsoft.com>, "Kevin Modzelewski" <kmod@dropbox.com>, "Swaroop Sridhar" <Swaroop.Sridhar@microsoft.com>, rudi@dropbox.com, "Pat Gavlin" <pagavlin@microsoft.com>, "Joseph Tremoulet" <jotrem@microsoft.com>, "Reid Kleckner" <rnk@google.com><br><b>Sent: </b>Monday, August 10, 2015 11:38:32 PM<br><b>Subject: </b>Re: RFC: Add "operand bundles" to calls and invokes<br><br><div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Sun, Aug 9, 2015 at 11:32 PM, Sanjoy Das <span dir="ltr"><<a href="mailto:sanjoy@playingwithpointers.com" target="_blank">sanjoy@playingwithpointers.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;">We'd like to propose a scheme to attach "operand bundles" to call and<br>

invoke instructions.  This is based on the offline discussion<br>

mentioned in<br>

<a href="http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-July/088748.html" rel="noreferrer" target="_blank">http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-July/088748.html</a>.<br>

<br>

# Motivation & Definition<br>

<br>

Our motivation behind this is to track the state required for<br>

deoptimization (described briefly later) through the LLVM pipeline as<br>

a first-class IR citizen.  We want to do this is a way that is<br>

generally useful.<br>

<br>

An "operand bundle" is a set of SSA values (called "bundle operands")<br>

tagged with a string (called the "bundle tag").  One or more of such<br>

bundles may be attached to a call or an invoke.  The intended use of<br>

these values is to support "frame introspection"-like functionality<br>

for managed languages.<br>

<br>

<br>

# Abstract Syntax<br>

<br>

The syntax of a call instruction will be changed to look like this:<br>

<br>

<result> = [tail | musttail] call [cconv] [ret attrs] <ty> [<fnty>*]<br>

    <fnptrval>(<function args>)  [operand_bundle*] [fn attrs]<br>

<br>

where operand_bundle = tag '('[ value ] (',' value )* ')'<br>

      value = normal SSA values<br>

      tag = "< some name >"<br>

<br>

In other words, after the function arguments we now have an optional<br>

list of operand bundles of the form `"< bundle tag >"(bundle<br>

attributes, values...)`.  There can be more than one operand bundle in<br>

a call.  Two operand bundles in the same call instruction cannot have<br>

the same tag.<br>

<br>

We'd do something similar for invokes.  I'll omit the invoke syntax<br>

from this RFC to keep things brief.<br>

<br>

An example:<br>

<br>

    define i32 @f(i32 %x) {<br>

     entry:<br>

      %t = add i32 %x, 1<br>

      ret i32 %t<br>

    }<br>

<br>

    define void @g(i16 %val, i8* %ptr) {<br>

     entry:<br>

      call void @f(i32 10) "some-bundle"(i32 42) "debug"(i32 100)<br>

      call void @f(i32 20) "some-bundle"(i16 %val, i8* %ptr)<br>

    }<br>

<br>

Note 1: Operand bundles are *not* part of a function's signature, and<br>

a given function may be called from multiple places with different<br>

kinds of operand bundles.  This reflects the fact that the operand<br>

bundles are conceptually a part of the *call*, not the callee being<br>

dispatched to.<br>

<br>

Note 2: There may be tag specific requirements not mentioned here.<br>

E.g. we may add a rule in the future that says operand bundles with<br>

the tag `"integer-id"` may only contain exactly one constant integer.<br>

<br>

<br>

# IR Semantics<br>

<br>

Bundle operands (SSA values part of some operand bundle) are normal<br>

SSA values.  They need to dominate the call or invoke instruction<br>

they're being passed into and can be optimized as usual.  For<br>

instance, LLVM is allowed (and strongly encouraged!) to PRE / LICM a<br>

load feeding into an operand bundle if legal.<br>

<br>

Operand bundles are characterized by the `"< bundle tag >"` string<br>

associated with them.<br>

<br>

The overall strategy is:<br>

<br>

 1. The semantics are as conservative as is reasonable for operand<br>

    bundles with tags that LLVM does not have a special understanding<br>

    of.  This way LLVM does not miscompile code by default.<br>

<br>

 2. LLVM understands the semantics of operand bundles with certain<br>

    specific tags more precisely, and can optimize them better.<br>

<br>

This RFC talks mainly about (1).  We will discuss (2) as we add smarts<br>

to LLVM about specific kinds of operand bundles.<br>

<br>

The IR-level semantics of an operand bundle with an arbitrary tag are:<br>

<br>

 1. The bundle operands passed in to a call escape in unknown ways<br>

    before transferring control to the callee.  For instance:<br>

<br>

      declare void @opaque_runtime_fn()<br>

<br>

      define void @f(i32* %v) { }<br>

<br>

      define i32 @g() {<br>

        %t = i32* @malloc(...)<br>

        ;; "unknown" is a tag LLVM does not have any special knowledge of<br>

        call void @f(i32* %t) "unknown"(i32* %t)<br>

<br>

        store i32 42, i32* %t<br>

        call void @opaque_runtime_fn();<br>

        ret (load i32, i32* %t)<br>

      }<br>

<br>

    Normally (without the `"unknown"` bundle) it would be okay to<br>

    optimize `@g` to return `42`.  But the `"unknown"` operand bundle<br>

    escapes `%t`, and the call to `@opaque_runtime_fn` can therefore<br>

    modify the location pointed to by `%t`.<br>

<br>

 2. Calls and invokes with operand bundles have unknown read / write<br>

    effect on the heap on entry and exit (even if the call target is<br>

    `readnone` or `readonly`).  For instance:<br>

<br>

      define void @f(i32* %v) { }<br>

<br>

      define i32 @g() {<br>

        %t = i32* @malloc(...)<br>

        %t.unescaped = i32* @malloc(...)<br>

        ;; "unknown" is a tag LLVM does not have any special knowledge of<br>

        call void @f(i32* %t) "unknown"(i32* %t)<br>

        ret (load i32, i32* %t)<br>

      }<br>

<br>

    Normally it would be okay to optimize `@g` to return `undef`, but<br>

    the `"unknown"` bundle potentially clobbers `%t`.  Note that it<br>

    clobbers `%t` only because it was *also escaped* by the<br>

    `"unknown"` operand bundle -- it does not clobber `%t.unescaped`<br>

    because it isn't reachable from the heap yet.<br>

<br>

    However, it is okay to optimize<br>

<br>

      define void @f(i32* %v) {<br>

        store i32 10, i32* %v<br>

        print(load i32, i32* %v)<br>

      }<br>

<br>

      define void @g() {<br>

        %t = ...<br>

        ;; "unknown" is a tag LLVM does not have any special knowledge of<br>

        call void @f(i32* %t) "unknown"()<br>

      }<br>

<br>

    to<br>

<br>

      define void @f(i32* %v) {<br>

        store i32 10, i32* %v<br>

        print(10)<br>

      }<br>

<br>

      define void @g() {<br>

        %t = ...<br>

        call void @f(i32* %t) "unknown"()<br>

      }<br>

<br>

    The arbitrary heap clobbering only happens on the boundaries of<br>

    the call operation, and therefore we can still do store-load<br>

    forwarding *within* `@f`.<br>

<br>

Since we haven't specified any "pure" LLVM way of accessing the<br>

contents of operand bundles, the client is required to model such<br>

accesses as calls to opaque functions (or inline assembly).  This<br>

ensures that things like IPSCCP work as intended.  E.g. it is legal to<br>

optimize<br>

<br>

   define i32 @f(i32* %v) { ret i32 10 }<br>

<br>

   define void @g() {<br>

     %t = i32* @malloc(...)<br>

     %v = call i32 @f(i32* %t) "unknown"(i32* %t)<br>

     print(%v)<br>

   }<br>

<br>

to<br>

<br>

   define i32 @f(i32* %v) { ret i32 10 }<br>

<br>

   define void @g() {<br>

     %t = i32* @malloc(...)<br>

     %v = call i32 @f(i32* %t) "unknown"(i32* %t)<br>

     print(10)<br>

   }<br>

<br>

LLVM won't generally be able to inline through calls and invokes with<br>

operand bundles -- the inliner does not know what to replace the<br>

arbitrary heap accesses implied on function entry and exit with.<br>

However, we intend to teach the inliner to inline through calls /<br>

invokes with some specific kinds of operand bundles.<br>

<br>

<br>

# Lowering<br>

<br>

The lowering strategy will be special cased for each bundle tag.<br>

There won't be any "generic" lowering strategy -- `llc` is expected to<br>

abort if it sees an operand bundle that it does not understand.<br>

<br>

There is no requirement that the operand bundles actually make it to<br>

the backend.  Rewriting the operand bundles into "vanilla" LLVM IR at<br>

some point in the pipeline (instead of teaching codegen to lower them)<br>

is a perfectly reasonable lowering strategy.<br>

<br>

<br>

# Example use cases<br>

<br>

A couple of usage scenarios are very briefly described below:<br>

<br>

## Deoptimization<br>

<br>

This is our motivating use case.  Some managed environments expect to<br>

be able to discover the state of the abstract virtual machine at specific call<br>

sites.  LLVM will be able to support this requirement by attaching a<br>

`"deopt"` operand bundle containing the state of the abstract virtual<br>

machine (as a vector of SSA values) at the appropriate call sites.<br>

There is a straightforward way<br>

to extend the inliner work with `"deopt"` operand bundles.<br>

<br>

`"deopt"` operand bundles will not have to be as pessimistic about<br>

heap effects as the general "unknown operand bundle" case -- they only<br>

imply a read from the entire heap on function entry or function exit,<br>

depending on what kind of deoptimization state we're interested in.<br>

They also don't imply escaping semantics.<br>

<br>

<br>

## Value injection<br>

<br>

By passing in one or more `alloca`s to an `"injectable-value"` tagged<br>

operand bundle, languages can allow the runtime to overwrite the<br>

values of specific variables, while still preserving a significant<br>

amount of optimization potential.<br>

<br>

<br>

<br>

Thoughts?<br></blockquote><div><br></div><div id="DWT3847">This seems pretty useful, generic, call-site annotation mechanism. </div></div></div></div></blockquote><br>Agreed. It seems like these would be useful for our existing patchpoints too (to record the live values for the associated stack map, instead of using extra intrinsic arguments for them).<br><br> -Hal<br><blockquote style="border-left: 2px solid rgb(16, 16, 255); margin-left: 5px; padding-left: 5px; color: rgb(0, 0, 0); font-weight: normal; font-style: normal; text-decoration: none; font-family: Helvetica,Arial,sans-serif; font-size: 12pt;"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div> I believe that this has immediate application outside of the context of GC.</div><div><br></div><div>Our exception handling personality routine has a desire to know whether some code is inside a specific try or catch.  We can feed the value coming out of our EH pad back into the call-site, making it very clear which EH pad the call-site is associated with.</div><div> </div><blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;">

<span class="HOEnZb"><font color="#888888">-- Sanjoy<br>

</font></span></blockquote></div><br></div></div>

</blockquote><br><br><br>-- <br><div><span name="x"></span>Hal Finkel<br>Assistant Computational Scientist<br>Leadership Computing Facility<br>Argonne National Laboratory<span name="x"></span><br></div></div></body></html>