<html><head><style type='text/css'>p { margin: 0; }</style></head><body><div style='font-family: arial,helvetica,sans-serif; font-size: 10pt; color: #000000'><br><hr id="zwchr"><blockquote style="border-left: 2px solid rgb(16, 16, 255); margin-left: 5px; padding-left: 5px; color: rgb(0, 0, 0); font-weight: normal; font-style: normal; text-decoration: none; font-family: Helvetica,Arial,sans-serif; font-size: 12pt;"><b>From: </b>"David Majnemer" <david.majnemer@gmail.com><br><b>To: </b>"Sanjoy Das" <sanjoy@playingwithpointers.com><br><b>Cc: </b>"llvm-dev" <llvm-dev@lists.llvm.org>, "Philip Reames" <listmail@philipreames.com>, "Chandler Carruth" <chandlerc@gmail.com>, "Nick Lewycky" <nlewycky@google.com>, "Hal Finkel" <hfinkel@anl.gov>, "Chen Li" <meloli87@gmail.com>, "Russell Hadley" <rhadley@microsoft.com>, "Kevin Modzelewski" <kmod@dropbox.com>, "Swaroop Sridhar" <Swaroop.Sridhar@microsoft.com>, rudi@dropbox.com, "Pat Gavlin" <pagavlin@microsoft.com>, "Joseph Tremoulet" <jotrem@microsoft.com>, "Reid Kleckner" <rnk@google.com><br><b>Sent: </b>Monday, August 10, 2015 11:38:32 PM<br><b>Subject: </b>Re: RFC: Add "operand bundles" to calls and invokes<br><br><div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Sun, Aug 9, 2015 at 11:32 PM, Sanjoy Das <span dir="ltr"><<a href="mailto:sanjoy@playingwithpointers.com" target="_blank">sanjoy@playingwithpointers.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;">We'd like to propose a scheme to attach "operand bundles" to call and<br>
invoke instructions. This is based on the offline discussion<br>
mentioned in<br>
<a href="http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-July/088748.html" rel="noreferrer" target="_blank">http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-July/088748.html</a>.<br>
<br>
# Motivation & Definition<br>
<br>
Our motivation behind this is to track the state required for<br>
deoptimization (described briefly later) through the LLVM pipeline as<br>
a first-class IR citizen. We want to do this is a way that is<br>
generally useful.<br>
<br>
An "operand bundle" is a set of SSA values (called "bundle operands")<br>
tagged with a string (called the "bundle tag"). One or more of such<br>
bundles may be attached to a call or an invoke. The intended use of<br>
these values is to support "frame introspection"-like functionality<br>
for managed languages.<br>
<br>
<br>
# Abstract Syntax<br>
<br>
The syntax of a call instruction will be changed to look like this:<br>
<br>
<result> = [tail | musttail] call [cconv] [ret attrs] <ty> [<fnty>*]<br>
<fnptrval>(<function args>) [operand_bundle*] [fn attrs]<br>
<br>
where operand_bundle = tag '('[ value ] (',' value )* ')'<br>
value = normal SSA values<br>
tag = "< some name >"<br>
<br>
In other words, after the function arguments we now have an optional<br>
list of operand bundles of the form `"< bundle tag >"(bundle<br>
attributes, values...)`. There can be more than one operand bundle in<br>
a call. Two operand bundles in the same call instruction cannot have<br>
the same tag.<br>
<br>
We'd do something similar for invokes. I'll omit the invoke syntax<br>
from this RFC to keep things brief.<br>
<br>
An example:<br>
<br>
define i32 @f(i32 %x) {<br>
entry:<br>
%t = add i32 %x, 1<br>
ret i32 %t<br>
}<br>
<br>
define void @g(i16 %val, i8* %ptr) {<br>
entry:<br>
call void @f(i32 10) "some-bundle"(i32 42) "debug"(i32 100)<br>
call void @f(i32 20) "some-bundle"(i16 %val, i8* %ptr)<br>
}<br>
<br>
Note 1: Operand bundles are *not* part of a function's signature, and<br>
a given function may be called from multiple places with different<br>
kinds of operand bundles. This reflects the fact that the operand<br>
bundles are conceptually a part of the *call*, not the callee being<br>
dispatched to.<br>
<br>
Note 2: There may be tag specific requirements not mentioned here.<br>
E.g. we may add a rule in the future that says operand bundles with<br>
the tag `"integer-id"` may only contain exactly one constant integer.<br>
<br>
<br>
# IR Semantics<br>
<br>
Bundle operands (SSA values part of some operand bundle) are normal<br>
SSA values. They need to dominate the call or invoke instruction<br>
they're being passed into and can be optimized as usual. For<br>
instance, LLVM is allowed (and strongly encouraged!) to PRE / LICM a<br>
load feeding into an operand bundle if legal.<br>
<br>
Operand bundles are characterized by the `"< bundle tag >"` string<br>
associated with them.<br>
<br>
The overall strategy is:<br>
<br>
1. The semantics are as conservative as is reasonable for operand<br>
bundles with tags that LLVM does not have a special understanding<br>
of. This way LLVM does not miscompile code by default.<br>
<br>
2. LLVM understands the semantics of operand bundles with certain<br>
specific tags more precisely, and can optimize them better.<br>
<br>
This RFC talks mainly about (1). We will discuss (2) as we add smarts<br>
to LLVM about specific kinds of operand bundles.<br>
<br>
The IR-level semantics of an operand bundle with an arbitrary tag are:<br>
<br>
1. The bundle operands passed in to a call escape in unknown ways<br>
before transferring control to the callee. For instance:<br>
<br>
declare void @opaque_runtime_fn()<br>
<br>
define void @f(i32* %v) { }<br>
<br>
define i32 @g() {<br>
%t = i32* @malloc(...)<br>
;; "unknown" is a tag LLVM does not have any special knowledge of<br>
call void @f(i32* %t) "unknown"(i32* %t)<br>
<br>
store i32 42, i32* %t<br>
call void @opaque_runtime_fn();<br>
ret (load i32, i32* %t)<br>
}<br>
<br>
Normally (without the `"unknown"` bundle) it would be okay to<br>
optimize `@g` to return `42`. But the `"unknown"` operand bundle<br>
escapes `%t`, and the call to `@opaque_runtime_fn` can therefore<br>
modify the location pointed to by `%t`.<br>
<br>
2. Calls and invokes with operand bundles have unknown read / write<br>
effect on the heap on entry and exit (even if the call target is<br>
`readnone` or `readonly`). For instance:<br>
<br>
define void @f(i32* %v) { }<br>
<br>
define i32 @g() {<br>
%t = i32* @malloc(...)<br>
%t.unescaped = i32* @malloc(...)<br>
;; "unknown" is a tag LLVM does not have any special knowledge of<br>
call void @f(i32* %t) "unknown"(i32* %t)<br>
ret (load i32, i32* %t)<br>
}<br>
<br>
Normally it would be okay to optimize `@g` to return `undef`, but<br>
the `"unknown"` bundle potentially clobbers `%t`. Note that it<br>
clobbers `%t` only because it was *also escaped* by the<br>
`"unknown"` operand bundle -- it does not clobber `%t.unescaped`<br>
because it isn't reachable from the heap yet.<br>
<br>
However, it is okay to optimize<br>
<br>
define void @f(i32* %v) {<br>
store i32 10, i32* %v<br>
print(load i32, i32* %v)<br>
}<br>
<br>
define void @g() {<br>
%t = ...<br>
;; "unknown" is a tag LLVM does not have any special knowledge of<br>
call void @f(i32* %t) "unknown"()<br>
}<br>
<br>
to<br>
<br>
define void @f(i32* %v) {<br>
store i32 10, i32* %v<br>
print(10)<br>
}<br>
<br>
define void @g() {<br>
%t = ...<br>
call void @f(i32* %t) "unknown"()<br>
}<br>
<br>
The arbitrary heap clobbering only happens on the boundaries of<br>
the call operation, and therefore we can still do store-load<br>
forwarding *within* `@f`.<br>
<br>
Since we haven't specified any "pure" LLVM way of accessing the<br>
contents of operand bundles, the client is required to model such<br>
accesses as calls to opaque functions (or inline assembly). This<br>
ensures that things like IPSCCP work as intended. E.g. it is legal to<br>
optimize<br>
<br>
define i32 @f(i32* %v) { ret i32 10 }<br>
<br>
define void @g() {<br>
%t = i32* @malloc(...)<br>
%v = call i32 @f(i32* %t) "unknown"(i32* %t)<br>
print(%v)<br>
}<br>
<br>
to<br>
<br>
define i32 @f(i32* %v) { ret i32 10 }<br>
<br>
define void @g() {<br>
%t = i32* @malloc(...)<br>
%v = call i32 @f(i32* %t) "unknown"(i32* %t)<br>
print(10)<br>
}<br>
<br>
LLVM won't generally be able to inline through calls and invokes with<br>
operand bundles -- the inliner does not know what to replace the<br>
arbitrary heap accesses implied on function entry and exit with.<br>
However, we intend to teach the inliner to inline through calls /<br>
invokes with some specific kinds of operand bundles.<br>
<br>
<br>
# Lowering<br>
<br>
The lowering strategy will be special cased for each bundle tag.<br>
There won't be any "generic" lowering strategy -- `llc` is expected to<br>
abort if it sees an operand bundle that it does not understand.<br>
<br>
There is no requirement that the operand bundles actually make it to<br>
the backend. Rewriting the operand bundles into "vanilla" LLVM IR at<br>
some point in the pipeline (instead of teaching codegen to lower them)<br>
is a perfectly reasonable lowering strategy.<br>
<br>
<br>
# Example use cases<br>
<br>
A couple of usage scenarios are very briefly described below:<br>
<br>
## Deoptimization<br>
<br>
This is our motivating use case. Some managed environments expect to<br>
be able to discover the state of the abstract virtual machine at specific call<br>
sites. LLVM will be able to support this requirement by attaching a<br>
`"deopt"` operand bundle containing the state of the abstract virtual<br>
machine (as a vector of SSA values) at the appropriate call sites.<br>
There is a straightforward way<br>
to extend the inliner work with `"deopt"` operand bundles.<br>
<br>
`"deopt"` operand bundles will not have to be as pessimistic about<br>
heap effects as the general "unknown operand bundle" case -- they only<br>
imply a read from the entire heap on function entry or function exit,<br>
depending on what kind of deoptimization state we're interested in.<br>
They also don't imply escaping semantics.<br>
<br>
<br>
## Value injection<br>
<br>
By passing in one or more `alloca`s to an `"injectable-value"` tagged<br>
operand bundle, languages can allow the runtime to overwrite the<br>
values of specific variables, while still preserving a significant<br>
amount of optimization potential.<br>
<br>
<br>
<br>
Thoughts?<br></blockquote><div><br></div><div id="DWT3847">This seems pretty useful, generic, call-site annotation mechanism. </div></div></div></div></blockquote><br>Agreed. It seems like these would be useful for our existing patchpoints too (to record the live values for the associated stack map, instead of using extra intrinsic arguments for them).<br><br> -Hal<br><blockquote style="border-left: 2px solid rgb(16, 16, 255); margin-left: 5px; padding-left: 5px; color: rgb(0, 0, 0); font-weight: normal; font-style: normal; text-decoration: none; font-family: Helvetica,Arial,sans-serif; font-size: 12pt;"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div> I believe that this has immediate application outside of the context of GC.</div><div><br></div><div>Our exception handling personality routine has a desire to know whether some code is inside a specific try or catch. We can feed the value coming out of our EH pad back into the call-site, making it very clear which EH pad the call-site is associated with.</div><div> </div><blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;">
<span class="HOEnZb"><font color="#888888">-- Sanjoy<br>
</font></span></blockquote></div><br></div></div>
</blockquote><br><br><br>-- <br><div><span name="x"></span>Hal Finkel<br>Assistant Computational Scientist<br>Leadership Computing Facility<br>Argonne National Laboratory<span name="x"></span><br></div></div></body></html>