[llvm-dev] RFC: Add "operand bundles" to calls and invokes
Philip Reames via llvm-dev
llvm-dev at lists.llvm.org
Wed Aug 19 10:02:29 PDT 2015
On 08/19/2015 01:52 AM, Hal Finkel wrote:
>
> ------------------------------------------------------------------------
>
> *From: *"David Majnemer" <david.majnemer at gmail.com>
> *To: *"Sanjoy Das" <sanjoy at playingwithpointers.com>
> *Cc: *"llvm-dev" <llvm-dev at lists.llvm.org>, "Philip Reames"
> <listmail at philipreames.com>, "Chandler Carruth"
> <chandlerc at gmail.com>, "Nick Lewycky" <nlewycky at google.com>, "Hal
> Finkel" <hfinkel at anl.gov>, "Chen Li" <meloli87 at gmail.com>,
> "Russell Hadley" <rhadley at microsoft.com>, "Kevin Modzelewski"
> <kmod at dropbox.com>, "Swaroop Sridhar"
> <Swaroop.Sridhar at microsoft.com>, rudi at dropbox.com, "Pat Gavlin"
> <pagavlin at microsoft.com>, "Joseph Tremoulet"
> <jotrem at microsoft.com>, "Reid Kleckner" <rnk at google.com>
> *Sent: *Monday, August 10, 2015 11:38:32 PM
> *Subject: *Re: RFC: Add "operand bundles" to calls and invokes
>
>
>
> On Sun, Aug 9, 2015 at 11:32 PM, Sanjoy Das
> <sanjoy at playingwithpointers.com
> <mailto:sanjoy at playingwithpointers.com>> wrote:
>
> We'd like to propose a scheme to attach "operand bundles" to
> call and
> invoke instructions. This is based on the offline discussion
> mentioned in
> http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-July/088748.html.
>
> # Motivation & Definition
>
> Our motivation behind this is to track the state required for
> deoptimization (described briefly later) through the LLVM
> pipeline as
> a first-class IR citizen. We want to do this is a way that is
> generally useful.
>
> An "operand bundle" is a set of SSA values (called "bundle
> operands")
> tagged with a string (called the "bundle tag"). One or more
> of such
> bundles may be attached to a call or an invoke. The intended
> use of
> these values is to support "frame introspection"-like
> functionality
> for managed languages.
>
>
> # Abstract Syntax
>
> The syntax of a call instruction will be changed to look like
> this:
>
> <result> = [tail | musttail] call [cconv] [ret attrs] <ty>
> [<fnty>*]
> <fnptrval>(<function args>) [operand_bundle*] [fn attrs]
>
> where operand_bundle = tag '('[ value ] (',' value )* ')'
> value = normal SSA values
> tag = "< some name >"
>
> In other words, after the function arguments we now have an
> optional
> list of operand bundles of the form `"< bundle tag >"(bundle
> attributes, values...)`. There can be more than one operand
> bundle in
> a call. Two operand bundles in the same call instruction
> cannot have
> the same tag.
>
> We'd do something similar for invokes. I'll omit the invoke
> syntax
> from this RFC to keep things brief.
>
> An example:
>
> define i32 @f(i32 %x) {
> entry:
> %t = add i32 %x, 1
> ret i32 %t
> }
>
> define void @g(i16 %val, i8* %ptr) {
> entry:
> call void @f(i32 10) "some-bundle"(i32 42) "debug"(i32 100)
> call void @f(i32 20) "some-bundle"(i16 %val, i8* %ptr)
> }
>
> Note 1: Operand bundles are *not* part of a function's
> signature, and
> a given function may be called from multiple places with different
> kinds of operand bundles. This reflects the fact that the operand
> bundles are conceptually a part of the *call*, not the callee
> being
> dispatched to.
>
> Note 2: There may be tag specific requirements not mentioned here.
> E.g. we may add a rule in the future that says operand bundles
> with
> the tag `"integer-id"` may only contain exactly one constant
> integer.
>
>
> # IR Semantics
>
> Bundle operands (SSA values part of some operand bundle) are
> normal
> SSA values. They need to dominate the call or invoke instruction
> they're being passed into and can be optimized as usual. For
> instance, LLVM is allowed (and strongly encouraged!) to PRE /
> LICM a
> load feeding into an operand bundle if legal.
>
> Operand bundles are characterized by the `"< bundle tag >"` string
> associated with them.
>
> The overall strategy is:
>
> 1. The semantics are as conservative as is reasonable for operand
> bundles with tags that LLVM does not have a special
> understanding
> of. This way LLVM does not miscompile code by default.
>
> 2. LLVM understands the semantics of operand bundles with certain
> specific tags more precisely, and can optimize them better.
>
> This RFC talks mainly about (1). We will discuss (2) as we
> add smarts
> to LLVM about specific kinds of operand bundles.
>
> The IR-level semantics of an operand bundle with an arbitrary
> tag are:
>
> 1. The bundle operands passed in to a call escape in unknown ways
> before transferring control to the callee. For instance:
>
> declare void @opaque_runtime_fn()
>
> define void @f(i32* %v) { }
>
> define i32 @g() {
> %t = i32* @malloc(...)
> ;; "unknown" is a tag LLVM does not have any special
> knowledge of
> call void @f(i32* %t) "unknown"(i32* %t)
>
> store i32 42, i32* %t
> call void @opaque_runtime_fn();
> ret (load i32, i32* %t)
> }
>
> Normally (without the `"unknown"` bundle) it would be okay to
> optimize `@g` to return `42`. But the `"unknown"` operand
> bundle
> escapes `%t`, and the call to `@opaque_runtime_fn` can
> therefore
> modify the location pointed to by `%t`.
>
> 2. Calls and invokes with operand bundles have unknown read /
> write
> effect on the heap on entry and exit (even if the call
> target is
> `readnone` or `readonly`). For instance:
>
> define void @f(i32* %v) { }
>
> define i32 @g() {
> %t = i32* @malloc(...)
> %t.unescaped = i32* @malloc(...)
> ;; "unknown" is a tag LLVM does not have any special
> knowledge of
> call void @f(i32* %t) "unknown"(i32* %t)
> ret (load i32, i32* %t)
> }
>
> Normally it would be okay to optimize `@g` to return
> `undef`, but
> the `"unknown"` bundle potentially clobbers `%t`. Note that it
> clobbers `%t` only because it was *also escaped* by the
> `"unknown"` operand bundle -- it does not clobber
> `%t.unescaped`
> because it isn't reachable from the heap yet.
>
> However, it is okay to optimize
>
> define void @f(i32* %v) {
> store i32 10, i32* %v
> print(load i32, i32* %v)
> }
>
> define void @g() {
> %t = ...
> ;; "unknown" is a tag LLVM does not have any special
> knowledge of
> call void @f(i32* %t) "unknown"()
> }
>
> to
>
> define void @f(i32* %v) {
> store i32 10, i32* %v
> print(10)
> }
>
> define void @g() {
> %t = ...
> call void @f(i32* %t) "unknown"()
> }
>
> The arbitrary heap clobbering only happens on the
> boundaries of
> the call operation, and therefore we can still do store-load
> forwarding *within* `@f`.
>
> Since we haven't specified any "pure" LLVM way of accessing the
> contents of operand bundles, the client is required to model such
> accesses as calls to opaque functions (or inline assembly). This
> ensures that things like IPSCCP work as intended. E.g. it is
> legal to
> optimize
>
> define i32 @f(i32* %v) { ret i32 10 }
>
> define void @g() {
> %t = i32* @malloc(...)
> %v = call i32 @f(i32* %t) "unknown"(i32* %t)
> print(%v)
> }
>
> to
>
> define i32 @f(i32* %v) { ret i32 10 }
>
> define void @g() {
> %t = i32* @malloc(...)
> %v = call i32 @f(i32* %t) "unknown"(i32* %t)
> print(10)
> }
>
> LLVM won't generally be able to inline through calls and
> invokes with
> operand bundles -- the inliner does not know what to replace the
> arbitrary heap accesses implied on function entry and exit with.
> However, we intend to teach the inliner to inline through calls /
> invokes with some specific kinds of operand bundles.
>
>
> # Lowering
>
> The lowering strategy will be special cased for each bundle tag.
> There won't be any "generic" lowering strategy -- `llc` is
> expected to
> abort if it sees an operand bundle that it does not understand.
>
> There is no requirement that the operand bundles actually make
> it to
> the backend. Rewriting the operand bundles into "vanilla"
> LLVM IR at
> some point in the pipeline (instead of teaching codegen to
> lower them)
> is a perfectly reasonable lowering strategy.
>
>
> # Example use cases
>
> A couple of usage scenarios are very briefly described below:
>
> ## Deoptimization
>
> This is our motivating use case. Some managed environments
> expect to
> be able to discover the state of the abstract virtual machine
> at specific call
> sites. LLVM will be able to support this requirement by
> attaching a
> `"deopt"` operand bundle containing the state of the abstract
> virtual
> machine (as a vector of SSA values) at the appropriate call sites.
> There is a straightforward way
> to extend the inliner work with `"deopt"` operand bundles.
>
> `"deopt"` operand bundles will not have to be as pessimistic about
> heap effects as the general "unknown operand bundle" case --
> they only
> imply a read from the entire heap on function entry or
> function exit,
> depending on what kind of deoptimization state we're
> interested in.
> They also don't imply escaping semantics.
>
>
> ## Value injection
>
> By passing in one or more `alloca`s to an `"injectable-value"`
> tagged
> operand bundle, languages can allow the runtime to overwrite the
> values of specific variables, while still preserving a significant
> amount of optimization potential.
>
>
>
> Thoughts?
>
>
> This seems pretty useful, generic, call-site annotation mechanism.
>
>
> Agreed. It seems like these would be useful for our existing
> patchpoints too (to record the live values for the associated stack
> map, instead of using extra intrinsic arguments for them).
That's specifically the intent. This mechanism will allow us to work
towards replacing (or at least greatly simplifying) both patchpoint and
statepoints.
>
> -Hal
>
> I believe that this has immediate application outside of the
> context of GC.
>
> Our exception handling personality routine has a desire to know
> whether some code is inside a specific try or catch. We can feed
> the value coming out of our EH pad back into the call-site, making
> it very clear which EH pad the call-site is associated with.
>
> -- Sanjoy
>
>
>
>
>
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150819/015a087c/attachment.html>
More information about the llvm-dev
mailing list