[llvm-dev] RFC: Add "operand bundles" to calls and invokes

Wed Aug 12 14:58:28 PDT 2015

> tag needs to be "some string name" or <future keyword>.  We also need to be
> clear about what the compatibility guarantees are. If I remember correctly,
> we discussed something along the following:
> - string bundle names are entirely version locked to particular revision of
> LLVM.  They are for experimentation and incremental development.  There is
> no attempt to forward serialize them.  In particular, using a string name
> which is out of sync with the version of LLVM can result in miscompiles.
> - keyword bundle names become first class parts of the IR, they are forward
> serialized, and fully supported.  Obviously, getting an experimental string
> bundle name promoted to a first class keyword bundle will require broad
> discussion and buy in.
>
> We were deliberately trying to parallel the defacto policy around attributes
> vs string-attributes.

Agreed.

>> In other words, after the function arguments we now have an optional
>> list of operand bundles of the form `"< bundle tag >"(bundle
>> attributes, values...)`.  There can be more than one operand bundle in
>> a call.  Two operand bundles in the same call instruction cannot have
>> the same tag.
>
> I don't think we need that last sentence.  It should be up to the bundle
> implementation if that's legal or not.  I don't have a strong preference
> here and we could easily relax this later.

I'll remove the restriction.  I think it is reasonable to have this
decided per bundle type, as you suggested.

>>   2. Calls and invokes with operand bundles have unknown read / write
>>      effect on the heap on entry and exit (even if the call target is
>>      `readnone` or `readonly`).  For instance:
>
> I don't think we actually need this.  I think it would be perfectly fine to
> require the frontend ensure that the called function is not readonly if it
> being readonly would be problematic for the call site.  I'm not really
> opposed to this generalization - I could see it being useful - but I'm
> worried about the amount of work involved.  A *lot* of the optimizer assumes
> that attributes on a call site strictly less conservative than the
> underlying function. Changing that could have a long bug tail.  I'd rather
> defer that work until someone defines an operand bundle type which requires
> it.  The motivating example (deoptimization) doesn't seem to require this.

If we're doing late poll placement and if certain functions are
"frameless" in the abstract machine, then we will need this for
deoptimization.

The case I'm thinking of is:

  define void @foo() {
   ;; Can be just about any kind of uncounted loop that is readnone
   entry:
    br label %inf_loop

   inf_loop:
    br label %inf_loop
  }

  define void @caller() {
   entry:
    store i32 42, i32* @global
    call void @foo() "deopt"(i32 100)
    store i32 46, i32* @global
    ret void
  }

Right now `@foo` is `readnone`, so the first store of `i32 42` can be
DSE'ed.  However, if we insert a poll inside `@foo` later, that will
have to be given a JVM state, which we cannot do anymore since a store
that would have been done by the abstract machine has been elided.

[ moved here, because this is related ]
>> `"deopt"` operand bundles will not have to be as pessimistic about
>> heap effects as the general "unknown operand bundle" case -- they only
>> imply a read from the entire heap on function entry or function exit,
>> depending on what kind of deoptimization state we're interested in.
>> They also don't imply escaping semantics.
>
> An alternate framing here which would remove the attribute case I was
> worried about about would be to separate the memory and abstract state
> semantics of deoptimization.  If the deopt bundle only described the
> abstract state and it was up to the frontend to ensure the callee was at
> least readonly, we wouldn't need to model memory in the deopt bundle.  I
> think that's a much better starting place.

Semantically, I think we need the state of the heap to be consistent
at method call boundaries, not within a method boundary.  For
instance, consider this:

  ;; @global is 0 to start with

  define void @f() readonly {
    ;; do whatever
    call read_only_safepoint_poll() readonly "deopt"( ... deopt state
local to @f ...)
  }

  define void @g() {
    call void @f() "deopt"( ... deopt state local to @g ...)
    if (*@global == 42) { side_effect(); }
    store i32 42, i32* @global
  }

If we do not have the reads-everything-on-exit property, then this is
a valid transform:

  define void @f() readonly {
    ;; do whatever
    call read_only_safepoint_poll() readonly "deopt"( ... deopt state
local to @f ...)
    if (*@global == 42) { side_effect(); }
    store i32 42, i32* @global
  }

  define void @g() {
    call void @f() "deopt"( ... deopt state local to @g ...)
  }

If we *don't* inline `@f` into `@g`, and `@f` wants to deoptimize `@g`
(and only `@g`) after halting the thread at
`read_only_safepoint_poll`, we're in trouble.  `@f` will execute the
store to `@global` before returning, and the deoptimized `@g` will
call `side_effect` when it shouldn't have.  (Note: I put the `if
(*@global == 42)` to make the problem more obvious, but in practice I
think doing the same store twice is also problematic).  Another way to
state this is that even though the state of the heap was consistent at
the call to `read_only_safepoint_poll`, it will not be consistent when
`@f` returns.  Therefore we cannot use a "deopt `@g` on return with
vmstate xyz" scheme, unless we model the operand bundle as reading the
entire heap on return of `@f` (this would force the state of the heap
to be consistent at the point where we actually use the vmstate).

There is an analogous case where we have to model the deopt operand
bundle as reads-everything-on-entry: if we have cases where we
deoptimize on entry.  IOW, something like this:

  ; @global starts off as 0

  define void @side_exit() readonly {
    call void @deoptimize_my_caller()
    return
  }

  define void @store_field(ref) {
   (*@global)++;
 lbl:
   if (ref == nullptr) {
     call void @side_exit() ;; vm_state = at label lbl
     unreachable
   } else {
     ref->field = 42;
   }
  }

could be transformed to

  define void @side_exit() readonly {
    (*@global)++;
    call void @deoptimize_my_caller()
    return
  }

  define void @store_field(ref) {
 lbl:
   if (ref == nullptr) {
     call void @side_exit() ;; vm_state = at label lbl
     unreachable
   } else {
     (*@global)++;
     ref->field = 42;
   }
  }

Now if `ref` is null and we do not inline `@side_exit` then we will
end up incrementing `@global` twice.

In practice I think we can work around these issues by marking
`@side_exit` and `@f` as external, so that inter-procedural code
motion does not happen but

 a. That would be a workaround, the semantic issues will still exist
 b. LLVM is still free to specialize external functions.

As a meta point, I think the right way to view operand bundles is as
something that *happens* before and after an call / invoke, not as a
set of values being passed around.  For that reason, do you think they
should be renamed to be something else?

>> Since we haven't specified any "pure" LLVM way of accessing the
>> contents of operand bundles, the client is required to model such
>> accesses as calls to opaque functions (or inline assembly).
>
> I'm a bit confused by this section.  By "client" do you mean frontend?  And
> what are you trying to allow in the second sentence? The first sentence
> seems sufficient.
>>
>> This
>> ensures that things like IPSCCP work as intended.  E.g. it is legal to
>> optimize
>>
> To say this differently, an operand bundle at a call site can not change the
> implementation of the called function.  This is not a mechanism for function
> interposition.

I was really trying to say "whatever the optimizer directly
understands about the IR is correct", so you're right, this is about
disallowing arbitrary function interposition.

-- Sanjoy