[llvm-dev] RFC: Add guard intrinsics to LLVM

Wed Feb 17 16:41:44 PST 2016

Replies inline.

At a high level, it feels like we'll eventually need a new instruction
to represent the kind of control flow a guard entails (to be clear: we
should probably still start with an intrinsic) -- they are fairly
well-behaved, i.e. readonly, nounwind etc. as far as the immediate
"physical" caller is concerned, but not so as far as its callers's
callers are concerned.

On Wed, Feb 17, 2016 at 3:40 PM, Philip Reames
<listmail at philipreames.com> wrote:

>> one very important difference -- `@llvm.guard_on(i1 <false>)` is well
>> defined (and not UB).  `@llvm.guard_on` on a false predicate bails to
>> the interpreter and that is always safe (but slow), and so
>> `@llvm.guard_on(i1 false)` is basically a `noreturn` call that
>> unconditionally transitions the current compilation to the
>> interpreter.
>
> It's also worth noting that @llvm.guard_on(i1 true) is useful and well
> defined as well.  When lowering, such a guard simply disappears, but it can
> be useful to keep around in the IR during optimization.  It gives a well
> defined point for widening transforms to apply with a well known
> deoptimization state already available.

Yes!  Actually, I had exactly this in an earlier version of this
writeup, which I removed to make the (already long) RFC shorter.

> I'd suggest a small change to Sanjoy's declaration.  I think we should allow
> additional arguments to the guard, not just the condition.  What exactly
> those arguments mean would be up to the runtime, but various runtimes might
> want to provide additional arguments to the OSR mechanism.

We'll still have to make a call on the signature of the intrinsic (or
are you suggesting a varargs intrinsic)?

I suppose we could also have a family of intrinsics, that take on
argument of variable type.

>> Bailing out to the interpreter involves re-creating the state of the
>> interpreter frames as-if the compilee had been executing in the
>> interpreter all along.  This state is represented and maintained using
>> a `"deopt"` operand bundle attached to the call to `@llvm.guard_on`.
>> The verifier will reject calls to `@llvm.guard_on` without a `"deopt"`
>> operand bundle.
>
> This introduces a very subtle point.  The side exit only effects the
> *function* which contains the guard.  A caller of that function in the same
> module may be returned to by either the function itself, or the interpreter
> after running the continuation implied by the guard.  This introduces a
> complication for IPA/IPO; any guard (really, any side exit, of which guards
> are one form) has to be treated as a possible return point from the callee
> with an unknowable return value and memory state.

This is a really good point.  This has strong implications for the
guards memory effects as well -- even though a guard can be seen as
readonly in its containing functions, things that call the containing
function has to see the guard as read-write.  IOW @foo below is
read/write, even though v0 can be forwarded to v1:

```
int @foo(int cond) {
  int v0 = this->field;
  guard_on(cond);
  int v1 = this->field;
  return v0 + v1;
}
```

As you point out, we're also introducing a newish kind of control flow
here.  It is not fundamentally new, since longjmp does something
similar (but not quite the same).

I hate to say this, but perhaps we're really looking at (eventually) a
new instruction here, and not just a new intrinsic.

>> `@llvm.guard_on` cannot be `invoke`ed (that would be
>> meaningless anyway, since the method it would've "thrown" into is
>> about to go away).
>
> I disagree with this bit.  It needlessly complicates the inliner. Allowing
> an invoke of a guard which SimplifyCFG converts to calls just like it would
> a nothrow function seems much cleaner.

SGTM.

>> The observable behavior of `@llvm.guard_on` is specified as:
>>
>> ```
>>    void @llvm.guard_on(i1 %pred) {
>>    entry:
>>      %unknown_cond = < unknown source >
>>      %cond = and i1 %unknown_cond, %pred
>>      br i1 %cond, label %left, label %right
>>
>>    left:
>>      call void @bail_to_interpreter() [ "deopt"() ] noreturn
>>      unreachable
>>
>>    right:
>>      ret void
>>    }
>> ```
>>
>> So, precisely speaking, `@llvm.guard_on` is guaranteed to bail to the
>> interpreter if `%pred` is false, but it **may** bail to the
>> interpreter if `%pred` is true.  It is this bit that lets us soundly
>> widen `%pred`, since all we're doing is "refining" `< unknown source >`.
>
> Unless I'm misreading this, it looks like Sanjoy got the IR in the
> specification wrong.  The intrinsic is specified to side exit if the
> condition is false (so that it's true in the caller after the guard), not
> the other way around.  The text description appears correct.

Yes, and thanks for catching.  The branch should have been "br i1
%cond, label %right, label %left".

>> `@bail_to_interpreter` does not return to the current compilation, but
>> it returns to the `"deopt"` continuation that is has been given (once
>> inlined, the empty "deopt"() continuation will be fixed up to have the
>> right
>> continuation).
>
> This "bail_to_interpreter" is a the more general form of side exit I
> mentioned above.

How is it more general?

> As a starting point, I'd likely do this just before code gen prep with some
> custom sinking logic to pull instructions only used on the failing path into
> the newly introduced block.  Long term, we'd probably want to do the same
> thing over MI, but mucking with control flow at that layer is a bit more
> complicated.
>
> Rather than framing this as inlining, I'd frame it as expansion to a well
> known body (pretty much the one Sanjoy gives above).  The
> @bail_to_interpreter construct could be lowered directly to a function call
> to some well known symbol name (which JIT users can intercept and bind to
> whatever they want.)  Something like __llvm_side_exit seems like a
> reasonable choice.

SGTM.

>> with this one having the semantics that it always throws an exception
>> if `%predicate` fails.  Only the non-widening optimizations for
>> `@llvm.guard_on` will apply to `@llvm.exception_on`.
>
> Not sure we actually need this.  A valid implementation of the side exit
> handler (which would do the OSR for us) is to call a runtime routine which
> generates and throws the exception.  The only bit we might need is a
> distinction between widdenable and non-widenable guards.

Yes, and that's the only distinction I'm trying to make here.

>> ## memory effects (unresolved)
>>
>> [I haven't come up with a good model for the memory effects of
>>   `@llvm.guard_on`, suggestions are very welcome.]
>>
>> I'd really like to model `@llvm.guard_on` as a readonly function,
>> since it does not write to memory if it returns; and e.g. forwarding
>> loads across a call to `@llvm.guard_on` should be legal.
>>
>> However, I'm in a quandary around representing the "may never return"
>> aspect of `@llvm.guard_on`: I have to make it illegal to, say, hoist a
>> load form `%ptr` across a guard on `%ptr != null`.
>
> Modeling this as memory dependence just seems wrong.  We already have to
> model control dependence on functions which may throw.  I don't think
> there's anything new here.

I am trying to model this as control dependence, but the difficult bit
is to do that while still maintaining that the call does not clobber
any memory.  I'm worried that there may be reasons (practical or
theoretical) why we "readonly" functions always have to terminate and
be nothrow.

> The only unusual bit is that we're going to want to teach AliasAnalysis that
> the guard does write to any memory location (to allow forwarding) while
> still preserving the control dependence.

So you're saying that we model the guard as otherwise read/write (thus
sidestepping the readonly non-returning quandary) but teach
AliasAnalysis that it doesn't clobber any memory?  That would work.

We can also use the same tool to solve the "may return to its caller's
caller with arbitrary heap state" issue by teaching AA that a guard
does not alias with reads in its own (physical) function, but clobbers
the heap for other (physical) functions.

Notation: I'm differentiating between physical functions == functions
that create actual stack frames and inlined functions == logical Java
functions that don't create separate physical frames.  Inlining IR
from one java level function into another usually creates a physical
function that contains more than one logical function.

>>
>> There are couple
>> of ways I can think of dealing with this, none of them are both easy
>> and neat:
>>
>>   - State that since `@llvm.guard_on` could have had an infinite loop
>>     in it, it may never return. Unfortunately, the LLVM IR has some
>>     rough edges on readonly infinite loops (since C++ disallows that),
>>     so LLVM will require some preparatory work before we can do this
>>     soundly.
>>
>>   - State that `@llvm.guard_on` can unwind, and thus has non-local
>>     control flow.  This can actually work (and is pretty close to
>>     the actual semantics), but is somewhat of hack since
>>     `@llvm.guard_on` doesn't _really_ throw an exception.
>
> Er, careful.  Semantically, the guard *might* throw an exception. It could
> be that's what the interpreter does when evaluating the continuation implied
> by the guard and any of our callers have to account for the fact the
> function which contains the guard might throw.  The easiest way to ensure
> that is to model the guard call as possibly throwing.

Yes, it does not throw an exception into its own caller, but may throw
into its caller's caller.

-- Sanjoy