<div dir="ltr">Hi Sanjoy,<div><br></div><div>A quick question here. With the bailing to the interpreter support that you're envisioning ("deopt operand bundle"), it appears to overlap quite a bit with the existing stack maps. What's the story/interaction/etc here? I agree that a simpler control flow is great when bailing to the interpreter - doing it with phi nodes is a recipe for pain and long compile times.</div><div><br></div><div>Thanks!</div><div><br></div><div>-eric<br><br><div class="gmail_quote"><div dir="ltr">On Tue, Feb 16, 2016 at 6:06 PM Sanjoy Das via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">This is a proposal to add guard intrinsics to LLVM.<br>

<br>

Couple of glossary items: when I say "interpreter" I mean "the most<br>

conservative tier in the compilation stack" which can be an actual<br>

interpreter, a "splat compiler" or even a regular JIT that doesn't<br>

make optimistic assumptions.  By "bailing out to the interpreter" I<br>

mean "side exit" as defined by Philip in<br>

<a href="http://www.philipreames.com/Blog/2015/05/20/deoptimization-terminology/" rel="noreferrer" target="_blank">http://www.philipreames.com/Blog/2015/05/20/deoptimization-terminology/</a><br>

<br>

<br>

# high level summary<br>

<br>

Guard intrinsics are intended to represent "if (!cond) leave();" like<br>

control flow in a structured manner.  This kind of control flow is<br>

abundant in IR coming from safe languages due to things like range<br>

checks and null checks (and overflow checks, in some cases).  From a<br>

high level, there are two distinct motivations for introducing guard<br>

intrinsics:<br>

<br>

 - To keep control flow as simple and "straight line"-like as is<br>

   reasonable<br>

<br>

 - To allow certain kinds of "widening" transforms that cannot be<br>

   soundly done in an explicit "check-and-branch" control flow<br>

   representation<br>

<br>

## straw man specification<br>

<br>

Signature:<br>

<br>

```<br>

declare void @llvm.guard_on(i1 %predicate) ;; requires [ "deopt"(...) ]<br>

```<br>

<br>

Semantics:<br>

<br>

`@llvm.guard_on` bails out to the interpreter (i.e. "deoptimizes" the<br>

currently execution function) if `%predicate` is `false`, meaning that<br>

after `@llvm.guard_on(i1 %t)` has executed `%t` can be assumed to be<br>

true.  In this way, it is close to `@llvm.assume` or an `assert`, with<br>

one very important difference -- `@llvm.guard_on(i1 <false>)` is well<br>

defined (and not UB).  `@llvm.guard_on` on a false predicate bails to<br>

the interpreter and that is always safe (but slow), and so<br>

`@llvm.guard_on(i1 false)` is basically a `noreturn` call that<br>

unconditionally transitions the current compilation to the<br>

interpreter.<br>

<br>

Bailing out to the interpreter involves re-creating the state of the<br>

interpreter frames as-if the compilee had been executing in the<br>

interpreter all along.  This state is represented and maintained using<br>

a `"deopt"` operand bundle attached to the call to `@llvm.guard_on`.<br>

The verifier will reject calls to `@llvm.guard_on` without a `"deopt"`<br>

operand bundle.  `@llvm.guard_on` cannot be `invoke`ed (that would be<br>

meaningless anyway, since the method it would've "thrown" into is<br>

about to go away).<br>

<br>

<br>

Example:<br>

<br>

```<br>

  ...<br>

  %rangeCheck0 = icmp ult i32 %arg0, 10<br>

  call void @llvm.guard_on(i1 %rangeCheck0) [ "deopt"(/* deopt state 0 */) ]<br>

  %rangeCheck1 = icmp ult i32 %arg0, 12<br>

  call void @llvm.guard_on(i1 %rangeCheck1) [ "deopt"(/* deopt state 1 */) ]<br>

  ...<br>

```<br>

<br>

<br>

# details: motivation & alternatives<br>

<br>

As specified in the high level summary, there are two key motivations.<br>

<br>

The first, more obvious one is that we want the CFG to be less<br>

complex, even if that involves "hiding" control flow in guard<br>

intrinsics.  I expect this to benefit both compile time and<br>

within-a-block optimization.<br>

<br>

The second, somewhat less obvious motivation to use guard intrinsics<br>

instead of explicit control flow is to allow check widening.<br>

<br>

## check widening<br>

<br>

Consider:<br>

<br>

```<br>

  ...<br>

  %rangeCheck0 = icmp ult i32 6, %len   ;; for a[6]<br>

  call void @llvm.guard_on(i1 %rangeCheck0) [ "deopt"(/* deopt state 0 */) ]<br>

  call void @printf("hello world")<br>

  %rangeCheck1 = icmp ult i32 7, %len   ;; for a[7]<br>

  call void @llvm.guard_on(i1 %rangeCheck1) [ "deopt"(/* deopt state 1 */) ]<br>

  access a[6] and a[7]<br>

  ...<br>

```<br>

<br>

we'd like to optimize it to<br>

<br>

```<br>

  ...<br>

  %rangeCheckWide = icmp ult i32 7, %len<br>

  call void @llvm.guard_on(i1 %rangeCheckWide) [ "deopt"(/* deopt state 0 */) ]<br>

  call void @printf("hello world")<br>

  ;; %rangeCheck1 = icmp ult i32 7, %len  ;; not needed anymore<br>

  ;; call void @llvm.guard_on(i1 %rangeCheck1) [ "deopt"(/* deopt state 1 */) ]<br>

  access a[6] and a[7]<br>

  ...<br>

```<br>

<br>

This way we do a range check only on `7` -- if `7` is within bounds,<br>

then we know `6` is too.  This transform is sound only because we know<br>

that the guard on `7 ult %len` will not simply throw an exception if<br>

the said predicate fails, but will bail out to the interpreter with<br>

the abstract state `/* deopt state 0 */`.  In fact, if `%len` is `7`,<br>

the pre-transform program is supposed to print `"hello world"` and<br>

*then* throw an exception, and bailing out to the interpreter with `/*<br>

deopt state 0 */` will do exactly that.<br>

<br>

In other words, we're allowed to do speculative and aggressive<br>

transforms that make a guard fail that wouldn't have in the initial<br>

program.  This is okay because a failed guard only bails to the<br>

interpreter, and the interpreter always Does The Right Thing(TM).  In<br>

fact, it is correct (though unwise) to replace every guard's predicate<br>

with `false`.<br>

<br>

## the problem with check widening and explicit control flow<br>

<br>

Widening is difficult to get right in an explicit "check-and-branch"<br>

representation.  For instance, the above example in a<br>

"check-and-branch" representation would be (in pseudo C, and excluding<br>

the printf):<br>

<br>

```<br>

  ...<br>

  if (!(6 < %len)) { call @side_exit() [ "deopt"(P) ]; unreachable; }<br>

  if (!(7 < %len)) { call @side_exit() [ "deopt"(Q) ]; unreachable; }<br>

  ...<br>

```<br>

<br>

The following transform is invalid:<br>

<br>

```<br>

  ...<br>

  if (!(7 < %len)) { call @side_exit() [ "deopt"(P) ]; unreachable; }<br>

  if (!(true)) { call @side_exit() [ "deopt"(Q) ]; unreachable; }<br>

  ...<br>

```<br>

<br>

since we do not know if the first side exit had been optimized under<br>

the assumption `!(6 < %len)` (via JumpThreading etc.).  E.g. the<br>

"original" IR could have been<br>

<br>

```<br>

  ...<br>

  if (!(6 < %len)) { call @side_exit() [ "deopt"(!(6 < %len)) ]; unreachable; }<br>

  if (!(7 < %len)) { call @side_exit() [ "deopt"(Q) ]; unreachable; }<br>

  ...<br>

```<br>

<br>

which got optimized to<br>

<br>

```<br>

  ...<br>

  if (!(6 < %len)) { call @side_exit() [ "deopt"(true) ]; unreachable; }<br>

  if (!(7 < %len)) { call @side_exit() [ "deopt"(Q) ]; unreachable; }<br>

  ...<br>

```<br>

<br>

before the widening transform. The widening transform will now<br>

effectively pass in an incorrect value for `!(6 < %len)`.<br>

<br>

This isn't to say it is impossible to do check widening in a explicit<br>

control flow representation, just that is more natural to do it with<br>

guards.<br>

<br>

<br>

# details: semantics<br>

<br>

## as-if control flow<br>

<br>

The observable behavior of `@llvm.guard_on` is specified as:<br>

<br>

```<br>

  void @llvm.guard_on(i1 %pred) {<br>

  entry:<br>

    %unknown_cond = < unknown source ><br>

    %cond = and i1 %unknown_cond, %pred<br>

    br i1 %cond, label %left, label %right<br>

<br>

  left:<br>

    call void @bail_to_interpreter() [ "deopt"() ] noreturn<br>

    unreachable<br>

<br>

  right:<br>

    ret void<br>

  }<br>

```<br>

<br>

So, precisely speaking, `@llvm.guard_on` is guaranteed to bail to the<br>

interpreter if `%pred` is false, but it **may** bail to the<br>

interpreter if `%pred` is true.  It is this bit that lets us soundly<br>

widen `%pred`, since all we're doing is "refining" `< unknown source >`.<br>

<br>

`@bail_to_interpreter` does not return to the current compilation, but<br>

it returns to the `"deopt"` continuation that is has been given (once<br>

inlined, the empty "deopt"() continuation will be fixed up to have the right<br>

continuation).<br>

<br>

<br>

## applicable optimizations<br>

<br>

Apart from widening, most of the optimizations we're interested in are<br>

what's allowed by an equivalent `@llvm.assume`.  Any conditional<br>

branches dominated by a guard on the same condition can be folded,<br>

multiple guards on the same condition can be CSE'ed, loop invariant<br>

guards can be hoisted out of loops etc.<br>

<br>

Ultimately, we'd like to recover the same quality of optimization as<br>

we currently get from the "check-and-branch" representation.  With the<br>

"check-and-branch" representation, the optimizer is able to sink<br>

stores and computation into the slow path. This is something it cannot<br>

do in the guard_on representation, and we'll have to lower the<br>

guard_on representation to the "check-and-branch" representation at a<br>

suitable point in the pipeline to get this kind of optimization.<br>

<br>

## lowering<br>

<br>

At some point in the compilation pipeline, we will have to lower<br>

`@llvm.guard_on` into explicit control flow, by "inlining" "an<br>

implementation" of `@llvm.guard_on` (or by some other means).  I don't<br>

have a good sense on when in the pipeline this should be done -- the<br>

answer will depend on what we find as we make LLVM more aggressive<br>

around optimizing guards.<br>

<br>

## environments without support for bailing to the interpreter<br>

<br>

Our VM has deeply integrated support for deoptimizations, but not all<br>

language runtimes do.  If there is a need for non deoptimizing guards, we<br>

can consider introducing a variant of `@llvm.guard_on`:<br>

<br>

```<br>

declare void @llvm.exception_on(i1 %predicate, i32 %exceptionKind)<br>

```<br>

<br>

with this one having the semantics that it always throws an exception<br>

if `%predicate` fails.  Only the non-widening optimizations for<br>

`@llvm.guard_on` will apply to `@llvm.exception_on`.<br>

<br>

## memory effects (unresolved)<br>

<br>

[I haven't come up with a good model for the memory effects of<br>

 `@llvm.guard_on`, suggestions are very welcome.]<br>

<br>

I'd really like to model `@llvm.guard_on` as a readonly function,<br>

since it does not write to memory if it returns; and e.g. forwarding<br>

loads across a call to `@llvm.guard_on` should be legal.<br>

<br>

However, I'm in a quandary around representing the "may never return"<br>

aspect of `@llvm.guard_on`: I have to make it illegal to, say, hoist a<br>

load form `%ptr` across a guard on `%ptr != null`.  There are couple<br>

of ways I can think of dealing with this, none of them are both easy<br>

and neat:<br>

<br>

 - State that since `@llvm.guard_on` could have had an infinite loop<br>

   in it, it may never return. Unfortunately, the LLVM IR has some<br>

   rough edges on readonly infinite loops (since C++ disallows that),<br>

   so LLVM will require some preparatory work before we can do this<br>

   soundly.<br>

<br>

 - State that `@llvm.guard_on` can unwind, and thus has non-local<br>

   control flow.  This can actually work (and is pretty close to<br>

   the actual semantics), but is somewhat of hack since<br>

   `@llvm.guard_on` doesn't _really_ throw an exception.<br>

<br>

  - Special case `@llvm.guard_on` (yuck!).<br>

<br>

What do you think?<br>

<br>

-- Sanjoy<br>

_______________________________________________<br>

LLVM Developers mailing list<br>

<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a><br>

<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br>

</blockquote></div></div></div>