[llvm-dev] RFC: Add guard intrinsics to LLVM

Sun Feb 21 12:24:48 PST 2016

Hi Andy,

Thanks for replying, responses inline below:

On Fri, Feb 19, 2016 at 11:12 AM, Andrew Trick <atrick at apple.com> wrote:
> This clearly doesn't need operand bundles, but using an intrinsic
> would permit special code motion semantics. It could be hoisted and
> merged with other traps, but the condition could never be widened
> beyond the union of the original conditions. Unlike deoptimizing
> guards, it would need to be sequenced with memory barriers, but could

By memory barrier, do you mean things like fences?

> otherwise be hoisted as readnone.

Can you clarify this a little bit?  Are you talking about things like:

  *ptr = 42
  @trap_if(%foo)
=>
  @trap_if(%foo)
  *ptr = 42

?  I.e. if a failing guard postdominates a point "P" in the program,
then the guard can be failed early at P.

(We'd want a stronger condition than postdomination, since we won't
 want to hoist

    while (!*cond) { }
    @trap_if(*cond)
  =>
    @trap_if(*cond)
    while (!*cond) { }
)

[snip]

> Of course it would be nice to consider these intrinsics
> readonly. (One thing we never fixed with patchpoints is the memory
> effects and that was probably hurting performance.) But has this bug
> been fixed yet: http://llvm.org/pr18912 Optimizer should consider
> "maythrow" calls (those without "nounwind) as having side effects? I
> vaguely remember some work being done to improve the situation, but
> last I checked LICM was still violating it, and who knows what else?

The specific case in the bug was fixed by David in
http://reviews.llvm.org/rL256728.  But I agree with your concern, that
this notion of "readnone calls are always okay to remove" may have
leaked into other parts of LLVM.

> BTW, if you do want readonly semantics, why would you want readonly
> to be implicit instead of explicit?

I don't understand the question. :) What's explicit vs. implicit here?

> I think you would need to mark @guard_on as may-unwind AND fix any
> lingering assumptions in LLVM that readonly calls are nounwind. (Loads
> can be CSE's across unwind calls but can't be hoisted across them).

Yes, but a failed guard doesn't exactly "unwind".  I.e. if A invokes B
and B fails a guard, then the guard may not always unwind to the A->B
callsite's landingpad, but can also return to its normal no-exception
successor; unless you inline B into A, in which case "B"s guard (now
physically present in the A+B function) will return from A if it
fails.  In other words, we'll have to re-purpose the term "unwind" to
mean either "throwing an exception" or "failed a guard".  I'm fine
with this, but it is a choice we need to make explicitly.

> Another good point mentioned later in the thread is that a readonly
> callee should *not* imply a readonly @guard_on or other "deopt" call
> site. The simple solution for this is to disallow "deopt" of readonly
> calls.

Assuming we're talking about the same thing, the concern was more
along the lines of: you cannot do IPA on a callsite that calls
something that contains a potential deoptimization point, be it either
for a @guard_on call, or a "normal" deoptimization safepoint.  This is
because if you have:

```
void @something()

void callee() {
  call @something() [ "deopt"(state0) ]
  *ptr = 100;
}

void caller() {
  call @callee()  // whether this is a "deopt" call site or a
                  // a normal call site doesn't matter
  int k = *ptr;
}
```

you cannot, say, fold `k` to `100` in @caller() because the deopt
state, state0, if resumed to may actually write something else to
*ptr.

Making the call to @something() read/write/may unwind does not solve
the problem -- even if the call to @something() wrote to *ptr (or
threw an exception), we could still fold k to 100.  The novel control
flow here is that `@caller`, if deoptimized with `state0`, will
execute some arbitrary code, and **return** to @caller at the callsite
to @callee.

(Btw: things are a little different inside our JVM because of the way
we register dependencies, but I'm trying to make a LLVM-centric
argument here.)

At this point, I think it is most straightforward to do a slight
variant of what Philip suggested to me offline: if a function can
deoptimize, it needs to be marked as "mayBeOverridden", preferably by
introducing a new function attribute or by setting the linkage type to
one of those checked by GlobalValue::mayBeOverridden.  This is the
responsibility of the frontend, and resuming to the "deopt"
continuation in a physical frame where the enclosing function was not
marked mayBeOverridden is UB.

-- Sanjoy