[LLVMdev] RFC: implicit null checks in llvm

Thu Apr 23 13:46:14 PDT 2015

> Why not allow non-nounwind calls, in other words, an intrinsic call that may
> throw?
>
> In most languages with implicit null checks, there are far more functions
> that do field accesses and method calls than there are functions that catch
> exceptions. The common case is that the frame with the load will have
> nothing to do other than propagate the exception to the parent frame, and we
> should allow the runtime to handle that efficiently.
>
> Essentially, in this model, the signal handler is responsible for
> identifying the signal as a null pointer exception (i.e. SIGSEGVs on a small
> pointer value with a PC in code known to use this EH personality) and
> transitioning to the exception handling machinery in the language runtime.

I have no problems in semantically allowing unwinding calls to these
intrinsics that get your runtime to unwind the stack and propagate an
exception object.  But this proposal is specifically to only introduce
LLVM side changes to generate the appropriate side-tables for implicit
exceptions.  I think getting signal handlers and runtimes to throw
exceptions will be quite a bit of work beyond that.

One thing to note: it is usually impractical for managed languages to
use implicit null pointer exceptions unless they have some way to
"heal" the implicit null check sites into explicit null checks as they
fail.  Unless you have a way to quickly converge your program to a
point where no implicit null checks actually fail at runtime, checking
for null pointers via virtual memory tricks is a pessimization.

This can be done via code patching / invalidation etc. as Andy
mentioned.

> The landingpad personality normally controls what kind of EH tables are
> emitted, so if you want something other than the __gxx_personality_v0 LSDA
> table, you could invent your own personality and use that to control what
> gets emitted. This might be useful for interoperating with existing language
> runtimes.

That's a great idea.  Do you think it is reasonable to standardize a
"__llvm_implicit_null_check" personality function that emits
information into the stackmaps sections instead?

> Does it really have to be a per-target pseudo? The way I see it, we can
> handle this all in selection dag. All we need to do is emit the before
> label, the load/store operation, and the end label, and establish control
> dependence between them all to prevent folding. Does that seem reasonable,
> or is this an overly simplistic perspective? :-)

That was my impression also. :)

> Would you be OK with simply documenting that these intrinsics are
> optimization-hostile, in the same way that early safepoint insertion is?
> There are some language constructs (__try / __except) that allow catching
> memory faults like this. Such constructs are rare and don't really need to
> be optimized. I just want to make sure that mid-level optimizations don't
> actively break these.

Personally, I'm okay with your proposal; but I will wait for Andy to
respond on this.

> I agree, with the caveat above. Mid-level passes shouldn't actively break
> these intrinsics.

Agreed.  Mid-level passes should treat these as opaque calls.

-- Sanjoy