[llvm-dev] The semantics of nonnull attribute

Juneyoung Lee via llvm-dev llvm-dev at lists.llvm.org
Mon Feb 17 17:45:28 PST 2020


Hello all,

LangRef says it is undefined behavior to pass null to a nonnull argument
(`call f(nonnull null);`), but the semantics is too strong for a few
existing optimizations.
To support these, we can relax the semantics so `f(nonnull null)` is
equivalent to `f(poison)`, but (A) it again blocks another set of
optimizations, and (B) this makes the semantics of nonnull deviate from
other attributes like dereferenceable.
I found that there was a similar discussion about this issue in the past as
well, but seems it is not settled yet.
What should the semantics of nonnull be?
I listed a few optimizations that are relevant with this issue.


1. Propagating nonnull attribute to callee's arg (
https://godbolt.org/z/-cVsVP )

g(i8* ptr) {
f(nonnull ptr);
}
=>
g(i8* nonnull ptr) {
f(nonnull ptr);
}

This is correct if f(nonnull null) is UB. If ptr == null, f(nonnull null)
should have raised UB, so ptr shouldn't be null.
However, this transformation is incorrect if f(nonnull null) is equivalent
to f(poison).
If f was an empty function, f(nonnull null) never raises UB regardless of
ptr. So we can't guarantee ptr != null at other uses of ptr.


2. InstCombine (https://godbolt.org/z/HDQ7rD ):

%ptr_inb = gep inbounds %any_ptr, 1
f(%ptr_inb)
=>
%ptr_inb = .. (identical)
f(nonnull %ptr_inb)

This optimization is incorrect if `f(nonnull null)` is UB. The reason is as
follows.
If `gep inbounds %any_ptr, 1` yields poison, the source is `f(poison)`
whereas the optimized one is `f(nonnull poison)`.
`f(nonnull poison)` should be UB because `f(nonnull null)` is UB. So, the
transformation introduced UB.
This optimization is correct if both `f(nonnull null)` and `f(nonnull
poison)` are equivalent to `f(poison)`.


3. https://reviews.llvm.org/D69477

f(nonnull ptr);
use(ptr);
=>
llvm.assume(ptr != null);
use(ptr);
f(nonnull ptr);

If f(nonnull null) is f(poison), this is incorrect. If ptr was null, the
added llvm.assume(ptr != null) raises UB whereas the source may not raise
UB at all. (e.g. assume that f() was an empty function)
If f(nonnull null) is UB, this is correct.


4. Dead argument elimination (from https://reviews.llvm.org/D70749 )

f(nonnull ptr); // f’s argument is dead
=>
f(nonnull undef);

This is incorrect if f(nonnull null) is UB. To make this correct, nonnull
should be dropped. This becomes harder to fix if nonnull was attached at
the signature of a function (not at the callee site).
It is correct if f(nonnull null) is f(poison).

Actually the D70749's thread had an end-to-end miscompilation example due
to the interaction between this DAE and the optimization 3 (insertion of
llvm.assume).

Thanks,
Juneyoung Lee
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200218/87d18a9f/attachment.html>


More information about the llvm-dev mailing list