[llvm-dev] The semantics of nonnull attribute

Philip Reames via llvm-dev llvm-dev at lists.llvm.org
Tue Feb 18 13:49:00 PST 2020


I'm really not sure I agree there's a problem with the current semantics 
at all.

The root of your argument appears to be that passing poison to a 
function argument is "harmless" if that argument is unused in the 
callee.  This seems reasonable to me, but it brings with it a 
restriction on any IPO pass which is non-obvious.  It essentially 
requires that the IPA argument inference framework prove both that a 
property holds, and that if that property didn't hold UB would be 
triggered.  (I think this is similar in spirit to the "used" discussion 
downthread.)   In general, it is much easier to prove the former 
property than the later.

I would simply state that per the current semantics, any IPO/IPA 
transform which doesn't do so is buggy.  That's not pretty, but it's 
valid and I believe matches (most of?) the current implementation.

Conversely, the power for the frontend to annotate such facts on the 
function boundary is one of the key advantages higher level semantics 
give over inference.  I would be very reluctant to give that up.

To be clear, I'm completely open to proposals to change the current 
semantics for ease of implementation on the IPO/IPA side. I just think 
it's important to distinguish between "current semantics are wrong" and 
"current semantics are inconvenient".

Philip

On 2/17/20 5:45 PM, Juneyoung Lee wrote:
> Hello all,
>
> LangRef says it is undefined behavior to pass null to a nonnull 
> argument (`call f(nonnull null);`), but the semantics is too strong 
> for a few existing optimizations.
> To support these, we can relax the semantics so `f(nonnull null)` is 
> equivalent to `f(poison)`, but (A) it again blocks another set of 
> optimizations, and (B) this makes the semantics of nonnull deviate 
> from other attributes like dereferenceable.
> I found that there was a similar discussion about this issue in the 
> past as well, but seems it is not settled yet.
> What should the semantics of nonnull be?
> I listed a few optimizations that are relevant with this issue.
>
>
> 1. Propagating nonnull attribute to callee's arg 
> (https://godbolt.org/z/-cVsVP )
>
> g(i8* ptr) {
> f(nonnull ptr);
> }
> =>
> g(i8* nonnull ptr) {
> f(nonnull ptr);
> }
>
> This is correct if f(nonnull null) is UB. If ptr == null, f(nonnull 
> null) should have raised UB, so ptr shouldn't be null.
> However, this transformation is incorrect if f(nonnull null) is 
> equivalent to f(poison).
> If f was an empty function, f(nonnull null) never raises UB regardless 
> of ptr. So we can't guarantee ptr != null at other uses of ptr.
>
>
> 2. InstCombine (https://godbolt.org/z/HDQ7rD ):
>
> %ptr_inb = gep inbounds %any_ptr, 1
> f(%ptr_inb)
> =>
> %ptr_inb = .. (identical)
> f(nonnull %ptr_inb)
>
> This optimization is incorrect if `f(nonnull null)` is UB. The reason 
> is as follows.
> If `gep inbounds %any_ptr, 1` yields poison, the source is `f(poison)` 
> whereas the optimized one is `f(nonnull poison)`.
> `f(nonnull poison)` should be UB because `f(nonnull null)` is UB. So, 
> the transformation introduced UB.
> This optimization is correct if both `f(nonnull null)` and `f(nonnull 
> poison)` are equivalent to `f(poison)`.
>
>
> 3. https://reviews.llvm.org/D69477
>
> f(nonnull ptr);
> use(ptr);
> =>
> llvm.assume(ptr != null);
> use(ptr);
> f(nonnull ptr);
>
> If f(nonnull null) is f(poison), this is incorrect. If ptr was null, 
> the added llvm.assume(ptr != null) raises UB whereas the source may 
> not raise UB at all. (e.g. assume that f() was an empty function)
> If f(nonnull null) is UB, this is correct.
>
>
> 4. Dead argument elimination (from https://reviews.llvm.org/D70749 )
>
> f(nonnull ptr); // f’s argument is dead
> =>
> f(nonnull undef);
>
> This is incorrect if f(nonnull null) is UB. To make this correct, 
> nonnull should be dropped. This becomes harder to fix if nonnull was 
> attached at the signature of a function (not at the callee site).
> It is correct if f(nonnull null) is f(poison).
>
> Actually the D70749's thread had an end-to-end miscompilation example 
> due to the interaction between this DAE and the optimization 3 
> (insertion of llvm.assume).
>
> Thanks,
> Juneyoung Lee


More information about the llvm-dev mailing list