[llvm-dev] The semantics of nonnull attribute

Johannes Doerfert via llvm-dev llvm-dev at lists.llvm.org
Tue Feb 18 15:04:48 PST 2020


On 02/18, Philip Reames via llvm-dev wrote:
> I'm really not sure I agree there's a problem with the current semantics at
> all.
>
> The root of your argument appears to be that passing poison to a function
> argument is "harmless" if that argument is unused in the callee.  This seems
> reasonable to me, but it brings with it a restriction on any IPO pass which
> is non-obvious.  It essentially requires that the IPA argument inference
> framework prove both that a property holds, and that if that property didn't
> hold UB would be triggered.  (I think this is similar in spirit to the
> "used" discussion downthread.)   In general, it is much easier to prove the
> former property than the later.
> 
> I would simply state that per the current semantics, any IPO/IPA transform
> which doesn't do so is buggy.  That's not pretty, but it's valid and I
> believe matches (most of?) the current implementation.

There is a good example (below) that shows how we misinterpret this
already and I anticipate more as we improve our attribute deduction
capabilities.


> Conversely, the power for the frontend to annotate such facts on the
> function boundary is one of the key advantages higher level semantics give
> over inference.  I would be very reluctant to give that up.

You would not loose anything. As we discussed in the other part of the
thread:
  `attr` + `not_poison` (aka. `used`) => current UB semantics
All we get is a two step process which frontends can avoid by always
using the attributes together.


> To be clear, I'm completely open to proposals to change the current
> semantics for ease of implementation on the IPO/IPA side. I just think it's
> important to distinguish between "current semantics are wrong" and "current
> semantics are inconvenient".

Juneyoung lists nice examples of transformations that are "buggy"
according to the current semantics. I think one of the strongest
arguments for a change is his case 2. You can also write it as:

void @bar();
void @foo(nonnull %a, %offset) {
  %g = gep inbounds %a, %offset
  call void @bar(%g)             ; <- You cannot mark %g at the call
                                 ;    site as nonnull according to the
                                 ;    current lang-ref rules as a
                                 ;    violation of `inbounds` results
                                 ;    in a poison value and not UB.
                                 ;    Now, `call void @bar(nonnull %g)`
                                 ;    would introduce UB where there was
                                 ;    none. We still do that though,
                                 ;    e.g., in instcombine. The helper
                                 ;    `isKnownNonZero` will return true
                                 ;    as well, which actually means,
                                 ;    non-zero or poison.
}

Cheers,
  Johannes


> Philip
> 
> On 2/17/20 5:45 PM, Juneyoung Lee wrote:
> > Hello all,
> > 
> > LangRef says it is undefined behavior to pass null to a nonnull argument
> > (`call f(nonnull null);`), but the semantics is too strong for a few
> > existing optimizations.
> > To support these, we can relax the semantics so `f(nonnull null)` is
> > equivalent to `f(poison)`, but (A) it again blocks another set of
> > optimizations, and (B) this makes the semantics of nonnull deviate from
> > other attributes like dereferenceable.
> > I found that there was a similar discussion about this issue in the past
> > as well, but seems it is not settled yet.
> > What should the semantics of nonnull be?
> > I listed a few optimizations that are relevant with this issue.
> > 
> > 
> > 1. Propagating nonnull attribute to callee's arg
> > (https://godbolt.org/z/-cVsVP )
> > 
> > g(i8* ptr) {
> > f(nonnull ptr);
> > }
> > =>
> > g(i8* nonnull ptr) {
> > f(nonnull ptr);
> > }
> > 
> > This is correct if f(nonnull null) is UB. If ptr == null, f(nonnull
> > null) should have raised UB, so ptr shouldn't be null.
> > However, this transformation is incorrect if f(nonnull null) is
> > equivalent to f(poison).
> > If f was an empty function, f(nonnull null) never raises UB regardless
> > of ptr. So we can't guarantee ptr != null at other uses of ptr.
> > 
> > 
> > 2. InstCombine (https://godbolt.org/z/HDQ7rD ):
> > 
> > %ptr_inb = gep inbounds %any_ptr, 1
> > f(%ptr_inb)
> > =>
> > %ptr_inb = .. (identical)
> > f(nonnull %ptr_inb)
> > 
> > This optimization is incorrect if `f(nonnull null)` is UB. The reason is
> > as follows.
> > If `gep inbounds %any_ptr, 1` yields poison, the source is `f(poison)`
> > whereas the optimized one is `f(nonnull poison)`.
> > `f(nonnull poison)` should be UB because `f(nonnull null)` is UB. So,
> > the transformation introduced UB.
> > This optimization is correct if both `f(nonnull null)` and `f(nonnull
> > poison)` are equivalent to `f(poison)`.
> > 
> > 
> > 3. https://reviews.llvm.org/D69477
> > 
> > f(nonnull ptr);
> > use(ptr);
> > =>
> > llvm.assume(ptr != null);
> > use(ptr);
> > f(nonnull ptr);
> > 
> > If f(nonnull null) is f(poison), this is incorrect. If ptr was null, the
> > added llvm.assume(ptr != null) raises UB whereas the source may not
> > raise UB at all. (e.g. assume that f() was an empty function)
> > If f(nonnull null) is UB, this is correct.
> > 
> > 
> > 4. Dead argument elimination (from https://reviews.llvm.org/D70749 )
> > 
> > f(nonnull ptr); // f’s argument is dead
> > =>
> > f(nonnull undef);
> > 
> > This is incorrect if f(nonnull null) is UB. To make this correct,
> > nonnull should be dropped. This becomes harder to fix if nonnull was
> > attached at the signature of a function (not at the callee site).
> > It is correct if f(nonnull null) is f(poison).
> > 
> > Actually the D70749's thread had an end-to-end miscompilation example
> > due to the interaction between this DAE and the optimization 3
> > (insertion of llvm.assume).
> > 
> > Thanks,
> > Juneyoung Lee
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

-- 

Johannes Doerfert
Researcher

Argonne National Laboratory
Lemont, IL 60439, USA

jdoerfert at anl.gov
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 228 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200218/31df7e1d/attachment.sig>


More information about the llvm-dev mailing list