[llvm-dev] The semantics of nonnull attribute

Nuno Lopes via llvm-dev llvm-dev at lists.llvm.org
Tue Feb 18 10:22:48 PST 2020


> I forgot to explicit say that in my first email but whatever we decide for `nonnull` needs to be applied to basically all other attributes as well. `nonnul` is no way special (IMHO).

I think that not all attributes are the same. For example, "dereferenceable(n)" is quite strong. It allows e.g.:
f(dereferenceable(4) %p) {
  loop() {
    %v = load %p
    use(%v)
  }
}
=>
f(dereferenceable(4) %p) {
  %v = load %p
  loop() {
    use(%v)
  }
}

For this transformation to be correct, the function call must trigger UB if the pointer passed as argument is not dereferenceable. Otherwise, upon inlining, it would become UB.
Other attributes are not as severe. Ideally we would have a consistent solution so we don't need to remember which are the nice attributes.

Nuno


> I found that there was a similar discussion about this issue in the 
> past as well, but seems it is not settled yet.
> What should the semantics of nonnull be?
> I listed a few optimizations that are relevant with this issue.
> 
> 
> 1. Propagating nonnull attribute to callee's arg ( 
> https://godbolt.org/z/-cVsVP )
> 
> g(i8* ptr) {
> f(nonnull ptr);
> }
> =>
> g(i8* nonnull ptr) {
> f(nonnull ptr);
> }
> 
> This is correct if f(nonnull null) is UB. If ptr == null, f(nonnull 
> null) should have raised UB, so ptr shouldn't be null.
> However, this transformation is incorrect if f(nonnull null) is 
> equivalent to f(poison).
> If f was an empty function, f(nonnull null) never raises UB regardless 
> of ptr. So we can't guarantee ptr != null at other uses of ptr.
> 
> 
> 2. InstCombine (https://godbolt.org/z/HDQ7rD ):
> 
> %ptr_inb = gep inbounds %any_ptr, 1
> f(%ptr_inb)
> =>
> %ptr_inb = .. (identical)
> f(nonnull %ptr_inb)
> 
> This optimization is incorrect if `f(nonnull null)` is UB. The reason 
> is as follows.
> If `gep inbounds %any_ptr, 1` yields poison, the source is `f(poison)` 
> whereas the optimized one is `f(nonnull poison)`.
> `f(nonnull poison)` should be UB because `f(nonnull null)` is UB. So, 
> the transformation introduced UB.
> This optimization is correct if both `f(nonnull null)` and `f(nonnull 
> poison)` are equivalent to `f(poison)`.
> 
> 
> 3. https://reviews.llvm.org/D69477
> 
> f(nonnull ptr);
> use(ptr);
> =>
> llvm.assume(ptr != null);
> use(ptr);
> f(nonnull ptr);
> 
> If f(nonnull null) is f(poison), this is incorrect. If ptr was null, 
> the added llvm.assume(ptr != null) raises UB whereas the source may 
> not raise UB at all. (e.g. assume that f() was an empty function) If 
> f(nonnull null) is UB, this is correct.
> 
> 
> 4. Dead argument elimination (from https://reviews.llvm.org/D70749 )
> 
> f(nonnull ptr); // f’s argument is dead => f(nonnull undef);
> 
> This is incorrect if f(nonnull null) is UB. To make this correct, 
> nonnull should be dropped. This becomes harder to fix if nonnull was 
> attached at the signature of a function (not at the callee site).
> It is correct if f(nonnull null) is f(poison).
> 
> Actually the D70749's thread had an end-to-end miscompilation example 
> due to the interaction between this DAE and the optimization 3 
> (insertion of llvm.assume).
> 
> Thanks,
> Juneyoung Lee

> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev


-- 

Johannes Doerfert
Researcher

Argonne National Laboratory
Lemont, IL 60439, USA

jdoerfert at anl.gov



More information about the llvm-dev mailing list