[llvm-dev] GEP with a null pointer base

David Blaikie via llvm-dev llvm-dev at lists.llvm.org
Thu Jul 20 11:22:54 PDT 2017

On Wed, Jul 19, 2017 at 10:17 AM Peter Lawrence via llvm-dev <
llvm-dev at lists.llvm.org> wrote:

> Chandler,
>                The only thing David made clear that wasn’t already clear
> is that he believes UB to be “comparatively rare”, which is in agreement
> with what Hal already said which is that he does not expect deleting
> UB will be of benefit to for example SPEC benchmarks.
> Given that it is “comparatively rare”, why all the effort to delete it ?
And why make deleting it the default rather than warning about it ?

There seems to be some confusion/misunderstanding here. My best
understanding is that when David said this:

"The cases where the compiler can statically prove that undefined behaviour
is present are comparatively rare."

What he was referring to/describing was a contrast with the optimizations
described prior to that.

It's something like this:

UB-based optimizations don't prove UB is present - they optimize on the
assumption that it is not present due to some unproven (by the compiler,
but assumed to be known by the developer) invariants in the program.

Think about a simple case like array bounds - the compiler emits an
unconditional load to the memory because it assumes the developer correctly
validated the bounds or otherwise constructed so that out of bounds indexes
never reach that piece of code. This is quite common - that UB is assumed
to not happen, and the compiler optimizes on this fact.

What is less common, is for the compiler to be able to (in reasonable time)
prove that UB /does/ happen (in many cases this would require complex
interprocedural analysis - the array is defined in one function, maybe with
a complex dynamic bound, then passed to another function and indexed using
a non-trivial dynamic expression... statically proving that to be true or
false is complex/expensive and so basically not done by any compiler - so
any cases that are caught by the compiler are relatively trivial ("oh, you
declared a const null pointer value, then dereferenced it within the same
function", etc) & so don't happen very often (because they're also fairly
obvious to developers too))

Does that help explain the difference/distinction being drawn here?

- Dave

> Peter
> On Jul 13, 2017, at 2:15 PM, Chandler Carruth <chandlerc at gmail.com> wrote:
> On Thu, Jul 13, 2017 at 5:13 PM Peter Lawrence via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>> David,
>>           Here is the definition accepted by Hal of what we’re doing
>> > 1. Sometimes there are abstraction penalties in C++ code
>> > 2. That can be optimized away after template instantiation, function
>> inlining, etc
>> > 3. When they for example exhibit this pattern
>> >       if (A) {
>> >               stuff;
>> >       } else {
>> >               other stuff including “undefined behavior”;
>> >       }
>> > 4. Where the compiler assumes “undefined behavior” doesn’t actually
>> happen because
>> >    In the C language standard it is the users responsibility to avoid it
>> > 5. Therefore in this example the compiler can a) delete the else-clause
>> >     b) delete the if-cond, c) assume A is true and propagate that
>> information
>> We are actively deleting undefined behavior, and the question is why
>> given that doing so potentially masks a real source code bug.
>> At the very least deleting undefined behavior should not be the default.
> You are asserting this (again), but others have clearly stated that they
> disagree. David gave detailed and clear reasons why. Continuing to re-state
> positions is not productive.
> -Chandler
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170720/e26dd6c1/attachment.html>

More information about the llvm-dev mailing list