[PATCH] D87149: [InstCombine] erase instructions leading up to unreachable

Wed Sep 9 11:06:25 PDT 2020

jdoerfert added a comment.

In D87149#2263777 <https://reviews.llvm.org/D87149#2263777>, @stephan.yichao.zhao wrote:

> In D87149#2263692 <https://reviews.llvm.org/D87149#2263692>, @nikic wrote:
>
>>> Although executing __builtin_unreachable is undefined, removing the code before it deletes their side effects.
>>
>> I believe this behavior is correct. LangRef explicitly states that execution continues past a volatile store. As such, unreachable must be reached, which is undefined behavior. As such, we are free to optimize as we wish, including removing a preceding volatile store.
>
> I think the problem is if that reaching unreachable instructions is undefined means the behavior when a program hits __builtin_unreachable is undefined and the behavior before  __builtin_unreachable is hit is also undefined.
>
> The behavior when a program hits __builtin_unreachable is definitely undefined, because a runtime can abort or hang or keep on going. The LangRef does not define it.

A program execution has defined behavior or not. If you cannot derive defined behavior for your *entire* execution you do not have defined behavior for any part of your execution.
The fact that you see some side effects that you might expect before "UB is hit" is coincidental. Take this code:

  if (A)
    printf("A is not null!\n");

  *A = 0;

If you would execute this with `A = nullptr` you would cause UB. The compiler knows that and removes the conditional. Thus, when you run this with `A = nullptr` you will see the message before, you know, something else happens (=UB).
You can revert the conditional, thus `!A`, to remove some side-effect with the same explanation.

> However, is the behavior before  __builtin_unreachable is hit is also undefined? Does LLVM have any existing transformation or optimization examples that change behaviors before an undefined instruction?
> I found some blogs: https://blog.llvm.org/posts/2011-05-13-what-every-c-programmer-should-know/ and https://blog.regehr.org/archives/213 They do not answer the question explicitly, although I found the first link says "If you're using an LLVM-based compiler, you can dereference a "volatile" null pointer to get a crash if that's what you're looking for, **since volatile loads and stores are generally not touched by the optimizer**."
>
> If derefencing a null volatile has a defined crash behavior, it is not safe to remove volatile accesses before unreachable because it may not be reaching the unreachable if it gets crashed.

I do not believe this is defined at all. A volatile access might not return and might throw, but it is, IMHO, never *known* to "crash" (w/o target knowledge).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D87149/new/

https://reviews.llvm.org/D87149