[PATCH] D87149: [InstCombine] erase instructions leading up to unreachable

Wed Sep 9 18:41:15 PDT 2020

chandlerc added a comment.

(Continuing the discussion here as I'm not really aware of a better place... feel free to point to a different thread if better... I don't actively follow llvm-dev at this point though, sorry if I've missed a post there.)

I don't have super strong opinions about exactly what semantics LLVM implements.

I have somewhat strong opinions about the semantics that users of C++ need from volatile memory accesses, and would like for Clang to have a clear way to lower that into LLVM IR.

The semantics I think are that the memory access is observable *outside* of the C++ abstract machine. I think that needs to include things like an emulator tracing memory accesses, IPC, MMIO, hardware exception handling (in the kernel), signal handlers (in userspace), and other external observation of memory access.

One specific example from the above: I think that C and C++ volatile accesses need to be usable to reliably trigger a fault due to failed virtual address translation or bad permissions on the mapped page on Linux and similar virtual memory operating systems. This is, for example, heavily used by kernels and other systems level code. It is even used in userspace for fast bounds checking and in some cases deoptimization. On many systems (certainly Linux userspace) these accesses should reliably trigger a signal handler. Signal handlers are not, AFAIK, guaranteed to return.

Given that, it seems very hard to argue that a volatile memory access must not trap, and must not unwind the stack -- signal handlers are allowed to do that, and it seems like a long held valid use case for volatile access to memory is to trigger a specific signal handler. Perhaps this use case just didn't come up in the prior discussions?

Another use case I have is very specifically to make a store not dead to the compiler if it is reachable. Benchmark utilities have relied on this usage of volatile for many years, and it would seem painful to try to re-teach people to use a different facility.

The last use case I have is to make a store not-dead to the compiler even if it is followed by something unreachable. This comes up in a few rare cases in benchmarks, but also comes up in a few other cases. It is perhaps the least important use case. But it happens to be easily satisfied by the combination of volatile accesses potentially intentionally triggering signal handlers, and not being considered a "dead" store even if unused.

I will point out that of all the modern compilers I could test, only Clang trunk deletes volatile stores, even followed by clearly unreachable code: https://compiler-explorer.com/z/EGTofn

I think Clang deviating in the semantics it provides for volatile stores is a serious mistake and bug. So my argument for reverting would be "we need to fix Clang to use some other lowering of volatile stores given the clarified LLVM semantics before landing patches that implement them" as otherwise we will regress users of Clang. If folks want to revisit the LLVM semantics in light of this, as I said, I don't have an especially strong opinion either way. My mild opinion is that it would be more useful for `volatile` accesses in LLVM to have a superset of the possible behaviors (and thus restrictions on optimizations) placed upon a call of an unknown function, in that execution might not proceed past the store. That would allow Clang and other frontends to use it for lowering these kinds of operations.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D87149/new/

https://reviews.llvm.org/D87149