[Openmp-commits] [PATCH] D62393: [OPENMP][NVPTX]Mark parallel level counter as volatile.
Alexey Bataev via Phabricator via Openmp-commits
openmp-commits at lists.llvm.org
Thu Jun 13 10:33:24 PDT 2019
ABataev added a comment.
In D62393#1542065 <https://reviews.llvm.org/D62393#1542065>, @__simt__ wrote:
> In D62393#1542009 <https://reviews.llvm.org/D62393#1542009>, @ABataev wrote:
> > No, I'm not relying on the non-optimization of atomics. I need both, volatile semantics and atomic. So, the compiler could not optimize out memory accesses and the access would be atomic.
> I'm going to try to improve the precision on that statement, if you'll allow me.
> Since the address at `¶llelLevel` is necessarily backed by real memory (I mean RAM, and not PCI registers) because it is an instance introduced by C++ code, then many optimizations are not observable regardless of whatever decorations are applied to the declaration. For example `x = *ptr; y = *ptr;` can always be simplified to `x = *ptr; y = x;` when you know that `ptr` points to memory (and some other conditions, like the set of all `y = ...` simplified is finite). There is a long list of such simplifications based on "run in finite isolation" substitutions (`*ptr = x; y = *ptr;` => `*ptr = x; y = x;`... etc...).
> To be clear: there is nothing you can do to prevent ptxas from performing optimizations such as these, even on accesses marked volatile, so the value of the predicate "could not optimize memory accesses" is just hardcoded to false, full stop.
> The specific optimizations you don't want, are the optimizations that aren't valid on `std::atomic<T>` with `memory_order_relaxed`, for example performing substitutions across fences, barriers, and in potentially infinite loops. And again, the final conclusion is completely valid, you disable those optimizations with `*.relaxed.sys` or `*.volatile` in PTX. You just don't disable all optimizations.
Yes, this is what I need and what I'm trying to explain. Thanks, Olivier, you did it better than me.
> In the end the mental model is simple: think as though you had `std::atomic<T>` available to you, the conclusions you draw and the code you write with that will be correct.
CHANGES SINCE LAST ACTION
More information about the Openmp-commits