[Openmp-commits] [PATCH] D112156: [OpenMP] Ensure broken assumptions print once, not thousands of times.

Johannes Doerfert via Phabricator via Openmp-commits openmp-commits at lists.llvm.org
Thu Oct 21 07:17:04 PDT 2021


jdoerfert added inline comments.


================
Comment at: openmp/libomptarget/DeviceRTL/include/Utils.h:80
+/// guarded such that release mode execution will not be impacted.
+template <auto ID> struct SingletonFlag {
+
----------------
JonChesterfield wrote:
> Are we c++17 in the deviceRTL? Can presumably be whatever clang trunk has implemented
Runs w/o warnings for me. I'll use an integer type.


================
Comment at: openmp/libomptarget/DeviceRTL/include/Utils.h:99
+  }
+};
+
----------------
JonChesterfield wrote:
> JonChesterfield wrote:
> > Or we could use zero for the initial value and write 1 into it. Seq_cst seems unnecessary - acq_rel? Global ordering on GPUs is expensive and not obviously required here, don't care which thread wins the race to print a debug message
> Actually worse than that - seq cst induces a global order. To make that work, we probably change thread execution ordering. Want to minimise assert's influence on dynamic control flow
> Or we could use zero for the initial value and write 1 into it.

We could, but then we would pay a cost even if this code is dead. I'm fine either way.

> ...

This is used in front of a __builtin_trap, or at least guarded by a debug/configuration kind. It will not be fast but it doesn't have to be.


================
Comment at: openmp/libomptarget/DeviceRTL/src/Synchronization.cpp:263
+uint32_t atomic::exchange(uint32_t *Addr, uint32_t V, int Ordering) {
+  return impl::atomicExchange(Addr, V, Ordering);
+}
----------------
JonChesterfield wrote:
> Normally takes two orderings
No it does not. This is not cas.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D112156/new/

https://reviews.llvm.org/D112156



More information about the Openmp-commits mailing list