[Openmp-commits] [PATCH] D46185: [OpenMP] Allow nvptx sm_30 to be used as an offloading device

Jonas Hahnfeld via Phabricator via Openmp-commits openmp-commits at lists.llvm.org
Fri Apr 27 08:28:07 PDT 2018


Hahnfeld added a comment.

In https://reviews.llvm.org/D46185#1081111, @grokos wrote:

> I don't think this code does atomic max. Actually, I don't think it does anything at all. First off, `old_value` and `elem` are variables which do not change inside the while-loop. So the first loop condition is either always true or always false. Secondly, `atomicCAS` returns the value that existed at address `Buffer` (regardless of whether or not the CAS operation succeeded) and not a boolean result. Also, `old_value` must be updated before each CAS attempt because some other thread may have changed it.
>
> A correct implementation would look somewhat like this:
>
>   uint64_t old_value = *Buffer;
>   uint64_t expected_old_value;
>   do {
>     expected_old_value = old_value;
>     old_value = atomicCAS((unsigned long long *) Buffer,
>                           (unsigned long long) expected_old_value,
>                           (unsigned long long) max(expected_old_value, elem));
>   } while (expected_old_value != old_value);
>
>
> What this code does is: check that the value at `Buffer` is still what I expect it to be, i.e. no other thread has changed it, and replace it with `max(old_value, elem)`.


Fully agree with George's analysis here. The proposed implementation may be tweaked to early exit if `elem < *Buffer`.

Anyway, why do you want to support `sm_30`? Is there a particular hardware that you want to support? Why does AMD care?!?


Repository:
  rOMP OpenMP

https://reviews.llvm.org/D46185





More information about the Openmp-commits mailing list