[Openmp-commits] [PATCH] D64217: [OpenMP][NFCI] Cleanup the target state queue implementation
Alexey Bataev via Phabricator via Openmp-commits
openmp-commits at lists.llvm.org
Fri Jul 5 19:29:09 PDT 2019
ABataev added inline comments.
Comment at: openmp/libomptarget/deviceRTLs/nvptx/src/state-queue.h:29
- ElementType elements[SIZE];
- volatile ElementType *elementQueue[SIZE];
- volatile uint32_t head;
> ABataev wrote:
> > jdoerfert wrote:
> > > ABataev wrote:
> > > > I would like to keep volatile modifier at least for cuda 8. Without volatile it may produce incorrect code. I would keep it until we drop support for cuda 8.
> > > > Plus, I would suggest to test it very seriously for other versions of cuda. Does it really works correctly? Ptxas may use some incorrect optimizations without volatile. Though I like the idea of removing them.
> > > These members are accessed only through *atomic* accesses. Why do we would require volatile in addition?
> > Without `volatile` they were not initialized properly on cuda 8. This, again, seems to me like some kind of a bug in ptxas for cuda 8. Not sure about this problem in cuda 9, it requires some additional testing.
> I actually did not notice that there is apparently no initialization of these values (or did I miss it?). If there is none, than that is the problem which needs fixing, e.g., `uint32_t Head = 0`.
They are initialized, but without volatile qualifier they are not initialized in the compiled code.
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
More information about the Openmp-commits