[Openmp-commits] [PATCH] D101976: [OpenMP] Unified entry point for SPMD & generic kernels in the device RTL
Jon Chesterfield via Phabricator via Openmp-commits
openmp-commits at lists.llvm.org
Wed May 12 12:16:35 PDT 2021
JonChesterfield accepted this revision.
JonChesterfield added a comment.
This revision is now accepted and ready to land.
I'm not certain what this 'aligned' limitation for nvptx syncthreads is, but can't think of a corresponding one for amdgcn. So we may not need the LDS barrier construction, and it'll be much faster if we don't.
This was reported working on amdgpu by a third party against an earlier trunk build, but sadly the current trunk seems to have regressed (debugging offline). So I have no reason to believe this doesn't work, and some reason to believe it will do. Objection withdrawn.
The code itself always looked fine, was only nervous about the changes to concurrency primitives in nvptx.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D101976/new/
https://reviews.llvm.org/D101976
More information about the Openmp-commits
mailing list