[Openmp-commits] [PATCH] D108398: [libomptarget] Specialize amdgpu devicertl on wave size for gfx10
Jon Chesterfield via Phabricator via Openmp-commits
openmp-commits at lists.llvm.org
Fri Aug 27 04:23:51 PDT 2021
JonChesterfield added reviewers: ronlieb, dpalermo, dhruvachak.
JonChesterfield added inline comments.
================
Comment at: openmp/libomptarget/deviceRTLs/amdgcn/src/target_impl.h:48
+namespace detail {
+template <unsigned> struct UnsignedToType;
----------------
As a performance optimisation, this is probably in the noise.
However it will eliminate all the warp32 vs wave64 differences in the deviceRTL, making gfx10 a useful datapoint for debugging works on nvptx and fails on amdgpu. That is, if gfx10 works, it suggests the bug is in wave size. If it fails, it suggests the bug is not in wave size.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D108398/new/
https://reviews.llvm.org/D108398
More information about the Openmp-commits
mailing list