[Openmp-commits] [PATCH] D108398: [libomptarget] Specialize amdgpu devicertl on wave size for gfx10

Fri Aug 27 04:23:51 PDT 2021

JonChesterfield added reviewers: ronlieb, dpalermo, dhruvachak.
JonChesterfield added inline comments.

================
Comment at: openmp/libomptarget/deviceRTLs/amdgcn/src/target_impl.h:48

+namespace detail {
+template <unsigned> struct UnsignedToType;
----------------
As a performance optimisation, this is probably in the noise.

However it will eliminate all the warp32 vs wave64 differences in the deviceRTL, making gfx10 a useful datapoint for debugging works on nvptx and fails on amdgpu. That is, if gfx10 works, it suggests the bug is in wave size. If it fails, it suggests the bug is not in wave size.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D108398/new/

https://reviews.llvm.org/D108398