[Openmp-commits] [PATCH] D108708: [openmp][amdgpu] Initial gfx10 offloading implementation
Jon Chesterfield via Phabricator via Openmp-commits
openmp-commits at lists.llvm.org
Wed Aug 25 09:15:27 PDT 2021
JonChesterfield added inline comments.
================
Comment at: llvm/include/llvm/Frontend/OpenMP/OMPGridValues.h:102
+
+template <uint32_t wavesize> constexpr const GV &getAMDGPUGridValues() {
+ static_assert(wavesize == 32 || wavesize == 64, "");
----------------
Want to resolve this at compile time for the deviceRTL. Hoping to think of a prettier way to spell it. Currently all but one field are the same for the two, that's probably suboptimal for performance.
================
Comment at: openmp/libomptarget/deviceRTLs/amdgcn/CMakeLists.txt:111
+set(mcpus gfx700 gfx701 gfx801 gfx803 gfx900 gfx902 gfx906 gfx908 gfx1010 gfx1031)
if (DEFINED LIBOMPTARGET_AMDGCN_GFXLIST)
set(mcpus ${LIBOMPTARGET_AMDGCN_GFXLIST})
----------------
I've got a gfx1010 locally and @dpalermo has a gfx1031
================
Comment at: openmp/libomptarget/deviceRTLs/amdgcn/src/target_impl.h:36
INLINE constexpr const llvm::omp::GV &getGridValue() {
+ return llvm::omp::getAMDGPUGridValues<__AMDGCN_WAVEFRONT_SIZE>();
----------------
compiling for device code means a macro is available to pick between the two
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D108708/new/
https://reviews.llvm.org/D108708
More information about the Openmp-commits
mailing list