[libc-commits] [PATCH] D156731: [libc] Add basic wrappers for the backend address spaces

Mon Jul 31 12:39:43 PDT 2023

jhuber6 created this revision.
jhuber6 added reviewers: jdoerfert, tianshilei1992, arsenm, JonChesterfield, sivachandra, michaelrj.
Herald added subscribers: libc-commits, mattd, asavonic, kerbowa, arichardson, jvesely.
Herald added projects: libc-project, All.
jhuber6 requested review of this revision.
Herald added a subscriber: wdng.

The GPU makes use of different address spaces. We generally work with
global memory, thread private memory, and thread shared memory. This
patch simply adds a few preliminary wrappers to map these concepts to
the numerical values the backend uses. Obviously casts between these
will need to be checked by the user.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D156731

Files:
  libc/src/__support/GPU/amdgpu/utils.h
  libc/src/__support/GPU/generic/utils.h
  libc/src/__support/GPU/nvptx/utils.h


Index: libc/src/__support/GPU/nvptx/utils.h
===================================================================

--- libc/src/__support/GPU/nvptx/utils.h
+++ libc/src/__support/GPU/nvptx/utils.h
@@ -19,6 +19,12 @@
 /// The number of threads that execute in lock-step in a warp.
 constexpr const uint64_t LANE_SIZE = 32;
 
+/// Type aliases to the address spaces used by the NVPTX backend.
+template <typename T> using Private = [[clang::address_space(5)]] T;
+template <typename T> using Constant = [[clang::address_space(4)]] T;
+template <typename T> using Shared = [[clang::address_space(3)]] T;
+template <typename T> using Global = [[clang::address_space(1)]] T;
+
 /// Returns the number of CUDA blocks in the 'x' dimension.
 LIBC_INLINE uint32_t get_num_blocks_x() {
   return __nvvm_read_ptx_sreg_nctaid_x();
Index: libc/src/__support/GPU/generic/utils.h
===================================================================
--- libc/src/__support/GPU/generic/utils.h
+++ libc/src/__support/GPU/generic/utils.h
@@ -18,6 +18,11 @@
 
 constexpr const uint64_t LANE_SIZE = 1;
 
+template <typename T> using Private = T;
+template <typename T> using Constant = T;
+template <typename T> using Shared = T;
+template <typename T> using Global = T;
+
 LIBC_INLINE uint32_t get_num_blocks_x() { return 1; }
 
 LIBC_INLINE uint32_t get_num_blocks_y() { return 1; }
Index: libc/src/__support/GPU/amdgpu/utils.h
===================================================================
--- libc/src/__support/GPU/amdgpu/utils.h
+++ libc/src/__support/GPU/amdgpu/utils.h
@@ -20,6 +20,13 @@
 /// The number of threads that execute in lock-step in a lane.
 constexpr const uint64_t LANE_SIZE = __AMDGCN_WAVEFRONT_SIZE;
 
+/// Type aliases to the address spaces used by the AMDGPU backend. We use
+/// 'Shared' instead of 'Local' to maintain consistency with NVPTX.
+template <typename T> using Private = [[clang::address_space(5)]] T;
+template <typename T> using Constant = [[clang::address_space(4)]] T;
+template <typename T> using Shared = [[clang::address_space(3)]] T;
+template <typename T> using Global = [[clang::address_space(1)]] T;
+
 /// Returns the number of workgroups in the 'x' dimension of the grid.
 LIBC_INLINE uint32_t get_num_blocks_x() {
   return __builtin_amdgcn_grid_size_x() / __builtin_amdgcn_workgroup_size_x();


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D156731.545760.patch
Type: text/x-patch
Size: 2326 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/libc-commits/attachments/20230731/4a2b7e73/attachment.bin>