[llvm] [AMDGPU] Introduce "amdgpu-sw-lower-lds" pass to lower LDS accesses. (PR #87265)
Matt Arsenault via llvm-commits
llvm-commits at lists.llvm.org
Mon Aug 12 11:51:30 PDT 2024
================
@@ -0,0 +1,1335 @@
+//===-- AMDGPUSwLowerLDS.cpp -----------------------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// This pass lowers the local data store, LDS, uses in kernel and non-kernel
+// functions in module to use dynamically allocated global memory.
+// Packed LDS Layout is emulated in the global memory.
+// The lowered memory instructions from LDS to global memory are then
+// instrumented for address sanitizer, to catch addressing errors.
+//
+// Replacement of Kernel LDS accesses:
+// For a kernel, LDS access can be static or dynamic which are direct
+// (accessed within kernel) and indirect (accessed through non-kernels).
+// All these LDS accesses corresponding to kernel will be packed together,
+// where all static LDS accesses will be allocated first and then dynamic
+// LDS follows. The total size with alignment is calculated. A new LDS global
+// will be created for the kernel called "SW LDS" and it will have the
+// attribute "amdgpu-lds-size" attached with value of the size calculated.
+// All the LDS accesses in the module will be replaced by GEP with offset
+// into the "Sw LDS".
+// A new "llvm.amdgcn.<kernel>.dynlds" is created per kernel accessing
+// the dynamic LDS. This will be marked used by kernel and will have
+// MD_absolue_symbol metadata set to total static LDS size, Since dynamic
+// LDS allocation starts after all static LDS allocation.
+//
+// A device global memory equal to the total LDS size will be allocated.
+// At the prologue of the kernel, a single work-item from the
+// work-group, does a "malloc" and stores the pointer of the
+// allocation in "SW LDS".
+//
+// To store the offsets corresponding to all LDS accesses, another global
+// variable is created which will be called "SW LDS metadata" in this pass.
+// - SW LDS Global:
+// It is LDS global of ptr type with name
+// "llvm.amdgcn.sw.lds.<kernel-name>".
+// - Metadata Global:
+// It is of struct type, with n members. n equals the number of LDS
+// globals accessed by the kernel(direct and indirect). Each member of
+// struct is another struct of type {i32, i32, i32}. First member
+// corresponds to offset, second member corresponds to size of LDS global
+// being replaced and third represents the total aligned size. It will
+// have name "llvm.amdgcn.sw.lds.<kernel-name>.md". This global will have
+// an intializer with static LDS related offsets and sizes initialized.
+// But for dynamic LDS related entries, offsets will be intialized to
+// previous static LDS allocation end offset. Sizes for them will be zero
+// initially. These dynamic LDS offset and size values will be updated
+// with in the kernel, since kernel can read the dynamic LDS size
----------------
arsenm wrote:
```suggestion
// within the kernel, since kernel can read the dynamic LDS size
```
https://github.com/llvm/llvm-project/pull/87265
More information about the llvm-commits
mailing list