[llvm] [AMDGPU] Introduce "amdgpu-sw-lower-lds" pass to lower LDS accesses. (PR #87265)

Matt Arsenault via llvm-commits llvm-commits at lists.llvm.org
Mon Aug 12 11:51:30 PDT 2024


================
@@ -0,0 +1,1335 @@
+//===-- AMDGPUSwLowerLDS.cpp -----------------------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// This pass lowers the local data store, LDS, uses in kernel and non-kernel
+// functions in module to use dynamically allocated global memory.
+// Packed LDS Layout is emulated in the global memory.
+// The lowered memory instructions from LDS to global memory are then
+// instrumented for address sanitizer, to catch addressing errors.
+//
+// Replacement of Kernel LDS accesses:
+//    For a kernel, LDS access can be static or dynamic which are direct
+//    (accessed within kernel) and indirect (accessed through non-kernels).
+//    All these LDS accesses corresponding to kernel will be packed together,
+//    where all static LDS accesses will be allocated first and then dynamic
+//    LDS follows. The total size with alignment is calculated. A new LDS global
+//    will be created for the kernel called "SW LDS" and it will have the
+//    attribute "amdgpu-lds-size" attached with value of the size calculated.
+//    All the LDS accesses in the module will be replaced by GEP with offset
+//    into the "Sw LDS".
+//    A new "llvm.amdgcn.<kernel>.dynlds" is created per kernel accessing
+//    the dynamic LDS. This will be marked used by kernel and will have
+//    MD_absolue_symbol metadata set to total static LDS size, Since dynamic
+//    LDS allocation starts after all static LDS allocation.
+//
+//    A device global memory equal to the total LDS size will be allocated.
+//    At the prologue of the kernel, a single work-item from the
+//    work-group, does a "malloc" and stores the pointer of the
+//    allocation in "SW LDS".
+//
+//    To store the offsets corresponding to all LDS accesses, another global
+//    variable is created which will be called "SW LDS metadata" in this pass.
+//    - SW LDS Global:
+//        It is LDS global of ptr type with name
+//        "llvm.amdgcn.sw.lds.<kernel-name>".
+//    - Metadata Global:
+//        It is of struct type, with n members. n equals the number of LDS
+//        globals accessed by the kernel(direct and indirect). Each member of
+//        struct is another struct of type {i32, i32, i32}. First member
+//        corresponds to offset, second member corresponds to size of LDS global
+//        being replaced and third represents the total aligned size. It will
+//        have name "llvm.amdgcn.sw.lds.<kernel-name>.md". This global will have
+//        an intializer with static LDS related offsets and sizes initialized.
+//        But for dynamic LDS related entries, offsets will be intialized to
+//        previous static LDS allocation end offset. Sizes for them will be zero
+//        initially. These dynamic LDS offset and size values will be updated
+//        with in the kernel, since kernel can read the dynamic LDS size
----------------
arsenm wrote:

```suggestion
//        within the kernel, since kernel can read the dynamic LDS size
```

https://github.com/llvm/llvm-project/pull/87265


More information about the llvm-commits mailing list