[PATCH] D91516: [AMDGPU][WIP] Lower Function Local LDS Variables.

Matt Arsenault via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Jan 26 19:34:28 PST 2021


arsenm added inline comments.


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPULowerFunctionLocalLDS.cpp:125-128
+  void pairUpKernelWithLDSList() {
+    for (auto *K : Kernels)
+      pairUpKernelWithLDSList(K);
+  }
----------------
Should just inline this function


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPULowerFunctionLocalLDS.cpp:168
+  // paths from kernels to these callees.
+  void pairUpKernelWithCalleeList() {
+    for (auto *K : Kernels)
----------------
Should just inline this function


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPULowerFunctionLocalLDS.cpp:182
+  // Pair up each LDS with the function within which the LDS is defined.
+  void pairUpLDSWithItsAssociatedFunction() {
+    for (auto *LDS : LDSGlobals)
----------------
Should just inline this function


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPULowerFunctionLocalLDS.cpp:205
+  std::set<GlobalVariable *> LDSSet;
+  auto Callees = KernelToCallee[K];
+
----------------
Copy here, just directly use this in the for loop


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPULowerFunctionLocalLDS.cpp:111
+    for (auto *LDS : LDSGlobals)
+      if (LDSToFunction.find(LDS) == LDSToFunction.end())
+        ToBeRemovedLDSList.insert(LDS);
----------------
hsmhsm wrote:
> arsenm wrote:
> > .contains
> The data structures, ValueMap<>,  SmallPtrSet<> do not have member function -  `.contains()`. W.r.t std::set<>, this member function is supported in C++20.
There is an llvm::is_contained. Also why use std::set? You randomly switch set types around here


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPULowerFunctionLocalLDS.cpp:410-411
+
+// Create a reverse map from function to LDS set which maps a given function F
+// to a set of LDS which are defined within F.
+void LowerFunctionLocalLDSImpl::createFunctionToLDSMap() {
----------------
hsmhsm wrote:
> arsenm wrote:
> > This concept doesn't quite work for the IR. The same global can appear in multiple functions
> My understanding is that - scope of the shared variable is function/statement block scope. It is not available to access outside this scope. It is just that we implement it as global, just like how the local static variables are implemented in C/C++?
> 
> Can you give an example of the use-case that you are claiming?
The IR has absolutely no concept of these scopes. The global variables have global scope and no restriction on where their uses can appear. Whether or not this directly corresponds to a direct language feature is unimportant. Some IPO transforms can push global variable references into other functions.

The example is just two functions that refer to the same variable:


```
@lds = ...

define void @func0() {
  store i32 0, i32* @lds
  ret void
}

define void @func1() {
  store i32 0, i32* @lds
  ret void
}
```


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPULowerFunctionLocalLDS.cpp:502
+    if (GV.getAddressSpace() == AMDGPUAS::LOCAL_ADDRESS &&
+        !GV.hasExternalLinkage())
+      LDSGlobals.insert(&GV);
----------------
hsmhsm wrote:
> arsenm wrote:
> > Probably should skip declarations. Also not sure about the linkage check
> The linkage test is required to ignore the `dynamic shared variables` like the one defined as `extern __shared__ int dy_sm[];`  where size of `dy_sm` is not known at compile time, but is passed as one of the kernel execution configuration parameters at run time.
That's a function of it having 0 size, not the linkage


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPULowerFunctionLocalLDS.cpp:561-571
+// Entry-point function.
+bool LowerFunctionLocalLDSImpl::lower() {
+  // Build necessary data structures which will be later used in the lowering
+  // process.
+  if (!initialize())
+    return false;
+
----------------
hsmhsm wrote:
> arsenm wrote:
> > Don't understand the point of this stub function
> This is a driver function, it looks like a stub now, since implementation is not complete yet. Once this patch is accepted, next step is to (1) define kernel specific LDS layouts (2) create 2D offset table and (3) add new implicit argument.
If it's going to be split, I'd rather see the full stack for the review


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D91516/new/

https://reviews.llvm.org/D91516



More information about the llvm-commits mailing list