[PATCH] D91516: [AMDGPU] Support for device scope shared variables

Fri Jan 8 03:04:47 PST 2021

hsmhsm added inline comments.

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUDeviceScopeSharedVariable.cpp:1486
+
+  // Perform all the necessary processing required.
+  processDeviceScopeSharedVariables(M, Kernels, LDSGlobals, LDSToFunction,
----------------
arsenm wrote:
> Pointless comment
Taken care

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUDeviceScopeSharedVariable.cpp:1505
+                      << ", skipping handling device of scope shared variables"
+                      << "\n");
+    return false;
----------------
arsenm wrote:
> Can merge this \n with the previous string
Taken care.

Further, as mentioned in the review comments below, I removed all the redundant code within this function, and since we had then left with single call, I just inlined this call.

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUDeviceScopeSharedVariable.cpp:1510
+  // Collect all the amdgpu kernels defined within the current module.
+  SetVector<Function *> Kernels;
+  for (auto &F : M.functions())
----------------
arsenm wrote:
> Don't see why this is a SetVector, the ordering shouldn't matter
I intentionally used SetVector in order to make sure that the TRAVERSAL ORDER is preserved during multiple traversals of this set in different places. 

Is it harm to use SetVector here? Is it causes noticible performance issues? If not, why not we use it when it would provide additional benifit of preserving TRAVERSAL ORDER?

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUDeviceScopeSharedVariable.cpp:1511
+  SetVector<Function *> Kernels;
+  for (auto &F : M.functions())
+    if ((F.getCallingConv() == CallingConv::AMDGPU_KERNEL) &&
----------------
arsenm wrote:
> Braces
taken care

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUDeviceScopeSharedVariable.cpp:1512
+  for (auto &F : M.functions())
+    if ((F.getCallingConv() == CallingConv::AMDGPU_KERNEL) &&
+        !F.isDeclaration())
----------------
arsenm wrote:
> Probably should handle all entry function CCs
I assume, the only kind entry functions are KERNELS (for hip programming language). And, this pass is targetted only for HIP. With this context, is there anything that we are missing?

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUDeviceScopeSharedVariable.cpp:1519
+                      << ", skipping handling of device scope shared variables"
+                      << "\n");
+    return false;
----------------
arsenm wrote:
> Can merge this \n with the previous string
taken care

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUDeviceScopeSharedVariable.cpp:1527-1529
+  LLVM_DEBUG(dbgs() << "===== Handling device scope shared variables in the "
+                       "module "
+                    << M.getName() << "\n");
----------------
arsenm wrote:
> Don't need this, the pass banner will be redundant
taken care

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUDeviceScopeSharedVariable.cpp:1531-1533
+  // TODO: We only want to handle HIP kernels, and no kernels from from other
+  // programming languages, like OpenCL, OpenMP, etc. Do we need to add a
+  // condition here for it, and skip running the pass for non-HIP kernels?
----------------
arsenm wrote:
> Language doesn't matter, only the IR. This comment should be removed
taken care

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUDeviceScopeSharedVariable.cpp:1534-1539
+  if (skipModule(M)) {
+    LLVM_DEBUG(dbgs() << "Skipping handling of device scope shared variables "
+                         "in the module "
+                      << M.getName() << "\n");
+    return false;
+  }
----------------
arsenm wrote:
> This is required for correctness, this can't be skipped
taken care

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUDeviceScopeSharedVariable.cpp:1534-1539
+  if (skipModule(M)) {
+    LLVM_DEBUG(dbgs() << "Skipping handling of device scope shared variables "
+                         "in the module "
+                      << M.getName() << "\n");
+    return false;
+  }
----------------
hsmhsm wrote:
> arsenm wrote:
> > This is required for correctness, this can't be skipped
> taken care
taken care

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUDeviceScopeSharedVariable.cpp:1543-1545
+  LLVM_DEBUG(dbgs() << "===== Done with handling device scope shared variables "
+                       "in the module "
+                    << M.getName() << "\n");
----------------
arsenm wrote:
> Don't need this debug printing
taken care

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.h:43
   static bool EnableFixedFunctionABI;
+  static bool EnableDeviceScopeSharedVariable;

----------------
arsenm wrote:
> I don't think this needs to be exposed here, there's no reason other places would need to inspect this
This is needed in the function `AMDGPUAlwaysInlinePass.cpp`. Please look at the changes being made to this file.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D91516/new/

https://reviews.llvm.org/D91516