[PATCH] D91516: [AMDGPU] Support for device scope shared variables

Fri Jan 8 08:15:42 PST 2021

hsmhsm added inline comments.

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUDeviceScopeSharedVariable.cpp:336
+      BasePtr->getType()->getPointerElementType(), BasePtr, LInst,
+      Twine("dssv.gep2.") + Twine(Suffix), const_cast<Instruction *>(&EI));
+
----------------
arsenm wrote:
> There should be no const_casts
But  GetElementPtrInst::CreateInBounds() receives non_cast `Instruction` parameter.

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUDeviceScopeSharedVariable.cpp:478
+
+  NewParams.push_back(BasePtrTy);
+  NewParams.push_back(KernNumTy);
----------------
arsenm wrote:
> I think we could avoid passing the base pointer since it should always be 0. I would have to think about how to best guarantee this to codegen though
How do you avoid it? I am not getting. Could you be more explicit here?

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUDeviceScopeSharedVariable.cpp:1510
+  // Collect all the amdgpu kernels defined within the current module.
+  SetVector<Function *> Kernels;
+  for (auto &F : M.functions())
----------------
arsenm wrote:
> hsmhsm wrote:
> > arsenm wrote:
> > > Don't see why this is a SetVector, the ordering shouldn't matter
> > I intentionally used SetVector in order to make sure that the TRAVERSAL ORDER is preserved during multiple traversals of this set in different places. 
> > 
> > Is it harm to use SetVector here? Is it causes noticible performance issues? If not, why not we use it when it would provide additional benifit of preserving TRAVERSAL ORDER?
> > 
> Traversal order shouldn't matter though
Agree, but what is the harm in preserving traversal order? If you insist, I will change, but just curious to know, if it is really a serious comment to be taken care.

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.h:43
   static bool EnableFixedFunctionABI;
+  static bool EnableDeviceScopeSharedVariable;

----------------
arsenm wrote:
> hsmhsm wrote:
> > arsenm wrote:
> > > I don't think this needs to be exposed here, there's no reason other places would need to inspect this
> > This is needed in the function `AMDGPUAlwaysInlinePass.cpp`. Please look at the changes being made to this file.
> This should be a required pass without an option to disable it that the pass would need to be aware of. Is this just for debug/bringup?
Current idea and common consensus  (as I understand it) is to guard this feature by a flag, and disable it by default.  HIP application programmer need to pass the option to enable this feature, at least in the begining. May be in future, we can think of getting rid of the option.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D91516/new/

https://reviews.llvm.org/D91516