[PATCH] D35974: AMDGPU: Add pass to replace out arguments

Stanislav Mekhanoshin via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Jul 27 22:39:09 PDT 2017

rampitec added inline comments.

Comment at: lib/Target/AMDGPU/AMDGPURewriteOutArguments.cpp:120
+  // TODO: It might be useful for any out arguments, not just privates.
+  if (!ArgTy || (ArgTy->getAddressSpace() != DL->getAllocaAddrSpace() &&
arsenm wrote:
> rampitec wrote:
> > I'm not sure you actually can do it on a non-private. Other address spaces are externally visible and stores to them shall not be easily eliminated. The exception is a non-aliasing LDS variable with function local scope.
> The memory access isn't actually eliminated here, it's just moved to the stub function. and relies on other passes to fix it so it is OK. The hope with private is it the temporary alloca will be SROAable 
This is true, but I do not believe other passes would be able to eliminate it. It looks like just an overhead.

Comment at: lib/Target/AMDGPU/AMDGPURewriteOutArguments.cpp:300
+  RetAttrs.addAttribute(Attribute::NoAlias);
+  NewFunc->removeAttributes(AttributeList::ReturnIndex, RetAttrs);
+  // TODO: How to preserve metadata?
arsenm wrote:
> rampitec wrote:
> > Consider also marking it always inline.
> The stub function is marked as always inline. The new function here steals the body of the old function, the original function is then marked always inline
I mean stub gets original attributes. The new function gets original attributes + always inline. That way it will be always inlined into stub if otherwise not simplified. I.e. if at the end call to a stub will not be replaced with a call to the new function we will have no extra call.

You can also check the heuristic isWrapperOnlyCall() in the AMDInline.cpp from HSAIL, which basically implements the same logic: inline everything into a wrapper/stub.


More information about the llvm-commits mailing list