[PATCH] D18340: AMDGPU: allow specifying a workgroup size that needs to fit in a compute unit
Matt Arsenault via llvm-commits
llvm-commits at lists.llvm.org
Tue Apr 12 15:07:46 PDT 2016
arsenm added inline comments.
================
Comment at: lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp:484
@@ -483,2 +483,3 @@
void AMDGPUPromoteAlloca::handleAlloca(AllocaInst &I) {
+ const Function& ContainingFunction = *I.getParent()->getParent();
if (!I.isStaticAlloca())
----------------
& should be on RHS and this should be moved down to uses
================
Comment at: lib/Target/AMDGPU/SIMachineFunctionInfo.cpp:51
@@ -50,2 +50,3 @@
ReturnsVoid(true),
+ MaximumWorkGroupSize(256),
LDSWaveSpillSize(0),
----------------
Should be set to 0 here
================
Comment at: lib/Target/AMDGPU/SIMachineFunctionInfo.cpp:125
@@ +124,3 @@
+
+ if (AMDGPU::isCompute(F->getCallingConv()))
+ MaximumWorkGroupSize = AMDGPU::getMaximumWorkGroupSize(*F);
----------------
Why is this function needed? You can just compare to F->getCallingConv directly
================
Comment at: lib/Target/AMDGPU/SIRegisterInfo.cpp:130
@@ +129,3 @@
+ // Try to place it in a hole after PrivateSegmentbufferReg.
+ if(Idx & 3)
+ Idx -= 1;
----------------
Space after if
================
Comment at: lib/Target/AMDGPU/SIRegisterInfo.cpp:131-133
@@ +130,5 @@
+ if(Idx & 3)
+ Idx -= 1;
+ else
+ Idx -= 5;
+ return AMDGPU::SGPR_32RegClass.getRegister(Idx);
----------------
Magic numbers
================
Comment at: lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp:128
@@ +127,3 @@
+unsigned getMaximumWorkGroupSize(const Function &F) {
+ return getIntegerAttribute(F, "amdgpu-maximum-work-group-size", 256);
+}
----------------
I would also probably reduce this to -max- rather than -maximum- since most other maximums do this
================
Comment at: test/CodeGen/AMDGPU/large-work-group-promote-alloca.ll:25
@@ +24,3 @@
+
+attributes #63 = { nounwind "amdgpu-maximum-work-group-size"="63"}
+
----------------
Attributes should be moved to the bottom. Also run the test through opt with no arguments to simplify the sets
http://reviews.llvm.org/D18340
More information about the llvm-commits
mailing list