[PATCH] D18340: AMDGPU: allow specifying a workgroup size that needs to fit in a compute unit

Matt Arsenault via llvm-commits llvm-commits at lists.llvm.org
Tue Apr 12 15:07:46 PDT 2016


arsenm added inline comments.

================
Comment at: lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp:484
@@ -483,2 +483,3 @@
 void AMDGPUPromoteAlloca::handleAlloca(AllocaInst &I) {
+  const Function& ContainingFunction = *I.getParent()->getParent();
   if (!I.isStaticAlloca())
----------------
& should be on RHS and this should be moved down to uses

================
Comment at: lib/Target/AMDGPU/SIMachineFunctionInfo.cpp:51
@@ -50,2 +50,3 @@
     ReturnsVoid(true),
+    MaximumWorkGroupSize(256),
     LDSWaveSpillSize(0),
----------------
Should be set to 0 here

================
Comment at: lib/Target/AMDGPU/SIMachineFunctionInfo.cpp:125
@@ +124,3 @@
+
+  if (AMDGPU::isCompute(F->getCallingConv()))
+    MaximumWorkGroupSize = AMDGPU::getMaximumWorkGroupSize(*F);
----------------
Why is this function needed? You can just compare to F->getCallingConv directly

================
Comment at: lib/Target/AMDGPU/SIRegisterInfo.cpp:130
@@ +129,3 @@
+  // Try to place it in a hole after PrivateSegmentbufferReg.
+  if(Idx & 3)
+	  Idx -= 1;
----------------
Space after if

================
Comment at: lib/Target/AMDGPU/SIRegisterInfo.cpp:131-133
@@ +130,5 @@
+  if(Idx & 3)
+	  Idx -= 1;
+  else
+	  Idx -= 5;
+  return AMDGPU::SGPR_32RegClass.getRegister(Idx);
----------------
Magic numbers

================
Comment at: lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp:128
@@ +127,3 @@
+unsigned getMaximumWorkGroupSize(const Function &F) {
+  return getIntegerAttribute(F, "amdgpu-maximum-work-group-size", 256);
+}
----------------
I would also probably reduce this to -max- rather than -maximum- since most other maximums do this

================
Comment at: test/CodeGen/AMDGPU/large-work-group-promote-alloca.ll:25
@@ +24,3 @@
+
+attributes #63 = { nounwind "amdgpu-maximum-work-group-size"="63"}
+
----------------
Attributes should be moved to the bottom. Also run the test through opt with no arguments to simplify the sets


http://reviews.llvm.org/D18340





More information about the llvm-commits mailing list