[PATCH] D18340: AMDGPU: allow specifying a workgroup size that needs to fit in a compute unit

Thu Mar 24 10:44:41 PDT 2016

nhaehnle added inline comments.

================
Comment at: lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp:487-492
@@ -485,8 +486,8 @@
 
   // FIXME: This is the maximum work group size.  We should try to get
   // value from the reqd_work_group_size function attribute if it is
   // available.
-  unsigned WorkGroupSize = 256;
+  unsigned WorkGroupSize = AMDGPU::getMaximumWorkGroupSize(ContainingFunction);
   int AllocaSize =
       WorkGroupSize * Mod->getDataLayout().getTypeAllocSize(AllocaTy);
 
----------------
bnieuwenhuizen wrote:
> nhaehnle wrote:
> > Fix the comment :)
> We don't technically handle that function attribute yet. I can move it somewhere more appropiate, like getMaxWorkGroupSize or also make getMaxWorkGroupSize detect and use reqd_work_group_size.
> 
> I noticed though that we have all the information for reqd_work_group_size in mesa (unless we want to support ARB_compute_variable_group_size). Maybe it is worth it to switch completely  to reqd_work_group_size.
I see what you mean. The semantics are also slightly different. It does makes sense to keep a comment about this.

================
Comment at: lib/Target/AMDGPU/SIRegisterInfo.cpp:62
@@ +61,3 @@
+
+}
+
----------------
bnieuwenhuizen wrote:
> nhaehnle wrote:
> > I was confused by that closing brace for a moment. I'm not sure what the LLVM coding style says if anything, but personally I find something like `// anonymous namespace` helpful.
> Seems like The LLVM coding style wants static instead of anonymous namespaces for functions (http://llvm.org/docs/CodingStandards.html#anonymous-namespaces). I will change that.
Thanks. With that changed, and arguably with a small alignDown helper, the patch LGTM.


http://reviews.llvm.org/D18340