[PATCH] D29473: [AMDGPU] Unroll preferences improvements

Matt Arsenault via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Feb 2 16:31:16 PST 2017

arsenm added inline comments.

Comment at: lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp:93-96
+  if (ST->getGeneration() >= AMDGPUSubtarget::VOLCANIC_ISLANDS)
     return 256;
+  else if (ST->getGeneration() >= AMDGPUSubtarget::SOUTHERN_ISLANDS)
+    return 128;
rampitec wrote:
> arsenm wrote:
> > I don't think we should change this away from the hardware sizes. I think this hook is only used by the vectorizers we don't use. We should define a different constant for use for the alloca heuristic
> The numbers here were just incorrect. SI to CI have 128 registers. Then it makes sense to take into consideration real register file size, which is target dependent. As a todo we need to limit it further if we have occupancy attributes.
No, there have always been 256 VGPRs.



More information about the llvm-commits mailing list