[PATCH] D122850: [AMDGPU] Fix regression with vectorization limiting

Thu Mar 31 16:22:49 PDT 2022

rampitec updated this revision to Diff 419586.
rampitec added a comment.

I have realized that RCID passed into getNumberOfRegisters(unsigned RCID) is in fact not an RCID, but boolean for vector/scalar registers. We could implement getRegisterClassForType() to change that, but we cannot reasonably distinguish between VGPRs and SGPRs anyway. At the end result is clamped to just 4, so it is easier to remove all of these calculations and simply return 4.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D122850/new/

https://reviews.llvm.org/D122850

Files:
  llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp
  llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.h
  llvm/test/Transforms/LoopVectorize/AMDGPU/packed-fp32.ll
  llvm/test/Transforms/LoopVectorize/AMDGPU/packed-math.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D122850.419586.patch
Type: text/x-patch
Size: 17274 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20220331/ea9996c6/attachment.bin>