[PATCH] D145586: [AMDGPU] Tweak PromoteAlloca limits

Mon Apr 3 06:47:14 PDT 2023

arsenm added inline comments.

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp:420-425
   // could also be promoted but we don't currently handle this case
   if (!VectorTy || VectorTy->getNumElements() > 16 ||
       VectorTy->getNumElements() < 2) {
     LLVM_DEBUG(dbgs() << "  Cannot convert type to vector\n");
     return false;
   }
----------------
This whole size logic doesn't make sense when viewed in total. The element count check should be considered along with the size. I also don't think using fractions of the register budget makes sense when deciding for an individual alloca. I think it would be easier to follow if we used the 32-bit element count limit as 16 for budgets < 64, and 32 for > 64.

Thinking about byte sizes doesn't really make sense either. For sub-32-bit elements, don't we end up promoting them to 32-bit element vectors anyway?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D145586/new/

https://reviews.llvm.org/D145586