[llvm] [AMDGPU] Implement LSR cost model for GFX9+ (PR #184138)
Matt Arsenault via llvm-commits
llvm-commits at lists.llvm.org
Wed Mar 4 02:19:56 PST 2026
================
@@ -1703,3 +1704,51 @@ GCNTTIImpl::getInstructionUniformity(const Value *V) const {
return InstructionUniformity::Default;
}
+
+InstructionCost GCNTTIImpl::getScalingFactorCost(Type *Ty, GlobalValue *BaseGV,
+ StackOffset BaseOffset,
+ bool HasBaseReg, int64_t Scale,
+ unsigned AddrSpace) const {
+ if (HasBaseReg && Scale != 0) {
+ // gfx1250+ can fold base+scale*index into the instruction when scale
+ // equals the memory access size (scale_offset bit). This is supported
+ // for global/constant/flat/scratch but NOT for LDS or GDS.
+ // GDS does not exist on gfx1250+, but we exclude REGION_ADDRESS for
+ // correctness since the address space is still representable in IR.
+ if (getST()->hasScaleOffset() && Ty && Ty->isSized() &&
+ AddrSpace != AMDGPUAS::LOCAL_ADDRESS &&
+ AddrSpace != AMDGPUAS::REGION_ADDRESS) {
+ TypeSize StoreSize = getDataLayout().getTypeStoreSize(Ty);
+ if (!StoreSize.isScalable() &&
----------------
arsenm wrote:
It still needs to not crash on a scaleable vector
https://github.com/llvm/llvm-project/pull/184138
More information about the llvm-commits
mailing list