[llvm] [AMDGPU] Implement LSR cost model for GFX9+ (PR #184138)
Matt Arsenault via llvm-commits
llvm-commits at lists.llvm.org
Fri Mar 6 06:17:16 PST 2026
================
@@ -1703,3 +1704,51 @@ GCNTTIImpl::getInstructionUniformity(const Value *V) const {
return InstructionUniformity::Default;
}
+
+InstructionCost GCNTTIImpl::getScalingFactorCost(Type *Ty, GlobalValue *BaseGV,
+ StackOffset BaseOffset,
+ bool HasBaseReg, int64_t Scale,
+ unsigned AddrSpace) const {
+ if (HasBaseReg && Scale != 0) {
+ // gfx1250+ can fold base+scale*index when scale matches the memory access
+ // size (scale_offset bit). Supported for flat/global/constant/scratch
+ // (VMEM, max 128 bits) and constant_32bit (SMRD, capped to 128 bits here).
+ if (getST()->hasScaleOffset() && Ty && Ty->isSized() &&
+ (AddrSpace == AMDGPUAS::FLAT_ADDRESS ||
+ AddrSpace == AMDGPUAS::GLOBAL_ADDRESS ||
+ AddrSpace == AMDGPUAS::CONSTANT_ADDRESS ||
+ AddrSpace == AMDGPUAS::CONSTANT_ADDRESS_32BIT ||
+ AddrSpace == AMDGPUAS::PRIVATE_ADDRESS)) {
+ unsigned StoreSize = getDataLayout().getTypeStoreSize(Ty).getFixedValue();
+ if (StoreSize <= 16 && static_cast<int64_t>(StoreSize) == Scale)
----------------
arsenm wrote:
Use isKnownLE and avoid asserting on scalable
https://github.com/llvm/llvm-project/pull/184138
More information about the llvm-commits
mailing list