[llvm] [AMDGPU] Implement LSR cost model for GFX9+ (PR #184138)

Fri Mar 6 06:17:16 PST 2026

================
@@ -1703,3 +1704,51 @@ GCNTTIImpl::getInstructionUniformity(const Value *V) const {
 
   return InstructionUniformity::Default;
 }
+
+InstructionCost GCNTTIImpl::getScalingFactorCost(Type *Ty, GlobalValue *BaseGV,
+                                                 StackOffset BaseOffset,
+                                                 bool HasBaseReg, int64_t Scale,
+                                                 unsigned AddrSpace) const {
+  if (HasBaseReg && Scale != 0) {
+    // gfx1250+ can fold base+scale*index when scale matches the memory access
+    // size (scale_offset bit). Supported for flat/global/constant/scratch
+    // (VMEM, max 128 bits) and constant_32bit (SMRD, capped to 128 bits here).
+    if (getST()->hasScaleOffset() && Ty && Ty->isSized() &&
+        (AddrSpace == AMDGPUAS::FLAT_ADDRESS ||
+         AddrSpace == AMDGPUAS::GLOBAL_ADDRESS ||
+         AddrSpace == AMDGPUAS::CONSTANT_ADDRESS ||
+         AddrSpace == AMDGPUAS::CONSTANT_ADDRESS_32BIT ||
+         AddrSpace == AMDGPUAS::PRIVATE_ADDRESS)) {
+      unsigned StoreSize = getDataLayout().getTypeStoreSize(Ty).getFixedValue();
+      if (StoreSize <= 16 && static_cast<int64_t>(StoreSize) == Scale)
----------------
arsenm wrote:

Use isKnownLE and avoid asserting on scalable 

https://github.com/llvm/llvm-project/pull/184138