[llvm] [AMDGPU] Sink uniform buffer address offsets into soffset (PR #169230)
Juan Manuel Martinez CaamaƱo via llvm-commits
llvm-commits at lists.llvm.org
Mon Nov 24 01:32:30 PST 2025
================
@@ -2046,6 +2056,86 @@ bool AMDGPUCodeGenPrepareImpl::visitSqrt(IntrinsicInst &Sqrt) {
return true;
}
+/// Sink uniform addends in buffer address calculations into soffset.
+///
+/// Transforms buffer loads/stores with voffset = add(uniform, divergent)
+/// into voffset = divergent, soffset = uniform for better address coalescing
+/// Only applies to raw buffer operations with soffset initially zero.
+bool AMDGPUCodeGenPrepareImpl::visitBufferIntrinsic(IntrinsicInst &I) {
+ Intrinsic::ID IID = I.getIntrinsicID();
+ bool IsLoad = (IID == Intrinsic::amdgcn_raw_buffer_load ||
+ IID == Intrinsic::amdgcn_raw_buffer_load_format ||
+ IID == Intrinsic::amdgcn_raw_ptr_buffer_load ||
+ IID == Intrinsic::amdgcn_raw_ptr_buffer_load_format);
+ bool IsStore = (IID == Intrinsic::amdgcn_raw_buffer_store ||
+ IID == Intrinsic::amdgcn_raw_buffer_store_format ||
+ IID == Intrinsic::amdgcn_raw_ptr_buffer_store ||
+ IID == Intrinsic::amdgcn_raw_ptr_buffer_store_format);
+
+ if (!IsLoad && !IsStore)
+ return false;
+
+ // Buffer intrinsic operand layout (same for vector and pointer descriptor):
+ // Load: (rsrc, voffset, soffset, cachepolicy)
+ // Store: (vdata, rsrc, voffset, soffset, cachepolicy)
+ const unsigned VOffsetIdx = IsStore ? 2 : 1;
+ const unsigned SOffsetIdx = IsStore ? 3 : 2;
+
+ Value *VOffset = I.getArgOperand(VOffsetIdx);
+ Value *SOffset = I.getArgOperand(SOffsetIdx);
+
+ // Only optimize when soffset is currently zero
+ if (!match(SOffset, m_Zero()))
+ return false;
----------------
jmmartinez wrote:
How does this code evolve when we want to handle soffsets that are not zero? Or more complex voffsets like `(((non_uniform + uniform_a) + uniform_b) + 8)`.
In this previous case, 8 is kept as it is to sink it later to the constant part. uniform_a and uniform_b are kept in there too, while it'd have been possible to move, at least `uniform_b`, to the soffset ?
I understand that as a first step we can put this transformation in here; but I would not be surprised about it having its own pass (or having a pass to do "uniform reassociate" more generally).
https://github.com/llvm/llvm-project/pull/169230
More information about the llvm-commits
mailing list