[llvm] [AMDGPU] Sink uniform buffer address offsets into soffset (PR #169230)

Mon Nov 24 06:22:27 PST 2025

PrasoonMishra wrote:

> The claim here is that for raw buffer operations there is no semantic reason to separate out the voffset/soffset/immediate parts of the offset. So a more comprehensive implementation would be to canonicalize them as early as possible with a single offset operand which is the sum of the three offsets provided by the programmer. Then instruction selection can split that back into divergent/uniform/constant parts like we already do for lots of addressing modes.

I agree that semantically raw buffer addressing is just the sum of voffset + soffset + imm. However, we keep them separate to enable later optimizations like promoting to SMEM when everything is uniform. This makes late splitting tricky because it must preserve uniformity
```
// Original
voffset = (divergent_tid * stride) + uniform_val1;
soffset = uniform_val2;
// Backend knows soffset is uniform can promote to s_buffer_load if voffset == 0

// After merging:
merged_offset = (divergent_tid * stride) + uniform_val1 + uniform_val2;
// Problem: merged_offset is divergent, hence cannot promote even though most is uniform.
```
It also needs to carry alignment info for SMEM promotion (see #138975). Without these, it will prevent s_buffer_load promotion.

https://github.com/llvm/llvm-project/pull/169230