[llvm-branch-commits] [llvm] [AMDGPU][SDAG] Only fold flat offsets if they are inbounds (PR #132353)

Fri Mar 21 01:03:14 PDT 2025

================
@@ -13,9 +13,9 @@ define protected amdgpu_kernel void @InferNothing(i32 %a, ptr %b, double %c) {
 ; CHECK-NEXT:    s_lshl_b64 s[2:3], s[6:7], 3
 ; CHECK-NEXT:    s_add_u32 s0, s2, s0
 ; CHECK-NEXT:    s_addc_u32 s1, s3, s1
-; CHECK-NEXT:    v_mov_b32_e32 v3, s1
-; CHECK-NEXT:    v_add_co_u32_e64 v2, vcc, -8, s0
-; CHECK-NEXT:    v_addc_co_u32_e32 v3, vcc, -1, v3, vcc
+; CHECK-NEXT:    s_add_u32 s0, s0, -8
+; CHECK-NEXT:    s_addc_u32 s1, s1, -1
+; CHECK-NEXT:    v_pk_mov_b32 v[2:3], s[0:1], s[0:1] op_sel:[0,1]
----------------
ritter-x2a wrote:

@arsenm regarding your question here from the old PR:

> Not sure what happened here (and why do we have a codegen test in test/Transforms?)

Before the patch, the negative constant offset is matched as part of the flat instruction, but since negative offsets are illegal in gfx908, the constant offset is split into a part that fits into the immediate (0) and one that doesn't fit (-8).
After the patch, the constant offset is not matched as part of the flat instruction since the address computation resulting from SeparateConstOffsetFromGEP is not marked inbounds, so it is matched separately (apparently resulting in different code).
We could get the same behavior as before by manually applying SeparateConstOffsetFromGEP to the test input and adding inbounds flags to the resulting GEPs, but that doesn't seem very central to the test.

We can also move this test to Codegen/AMDGPU, but I think that would be something for a separate PR.

https://github.com/llvm/llvm-project/pull/132353