[llvm-branch-commits] [llvm] [AMDGPU] Support one immediate folding for global load (PR #178608)
via llvm-branch-commits
llvm-branch-commits at lists.llvm.org
Wed Feb 4 23:50:02 PST 2026
================
@@ -2037,13 +2037,36 @@ bool AMDGPUDAGToDAGISel::SelectGlobalSAddr(SDNode *N, SDValue Addr,
LHS = Addr.getOperand(0);
if (!LHS->isDivergent()) {
- // add (i64 sgpr), (*_extend (i32 vgpr))
RHS = Addr.getOperand(1);
- ScaleOffset = SelectScaleOffset(N, RHS, Subtarget->hasSignedGVSOffset());
+
if (SDValue ExtRHS = matchExtFromI32orI32(
RHS, Subtarget->hasSignedGVSOffset(), CurDAG)) {
+ // add (i64 sgpr), (*_extend (scale (i32 vgpr)))
SAddr = LHS;
VOffset = ExtRHS;
+ if (NeedIOffset && !ImmOffset &&
+ CurDAG->isBaseWithConstantOffset(ExtRHS)) {
+ // add (i64 sgpr), (*_extend (add (scale (i32 vgpr)), (i32 imm)))
----------------
ruiling wrote:
One possible way to make this correct is checking that overflow would not happen like in the recent commit. I looked into the case, the address calculation went through a long sequences of optimization in dag-combine and finally looks like the pattern we matched here. I don't know if there is a better place to re-associate earlier. Do you think it is ok to solve the problem like the change?
https://github.com/llvm/llvm-project/pull/178608
More information about the llvm-branch-commits
mailing list