[all-commits] [llvm/llvm-project] c21153: [AMDGPU][SDAG] Only fold flat offsets if they are ...
Fabian Ritter via All-commits
all-commits at lists.llvm.org
Tue Mar 18 10:24:37 PDT 2025
Branch: refs/heads/users/ritter-x2a/03-17-_amdgpu_sdag_only_fold_flat_offsets_if_they_are_inbounds
Home: https://github.com/llvm/llvm-project
Commit: c21153d0c9ee9e0f23f1e1f2a4b7c5add7f09ce5
https://github.com/llvm/llvm-project/commit/c21153d0c9ee9e0f23f1e1f2a4b7c5add7f09ce5
Author: Fabian Ritter <fabian.ritter at amd.com>
Date: 2025-03-18 (Tue, 18 Mar 2025)
Changed paths:
M llvm/include/llvm/CodeGen/SelectionDAG.h
M llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
M llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
M llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp
M llvm/test/CodeGen/AMDGPU/atomics_cond_sub.ll
M llvm/test/CodeGen/AMDGPU/cgp-addressing-modes-flat.ll
M llvm/test/CodeGen/AMDGPU/flat-atomicrmw-fadd.ll
M llvm/test/CodeGen/AMDGPU/flat-atomicrmw-fmax.ll
M llvm/test/CodeGen/AMDGPU/flat-atomicrmw-fmin.ll
M llvm/test/CodeGen/AMDGPU/flat-atomicrmw-fsub.ll
M llvm/test/CodeGen/AMDGPU/flat_atomics.ll
M llvm/test/CodeGen/AMDGPU/flat_atomics_i32_system.ll
M llvm/test/CodeGen/AMDGPU/flat_atomics_i64.ll
M llvm/test/CodeGen/AMDGPU/flat_atomics_i64_noprivate.ll
M llvm/test/CodeGen/AMDGPU/flat_atomics_i64_system_noprivate.ll
M llvm/test/CodeGen/AMDGPU/fold-gep-offset.ll
M llvm/test/CodeGen/AMDGPU/llvm.amdgcn.atomic.dec.ll
M llvm/test/CodeGen/AMDGPU/llvm.amdgcn.atomic.inc.ll
M llvm/test/CodeGen/AMDGPU/loop-prefetch-data.ll
M llvm/test/CodeGen/AMDGPU/memintrinsic-unroll.ll
M llvm/test/CodeGen/AMDGPU/memory-legalizer-flat-agent.ll
M llvm/test/CodeGen/AMDGPU/memory-legalizer-flat-singlethread.ll
M llvm/test/CodeGen/AMDGPU/memory-legalizer-flat-system.ll
M llvm/test/CodeGen/AMDGPU/memory-legalizer-flat-wavefront.ll
M llvm/test/CodeGen/AMDGPU/memory-legalizer-flat-workgroup.ll
M llvm/test/CodeGen/AMDGPU/offset-split-flat.ll
M llvm/test/CodeGen/AMDGPU/promote-constOffset-to-imm.ll
M llvm/test/Transforms/InferAddressSpaces/AMDGPU/flat_atomic.ll
Log Message:
-----------
[AMDGPU][SDAG] Only fold flat offsets if they are inbounds
For flat memory instructions where the address is supplied as a base
address register with an immediate offset, the memory aperture test
ignores the immediate offset. Currently, ISel does not respect that,
which leads to miscompilations where valid input programs crash when the
address computation relies on the immediate offset to get the base
address in the proper memory aperture. Global or scratch instructions
are not affected.
This patch only selects flat instructions with immediate offsets from
address computations with the inbounds flag: If the address computation
does not leave the bounds of the allocated object, it cannot leave the
bounds of the memory aperture and is therefore safe to handle with an
immediate offset.
It also adds the inbounds flag to DAG nodes resulting from
transformations:
- Address computations resulting from getObjectPtrOffset. As far as I
can tell, this function is only used to compute addresses within
accessed memory ranges, e.g., for loads and stores that are split
during legalization.
- Reassociated inbounds adds. If both involved operations are inbounds,
then so are the operations after the transformation.
- Address computations in the SelectionDAG lowering of the
memcpy/move/set intrinsics. Base and result of the address arithmetic
there are accessed, so the operation must be inbounds.
It might make sense to separate these changes into their own PR, but I
don't see a way to test them without adding a use of the inbounds SDAG
flag.
Affected tests:
- CodeGen/AMDGPU/fold-gep-offset.ll: Offsets are no longer wrongly
folded; added new positive tests where we still do fold them.
- Transforms/InferAddressSpaces/AMDGPU/flat_atomic.ll: Offset folding
doesn't seem integral to this test, so the test input is not changed.
- CodeGen/AMDGPU/loop-prefetch-data.ll: loop-reduce prefers to base
addresses on the potentially OOB addresses used for prefetching for
memory accesses, that might be a separate issue to look into.
- Added memset tests to CodeGen/AMDGPU/memintrinsic-unroll.ll to make
sure that offsets in the memset DAG lowering are still folded
properly.
- All others: Added inbounds flags to GEPs in the input so that the
output stays the same.
A similar patch for GlobalISel will follow.
Fixes SWDEV-516125.
To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications
More information about the All-commits
mailing list