[llvm] [AMDGPU] Enable unaligned scratch accesses (PR #110219)
Fabian Ritter via llvm-commits
llvm-commits at lists.llvm.org
Mon Sep 30 05:28:09 PDT 2024
================
@@ -1831,26 +1831,17 @@ bool SITargetLowering::allowsMisalignedMemoryAccessesImpl(
Subtarget->hasUnalignedDSAccessEnabled();
}
- if (AddrSpace == AMDGPUAS::PRIVATE_ADDRESS) {
- bool AlignedBy4 = Alignment >= Align(4);
- if (IsFast)
- *IsFast = AlignedBy4;
-
- return AlignedBy4 ||
- Subtarget->enableFlatScratch() ||
- Subtarget->hasUnalignedScratchAccess();
- }
-
// FIXME: We have to be conservative here and assume that flat operations
// will access scratch. If we had access to the IR function, then we
// could determine if any private memory was used in the function.
- if (AddrSpace == AMDGPUAS::FLAT_ADDRESS &&
- !Subtarget->hasUnalignedScratchAccess()) {
+ if (AddrSpace == AMDGPUAS::PRIVATE_ADDRESS ||
+ AddrSpace == AMDGPUAS::FLAT_ADDRESS) {
bool AlignedBy4 = Alignment >= Align(4);
if (IsFast)
*IsFast = AlignedBy4;
- return AlignedBy4;
+ return AlignedBy4 || Subtarget->enableFlatScratch() ||
----------------
ritter-x2a wrote:
Note that we don't do that in the current trunk state: see the old [line 1840](https://github.com/llvm/llvm-project/pull/110219/files#diff-e78d2fbd64648d787707fd3d4e7e5b5d2f00fb9c09972937718a11237933c597L1840). Removing `Subtarget->enableFlatScratch()` from the OR would make code generation more pessimistic for cases where unaligned scratch is not explicitly enabled but flat scratch is, e.g., gfx12. Should we change that anyway?
https://github.com/llvm/llvm-project/pull/110219
More information about the llvm-commits
mailing list