[llvm] [AMDGPU] Enable unaligned scratch accesses (PR #110219)
Matt Arsenault via llvm-commits
llvm-commits at lists.llvm.org
Mon Sep 30 06:55:59 PDT 2024
================
@@ -1831,26 +1831,17 @@ bool SITargetLowering::allowsMisalignedMemoryAccessesImpl(
Subtarget->hasUnalignedDSAccessEnabled();
}
- if (AddrSpace == AMDGPUAS::PRIVATE_ADDRESS) {
- bool AlignedBy4 = Alignment >= Align(4);
- if (IsFast)
- *IsFast = AlignedBy4;
-
- return AlignedBy4 ||
- Subtarget->enableFlatScratch() ||
- Subtarget->hasUnalignedScratchAccess();
- }
-
// FIXME: We have to be conservative here and assume that flat operations
// will access scratch. If we had access to the IR function, then we
// could determine if any private memory was used in the function.
- if (AddrSpace == AMDGPUAS::FLAT_ADDRESS &&
- !Subtarget->hasUnalignedScratchAccess()) {
+ if (AddrSpace == AMDGPUAS::PRIVATE_ADDRESS ||
+ AddrSpace == AMDGPUAS::FLAT_ADDRESS) {
bool AlignedBy4 = Alignment >= Align(4);
if (IsFast)
*IsFast = AlignedBy4;
- return AlignedBy4;
+ return AlignedBy4 || Subtarget->enableFlatScratch() ||
----------------
arsenm wrote:
Dropped from the or. The unaligned access mode is orthogonal to which scratch ABI is in use.
IIRC the difficulty is linux always enables unaligned access, and windows does not. As an incorrect way of dealing with this, the subtarget constructor assumes unaligned access is enabled for amdhsa (which will be wrong if executed on the compatibility layer on top of pal on windows), and not for amdpal. Don't remember what we do for unknown / mesa3d
https://github.com/llvm/llvm-project/pull/110219
More information about the llvm-commits
mailing list