[llvm] [AMDGPU] Align loop headers to prevent instruction fetch split on GFX950 (PR #181999)
Matt Arsenault via llvm-commits
llvm-commits at lists.llvm.org
Thu Feb 19 02:13:53 PST 2026
================
@@ -18811,6 +18825,30 @@ Align SITargetLowering::getPrefLoopAlignment(MachineLoop *ML) const {
return CacheLineAlign;
}
+unsigned SITargetLowering::getMaxPermittedBytesForAlignment(
+ MachineBasicBlock *MBB) const {
+ // GFX950: Limit padding to 4 bytes (one s_nop) for blocks where an 8-byte
+ // instruction could be split by the 32-byte fetch window boundary.
+ // See getPrefLoopAlignment() for context.
+ if (needsFetchWindowAlignment(MBB))
+ return 4;
+ return TargetLowering::getMaxPermittedBytesForAlignment(MBB);
+}
+
+bool SITargetLowering::needsFetchWindowAlignment(
+ const MachineBasicBlock *MBB) const {
----------------
arsenm wrote:
```suggestion
const MachineBasicBlock &MBB) const {
```
https://github.com/llvm/llvm-project/pull/181999
More information about the llvm-commits
mailing list