[llvm] [AMDGPU] Decrease default NSA threshold from 3 to 2 (PR #116624)
Jay Foad via llvm-commits
llvm-commits at lists.llvm.org
Mon Nov 18 22:55:43 PST 2024
================
@@ -154,10 +152,10 @@ define <4 x float> @waterfall_loop(<8 x i32> %vgpr_srd) {
; CHECK-NEXT: v_readlane_b32 s17, v16, 1
; CHECK-NEXT: v_readlane_b32 s18, v16, 2
; CHECK-NEXT: v_readlane_b32 s19, v16, 3
-; CHECK-NEXT: buffer_load_dword v0, off, s[0:3], s32 offset:4 ; 4-byte Folded Reload
-; CHECK-NEXT: buffer_load_dword v1, off, s[0:3], s32 offset:8 ; 4-byte Folded Reload
+; CHECK-NEXT: buffer_load_dword v0, off, s[0:3], s32 offset:8 ; 4-byte Folded Reload
+; CHECK-NEXT: buffer_load_dword v1, off, s[0:3], s32 offset:4 ; 4-byte Folded Reload
; CHECK-NEXT: s_waitcnt vmcnt(0)
-; CHECK-NEXT: image_sample v0, v[0:1], s[8:15], s[16:19] dmask:0x1 dim:SQ_RSRC_IMG_2D
+; CHECK-NEXT: image_sample v0, [v0, v1], s[8:15], s[16:19] dmask:0x1 dim:SQ_RSRC_IMG_2D
----------------
jayfoad wrote:
SIShrinkInstructions would convert this to sequential form post-regalloc, but the test is using -O0 so it is not run.
https://github.com/llvm/llvm-project/pull/116624
More information about the llvm-commits
mailing list