[llvm] [AMDGPU] Lazily emit waitcnts on function entry (PR #73122)
Matt Arsenault via llvm-commits
llvm-commits at lists.llvm.org
Tue Nov 28 06:00:15 PST 2023
================
@@ -4,8 +4,8 @@
define float @test_fmed3_f32_known_nnan_ieee_true(float %a) #0 {
; GFX10-LABEL: test_fmed3_f32_known_nnan_ieee_true:
; GFX10: ; %bb.0:
-; GFX10-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
; GFX10-NEXT: v_mul_f32_e64 v0, v0, 2.0 clamp
+; GFX10-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
----------------
arsenm wrote:
We can't guarantee that v0 isn't being written by a pending load
https://github.com/llvm/llvm-project/pull/73122
More information about the llvm-commits
mailing list