[llvm] [AMDGPU] Lazily emit waitcnts on function entry (PR #73122)

Matt Arsenault via llvm-commits llvm-commits at lists.llvm.org
Tue Nov 28 06:00:15 PST 2023


================
@@ -4,8 +4,8 @@
 define float @test_fmed3_f32_known_nnan_ieee_true(float %a) #0 {
 ; GFX10-LABEL: test_fmed3_f32_known_nnan_ieee_true:
 ; GFX10:       ; %bb.0:
-; GFX10-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
 ; GFX10-NEXT:    v_mul_f32_e64 v0, v0, 2.0 clamp
+; GFX10-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
----------------
arsenm wrote:

We can't guarantee that v0 isn't being written by a pending load 

https://github.com/llvm/llvm-project/pull/73122


More information about the llvm-commits mailing list