[llvm] [AMDGPU] Lazily emit waitcnts on function entry (PR #73122)

Jay Foad via llvm-commits llvm-commits at lists.llvm.org
Tue Dec 5 09:03:05 PST 2023


================
@@ -4,8 +4,8 @@
 define float @test_fmed3_f32_known_nnan_ieee_true(float %a) #0 {
 ; GFX10-LABEL: test_fmed3_f32_known_nnan_ieee_true:
 ; GFX10:       ; %bb.0:
-; GFX10-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
 ; GFX10-NEXT:    v_mul_f32_e64 v0, v0, 2.0 clamp
+; GFX10-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
----------------
jayfoad wrote:

Good point. This was a major thinko. I am now marking live-ins to the entry block as possibly depending on any of the counters.

https://github.com/llvm/llvm-project/pull/73122


More information about the llvm-commits mailing list