[llvm] [AMDGPU] Lazily emit waitcnts on function entry (PR #73122)
Jay Foad via llvm-commits
llvm-commits at lists.llvm.org
Tue Dec 5 09:03:05 PST 2023
================
@@ -4,8 +4,8 @@
define float @test_fmed3_f32_known_nnan_ieee_true(float %a) #0 {
; GFX10-LABEL: test_fmed3_f32_known_nnan_ieee_true:
; GFX10: ; %bb.0:
-; GFX10-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
; GFX10-NEXT: v_mul_f32_e64 v0, v0, 2.0 clamp
+; GFX10-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
----------------
jayfoad wrote:
Good point. This was a major thinko. I am now marking live-ins to the entry block as possibly depending on any of the counters.
https://github.com/llvm/llvm-project/pull/73122
More information about the llvm-commits
mailing list