[llvm] [AMDGPU] Implement vop3p complex pattern optmization for gisel (PR #130234)
Diana Picus via llvm-commits
llvm-commits at lists.llvm.org
Fri Mar 21 03:01:02 PDT 2025
================
@@ -68,8 +68,7 @@ define float @v_fdot2_neg_c(<2 x half> %a, <2 x half> %b, float %c) {
; GFX906-LABEL: v_fdot2_neg_c:
; GFX906: ; %bb.0:
; GFX906-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
-; GFX906-NEXT: v_xor_b32_e32 v2, 0x80000000, v2
-; GFX906-NEXT: v_dot2_f32_f16 v0, v0, v1, v2
+; GFX906-NEXT: v_dot2_f32_f16 v0, v0, v1, v2 neg_lo:[0,0,1] neg_hi:[0,0,1]
----------------
rovka wrote:
Why do we negate both halves? The IR is only doing fneg on a float, not on <2 x half>.
https://github.com/llvm/llvm-project/pull/130234
More information about the llvm-commits
mailing list