[llvm] [AMDGPU] Implement vop3p complex pattern optmization for gisel (PR #130234)

via llvm-commits llvm-commits at lists.llvm.org
Mon Mar 24 03:10:45 PDT 2025


================
@@ -68,8 +68,7 @@ define float @v_fdot2_neg_c(<2 x half> %a, <2 x half> %b, float %c) {
 ; GFX906-LABEL: v_fdot2_neg_c:
 ; GFX906:       ; %bb.0:
 ; GFX906-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
-; GFX906-NEXT:    v_xor_b32_e32 v2, 0x80000000, v2
-; GFX906-NEXT:    v_dot2_f32_f16 v0, v0, v1, v2
+; GFX906-NEXT:    v_dot2_f32_f16 v0, v0, v1, v2 neg_lo:[0,0,1] neg_hi:[0,0,1]
----------------
Shoreshen wrote:

Hi @rovka , to fix neg of float instead of <2 x half>:
1. Separated NEG status for HI and LO. 
2. As indicated by [this](https://alive2.llvm.org/ce/z/LuC9ve) the neg of float is equivalent to neg of higher half of <2 x half>. 
3. Cases were added (e.g. `v_fmul_v2f16_partial_neg`) for neg float

However, this case does not took effect since all LLT that is not <2 x Scalar Type> will be blocked for safety (here is c as float).

https://github.com/llvm/llvm-project/pull/130234


More information about the llvm-commits mailing list