[llvm] [SelectionDAG] Handle `fneg`/`fabs`/`fcopysign` in `SimplifyDemandedBits` (PR #139239)
Matt Arsenault via llvm-commits
llvm-commits at lists.llvm.org
Fri May 9 03:45:01 PDT 2025
================
@@ -776,20 +776,16 @@ define <2 x half> @add_select_fabs_negk_negk_v2f16(<2 x i32> %c, <2 x half> %x)
; CI-LABEL: add_select_fabs_negk_negk_v2f16:
; CI: ; %bb.0:
; CI-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
-; CI-NEXT: v_cmp_eq_u32_e32 vcc, 0, v0
-; CI-NEXT: v_cndmask_b32_e64 v0, -1.0, -2.0, vcc
-; CI-NEXT: v_cmp_eq_u32_e32 vcc, 0, v1
-; CI-NEXT: v_cndmask_b32_e64 v1, -1.0, -2.0, vcc
; CI-NEXT: v_cvt_f16_f32_e32 v3, v3
; CI-NEXT: v_cvt_f16_f32_e32 v2, v2
-; CI-NEXT: v_cvt_f16_f32_e32 v0, v0
-; CI-NEXT: v_cvt_f16_f32_e32 v1, v1
+; CI-NEXT: v_cmp_eq_u32_e32 vcc, 0, v1
+; CI-NEXT: v_cndmask_b32_e64 v1, -1.0, -2.0, vcc
; CI-NEXT: v_cvt_f32_f16_e32 v3, v3
; CI-NEXT: v_cvt_f32_f16_e32 v2, v2
-; CI-NEXT: v_cvt_f32_f16_e64 v0, |v0|
-; CI-NEXT: v_cvt_f32_f16_e64 v1, |v1|
-; CI-NEXT: v_add_f32_e32 v0, v0, v2
-; CI-NEXT: v_add_f32_e32 v1, v1, v3
+; CI-NEXT: v_cmp_eq_u32_e32 vcc, 0, v0
+; CI-NEXT: v_cndmask_b32_e64 v0, -1.0, -2.0, vcc
+; CI-NEXT: v_sub_f32_e32 v1, v3, v1
+; CI-NEXT: v_sub_f32_e32 v0, v2, v0
----------------
arsenm wrote:
For AMDGPU we can commute fsub by switching to add and using a source modifier at the cost of encoding size (although I don't think we implement that particular fold). We also don't have any memory folding
https://github.com/llvm/llvm-project/pull/139239
More information about the llvm-commits
mailing list