[llvm] [RISCV] Fold (fmv_x_h/w (load)) to an integer load. (PR #109900)
Philip Reames via llvm-commits
llvm-commits at lists.llvm.org
Wed Sep 25 07:49:09 PDT 2024
================
@@ -516,41 +516,33 @@ define void @fabs_v8f16(ptr %x) {
; ZVFHMIN-RV32-NEXT: vle16.v v8, (a0)
; ZVFHMIN-RV32-NEXT: mv a1, sp
; ZVFHMIN-RV32-NEXT: vse16.v v8, (a1)
-; ZVFHMIN-RV32-NEXT: flh fa5, 2(sp)
-; ZVFHMIN-RV32-NEXT: flh fa4, 0(sp)
-; ZVFHMIN-RV32-NEXT: flh fa3, 4(sp)
-; ZVFHMIN-RV32-NEXT: fmv.x.h a1, fa5
-; ZVFHMIN-RV32-NEXT: fmv.x.h a2, fa4
-; ZVFHMIN-RV32-NEXT: lui a3, 8
-; ZVFHMIN-RV32-NEXT: fmv.x.h a4, fa3
-; ZVFHMIN-RV32-NEXT: flh fa5, 6(sp)
-; ZVFHMIN-RV32-NEXT: addi a3, a3, -1
-; ZVFHMIN-RV32-NEXT: and a2, a2, a3
-; ZVFHMIN-RV32-NEXT: vmv.v.x v8, a2
-; ZVFHMIN-RV32-NEXT: fmv.x.h a2, fa5
-; ZVFHMIN-RV32-NEXT: flh fa5, 10(sp)
-; ZVFHMIN-RV32-NEXT: and a1, a1, a3
+; ZVFHMIN-RV32-NEXT: lhu a1, 2(sp)
+; ZVFHMIN-RV32-NEXT: lui a2, 8
+; ZVFHMIN-RV32-NEXT: lhu a3, 0(sp)
+; ZVFHMIN-RV32-NEXT: addi a2, a2, -1
+; ZVFHMIN-RV32-NEXT: and a1, a1, a2
+; ZVFHMIN-RV32-NEXT: lhu a4, 4(sp)
+; ZVFHMIN-RV32-NEXT: and a3, a3, a2
+; ZVFHMIN-RV32-NEXT: vmv.v.x v8, a3
; ZVFHMIN-RV32-NEXT: vslide1down.vx v8, v8, a1
-; ZVFHMIN-RV32-NEXT: and a4, a4, a3
-; ZVFHMIN-RV32-NEXT: fmv.x.h a1, fa5
-; ZVFHMIN-RV32-NEXT: flh fa5, 8(sp)
+; ZVFHMIN-RV32-NEXT: and a4, a4, a2
+; ZVFHMIN-RV32-NEXT: lhu a1, 6(sp)
; ZVFHMIN-RV32-NEXT: vslide1down.vx v8, v8, a4
-; ZVFHMIN-RV32-NEXT: and a2, a2, a3
-; ZVFHMIN-RV32-NEXT: vslide1down.vx v8, v8, a2
-; ZVFHMIN-RV32-NEXT: fmv.x.h a2, fa5
-; ZVFHMIN-RV32-NEXT: flh fa5, 12(sp)
-; ZVFHMIN-RV32-NEXT: and a1, a1, a3
-; ZVFHMIN-RV32-NEXT: and a2, a2, a3
-; ZVFHMIN-RV32-NEXT: vmv.v.x v9, a2
-; ZVFHMIN-RV32-NEXT: fmv.x.h a2, fa5
-; ZVFHMIN-RV32-NEXT: flh fa5, 14(sp)
+; ZVFHMIN-RV32-NEXT: lhu a3, 10(sp)
+; ZVFHMIN-RV32-NEXT: lhu a4, 8(sp)
+; ZVFHMIN-RV32-NEXT: and a1, a1, a2
----------------
preames wrote:
Off topic, but we might be able to improve the and chains here.
1) With bclri, I think we can avoid the constant materialization.
2) It looks like we do this for every lane, and could possible move that into a vector op.
https://github.com/llvm/llvm-project/pull/109900
More information about the llvm-commits
mailing list