[llvm] [ARM] Switch to soft promoting half types. (PR #80440)
Harald van Dijk via llvm-commits
llvm-commits at lists.llvm.org
Fri Feb 2 13:40:01 PST 2024
================
@@ -2,78 +2,113 @@
define arm_aapcs_vfpcc { <8 x half>, <8 x half> } @f1() {
; CHECK-LABEL: _f1
-; CHECK: vpush {d8}
-; CHECK-NEXT: vmov.f64 d8, #5.000000e-01
-; CHECK-NEXT: vmov.i32 d8, #0x0
-; CHECK-NEXT: vmov.i32 d0, #0x0
-; CHECK-NEXT: vmov.i32 d1, #0x0
-; CHECK-NEXT: vmov.i32 d2, #0x0
-; CHECK-NEXT: vmov.i32 d3, #0x0
-; CHECK-NEXT: vmov.i32 d4, #0x0
-; CHECK-NEXT: vmov.i32 d5, #0x0
-; CHECK-NEXT: vmov.i32 d6, #0x0
-; CHECK-NEXT: vmov.i32 d7, #0x0
-; CHECK-NEXT: vmov.f32 s1, s16
-; CHECK-NEXT: vmov.f32 s3, s16
-; CHECK-NEXT: vmov.f32 s5, s16
-; CHECK-NEXT: vmov.f32 s7, s16
-; CHECK-NEXT: vmov.f32 s9, s16
-; CHECK-NEXT: vmov.f32 s11, s16
-; CHECK-NEXT: vmov.f32 s13, s16
-; CHECK-NEXT: vmov.f32 s15, s16
-; CHECK-NEXT: vpop {d8}
+; CHECK: vpush {d8, d9, d10, d11}
+; CHECK-NEXT: vmov.i32 q8, #0x0
+; CHECK-NEXT: vmov.u16 r0, d16[0]
+; CHECK-NEXT: vmov d4, r0, r0
+; CHECK-NEXT: vmov.u16 r0, d16[1]
+; CHECK-NEXT: vmov d8, r0, r0
+; CHECK-NEXT: vmov.u16 r0, d16[2]
+; CHECK-NEXT: vmov d5, r0, r0
+; CHECK-NEXT: vmov.u16 r0, d16[3]
+; CHECK-NEXT: vmov d9, r0, r0
+; CHECK-NEXT: vmov.u16 r0, d17[0]
+; CHECK-NEXT: vmov d6, r0, r0
+; CHECK-NEXT: vmov.u16 r0, d17[1]
+; CHECK-NEXT: vmov d10, r0, r0
+; CHECK-NEXT: vmov.u16 r0, d17[2]
+; CHECK-NEXT: vmov d7, r0, r0
+; CHECK-NEXT: vmov.u16 r0, d17[3]
+; CHECK-NEXT: vmov d11, r0, r0
+; CHECK: vmov.f32 s0, s8
+; CHECK: vmov.f32 s1, s16
+; CHECK: vmov.f32 s2, s10
+; CHECK: vmov.f32 s3, s18
+; CHECK: vmov.f32 s4, s12
+; CHECK: vmov.f32 s5, s20
+; CHECK: vmov.f32 s6, s14
+; CHECK: vmov.f32 s7, s22
+; CHECK: vmov.f32 s9, s16
+; CHECK: vmov.f32 s11, s18
+; CHECK: vmov.f32 s13, s20
+; CHECK: vmov.f32 s15, s22
+; CHECK: vpop {d8, d9, d10, d11}
----------------
hvdijk wrote:
Oh, there's two reasons we don't optimise this. Firstly, `v8f16` is not a legal type and we only optimise legal types. Secondly, null constants are special cased as getting optimised even if the vector has multiple uses, but null FP constants are not null constants. We get better codegen with
```diff
diff --git a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
index b17724cd0720..c707de4ad94b 100644
--- a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
@@ -22241,8 +22241,7 @@ SDValue DAGCombiner::visitEXTRACT_VECTOR_ELT(SDNode *N) {
// extract_vector_elt (build_vector x, y), 1 -> y
if (((IndexC && VecOp.getOpcode() == ISD::BUILD_VECTOR) ||
- VecOp.getOpcode() == ISD::SPLAT_VECTOR) &&
- TLI.isTypeLegal(VecVT)) {
+ VecOp.getOpcode() == ISD::SPLAT_VECTOR)) {
assert((VecOp.getOpcode() != ISD::BUILD_VECTOR ||
VecVT.isFixedLengthVector()) &&
"BUILD_VECTOR used for scalable vectors");
@@ -22252,7 +22251,7 @@ SDValue DAGCombiner::visitEXTRACT_VECTOR_ELT(SDNode *N) {
EVT InEltVT = Elt.getValueType();
if (VecOp.hasOneUse() || TLI.aggressivelyPreferBuildVectorSources(VecVT) ||
- isNullConstant(Elt)) {
+ isNullConstant(Elt) || isNullFPConstant(Elt)) {
// Sometimes build_vector's scalar input types do not match result type.
if (ScalarVT == InEltVT)
return Elt;
```
But presumably there is a reason why things are done the way they are. Let me merge this PR as is, and see what improvements work as a followup PR.
https://github.com/llvm/llvm-project/pull/80440
More information about the llvm-commits
mailing list