[PATCH] D118343: [DAGCombiner] Fix invalid size request in combineRepeatedFPDivisors
Cullen Rhodes via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Jan 27 03:15:46 PST 2022
c-rhodes created this revision.
c-rhodes added reviewers: sdesmalen, david-arm, dmgreen.
Herald added subscribers: ecnelises, pengfei, hiraditya, kristof.beyls.
c-rhodes requested review of this revision.
Herald added a project: LLVM.
For AArch64 the combine kicks in for the <vscale x 4 x float> case since
it's above the fdiv threshold (3), but the codegen is worse (splat +
vector fdiv + vector fmul) than the <vscale x 2 x double> case (splat +
vector fdiv).
If the combine can be converted into a scalar FP division by
scalarizeBinOpOfSplats it may be cheaper, but it looks like this is
predicated on the isExtractVecEltCheap TLI function which is implemented
for x86 but not AArch64. Perhaps combineRepeatedFPDivisors should bail
out for vectors unless it can be combined into scalar FP division.
Repository:
rG LLVM Github Monorepo
https://reviews.llvm.org/D118343
Files:
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
llvm/test/CodeGen/AArch64/fdiv-combine-vec.ll
Index: llvm/test/CodeGen/AArch64/fdiv-combine-vec.ll
===================================================================
--- /dev/null
+++ llvm/test/CodeGen/AArch64/fdiv-combine-vec.ll
@@ -0,0 +1,36 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mtriple=aarch64 < %s | FileCheck %s
+
+define <vscale x 4 x float> @splat_fdiv_nxv4f32(<vscale x 4 x float> %vx, float %y) #0 {
+; CHECK-LABEL: splat_fdiv_nxv4f32:
+; CHECK: // %bb.0: // %entry
+; CHECK-NEXT: // kill: def $s1 killed $s1 def $z1
+; CHECK-NEXT: fmov z2.s, #1.00000000
+; CHECK-NEXT: ptrue p0.s
+; CHECK-NEXT: mov z1.s, s1
+; CHECK-NEXT: fdivr z1.s, p0/m, z1.s, z2.s
+; CHECK-NEXT: fmul z0.s, z0.s, z1.s
+; CHECK-NEXT: ret
+entry:
+ %vy.ins = insertelement <vscale x 4 x float> poison, float %y, i64 0
+ %splat = shufflevector <vscale x 4 x float> %vy.ins, <vscale x 4 x float> poison, <vscale x 4 x i32> zeroinitializer
+ %div = fdiv fast <vscale x 4 x float> %vx, %splat
+ ret <vscale x 4 x float> %div
+}
+
+define <vscale x 2 x double> @splat_fdiv_nxv2f64(<vscale x 2 x double> %vx, double %y) #0 {
+; CHECK-LABEL: splat_fdiv_nxv2f64:
+; CHECK: // %bb.0: // %entry
+; CHECK-NEXT: // kill: def $d1 killed $d1 def $z1
+; CHECK-NEXT: ptrue p0.d
+; CHECK-NEXT: mov z1.d, d1
+; CHECK-NEXT: fdiv z0.d, p0/m, z0.d, z1.d
+; CHECK-NEXT: ret
+entry:
+ %vy.ins = insertelement <vscale x 2 x double> poison, double %y, i64 0
+ %splat = shufflevector <vscale x 2 x double> %vy.ins, <vscale x 2 x double> poison, <vscale x 2 x i32> zeroinitializer
+ %div = fdiv fast <vscale x 2 x double> %vx, %splat
+ ret <vscale x 2 x double> %div
+}
+
+attributes #0 = { "target-features"="+sve" }
Index: llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
===================================================================
--- llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+++ llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
@@ -14537,7 +14537,7 @@
unsigned NumElts = 1;
EVT VT = N->getValueType(0);
if (VT.isVector() && DAG.isSplatValue(N1))
- NumElts = VT.getVectorNumElements();
+ NumElts = VT.getVectorMinNumElements();
if (!MinUses || (N1->use_size() * NumElts) < MinUses)
return SDValue();
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D118343.403570.patch
Type: text/x-patch
Size: 2256 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20220127/952490a7/attachment.bin>
More information about the llvm-commits
mailing list