[PATCH] D58015: [SelectionDAG][AArch64] Legalize VECREDUCE

Sander de Smalen via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Mar 11 04:27:13 PDT 2019


sdesmalen accepted this revision.
sdesmalen added a comment.
This revision is now accepted and ready to land.

Thanks for making these changes, I think the patch looks good now!



================
Comment at: lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp:4264
+  unsigned OldElts = N->getOperand(0).getValueType().getVectorNumElements();
+  return TLI.expandVecReduce(Reduce.getNode(), DAG, OldElts);
+}
----------------
nikic wrote:
> sdesmalen wrote:
> > Do we want to expand the operation here, or just widen the input operand (inserted with zeros as you suggest in the comment)?
> > If a target just wants the operation to be widened, but does have support for reductions on the widened vectors, we probably prefer that over the default lowering that expandVecReduce provides.
> I agree that this is the better approach in general and have implemented it now. It's slightly more complicated than just padding with zeros, as the neutral element is different for the different reductions.
> 
> Doing this gives some pretty bad code for cases like `v1i8`, where we end up inserting 7 zero elements before performing the reduction. To avoid this, I've added a DAGCombine to convert a VECREDUCE of one element into an EXTRACT_VECTOR_ELT, but generating good code for cases like `v9i8` would need more work.
Thanks for this!


================
Comment at: llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp:3918
+  // The VECREDUCE result size may be larger than the element size, so
+  // we can simplify change the result type.
+  SDLoc dl(N);
----------------
nit: simplify change -> simply change?


================
Comment at: llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp:4307
+  unsigned WideElts = WideVT.getVectorNumElements();
+  for (unsigned Idx = OrigElts; Idx < WideElts; Idx++)
+    Op = DAG.getNode(ISD::INSERT_VECTOR_ELT, dl, WideVT, Op, NeutralElem,
----------------
I wanted to suggest doing a build_vector of all NeutralElems, and then a INSERT_SUBVECTOR, so you can benefit from cheap 'splat immediate' instructions, but unfortunately you then run into missing pieces of legalisation of INSERT_SUBVECTOR.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D58015/new/

https://reviews.llvm.org/D58015





More information about the llvm-commits mailing list