[PATCH] D133116: [AArch64][SVE] Optimise repeated floating-point complex patterns to splat

Matt Devereau via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Nov 10 07:53:32 PST 2022


MattDevereau added inline comments.


================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:20053
+      Sequence.push_back(InsertVecElt.getOperand(1));
+      InsertVecElt = InsertVecElt.getOperand(0);
+    } else if (InsertVecElt.getOpcode() == ISD::SCALAR_TO_VECTOR) {
----------------
peterwaller-arm wrote:
> Nothing here considers the insertion index of the insert_vector_elt, and although there could be a canonicalization which puts them in order, I'm not certain can rely on the order of the operations to give you the full sequence (e.g. what happens if some insertions are missing?).
I've added logic to check the ordering with `NumElements`. I've added tests `dupq_f32_repeat_complex_unordered`, `dupq_f32_repeat_complex_rev_unordered`and `dupq_f16_repeat_complex_omit_pair` to assert this


================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:20064
+  for (unsigned i = 0; i < Sequence.size() / 2; i++) {
+    if (Sequence[i] != Sequence[i + 2])
+      return SDValue();
----------------
peterwaller-arm wrote:
> This condition doesn't look right: it doesn't consider all of the elements of the sequence, only N/2+2 of them.
I've changed this to `RSequence.size() - 2;`.  I've added tests `dupq_f16_repeat_complex_mismatched_front`, `dupq_f16_repeat_complex_mismatched_end` and `dupq_f16_repeat_complex_mismatched_middle` to assert this.


================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:20078
+      DAG.getNode(ISD::INSERT_VECTOR_ELT, DL, VecTy, DAG.getUNDEF(VecTy),
+                  Sequence[0], DAG.getConstant(1, DL, MVT::i64));
+  SDValue WideVecBitcast =
----------------
peterwaller-arm wrote:
> This considers Sequence[0] but not Sequence[1], so I believe it's getting the right codegen by coincidence.
> 
> Also worth highlighting in a comment that Sequence is in reversed order.
I think the updated tests now reflect this with 2 kill messages:
`; CHECK-NEXT:    // kill: def $s0 killed $s0 def $z0`
`; CHECK-NEXT:    // kill: def $s1 killed $s1 def $q1`


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D133116/new/

https://reviews.llvm.org/D133116



More information about the llvm-commits mailing list