[PATCH] D133116: [AArch64][SVE] Optimise repeated floating-point complex patterns to splat

Peter Waller via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Nov 10 04:08:54 PST 2022


peterwaller-arm requested changes to this revision.
peterwaller-arm added inline comments.
This revision now requires changes to proceed.


================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:20053
+      Sequence.push_back(InsertVecElt.getOperand(1));
+      InsertVecElt = InsertVecElt.getOperand(0);
+    } else if (InsertVecElt.getOpcode() == ISD::SCALAR_TO_VECTOR) {
----------------
Nothing here considers the insertion index of the insert_vector_elt, and although there could be a canonicalization which puts them in order, I'm not certain can rely on the order of the operations to give you the full sequence (e.g. what happens if some insertions are missing?).


================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:20064
+  for (unsigned i = 0; i < Sequence.size() / 2; i++) {
+    if (Sequence[i] != Sequence[i + 2])
+      return SDValue();
----------------
This condition doesn't look right: it doesn't consider all of the elements of the sequence, only N/2+2 of them.


================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:20078
+      DAG.getNode(ISD::INSERT_VECTOR_ELT, DL, VecTy, DAG.getUNDEF(VecTy),
+                  Sequence[0], DAG.getConstant(1, DL, MVT::i64));
+  SDValue WideVecBitcast =
----------------
This considers Sequence[0] but not Sequence[1], so I believe it's getting the right codegen by coincidence.

Also worth highlighting in a comment that Sequence is in reversed order.


================
Comment at: llvm/test/CodeGen/AArch64/sve-intrinsics-perm-select.ll:590
 ; CHECK:       // %bb.0:
-; CHECK-NEXT:    // kill: def $s0 killed $s0 def $q0
-; CHECK-NEXT:    mov v2.16b, v0.16b
----------------
This kill indicates that q0 contains live data, and it's lost after the change. This could allow the compiler to reorder instructions ignorant of the fact that the data going into the mov via v0 is live. Looking at the code I can see you're not passing %x along, so that needs to be fixed.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D133116/new/

https://reviews.llvm.org/D133116



More information about the llvm-commits mailing list