[PATCH] D143283: [AArch64][SVE]: custom lower AVGFloor/AVGCeil.

Mon Feb 20 03:41:09 PST 2023

sdesmalen added inline comments.

================
Comment at: llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp:1034
+
+  if ((ExtOpA.getOpcode() == ISD::ZERO_EXTEND ||
+       ExtOpA.getOpcode() == ISD::SIGN_EXTEND) &&
----------------
This is probably better added as a separate DAGCombine that folds away the sign-extend-in-regs explicitly, because that is already performed by the avgfloor operation.

  avgfloors(sextinreg(x), sextinreg(y)) -> avgfloors(x, y)

================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:13402
+      Node->getOpcode() == ISD::AVGFLOORS) {
+    if (OpBC && OpBC->isOne())
+      tmp = DAG.getNode(ISD::AND, dl, VT, OpA, ConstantOne);
----------------
You seem to be trying to optimise the case where one of the operands is a vector of ones, but I don't see any motivating tests for this?
I would also expect these cases to be folded away already by existing DAGCombines.

================
Comment at: llvm/test/CodeGen/AArch64/sve-avg_floor_ceil.ll:62
 ; CHECK:       // %bb.0: // %entry
-; CHECK-NEXT:    and z0.d, z0.d, #0xffffffff
-; CHECK-NEXT:    adr z0.d, [z0.d, z1.d, uxtw]
-; CHECK-NEXT:    lsr z0.d, z0.d, #1
+; CHECK-NEXT:    lsr z2.d, z1.d, #1
+; CHECK-NEXT:    lsr z3.d, z0.d, #1
----------------
This doesn't look like an improvement, we don't really want to do this transform if it makes the resulting code worse. Do you know why this results in worse code?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D143283/new/

https://reviews.llvm.org/D143283