[PATCH] D143283: [AArch64][SVE]: custom lower AVGFloor/AVGCeil.

Dave Green via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Feb 24 02:39:51 PST 2023


dmgreen added a comment.

Hello. Sorry for the delay in looking at this but I wasn't sure exactly what you were trying to do, and I've never been a huge fan of DAG combines that create the wrong node just to expand it later. It looks like for legal types this can lead to a nice decrease in instruction count though.

For smaller types I'm not sure that checking for individual opcodes for extension will work well. They could already be extending loads for example. I've not thought about it too much yet, but As far as I understand from the original hadd (/avg) work it would probably be better to be checking that the top bits are known 0 for unsigned, and that there are >1 sign bits for the signed cases.

In any case, it would be good to see alive proofs for the transforms you are making.



================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:6090
   case ISD::AVGFLOORS:
-    return LowerToPredicatedOp(Op, DAG, AArch64ISD::HADDS_PRED);
+    if (Subtarget->hasSVE2())
+      return LowerToPredicatedOp(Op, DAG, AArch64ISD::HADDS_PRED);
----------------
This can be moved into the start of LowerAVG to make this function a little more regular. (I know it's not super regular already, but other parts already do the same thing).


================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:13391
+    if (ISD::isConstantSplatVector(Node.getOperand(1).getNode(), SplatVal))
+      if (SplatVal.isMask() && SplatVal.countPopulation() <
+                                   Node->getValueType(0).getScalarSizeInBits())
----------------
Can it use countTrailingOnes as opposed to countPopulation? Although I think these functions might be better removed and use computeKnownBits checks instead.


================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:13405
+
+SDValue AArch64TargetLowering::LowerAVGFloor_AVGCeil(SDValue Node,
+                                                     SelectionDAG &DAG) const {
----------------
This can just be called LowerAVG


================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:13427
+
+  // check if it's better to emit the original code:
+  if (IsZeroExtended(OpA)) {
----------------
check should be Check


================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:13428
+  // check if it's better to emit the original code:
+  if (IsZeroExtended(OpA)) {
+    // in this case omiting the original code is better than custom lowering
----------------
Why is it OK to only check one of the operands?


================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:13437
+
+  else if (IsSignExtended(OpA)) {
+    // in this case omiting the original code is better than custom lowering
----------------
Doesn't need the else after the return, which might help simplify if this is checking the sign bits.


================
Comment at: llvm/test/CodeGen/AArch64/sve-avg-floor-ceil.ll:3
 ; RUN: llc < %s -mtriple=aarch64-none-linux-gnu  -mattr=+sve | FileCheck %s -check-prefixes=CHECK,SVE
 ; RUN: llc < %s -mtriple=aarch64-none-linux-gnu  -mattr=+sve2 | FileCheck %s -check-prefixes=CHECK,SVE2
 ;
----------------
There is already a test for hadds in sve in sve2-hadd. This test looks like a copy of it. It would probably be better to just use the existing test with an extra run line for SVE1 targets.


================
Comment at: llvm/test/CodeGen/AArch64/sve-avg_floor_ceil.ll:4
+
+define <vscale x 2 x i64> @hadds_v2i64(<vscale x 2 x i64> %s0, <vscale x 2 x i64> %s1) {
+; CHECK-LABEL: hadds_v2i64:
----------------
hassnaa-arm wrote:
> sdesmalen wrote:
> > Can you add these tests to the precursory patch, so that this patch only shows the diff?
> Adding them cause a crash.
> Crash message is: "LLVM ERROR: Scalarization of scalable vectors is not supported."
> It crashes while legalising the result type of nxv2i128
Like Eli said it would be good to fix vscale x 1 vectorization so this wasn't a problem any more.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D143283/new/

https://reviews.llvm.org/D143283



More information about the llvm-commits mailing list