[PATCH] D123782: [AArch64] Generate AND in place of CSEL for Table Based CTTZ lowering in -O3

Fri Apr 22 03:51:08 PDT 2022

dmgreen added a comment.

>> As far as I understand this should be testing for select (icmp eq X, 0), 0, cttz X and converting it to and(cttz X, #bw-1). If so there are more conditions that need to be checked.
>
> What other conditions do you think should be added other than checking for 0 and cttz?

It looks like this checks for the select/CSEL, and the icmp/SUBS with a 0, but not the "eq" yet, and not the "0" in the select/csel. It also doesn't check that the X in the icmp and the X in the cttz are the same?

It would be worth adding test cases for the cases it should not fold.

================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:17460
+  SDValue N2 = N->getOperand(2);
+  bool isZero = cast<ConstantSDNode>(N2.getNode())->isZero();
+  if (N1.getOpcode() == ISD::CTTZ && isZero) {
----------------
craig.topper wrote:
> rahular-rrlogic wrote:
> > dmgreen wrote:
> > > As far as I can tell this is checking that the condition code is 0? It would be better to check that it is equal to AArch64CC::EQ.
> > No, this is a mistake. I was intending to check if the operands are 0 and cttz. I will change that. Is a check for the condition being EQ really required, though?
> There's a function called `isNullConstant` that can be used here.
It should probably be something like this, if the value is a condition code.
```
  AArch64CC CC = N->getConstantOperandVal(2);
  if (CC == AArch64CC::EQ)
```

It sounds good to use it for the zero of the sub though.

================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:17464-17465
+    if (N1.getOpcode() == ISD::CTTZ) {
+      SDValue NumBitsMinusOne =
+          DAG.getConstant(31, SDLoc(N), N1.getValueType());
+      return DAG.getNode(ISD::AND, SDLoc(N), N1.getValueType(), N1,
----------------
Why is it hard-coding 31, as opposed to checking the size of the type? Why can't we get here with an i64?

================
Comment at: llvm/test/CodeGen/AArch64/table-based-cttz.ll:27
+  %1 = icmp eq i64 %x, 0
+  %2 = trunc i64 %0 to i32
+  %3 = select i1 %1, i32 0, i32 %2
----------------
Can you add an i64 case without the truncate down to an i32?

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D123782/new/

https://reviews.llvm.org/D123782