[PATCH] D123782: [AArch64] Generate AND in place of CSEL for Table Based CTTZ lowering in -O3
Dave Green via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Apr 22 03:51:08 PDT 2022
dmgreen added a comment.
>> As far as I understand this should be testing for select (icmp eq X, 0), 0, cttz X and converting it to and(cttz X, #bw-1). If so there are more conditions that need to be checked.
>
> What other conditions do you think should be added other than checking for 0 and cttz?
It looks like this checks for the select/CSEL, and the icmp/SUBS with a 0, but not the "eq" yet, and not the "0" in the select/csel. It also doesn't check that the X in the icmp and the X in the cttz are the same?
It would be worth adding test cases for the cases it should not fold.
================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:17460
+ SDValue N2 = N->getOperand(2);
+ bool isZero = cast<ConstantSDNode>(N2.getNode())->isZero();
+ if (N1.getOpcode() == ISD::CTTZ && isZero) {
----------------
craig.topper wrote:
> rahular-rrlogic wrote:
> > dmgreen wrote:
> > > As far as I can tell this is checking that the condition code is 0? It would be better to check that it is equal to AArch64CC::EQ.
> > No, this is a mistake. I was intending to check if the operands are 0 and cttz. I will change that. Is a check for the condition being EQ really required, though?
> There's a function called `isNullConstant` that can be used here.
It should probably be something like this, if the value is a condition code.
```
AArch64CC CC = N->getConstantOperandVal(2);
if (CC == AArch64CC::EQ)
```
It sounds good to use it for the zero of the sub though.
================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:17464-17465
+ if (N1.getOpcode() == ISD::CTTZ) {
+ SDValue NumBitsMinusOne =
+ DAG.getConstant(31, SDLoc(N), N1.getValueType());
+ return DAG.getNode(ISD::AND, SDLoc(N), N1.getValueType(), N1,
----------------
Why is it hard-coding 31, as opposed to checking the size of the type? Why can't we get here with an i64?
================
Comment at: llvm/test/CodeGen/AArch64/table-based-cttz.ll:27
+ %1 = icmp eq i64 %x, 0
+ %2 = trunc i64 %0 to i32
+ %3 = select i1 %1, i32 0, i32 %2
----------------
Can you add an i64 case without the truncate down to an i32?
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D123782/new/
https://reviews.llvm.org/D123782
More information about the llvm-commits
mailing list