[PATCH] D159283: Add intrinsic to count trailing zero elements in a vector
Paul Walker via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Oct 27 03:28:36 PDT 2023
paulwalker-arm added a comment.
I few comments but otherwise this patch looks good. I agree with not making `CTTZ_ELTS` common, it's really just a target specific convenience and given there's no during code generation use case I don't think it's worth the cost of having to implement all the plumbing required for a common node. We can cross that bridge if/when the operation proves its worth.
================
Comment at: llvm/docs/LangRef.rst:18347-18348
+This is an overloaded intrinsic. You can use ```llvm.experimental.cttz.elts```
+on any vector of integer elements, both fixed width and scalable. Not all
+targets support all bit widths or vector types, however.
+
----------------
Is this last part true? The cases not directly handled by the target are converted to common DAG nodes so there's nothing here that should be functionally broken for other targets?
================
Comment at: llvm/docs/LangRef.rst:18377
+The '``llvm.experimental.cttz.elts``' intrinsic counts the trailing (least
+significant) zero elements in a vector. If ``src == 0`` then the result is
+the number of elements in the input vector.
----------------
I think you can drop the 'then'?
================
Comment at: llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp:7525
+ // of the result is VF-1.
+ if (cast<ConstantSDNode>(getValue(I.getOperand(1)))->getZExtValue() != 0)
+ CR = CR.subtract(APInt(64, 1));
----------------
Perhaps just `!cast<ConstantInt>(I.getOperand(1))->isZero()`?
================
Comment at: llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp:7550
+ EVT RetTy = TLI.getValueType(DAG.getDataLayout(), I.getType());
+ SDValue Ret = DAG.getNode(ISD::ZERO_EXTEND, DL, RetTy, Sub);
+
----------------
For the case where `I.getType()->getScalarSizeInBits()` is a non-power-of-two we're going to round up to the next power-of-two bit-width and so I think it's possible for `RetTy` to be smaller than `NewEltTy`? If true this should use `getZExtOrTrunc()`.
================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:1793
+bool AArch64TargetLowering::shouldExpandCttzElements(EVT VT) const {
+ if (!Subtarget->hasSVE() || VT != MVT::nxv16i1)
+ return true;
----------------
Can this be `hasSVEorSME()`?
This function could be written as
```
return !Subtarget->hasSVE() || VT != MVT::nxv16i1;
```
================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:5352-5353
+
+ if (Ty == MVT::i64)
+ return NewCttzElts;
+
----------------
Up to you but you likely don't need this given `getZExtOrTrunc` will do the right thing (i.e. nothing) for a nop cast.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D159283/new/
https://reviews.llvm.org/D159283
More information about the llvm-commits
mailing list