[PATCH] D159283: Add intrinsic to count trailing zero elements in a vector

Paul Walker via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Oct 27 03:28:36 PDT 2023


paulwalker-arm added a comment.

I few comments but otherwise this patch looks good.  I agree with not making `CTTZ_ELTS` common, it's really just a target specific convenience and given there's no during code generation use case I don't think it's worth the cost of having to implement all the plumbing required for a common node.  We can cross that bridge if/when the operation proves its worth.



================
Comment at: llvm/docs/LangRef.rst:18347-18348
+This is an overloaded intrinsic. You can use ```llvm.experimental.cttz.elts```
+on any vector of integer elements, both fixed width and scalable. Not all
+targets support all bit widths or vector types, however.
+
----------------
Is this last part true?  The cases not directly handled by the target are converted to common DAG nodes so there's nothing here that should be functionally broken for other targets?


================
Comment at: llvm/docs/LangRef.rst:18377
+The '``llvm.experimental.cttz.elts``' intrinsic counts the trailing (least
+significant) zero elements in a vector. If ``src == 0`` then the result is
+the number of elements in the input vector.
----------------
I think you can drop the 'then'?


================
Comment at: llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp:7525
+    // of the result is VF-1.
+    if (cast<ConstantSDNode>(getValue(I.getOperand(1)))->getZExtValue() != 0)
+      CR = CR.subtract(APInt(64, 1));
----------------
Perhaps just `!cast<ConstantInt>(I.getOperand(1))->isZero()`?


================
Comment at: llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp:7550
+    EVT RetTy = TLI.getValueType(DAG.getDataLayout(), I.getType());
+    SDValue Ret = DAG.getNode(ISD::ZERO_EXTEND, DL, RetTy, Sub);
+
----------------
For the case where `I.getType()->getScalarSizeInBits()` is a non-power-of-two we're going to round up to the next power-of-two bit-width and so I think it's possible for `RetTy` to be smaller than `NewEltTy`?  If true this should use `getZExtOrTrunc()`.


================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:1793
+bool AArch64TargetLowering::shouldExpandCttzElements(EVT VT) const {
+  if (!Subtarget->hasSVE() || VT != MVT::nxv16i1)
+    return true;
----------------
Can this be `hasSVEorSME()`?

This function could be written as
```
return !Subtarget->hasSVE() || VT != MVT::nxv16i1;
```


================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:5352-5353
+
+    if (Ty == MVT::i64)
+      return NewCttzElts;
+
----------------
Up to you but you likely don't need this given `getZExtOrTrunc` will do the right thing (i.e. nothing) for a nop cast.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D159283/new/

https://reviews.llvm.org/D159283



More information about the llvm-commits mailing list