[llvm] [AArch64] Extend efficient lowering of experimental.cttz.elts (PR #92114)
Paul Walker via llvm-commits
llvm-commits at lists.llvm.org
Fri May 17 04:33:54 PDT 2024
================
@@ -5838,9 +5840,21 @@ SDValue AArch64TargetLowering::LowerINTRINSIC_WO_CHAIN(SDValue Op,
return SDValue();
}
case Intrinsic::experimental_cttz_elts: {
- SDValue NewCttzElts =
- DAG.getNode(AArch64ISD::CTTZ_ELTS, dl, MVT::i64, Op.getOperand(1));
+ SDValue CttzOp = Op.getOperand(1);
+ EVT VT = CttzOp.getValueType();
+
+ if (!VT.isScalableVector()) {
+ // Retrieve original fixed-width vector from ISD::TRUNCATE Node.
+ assert(CttzOp.getOpcode() == ISD::TRUNCATE && "Expected ISD::TRUNCATE!");
----------------
paulwalker-arm wrote:
Sorry, I hit the wrong button and thus posted my review comments early whilst still investigating the code.
I now understand where the truncate comes from but as before, this is something that cannot be relied upon. For AArch64 `getBooleanContents` requires fixed length masks to be all zeros or all ones. Given the change to `shouldExpandCttzElements` we can be sure the operand type will be an i1 vector (which you can and should assert for) and thus I think you can just sign extend to the relevant type (see `getTypeToTransformTo`) and pass this to `convertFixedMaskToScalableVector`. As before, I'd expect this sign extend will be optimised away.
https://github.com/llvm/llvm-project/pull/92114
More information about the llvm-commits
mailing list