[llvm] [AArch64] Extend efficient lowering of experimental.cttz.elts (PR #92114)

Fri May 17 03:49:34 PDT 2024

================
@@ -5838,9 +5840,21 @@ SDValue AArch64TargetLowering::LowerINTRINSIC_WO_CHAIN(SDValue Op,
     return SDValue();
   }
   case Intrinsic::experimental_cttz_elts: {
-    SDValue NewCttzElts =
-        DAG.getNode(AArch64ISD::CTTZ_ELTS, dl, MVT::i64, Op.getOperand(1));
+    SDValue CttzOp = Op.getOperand(1);
+    EVT VT = CttzOp.getValueType();
+
+    if (!VT.isScalableVector()) {
+      // Retrieve original fixed-width vector from ISD::TRUNCATE Node.
+      assert(CttzOp.getOpcode() == ISD::TRUNCATE && "Expected ISD::TRUNCATE!");
----------------
paulwalker-arm wrote:

This is not a safe assumption to make.  Also, "masks" have a special format as dictated by `getBooleanContents` which `convertFixedMaskToScalableVector` is allowed to depend on.

Putting this together I think you should emit a suitable compare against zero to construct a mask that can be passed to `convertFixedMaskToScalableVector`.  I'd expect DAG combine to do a good job of removing this compare for the truncate scenario your expecting.

https://github.com/llvm/llvm-project/pull/92114