[llvm] [AArch64] Allow lowering of more types to GET_ACTIVE_LANE_MASK (PR #140062)
David Sherwood via llvm-commits
llvm-commits at lists.llvm.org
Thu May 15 08:06:02 PDT 2025
================
@@ -3248,6 +3251,22 @@ void DAGTypeLegalizer::SplitVecRes_PARTIAL_REDUCE_MLA(SDNode *N, SDValue &Lo,
Hi = DAG.getNode(Opcode, DL, ResultVT, AccHi, Input1Hi, Input2Hi);
}
+void DAGTypeLegalizer::SplitVecRes_GET_ACTIVE_LANE_MASK(SDNode *N, SDValue &Lo,
+ SDValue &Hi) {
+ SDLoc DL(N);
+ SDValue Op0 = N->getOperand(0);
+ SDValue Op1 = N->getOperand(1);
+ EVT OpVT = Op0.getValueType();
+
+ EVT LoVT, HiVT;
+ std::tie(LoVT, HiVT) = DAG.GetSplitDestVTs(N->getValueType(0));
+
+ Lo = DAG.getNode(ISD::GET_ACTIVE_LANE_MASK, DL, LoVT, Op0, Op1);
+ SDValue LoElts = DAG.getElementCount(DL, OpVT, LoVT.getVectorElementCount());
+ SDValue HiStartVal = DAG.getNode(ISD::UADDSAT, DL, OpVT, Op0, LoElts);
----------------
david-arm wrote:
According to the documentation for the node:
```
// GET_ACTIVE_LANE_MASK - this corrosponds to the llvm.get.active.lane.mask
// intrinsic. It creates a mask representing active and inactive vector
// lanes, active while Base + index < Trip Count. As with the intrinsic,
// the operands Base and Trip Count have the same scalar integer type and
// the internal addition of Base + index cannot overflow. However, the ISD
// node supports result types which are wider than i1, where the high
// bits conform to getBooleanContents similar to the SETCC operator.
```
I assume here that `the internal addition of Base + index cannot overflow` is a statement that we have to generate appropriate code to ensure it does not overflow because the operation requires it. As opposed to it being a guarantee that it cannot overflow? If it's the former then UADDSAT makes sense, but if it's the latter then presumably we don't need the UADDSAT?
https://github.com/llvm/llvm-project/pull/140062
More information about the llvm-commits
mailing list