[PATCH] D137547: [AArch64][SVE] Use PTRUE instruction for get_active_lane_mask intrinsic if the range is appropriate for predicator constant
Cullen Rhodes via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Nov 17 04:47:02 PST 2022
c-rhodes added inline comments.
================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:4754
+ isa<ConstantSDNode>(Op.getOperand(2))) {
+ int64_t MinSVEVectorSize = Subtarget->getMinSVEVectorSizeInBits()
+ ? Subtarget->getMinSVEVectorSizeInBits()
----------------
unsigned
================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:4754-4756
+ int64_t MinSVEVectorSize = Subtarget->getMinSVEVectorSizeInBits()
+ ? Subtarget->getMinSVEVectorSizeInBits()
+ : 128;
----------------
c-rhodes wrote:
> unsigned
`std::max(Subtarget->getMinSVEVectorSizeInBits(), 128u)`
================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:4758
+ unsigned ElementSize = 128 / Op.getValueType().getVectorMinNumElements();
+ unsigned Dist = cast<ConstantSDNode>(Op.getOperand(2))->getZExtValue() -
+ cast<ConstantSDNode>(Op.getOperand(1))->getZExtValue();
----------------
c-rhodes wrote:
> `Op.getConstantOperandVal(2) - Op.getConstantOperandVal(1);`
`NumActiveElems`?
================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:4758-4759
+ unsigned ElementSize = 128 / Op.getValueType().getVectorMinNumElements();
+ unsigned Dist = cast<ConstantSDNode>(Op.getOperand(2))->getZExtValue() -
+ cast<ConstantSDNode>(Op.getOperand(1))->getZExtValue();
+ if (getSVEPredPatternFromNumElements(Dist) != None &&
----------------
`Op.getConstantOperandVal(2) - Op.getConstantOperandVal(1);`
================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:4761
+ if (getSVEPredPatternFromNumElements(Dist) != None &&
+ Dist < (MinSVEVectorSize / ElementSize))
+ return getPTrue(DAG, dl, Op.getValueType(), Dist);
----------------
this should be `<=`
================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:4762
+ Dist < (MinSVEVectorSize / ElementSize))
+ return getPTrue(DAG, dl, Op.getValueType(), Dist);
+ }
----------------
this should be the encoded pattern returned by `getSVEPredPatternFromNumElements`, see elsewhere in ISelLowering for examples.
================
Comment at: llvm/test/CodeGen/AArch64/active_lane_mask.ll:478
+define <vscale x 4 x i1> @lane_mask_nxv4i1_imm3() {
+; CHECK-LABEL: lane_mask_nxv4i1_imm3:
----------------
these tests aren't sufficient, it would be good to have tests for
* the upper bound of active elements vl256, this can be tested with a `<vscale x 16 x i1>` predicate and minimum SVE vector width of 2048 specified via `vscale_range(16, 16)` function attribute. Also add a test for this vector width that exceeds 256 active elements.
* A non-zero `%base` argument for active.lane.mask.
================
Comment at: llvm/test/CodeGen/AArch64/active_lane_mask.ll:495
+entry:
+ %active.lane.mask = call <vscale x 4 x i1> @llvm.get.active.lane.mask.nxv4i1.i64(i64 0, i64 4)
+ ret <vscale x 4 x i1> %active.lane.mask
----------------
active.lane.mask is an unsigned less-than comparison so this generates a predicate with 4 active lanes, which for a minimum vector width of 128 and esize of 32 can be represented with `ptrue p0.s, vl4`?
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D137547/new/
https://reviews.llvm.org/D137547
More information about the llvm-commits
mailing list