[PATCH][AVX512] Enable intrinsics for vpcmpeq
Demikhovsky, Elena
elena.demikhovsky at intel.com
Tue Sep 30 00:46:28 PDT 2014
I agree with Adam. Can we convert VSELECT to AND in td?
- Elena
From: Adam Nemet [mailto:anemet at apple.com]
Sent: Monday, September 29, 2014 20:27
To: Robert Khasanov
Cc: Demikhovsky, Elena; llvm-commits at cs.uiuc.edu
Subject: Re: [PATCH][AVX512] Enable intrinsics for vpcmpeq
On Sep 29, 2014, at 7:22 AM, Robert Khasanov <rob.khasanov at gmail.com<mailto:rob.khasanov at gmail.com>> wrote:
Hi Adam,
2014-09-28 11:14 GMT+04:00 Adam Nemet <anemet at apple.com<mailto:anemet at apple.com>>:
On Sep 26, 2014, at 8:04 AM, Robert Khasanov <rob.khasanov at gmail.com<mailto:rob.khasanov at gmail.com>> wrote:
Hi Adam,
Thanks for review.
I agree with you comment about reusing getVectorMaskingNode. See new patches attached.
Is 003 the correct version now?
yes.
However, I don't understand your comment about ISD::INSERT_SUBVECTOR. Could you please clarify your suggestion?
You're making this op legal for v8i1, so we will now generate things like:
(v8i1 (insert_subvector (v8i1 foo), (v4i1 bar), (iPTR 0)))
Do we have instructions to match these in the TD file?
We generate things like (v8i1 (insert_subvector undef, (v4i1 src), (iPTR 0))).
For this we have following rules in TD (see 003 patch):
+ def : Pat<(v8i1 (insert_subvector undef, (v4i1 VK4:$src), (iPTR 0))),
+ (v8i1 (COPY_TO_REGCLASS VK4:$src, VK8))>;
+ def : Pat<(v8i1 (insert_subvector undef, (v2i1 VK2:$src), (iPTR 0))),
+ (v8i1 (COPY_TO_REGCLASS VK2:$src, VK8))>;
OK, I missed this one.
You still owe me :) the sample DAG in the comment to:
+ SDValue Mask = Op.getOperand(3);
+ EVT BitcastVT = EVT::getVectorVT(*DAG.getContext(), MVT::i1,
+ Mask.getValueType().getSizeInBits());
SDValue Cmp = DAG.getNode(IntrData->Opc0, dl, MaskVT,
Op.getOperand(1), Op.getOperand(2));
- SDValue Res = getVectorMaskingNode(Cmp, Op.getOperand(3),
- DAG.getTargetConstant(0, MaskVT), DAG);
+ SDValue CmpMask = getVectorMaskingNode(Cmp, Op.getOperand(3),
+ DAG.getTargetConstant(0, MaskVT), DAG);
+ SDValue Res = DAG.getNode(ISD::INSERT_SUBVECTOR, dl, BitcastVT,
+ DAG.getUNDEF(BitcastVT), CmpMask,
+ DAG.getIntPtrConstant(0));
Other than that LGTM, thanks for splitting it up like this.
One more question more of as a follow-on discussion. Do we actually have to generate AND for masking compares? I understand that that is what they get canonicalized into but perhaps we can still generate vselect uniformly.
Adam
Thanks,
Robert
Adam
Thanks,
Robert
2014-09-26 4:19 GMT+04:00 Adam Nemet <anemet at apple.com<mailto:anemet at apple.com>>:
On Sep 25, 2014, at 12:43 AM, Robert Khasanov <rob.khasanov at gmail.com<mailto:rob.khasanov at gmail.com>> wrote:
The last patch is incorrect. See this version instead of previous one.
Sorry for that.
2014-09-25 11:40 GMT+04:00 Robert Khasanov <rob.khasanov at gmail.com<mailto:rob.khasanov at gmail.com>>:
Hi Elena, Adam,
In these patches I enable intrinsics for vpcmpeq{bwdq} instructions.
Since result of the instructions is mask, I enable new intrinsics type CMP_MASK and assume to enable CMP_MASK_CC in future for vpcmp{bwdq} instructions (due to additional CC argument).
Also I extended argument types when intrinsics generated through TableGen (IIT_V64) to support 64 packed data.
In last patch I enabled INSERT_SUBVECTOR for v8i1 to legalizer to support converting v2i1 and v4i1 to v8i1 and then bitcasting to i8.
Please let me know if it looks good.
Hi Robert,
+ case CMP_MASK: {
+ EVT VT = Op.getOperand(1).getValueType();
+ EVT MaskVT = EVT::getVectorVT(*DAG.getContext(), MVT::i1,
+ VT.getVectorNumElements());
+ SDValue Cmp = DAG.getNode(IntrData->Opc0, dl, MaskVT,
+ Op.getOperand(1), Op.getOperand(2));
+ SDValue Res;
+ if (isAllOnes(Op.getOperand(3))) {
+ Res = Cmp;
+ } else {
+ Res = DAG.getNode(ISD::AND, dl, MaskVT, Cmp,
+ DAG.getNode(ISD::BITCAST, dl, MaskVT,
+ Op.getOperand(3)));
+ }
+ return DAG.getNode(ISD::BITCAST, dl, Op.getValueType(), Res);
+ }
It would be good to reuse (and extend if needed) the helper getVectorMaskingNode for this. That should capture what it takes from a expression to become a masked expression in one place (just like AVX512_masking in td). Obviously the BITCAST at the end does not belong there there but the information that masked compare is canonicalized to an AND rather than a VSELECT is useful.
As you see I also don't check for the mask value to be AllOnes in getVectorMaskingNode. We can add that for now if you need but I think that should be an orthogonal DAGCombiner transformation to cover more cases.
+ setOperationAction(ISD::INSERT_SUBVECTOR, MVT::v8i1, Legal);
Can you just enable it like that? I understand that you need it to lower the intrinsics but you may need to add some more patterns to TD to recognize this in the general case.
- Res = DAG.getNode(ISD::AND, dl, MaskVT, Cmp,
- DAG.getNode(ISD::BITCAST, dl, MaskVT,
- Op.getOperand(3)));
+ RMask = DAG.getNode(ISD::AND, dl, MaskVT, Cmp,
+ DAG.getNode(ISD::EXTRACT_SUBVECTOR, dl, MaskVT,
+ DAG.getNode(ISD::BITCAST, dl, BitcastVT, Mask),
+ DAG.getIntPtrConstant(0)));
}
- return DAG.getNode(ISD::BITCAST, dl, Op.getValueType(), Res);
+ return DAG.getNode(ISD::BITCAST, dl, Op.getValueType(),
+ DAG.getNode(ISD::INSERT_SUBVECTOR, dl, BitcastVT,
+ DAG.getUNDEF(BitcastVT), RMask,
+ DAG.getIntPtrConstant(0)));
This could use a comment with an example DAG that we're trying to create.
Thanks,
Adam
Robert
---------------------------------------------------------------------
Intel Israel (74) Limited
This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140930/62142e35/attachment.html>
More information about the llvm-commits
mailing list