[PATCH][AVX512] Enable intrinsics for vpcmpeq

Robert Khasanov rob.khasanov at gmail.com
Tue Oct 28 09:30:55 PDT 2014


Removed special case for cmp instructions in getVectorMaskingNode. Now cmp
intrinsics lower as other intrinsics through VSELECT, and then VSELECT
tranforms to AND in PerformSELECTCombine.
See rev 220779

2014-09-30 15:54 GMT+04:00 Robert Khasanov <rob.khasanov at gmail.com>:

> Committed rev 218667, 218668, 218669.
>
> 2014-09-29 21:26 GMT+04:00 Adam Nemet <anemet at apple.com>:
>
>>
>> On Sep 29, 2014, at 7:22 AM, Robert Khasanov <rob.khasanov at gmail.com>
>> wrote:
>>
>> Hi Adam,
>>
>> 2014-09-28 11:14 GMT+04:00 Adam Nemet <anemet at apple.com>:
>>
>>>
>>> On Sep 26, 2014, at 8:04 AM, Robert Khasanov <rob.khasanov at gmail.com>
>>> wrote:
>>>
>>> Hi Adam,
>>>
>>> Thanks for review.
>>> I agree with you comment about reusing getVectorMaskingNode. See new
>>> patches attached.
>>>
>>>
>>> Is 003 the correct version now?
>>>
>>>
>> yes.
>>
>>> However, I don't understand your comment about ISD::INSERT_SUBVECTOR.
>>> Could you please clarify your suggestion?
>>>
>>>
>>> You’re making this op legal for v8i1, so we will now generate things
>>> like:
>>>
>>>     (v8i1 (insert_subvector (v8i1 foo), (v4i1 bar), (iPTR 0)))
>>>
>>> Do we have instructions to match these in the TD file?
>>>
>>>
>> We generate things like (v8i1 (insert_subvector undef, (v4i1 src), (iPTR
>> 0))).
>> For this we have following rules in TD (see 003 patch):
>>
>> +  def : Pat<(v8i1 (insert_subvector undef, (v4i1 VK4:$src), (iPTR 0))),
>> +            (v8i1 (COPY_TO_REGCLASS VK4:$src, VK8))>;
>> +  def : Pat<(v8i1 (insert_subvector undef, (v2i1 VK2:$src), (iPTR 0))),
>> +            (v8i1 (COPY_TO_REGCLASS VK2:$src, VK8))>;
>>
>>
>> OK, I missed this one.
>>
>> You still owe me :) the sample DAG in the comment to:
>>
>> +      SDValue Mask = Op.getOperand(3);
>> +      EVT BitcastVT = EVT::getVectorVT(*DAG.getContext(), MVT::i1,
>> +
>> Mask.getValueType().getSizeInBits());
>>        SDValue Cmp = DAG.getNode(IntrData->Opc0, dl, MaskVT,
>>                                  Op.getOperand(1), Op.getOperand(2));
>> -      SDValue Res = getVectorMaskingNode(Cmp, Op.getOperand(3),
>> -                                         DAG.getTargetConstant(0,
>> MaskVT), DAG);
>> +      SDValue CmpMask = getVectorMaskingNode(Cmp, Op.getOperand(3),
>> +                                        DAG.getTargetConstant(0,
>> MaskVT), DAG);
>> +      SDValue Res = DAG.getNode(ISD::INSERT_SUBVECTOR, dl, BitcastVT,
>> +                                DAG.getUNDEF(BitcastVT), CmpMask,
>> +                                DAG.getIntPtrConstant(0));
>>
>>
>> Other than that LGTM, thanks for splitting it up like this.
>>
>> One more question more of as a follow-on discussion.  Do we actually have
>> to generate AND for masking compares?  I understand that that is what they
>> get canonicalized into but perhaps we can still generate vselect uniformly.
>>
>> Adam
>>
>>
>>
>> Thanks,
>> Robert
>>
>> Adam
>>>
>>> Thanks,
>>> Robert
>>>
>>> 2014-09-26 4:19 GMT+04:00 Adam Nemet <anemet at apple.com>:
>>>
>>>>
>>>> On Sep 25, 2014, at 12:43 AM, Robert Khasanov <rob.khasanov at gmail.com>
>>>> wrote:
>>>>
>>>> The last patch is incorrect. See this version instead of previous one.
>>>> Sorry for that.
>>>>
>>>> 2014-09-25 11:40 GMT+04:00 Robert Khasanov <rob.khasanov at gmail.com>:
>>>>
>>>>> Hi Elena, Adam,
>>>>>
>>>>> In these patches I enable intrinsics for vpcmpeq{bwdq} instructions.
>>>>> Since result of the instructions is mask, I enable new intrinsics type
>>>>> CMP_MASK and assume to enable CMP_MASK_CC in future for vpcmp{bwdq}
>>>>> instructions (due to additional CC argument).
>>>>> Also I extended argument types when intrinsics generated through
>>>>> TableGen (IIT_V64) to support 64 packed data.
>>>>> In last patch I enabled INSERT_SUBVECTOR for v8i1 to legalizer to
>>>>> support converting v2i1 and v4i1 to v8i1 and then bitcasting to i8.
>>>>>
>>>>> Please let me know if it looks good.
>>>>>
>>>>
>>>> Hi Robert,
>>>>
>>>> +    case CMP_MASK: {
>>>> +      EVT VT = Op.getOperand(1).getValueType();
>>>> +      EVT MaskVT = EVT::getVectorVT(*DAG.getContext(), MVT::i1,
>>>> +                                    VT.getVectorNumElements());
>>>> +      SDValue Cmp = DAG.getNode(IntrData->Opc0, dl, MaskVT,
>>>> +                                Op.getOperand(1), Op.getOperand(2));
>>>> +      SDValue Res;
>>>> +      if (isAllOnes(Op.getOperand(3))) {
>>>> +        Res = Cmp;
>>>> +      } else {
>>>> +        Res = DAG.getNode(ISD::AND, dl, MaskVT, Cmp,
>>>> +                          DAG.getNode(ISD::BITCAST, dl, MaskVT,
>>>> +                                      Op.getOperand(3)));
>>>> +      }
>>>> +      return DAG.getNode(ISD::BITCAST, dl, Op.getValueType(), Res);
>>>> +    }
>>>>
>>>> It would be good to reuse (and extend if needed) the helper
>>>> getVectorMaskingNode for this.  That should capture what it takes from a
>>>> expression to become a masked expression in one place (just like
>>>> AVX512_masking in td).  Obviously the BITCAST at the end does not belong
>>>> there there but the information that masked compare is canonicalized to an
>>>> AND rather than a VSELECT is useful.
>>>>
>>>> As you see I also don’t check for the mask value to be AllOnes in
>>>> getVectorMaskingNode.  We can add that for now if you need but I think that
>>>> should be an orthogonal DAGCombiner transformation to cover more cases.
>>>>
>>>> +    setOperationAction(ISD::INSERT_SUBVECTOR,   MVT::v8i1, Legal);
>>>>
>>>> Can you just enable it like that?  I understand that you need it to
>>>> lower the intrinsics but you may need to add some more patterns to TD to
>>>> recognize this in the general case.
>>>>
>>>> -        Res = DAG.getNode(ISD::AND, dl, MaskVT, Cmp,
>>>> -                          DAG.getNode(ISD::BITCAST, dl, MaskVT,
>>>> -                                      Op.getOperand(3)));
>>>> +        RMask = DAG.getNode(ISD::AND, dl, MaskVT, Cmp,
>>>> +                            DAG.getNode(ISD::EXTRACT_SUBVECTOR, dl,
>>>> MaskVT,
>>>> +                              DAG.getNode(ISD::BITCAST, dl, BitcastVT,
>>>> Mask),
>>>> +                              DAG.getIntPtrConstant(0)));
>>>>        }
>>>> -      return DAG.getNode(ISD::BITCAST, dl, Op.getValueType(), Res);
>>>> +      return DAG.getNode(ISD::BITCAST, dl, Op.getValueType(),
>>>> +                         DAG.getNode(ISD::INSERT_SUBVECTOR, dl,
>>>> BitcastVT,
>>>> +                                     DAG.getUNDEF(BitcastVT), RMask,
>>>> +                                     DAG.getIntPtrConstant(0)));
>>>>
>>>> This could use a comment with an example DAG that we’re trying to
>>>> create.
>>>>
>>>> Thanks,
>>>> Adam
>>>>
>>>>
>>>>
>>>>> Robert
>>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20141028/e5077fc3/attachment.html>


More information about the llvm-commits mailing list