[PATCH] D20598: [X86] Detect SAD patterns and emit psadbw instructions on X86 redux

Michael Kuperstein via llvm-commits llvm-commits at lists.llvm.org
Fri May 27 09:56:32 PDT 2016


Thanks!
And I'd rather get INSERT_SUBVECTOR to work right (what I meant was that
it's probably broken in general :-) )- I think it's more natural here than
explicitly zeroing out the high lanes. If I see that that doesn't work out,
I'll make the change you suggest.

On Fri, May 27, 2016 at 9:39 AM, Wei Mi <wmi at google.com> wrote:

> On Fri, May 27, 2016 at 9:20 AM, Michael Kuperstein <mkuper at google.com>
> wrote:
> > mkuper added a comment.
> >
> > I agree the spill code is complete nonsense, but I don't think it should
> block this patch, for two reasons:
> >
> > 1. We seem to have a real problem with inserts into oversize vectors
> (e.g. <16 x i32> on SSE2), but I'm not entirely sure it's a high priority.
> I'm not sure how much of those the vectorizer actually generates -
> although, for reductions, that may be more common, I haven't checked.
> >
>
> I don't think it is a general problem. No such stack saves/restores
> are generated if we turn off detectSADPattern for sad16_i8. I think
> the problem is because we use INSERT_SUBVECTOR. I tried the following
> code then the stack saves/restores were gone.
>
>   if (VT.getSizeInBits() > ResVT.getSizeInBits()) {
>
>     // Update part of elements of the reduction vector. This is done by
> first
>     // extracting a sub-vector from it, updating this sub-vector, and
> inserting
>     // it back.
>     SDValue SubPhi = DAG.getNode(ISD::EXTRACT_SUBVECTOR, DL, ResVT, Phi,
>                                  DAG.getIntPtrConstant(0, DL));
>     SDValue Res = DAG.getNode(ISD::ADD, DL, ResVT, Sad, SubPhi);
>     SmallVector<SDValue, 16> Ops(VT.getSizeInBits() /
> ResVT.getSizeInBits(), DAG.getConstant(0, DL, ResVT));
>     Ops[0] = Res;
>     return DAG.getNode(ISD::CONCAT_VECTORS, DL, VT, Ops);
>   }
>
> > 2. More importantly, even with the spills, it still looks like a large
> net improvement. E.g. the SSE2 code for the sad16_i8 loop before this patch
> is:
> > ...
> > Are you ok with me committing as is, and filing a PR on the
> insert_subvector issue?
> >
>
> I think it is fine. Thanks.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20160527/2b665419/attachment.html>


More information about the llvm-commits mailing list