[PATCH] D11678: [CodeGen] Fixes absdiff intrinsic: LangRef doc/test case improvement and corresponding code change

Mon Aug 3 05:25:50 PDT 2015

> With larger data type do you mean promoting the given data type to larger
type ex: MVT:i32 to MVT:i64

> And then doing the expansion?

Yes, that was my suggestion.

On Mon, 3 Aug 2015 at 13:23 Shahid, Asghar-ahmad <
Asghar-ahmad.Shahid at amd.com> wrote:

> Hi James,
>
>
>
> That is right, for Uint_max the current comparison will not be proper.
>
>
>
> With larger data type do you mean promoting the given data type to larger
> type ex: MVT:i32 to MVT:i64
>
> And then doing the expansion?
>
>
>
> Regards,
>
> Shahid
>
>
>
> *From:* James Molloy [mailto:james at jamesmolloy.co.uk]
> *Sent:* Monday, August 03, 2015 1:02 AM
> *To:* reviews+D11678+public+e92bec0f352bb617 at reviews.llvm.org; Shahid,
> Asghar-ahmad; james.molloy at arm.com; hfinkel at anl.gov; mzolotukhin at apple.com
> *Cc:* llvm-commits at cs.uiuc.edu
> *Subject:* Re: [PATCH] D11678: [CodeGen] Fixes *absdiff* intrinsic:
> LangRef doc/test case improvement and corresponding code change
>
>
>
> I think uabsdiff needs to be expanded using a larger data type. If we have
> uabsdiff(uint_max, uint_max) , a signed comparison won't return the right
> result unless the bitwidth is expanded, right?
>
> James
>
> On Sun, 2 Aug 2015 at 08:20, Shahid <Asghar-ahmad.Shahid at amd.com> wrote:
>
> ashahid added inline comments.
>
> ================
> Comment at: docs/LangRef.rst:10387-10390
> @@ -10386,6 +10386,6 @@
>
>      %sub = sub nsw <4 x i32> %a, %b
> -    %ispos = icmp sgt <4 x i32> %sub, <i32 -1, i32 -1, i32 -1, i32 -1>
> +    %ispos = icmp sge <4 x i32> %sub, zeroinitializer
>      %neg = sub nsw <4 x i32> zeroinitializer, %sub
>      %1 = select <4 x i1> %ispos, <4 x i32> %sub, <4 x i32> %neg
>
> ----------------
> mzolotukhin wrote:
> > What's the difference between `llvm.uabsdiff` and `llvm.sabsdiff` then?
> The difference is the presence of NSW flag in case of llvm.sabsdiff.
>
> ================
> Comment at: lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp:737
> @@ -736,5 +736,3 @@
>        TLI.getSetCCResultType(DAG.getDataLayout(), *DAG.getContext(), VT),
> Tmp2,
> -      DAG.getConstant(0, dl, VT),
> -      DAG.getCondCode(Op->getOpcode() == ISD::SABSDIFF ? ISD::SETLT
> -                                                       : ISD::SETULT));
> +      DAG.getConstant(0, dl, VT), DAG.getCondCode(ISD::SETGE));
>    Tmp1 = DAG.getNode(ISD::VSELECT, dl, VT, Tmp4, Tmp1, Tmp2);
> ----------------
> mzolotukhin wrote:
> > AFAIU, this should be `ISD::SETLE`.
> My bad... intention was to use Tmp1 instead of Tmp2. I will use the proper
> variable names to reflect the operations.
>
> ================
> Comment at: test/CodeGen/X86/absdiff_128.ll:150-152
> @@ +149,5 @@
> +; CHECK-DAG:  psubd  %xmm1, [[SRC:%xmm[0-9]+]]
> +; CHECK:      pxor
> +; CHECK-DAG:  pxor    %xmm3, [[DST:%xmm[0-9]+]]
> +; CHECK-NEXT: psubd  [[SRC]], [[DST]]
> +; CHECK-DAG:  pcmpgtd  %xmm3, [[SRC:%xmm[0-9]+]]
> ----------------
> mzolotukhin wrote:
> > This `CHECK-DAG` doesn't make much sense, since it's limited by `CHECK`
> and `CHECK-NEXT` from both sides.
> >
> > Moreover, I think the right way to make the tests less bristle is to not
> check for everything, but just look for key instructions.
> > For example, we definitely expect to see `psubd`, then, maybe after
> several other instructions, we want to see `pcmpgt`, then we want to see
> `pand`, `pandn`, and `por`. Thus, I'd write this test something like this:
> > ```
> > CHECK: psubd
> > CHECK: pcmpgt
> > CHECK-DAG: pand // BTW, why do you have two `pandn` here?
> > CHECK-DAG: pandn
> > CHECK: por
> > CHECK: ret
> > ```
> Thanks, this is a very important input.
>
> For ISD::SETGE, X86 swaps the operand, consequently, in context of VSELECT
> it uses two "pandn".
>
>
>
> ================
> Comment at: test/CodeGen/X86/absdiff_128.ll:206
> @@ +205,3 @@
> +; CHECK-NEXT:   pcmpgtd
> +; CHECK-NEXT:   pshufd  {{.*}}    # xmm5 = xmm4[0,0,2,2]
> +; CHECK-NEXT:   pcmpeqd
> ----------------
> mzolotukhin wrote:
> > If we don't want to match any specific register here, we need to get rid
> of comments `# xmm5 = xmm4...` too.
> Ok
>
> ================
> Comment at: test/CodeGen/X86/absdiff_256.ll:33
> @@ +32,3 @@
> +; CHECK-DAG:         psubd   %xmm6, [[SRC:%xmm[0-9]+]]
> +; CHECK-DAG:         pxor    %xmm5, [[DST:%xmm[0-9]+]]
> +; CHECK-NEXT:        psubd   [[SRC]], [[DST]]
> ----------------
> mzolotukhin wrote:
> > This is still fragile.
> > Imagine that register allocator for some strange reason begins to use
> `xmm5` instead of `xmm6` and vice versa - this test will immediately fail.
> >
> > Also, if you want to match `pxor %xmmN, %xmmN`, the correct way to write
> the regexp for it would be:
> > ```
> > pxor [[SOMENAME:%xmm[0-9]+]], [[SOMENAME]]
> > ```
> > This will ensure that `pxor` operates on the same register.
> Ok
>
>
> Repository:
>   rL LLVM
>
> http://reviews.llvm.org/D11678
>
>
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150803/9d429db2/attachment.html>

[PATCH] D11678: [CodeGen] Fixes *absdiff* intrinsic: LangRef doc/test case improvement and corresponding code change

[PATCH] D11678: [CodeGen] Fixes absdiff intrinsic: LangRef doc/test case improvement and corresponding code change