[PATCH] D11678: [CodeGen] Fixes *absdiff* intrinsic: LangRef doc/test case improvement and corresponding code change

Shahid, Asghar-ahmad Asghar-ahmad.Shahid at amd.com
Tue Aug 4 01:56:29 PDT 2015


Hi Mikhail,

AFAIU, I think we need to use "ugt" as the "modulo 2^n" result of the difference for unsigned overflow
will be wrapped to zero.

But I too would be happy to be corrected if I'm wrong:)

Regards,
Shahid

From: Mikhail Zolotukhin [mailto:mzolotukhin at apple.com]
Sent: Tuesday, August 04, 2015 3:02 AM
To: James Molloy
Cc: reviews+D11678+public+e92bec0f352bb617 at reviews.llvm.org; Shahid, Asghar-ahmad; james.molloy at arm.com; hfinkel at anl.gov; llvm-commits at cs.uiuc.edu
Subject: Re: [PATCH] D11678: [CodeGen] Fixes *absdiff* intrinsic: LangRef doc/test case improvement and corresponding code change

Hi James,

According to LLVM Language reference manual, arithmetic operations do perform truncation:
If the difference has unsigned overflow, the result returned is the mathematical result modulo 2n, where n is the bit width of the result.

So, I think that if we use "uge" for unsigned, we're actually fine here. But I'd be happy to be corrected if I'm wrong here:)

Michael

On Aug 3, 2015, at 1:09 PM, James Molloy <james at jamesmolloy.co.uk<mailto:james at jamesmolloy.co.uk>> wrote:

Hi Michael,

I think the complexity comes from the subtraction having as its domain two unsigned integers- so it's range must be a larger signed integer.

Signed comparison for unsigned values is clearly wrong as you say, but I could contruct a testcase that shows incorrect behaviour with an unsigned comparison too. I think the only correct behaviour is to extend the inputs first and truncate the result.

But I've been wrong before :)

James
On Mon, 3 Aug 2015 at 21:05, Michael Zolotukhin <mzolotukhin at apple.com<mailto:mzolotukhin at apple.com>> wrote:
mzolotukhin added inline comments.

================
Comment at: docs/LangRef.rst:10387-10390
@@ -10386,6 +10386,6 @@

     %sub = sub nsw <4 x i32> %a, %b
-    %ispos = icmp sgt <4 x i32> %sub, <i32 -1, i32 -1, i32 -1, i32 -1>
+    %ispos = icmp sge <4 x i32> %sub, zeroinitializer
     %neg = sub nsw <4 x i32> zeroinitializer, %sub
     %1 = select <4 x i1> %ispos, <4 x i32> %sub, <4 x i32> %neg

----------------
ashahid wrote:
> mzolotukhin wrote:
> > What's the difference between `llvm.uabsdiff` and `llvm.sabsdiff` then?
> The difference is the presence of NSW flag in case of llvm.sabsdiff.
I still don't think it's correct. NSW is just a hint to optimizers, but it doesn't add any additional logic. It does assert that the expression won't overflow, but the operations we execute are still the same. That is, currently the only difference between signed and unsigned version is that for signed version we could get an undefined behavior in some cases. This is clearly incorrect, because we should get different results without undefined behavior in some cases (e.g. `<-1,-1,-1,-1>` and `<1,1,1,1>` - it should give `<254,254,254,254>` for `uabsdiff.v4i8` and `<2,2,2,2>` for `sabsdiff.v4i8`).

What really should be the difference, as far is I understand, is condition code in the comparison:
```
%ispos = icmp sge <4 x i32> %sub, zeroinitializer
```
As far as I understand, we should use `uge` for unsigned and `sge` for signed case.



Repository:
  rL LLVM

http://reviews.llvm.org/D11678




_______________________________________________
llvm-commits mailing list
llvm-commits at cs.uiuc.edu<mailto:llvm-commits at cs.uiuc.edu>
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150804/848e0460/attachment.html>


More information about the llvm-commits mailing list