[PATCH] D11678: [CodeGen] Fixes absdiff intrinsic: LangRef doc/test case improvement and corresponding code change

Mon Aug 3 14:42:04 PDT 2015

Hi,

So I think that the semantics of the absdiff intrinsic should be that there
be no truncation of an intermediate result. That would, I think, match up
with all the implementations I know of.

I am also happy to be corrected, it's now late my time :)

James
On Mon, 3 Aug 2015 at 22:28, Mikhail Zolotukhin <mzolotukhin at apple.com>
wrote:

> Hi James,
>
> According to LLVM Language reference manual, arithmetic operations do
> perform truncation:
> If the difference has unsigned overflow, the result returned is the
> mathematical result modulo 2n, where n is the bit width of the result.
>
> So, I think that if we use "uge" for unsigned, we're actually fine here.
> But I'd be happy to be corrected if I'm wrong here:)
>
> Michael
>
> On Aug 3, 2015, at 1:09 PM, James Molloy <james at jamesmolloy.co.uk> wrote:
>
> Hi Michael,
>
> I think the complexity comes from the subtraction having as its domain two
> unsigned integers- so it's range must be a larger signed integer.
>
> Signed comparison for unsigned values is clearly wrong as you say, but I
> could contruct a testcase that shows incorrect behaviour with an unsigned
> comparison too. I think the only correct behaviour is to extend the inputs
> first and truncate the result.
>
> But I've been wrong before :)
>
> James
> On Mon, 3 Aug 2015 at 21:05, Michael Zolotukhin <mzolotukhin at apple.com>
> wrote:
>
>> mzolotukhin added inline comments.
>>
>> ================
>> Comment at: docs/LangRef.rst:10387-10390
>> @@ -10386,6 +10386,6 @@
>>
>>      %sub = sub nsw <4 x i32> %a, %b
>> -    %ispos = icmp sgt <4 x i32> %sub, <i32 -1, i32 -1, i32 -1, i32 -1>
>> +    %ispos = icmp sge <4 x i32> %sub, zeroinitializer
>>      %neg = sub nsw <4 x i32> zeroinitializer, %sub
>>      %1 = select <4 x i1> %ispos, <4 x i32> %sub, <4 x i32> %neg
>>
>> ----------------
>> ashahid wrote:
>> > mzolotukhin wrote:
>> > > What's the difference between `llvm.uabsdiff` and `llvm.sabsdiff`
>> then?
>> > The difference is the presence of NSW flag in case of llvm.sabsdiff.
>> I still don't think it's correct. NSW is just a hint to optimizers, but
>> it doesn't add any additional logic. It does assert that the expression
>> won't overflow, but the operations we execute are still the same. That is,
>> currently the only difference between signed and unsigned version is that
>> for signed version we could get an undefined behavior in some cases. This
>> is clearly incorrect, because we should get different results without
>> undefined behavior in some cases (e.g. `<-1,-1,-1,-1>` and `<1,1,1,1>` - it
>> should give `<254,254,254,254>` for `uabsdiff.v4i8` and `<2,2,2,2>` for
>> `sabsdiff.v4i8`).
>>
>> What really should be the difference, as far is I understand, is
>> condition code in the comparison:
>> ```
>> %ispos = icmp sge <4 x i32> %sub, zeroinitializer
>> ```
>> As far as I understand, we should use `uge` for unsigned and `sge` for
>> signed case.
>>
>>
>>
>> Repository:
>>   rL LLVM
>>
>> http://reviews.llvm.org/D11678
>>
>>
>>
>>
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150803/535078e6/attachment.html>

[PATCH] D11678: [CodeGen] Fixes *absdiff* intrinsic: LangRef doc/test case improvement and corresponding code change

[PATCH] D11678: [CodeGen] Fixes absdiff intrinsic: LangRef doc/test case improvement and corresponding code change