[PATCH] D11678: [CodeGen] Fixes *absdiff* intrinsic: LangRef doc/test case improvement and corresponding code change
Michael Zolotukhin
mzolotukhin at apple.com
Mon Aug 3 12:55:43 PDT 2015
mzolotukhin added inline comments.
================
Comment at: docs/LangRef.rst:10387-10390
@@ -10386,6 +10386,6 @@
%sub = sub nsw <4 x i32> %a, %b
- %ispos = icmp sgt <4 x i32> %sub, <i32 -1, i32 -1, i32 -1, i32 -1>
+ %ispos = icmp sge <4 x i32> %sub, zeroinitializer
%neg = sub nsw <4 x i32> zeroinitializer, %sub
%1 = select <4 x i1> %ispos, <4 x i32> %sub, <4 x i32> %neg
----------------
ashahid wrote:
> mzolotukhin wrote:
> > What's the difference between `llvm.uabsdiff` and `llvm.sabsdiff` then?
> The difference is the presence of NSW flag in case of llvm.sabsdiff.
I still don't think it's correct. NSW is just a hint to optimizers, but it doesn't add any additional logic. It does assert that the expression won't overflow, but the operations we execute are still the same. That is, currently the only difference between signed and unsigned version is that for signed version we could get an undefined behavior in some cases. This is clearly incorrect, because we should get different results without undefined behavior in some cases (e.g. `<-1,-1,-1,-1>` and `<1,1,1,1>` - it should give `<254,254,254,254>` for `uabsdiff.v4i8` and `<2,2,2,2>` for `sabsdiff.v4i8`).
What really should be the difference, as far is I understand, is condition code in the comparison:
```
%ispos = icmp sge <4 x i32> %sub, zeroinitializer
```
As far as I understand, we should use `uge` for unsigned and `sge` for signed case.
Repository:
rL LLVM
http://reviews.llvm.org/D11678
More information about the llvm-commits
mailing list