<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=us-ascii">
<meta name="Generator" content="Microsoft Word 14 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
{font-family:Tahoma;
panose-1:2 11 6 4 3 5 4 4 2 4;}
@font-face
{font-family:"Lucida Sans Unicode";
panose-1:2 11 6 2 3 5 4 2 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
margin-bottom:.0001pt;
font-size:12.0pt;
font-family:"Times New Roman","serif";}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-priority:99;
color:purple;
text-decoration:underline;}
p.MsoAcetate, li.MsoAcetate, div.MsoAcetate
{mso-style-priority:99;
mso-style-link:"Balloon Text Char";
margin:0in;
margin-bottom:.0001pt;
font-size:8.0pt;
font-family:"Tahoma","sans-serif";}
span.apple-converted-space
{mso-style-name:apple-converted-space;}
span.EmailStyle18
{mso-style-type:personal-reply;
font-family:"Calibri","sans-serif";
color:#1F497D;}
span.BalloonTextChar
{mso-style-name:"Balloon Text Char";
mso-style-priority:99;
mso-style-link:"Balloon Text";
font-family:"Tahoma","sans-serif";}
.MsoChpDefault
{mso-style-type:export-only;
font-size:10.0pt;}
@page WordSection1
{size:8.5in 11.0in;
margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
{page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang="EN-US" link="blue" vlink="purple">
<div class="WordSection1">
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">Hi James,<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">Thanks for the clarification, I recall my last statement, it was a result of complete misunderstanding of the<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">reference manual’s statements though it was quite clear.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">As agreed earlier I will expand the given domain type to a larger type and truncate the range type finally.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">My idea is to consider promotion of below types<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">i8 -> i16<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">i16 -> i32<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">i32 -> i64<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">i64 -> i128<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">Does it make sense?<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">Regards,<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">Shahid<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"><o:p> </o:p></span></p>
<div style="border:none;border-left:solid blue 1.5pt;padding:0in 0in 0in 4.0pt">
<div>
<div style="border:none;border-top:solid #B5C4DF 1.0pt;padding:3.0pt 0in 0in 0in">
<p class="MsoNormal"><b><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">From:</span></b><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif""> James Molloy [mailto:James.Molloy@arm.com]
<br>
<b>Sent:</b> Tuesday, August 04, 2015 2:46 PM<br>
<b>To:</b> Shahid, Asghar-ahmad<br>
<b>Cc:</b> Mikhail Zolotukhin; James Molloy; reviews+D11678+public+e92bec0f352bb617@reviews.llvm.org; Hal Finkel; llvm-commits<br>
<b>Subject:</b> Re: [PATCH] D11678: [CodeGen] Fixes *absdiff* intrinsic: LangRef doc/test case improvement and corresponding code change<o:p></o:p></span></p>
</div>
</div>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Hi, <o:p></o:p></p>
<div>
<p class="MsoNormal"><o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal">This all depends on how you define your intrinsic. It’s currently ambiguous - we have to choose what uabsdiff should do.<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal"><o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal">Given this expression:<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal"><o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal">call i8 @llvm.uabsdiff(i8 1, i8 130)<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal"><o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal">What should be the result?<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal"><o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal">If we expand, we get:<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal"><o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal">%1 = sub i8 1, i8 130 ; pure result is -129, but has to be truncated to fit into i8.<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal"><o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal">Neither unsigned nor signed comparisons will get the right result here. The result of the subtract after truncation is 127 in both signed and unsigned representations, which is greater than zero and is definitely not 129!<span style="color:#1F497D"><o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal"><o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal">Instead, if we expand with range expansion:<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal"><o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal">%1 = zext i8 1 to i16 ; zext because we’re treating the inputs as unsigned<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal">%2 = zext i8 130 to i16<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal">%3 = sub i16 %1, %1 ; 0xff7f = -129<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal">%4 = icmp sgt i16 %3, 0 ; false<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal">%5 = sub i16 0, %3 ; 129<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal">%6 = select i1 %4, i16 %3, %5 ; %5 = 129<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal">%7 = trunc i16 %6 to i8 ; 129<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal"><o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal">So the end result is correct and in range, if we promote the intermediate calculations to a larger bit width.<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal"><o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal">So we need to decide: what should the behaviour of uabsdiff be? Should it behave as if there is larger intermediate precision or not?<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal"><o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal">I vote yes, because all the actual absdiff implementations I know of implement this behaviour.<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal"><o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal">Cheers,<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal"><o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal">James<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal"><o:p> </o:p></p>
</div>
<div>
<div>
<div>
<p class="MsoNormal">On 4 Aug 2015, at 09:56, Shahid, Asghar-ahmad <<a href="mailto:Asghar-ahmad.Shahid@amd.com">Asghar-ahmad.Shahid@amd.com</a>> wrote:<o:p></o:p></p>
</div>
<p class="MsoNormal"><br>
<br>
<o:p></o:p></p>
<div>
<div>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">Hi Mikhail,</span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"> </span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">AFAIU, I think we need to use “ugt” as the “modulo 2^n” result of the difference for unsigned overflow</span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">will be wrapped to zero.</span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"> </span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">But I too would be happy to be corrected if I'm wrong:)</span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"> </span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">Regards,</span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">Shahid</span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"> </span><o:p></o:p></p>
</div>
<div style="border:none;border-left:solid blue 1.5pt;padding:0in 0in 0in 4.0pt">
<div>
<div style="border:none;border-top:solid #B5C4DF 1.0pt;padding:3.0pt 0in 0in 0in">
<div>
<p class="MsoNormal"><b><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">From:</span></b><span class="apple-converted-space"><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif""> </span></span><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">Mikhail
Zolotukhin [<a href="mailto:mzolotukhin@apple.com"><span style="color:purple">mailto:mzolotukhin@apple.com</span></a>]<span class="apple-converted-space"> </span><br>
<b>Sent:</b><span class="apple-converted-space"> </span>Tuesday, August 04, 2015 3:02 AM<br>
<b>To:</b><span class="apple-converted-space"> </span>James Molloy<br>
<b>Cc:</b><span class="apple-converted-space"> </span><a href="mailto:reviews+D11678+public+e92bec0f352bb617@reviews.llvm.org"><span style="color:purple">reviews+D11678+public+e92bec0f352bb617@reviews.llvm.org</span></a>; Shahid, Asghar-ahmad;<span class="apple-converted-space"> </span><a href="mailto:james.molloy@arm.com"><span style="color:purple">james.molloy@arm.com</span></a>;<span class="apple-converted-space"> </span><a href="mailto:hfinkel@anl.gov"><span style="color:purple">hfinkel@anl.gov</span></a>;<span class="apple-converted-space"> </span><a href="mailto:llvm-commits@cs.uiuc.edu"><span style="color:purple">llvm-commits@cs.uiuc.edu</span></a><br>
<b>Subject:</b><span class="apple-converted-space"> </span>Re: [PATCH] D11678: [CodeGen] Fixes *absdiff* intrinsic: LangRef doc/test case improvement and corresponding code change</span><o:p></o:p></p>
</div>
</div>
</div>
<div>
<p class="MsoNormal"> <o:p></o:p></p>
</div>
<div>
<p class="MsoNormal">Hi James,<o:p></o:p></p>
</div>
<div>
<div>
<p class="MsoNormal"> <o:p></o:p></p>
</div>
</div>
<div>
<div>
<p class="MsoNormal">According to LLVM Language reference manual, arithmetic operations do perform truncation:<o:p></o:p></p>
</div>
</div>
<div>
<div>
<p class="MsoNormal"><span style="font-size:10.5pt;font-family:"Lucida Sans Unicode","sans-serif";background:white">If the difference has unsigned overflow, the result returned is the mathematical result modulo 2</span><sup><span style="font-family:"Lucida Sans Unicode","sans-serif"">n</span></sup><span style="font-size:10.5pt;font-family:"Lucida Sans Unicode","sans-serif";background:white">,
where n is the bit width of the result.</span><o:p></o:p></p>
</div>
</div>
<div>
<div>
<p class="MsoNormal"> <o:p></o:p></p>
</div>
</div>
<div>
<div>
<p class="MsoNormal">So, I think that if we use "uge" for unsigned, we're actually fine here. But I'd be happy to be corrected if I'm wrong here:)<o:p></o:p></p>
</div>
</div>
<div>
<div>
<p class="MsoNormal"> <o:p></o:p></p>
</div>
</div>
<div>
<div>
<p class="MsoNormal">Michael<o:p></o:p></p>
</div>
</div>
<div>
<div>
<p class="MsoNormal"> <o:p></o:p></p>
</div>
<div>
<blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
<div>
<div>
<p class="MsoNormal">On Aug 3, 2015, at 1:09 PM, James Molloy <<a href="mailto:james@jamesmolloy.co.uk"><span style="color:purple">james@jamesmolloy.co.uk</span></a>> wrote:<o:p></o:p></p>
</div>
</div>
<div>
<p class="MsoNormal"> <o:p></o:p></p>
</div>
<div>
<div>
<p class="MsoNormal">Hi Michael,<br>
<br>
I think the complexity comes from the subtraction having as its domain two unsigned integers- so it's range must be a larger signed integer.<br>
<br>
Signed comparison for unsigned values is clearly wrong as you say, but I could contruct a testcase that shows incorrect behaviour with an unsigned comparison too. I think the only correct behaviour is to extend the inputs first and truncate the result.<span class="apple-converted-space"> </span><br>
<br>
But I've been wrong before :)<br>
<br>
James<o:p></o:p></p>
</div>
<div>
<div>
<div>
<p class="MsoNormal">On Mon, 3 Aug 2015 at 21:05, Michael Zolotukhin <<a href="mailto:mzolotukhin@apple.com"><span style="color:purple">mzolotukhin@apple.com</span></a>> wrote:<o:p></o:p></p>
</div>
</div>
<blockquote style="border:none;border-left:solid #CCCCCC 1.0pt;padding:0in 0in 0in 6.0pt;margin-left:4.8pt;margin-top:5.0pt;margin-right:0in;margin-bottom:5.0pt">
<div>
<p class="MsoNormal">mzolotukhin added inline comments.<br>
<br>
================<br>
Comment at: docs/LangRef.rst:10387-10390<br>
@@ -10386,6 +10386,6 @@<br>
<br>
%sub = sub nsw <4 x i32> %a, %b<br>
- %ispos = icmp sgt <4 x i32> %sub, <i32 -1, i32 -1, i32 -1, i32 -1><br>
+ %ispos = icmp sge <4 x i32> %sub, zeroinitializer<br>
%neg = sub nsw <4 x i32> zeroinitializer, %sub<br>
%1 = select <4 x i1> %ispos, <4 x i32> %sub, <4 x i32> %neg<br>
<br>
----------------<br>
ashahid wrote:<br>
> mzolotukhin wrote:<br>
> > What's the difference between `llvm.uabsdiff` and `llvm.sabsdiff` then?<br>
> The difference is the presence of NSW flag in case of llvm.sabsdiff.<br>
I still don't think it's correct. NSW is just a hint to optimizers, but it doesn't add any additional logic. It does assert that the expression won't overflow, but the operations we execute are still the same. That is, currently the only difference between
signed and unsigned version is that for signed version we could get an undefined behavior in some cases. This is clearly incorrect, because we should get different results without undefined behavior in some cases (e.g. `<-1,-1,-1,-1>` and `<1,1,1,1>` - it
should give `<254,254,254,254>` for `uabsdiff.v4i8` and `<2,2,2,2>` for `sabsdiff.v4i8`).<br>
<br>
What really should be the difference, as far is I understand, is condition code in the comparison:<br>
```<br>
%ispos = icmp sge <4 x i32> %sub, zeroinitializer<br>
```<br>
As far as I understand, we should use `uge` for unsigned and `sge` for signed case.<br>
<br>
<br>
<br>
Repository:<br>
rL LLVM<br>
<br>
<a href="https://urldefense.proofpoint.com/v2/url?u=http-3A__reviews.llvm.org_D11678&d=AwMFAg&c=8hUWFZcy2Z-Za5rBPlktOQ&r=mQ4LZ2PUj9hpadE3cDHZnIdEwhEBrbAstXeMaFoB9tg&m=ikW87KaUFRTVYg1DZv34LbjQhfTYNd44D4VoY7kTjyA&s=UvWLnqRIQaoMLguX6EmG9XRGi5W1hdBGbR09KbaUQes&e=" target="_blank"><span style="color:purple">http://reviews.llvm.org/D11678</span></a><br>
<br>
<br>
<br>
<br>
_______________________________________________<br>
llvm-commits mailing list<br>
<a href="mailto:llvm-commits@cs.uiuc.edu" target="_blank"><span style="color:purple">llvm-commits@cs.uiuc.edu</span></a><br>
<a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits" target="_blank"><span style="color:purple">http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits</span></a><o:p></o:p></p>
</div>
</blockquote>
</div>
</div>
</blockquote>
</div>
</div>
</div>
</div>
</div>
<p class="MsoNormal"><o:p> </o:p></p>
</div>
<p class="MsoNormal"><br>
<span style="font-size:10.0pt;font-family:"Arial","sans-serif";color:black">-- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately
and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.<br>
<br>
ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in England & Wales, Company No: 2557590<br>
ARM Holdings plc, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in England & Wales, Company No: 2548782</span><o:p></o:p></p>
</div>
</div>
</body>
</html>