<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<meta name="Generator" content="Microsoft Word 15 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
margin-bottom:.0001pt;
font-size:11.0pt;
font-family:"Calibri",sans-serif;}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-priority:99;
color:purple;
text-decoration:underline;}
p.msonormal0, li.msonormal0, div.msonormal0
{mso-style-name:msonormal;
mso-margin-top-alt:auto;
margin-right:0in;
mso-margin-bottom-alt:auto;
margin-left:0in;
font-size:11.0pt;
font-family:"Calibri",sans-serif;}
span.EmailStyle18
{mso-style-type:personal-reply;
font-family:"Calibri",sans-serif;
color:windowtext;}
.MsoChpDefault
{mso-style-type:export-only;
font-family:"Calibri",sans-serif;}
@page WordSection1
{size:8.5in 11.0in;
margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
{page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang="EN-US" link="blue" vlink="purple">
<div class="WordSection1">
<p class="MsoNormal">strict fp_to_f16 is influenced by the rounding mode… but only in the case where it isn’t exact. So you could assume that in strict mode, any bitcast/store has an exact operand, and use a random chain, I guess. That’s pretty fragile, though;
probably simpler to change legalization to soften them.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">-Eli<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<div style="border:none;border-left:solid blue 1.5pt;padding:0in 0in 0in 4.0pt">
<div>
<div style="border:none;border-top:solid #E1E1E1 1.0pt;padding:3.0pt 0in 0in 0in">
<p class="MsoNormal"><b>From:</b> Craig Topper <craig.topper@gmail.com> <br>
<b>Sent:</b> Tuesday, December 10, 2019 3:35 PM<br>
<b>To:</b> Eli Friedman <efriedma@quicinc.com><br>
<b>Cc:</b> llvm-dev <llvm-dev@lists.llvm.org>; Tim Northover <t.p.northover@gmail.com><br>
<b>Subject:</b> [EXT] Re: TypePromoteFloat loses intermediate rounding operations<o:p></o:p></p>
</div>
</div>
<p class="MsoNormal"><o:p> </o:p></p>
<div>
<div>
<p class="MsoNormal">Thanks Eli.<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal"><o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal">I forgot to bring up the strict FP questions which I was working on when I found this. If we're in a strict FP function, do the fp_to_f16/f16_to_fp emitted by promoting load/store/bitcast need to be strict versions of fp_to_f16/f16_to_fp.
And if so where do we get the chain, especially for the bitcast case which isn't a chained node.<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal"><o:p> </o:p></p>
</div>
<p class="MsoNormal"><br clear="all">
<o:p></o:p></p>
<div>
<div>
<p class="MsoNormal">~Craig<o:p></o:p></p>
</div>
</div>
<p class="MsoNormal"><o:p> </o:p></p>
</div>
<p class="MsoNormal"><o:p> </o:p></p>
<div>
<div>
<p class="MsoNormal">On Tue, Dec 10, 2019 at 3:18 PM Eli Friedman <<a href="mailto:efriedma@quicinc.com">efriedma@quicinc.com</a>> wrote:<o:p></o:p></p>
</div>
<blockquote style="border:none;border-left:solid #CCCCCC 1.0pt;padding:0in 0in 0in 6.0pt;margin-left:4.8pt;margin-right:0in">
<div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">We could fix the legalization without touching the other handling by just inserting an fp_to_f16/f16_to_fp pair after each arithmetic operation that requires it. One advantage
to that approach is that it’s easier to take the obvious shortcut for fast-math.<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> <o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">The “promote-to-larger” strategy doesn’t really round correctly in general, but it works for specific pairs of operator/operation. For example, for f16 fadd in the default rounding
mode, “a+b” is exactly equivalent to “(_Float16)((float)a+(float)b)”. Not sure if this works for all f16 operations, and not sure how much we care if it doesn’t.<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">There aren’t any calling convention implications here for ARM targets; not sure about other targets. On 32-bit ARM, clang explicitly coerces half values to a legal type. And half
is always legal on AArch64 (unless you force soft-float, but at that point we don’t care).<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> <o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">-Eli<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> <o:p></o:p></p>
<div style="border:none;border-left:solid blue 1.5pt;padding:0in 0in 0in 4.0pt">
<div>
<div style="border:none;border-top:solid #E1E1E1 1.0pt;padding:3.0pt 0in 0in 0in">
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><b>From:</b> Craig Topper <<a href="mailto:craig.topper@gmail.com" target="_blank">craig.topper@gmail.com</a>>
<br>
<b>Sent:</b> Tuesday, December 10, 2019 12:18 PM<br>
<b>To:</b> llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>>; Eli Friedman <<a href="mailto:efriedma@quicinc.com" target="_blank">efriedma@quicinc.com</a>>; Tim Northover <<a href="mailto:t.p.northover@gmail.com" target="_blank">t.p.northover@gmail.com</a>><br>
<b>Subject:</b> [EXT] TypePromoteFloat loses intermediate rounding operations<o:p></o:p></p>
</div>
</div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> <o:p></o:p></p>
<div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">For the following C code<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> <o:p></o:p></p>
</div>
<div>
<div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;background:#FFFFFE">
<span style="color:black">__fp16 x, y, z, w;</span><o:p></o:p></p>
</div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;background:#FFFFFE">
<span style="color:black"> </span><o:p></o:p></p>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;background:#FFFFFE">
<span style="color:blue">void</span><span style="color:black"> foo() {</span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;background:#FFFFFE">
<span style="color:black">x = y + z;</span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;background:#FFFFFE">
<span style="color:black">x = x + w;</span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;background:#FFFFFE">
<span style="color:black">}</span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;background:#FFFFFE">
<span style="color:black"> </span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;background:#FFFFFE">
<span style="color:black">clang produces IR that extends each operand to float and then truncates to half before assigning to x. Like this</span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;background:#FFFFFE">
<span style="color:black"> </span><o:p></o:p></p>
</div>
<div>
<div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;background:#FFFFFE">
<span style="color:blue">define</span><span style="color:black"> </span><span style="color:teal">dso_local</span><span style="color:black">
</span><span style="color:teal">void</span><span style="color:black"> </span><span style="color:teal">@foo</span><span style="color:black">()
</span><span style="color:#09885A">#0</span><span style="color:black"> !</span><span style="color:teal">dbg</span><span style="color:black"> !</span><span style="color:#09885A">18</span><span style="color:black"> {</span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;background:#FFFFFE">
<span style="color:#CD3131">%1</span><span style="color:black"> </span><span style="color:#CD3131">=</span><span style="color:black">
</span><span style="color:blue">load</span><span style="color:black"> </span><span style="color:teal">half</span><span style="color:black">,
</span><span style="color:teal">half</span><span style="color:black">* </span><span style="color:teal">@y</span><span style="color:black">,
</span><span style="color:teal">align</span><span style="color:black"> </span><span style="color:#09885A">2</span><span style="color:black">, !</span><span style="color:teal">dbg</span><span style="color:black"> !</span><span style="color:#09885A">21</span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;background:#FFFFFE">
<span style="color:#CD3131">%2</span><span style="color:black"> </span><span style="color:#CD3131">=</span><span style="color:black">
</span><span style="color:blue">fpext</span><span style="color:black"> </span><span style="color:teal">half</span><span style="color:black">
</span><span style="color:#CD3131">%</span><span style="color:#09885A">1</span><span style="color:black">
</span><span style="color:teal">to</span><span style="color:black"> </span><span style="color:teal">float</span><span style="color:black">, !</span><span style="color:teal">dbg</span><span style="color:black"> !</span><span style="color:#09885A">21</span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;background:#FFFFFE">
<span style="color:#CD3131">%3</span><span style="color:black"> </span><span style="color:#CD3131">=</span><span style="color:black">
</span><span style="color:blue">load</span><span style="color:black"> </span><span style="color:teal">half</span><span style="color:black">,
</span><span style="color:teal">half</span><span style="color:black">* </span><span style="color:teal">@z</span><span style="color:black">,
</span><span style="color:teal">align</span><span style="color:black"> </span><span style="color:#09885A">2</span><span style="color:black">, !</span><span style="color:teal">dbg</span><span style="color:black"> !</span><span style="color:#09885A">22</span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;background:#FFFFFE">
<span style="color:#CD3131">%4</span><span style="color:black"> </span><span style="color:#CD3131">=</span><span style="color:black">
</span><span style="color:blue">fpext</span><span style="color:black"> </span><span style="color:teal">half</span><span style="color:black">
</span><span style="color:#CD3131">%</span><span style="color:#09885A">3</span><span style="color:black">
</span><span style="color:teal">to</span><span style="color:black"> </span><span style="color:teal">float</span><span style="color:black">, !</span><span style="color:teal">dbg</span><span style="color:black"> !</span><span style="color:#09885A">22</span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;background:#FFFFFE">
<span style="color:#CD3131">%5</span><span style="color:black"> </span><span style="color:#CD3131">=</span><span style="color:black">
</span><span style="color:blue">fadd</span><span style="color:black"> </span><span style="color:teal">float</span><span style="color:black">
</span><span style="color:#CD3131">%</span><span style="color:#09885A">2</span><span style="color:black">,
</span><span style="color:#CD3131">%</span><span style="color:#09885A">4</span><span style="color:black">, !</span><span style="color:teal">dbg</span><span style="color:black"> !</span><span style="color:#09885A">23</span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;background:#FFFFFE">
<span style="color:#CD3131">%6</span><span style="color:black"> </span><span style="color:#CD3131">=</span><span style="color:black">
</span><span style="color:blue">fptrunc</span><span style="color:black"> </span><span style="color:teal">float</span><span style="color:black">
</span><span style="color:#CD3131">%</span><span style="color:#09885A">5</span><span style="color:black">
</span><span style="color:teal">to</span><span style="color:black"> </span><span style="color:teal">half</span><span style="color:black">, !</span><span style="color:teal">dbg</span><span style="color:black"> !</span><span style="color:#09885A">21</span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;background:#FFFFFE">
<span style="color:blue">store</span><span style="color:black"> </span><span style="color:teal">half</span><span style="color:black">
</span><span style="color:#CD3131">%</span><span style="color:#09885A">6</span><span style="color:black">,
</span><span style="color:teal">half</span><span style="color:black">* </span><span style="color:teal">@x</span><span style="color:black">,
</span><span style="color:teal">align</span><span style="color:black"> </span><span style="color:#09885A">2</span><span style="color:black">, !</span><span style="color:teal">dbg</span><span style="color:black"> !</span><span style="color:#09885A">24</span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;background:#FFFFFE">
<span style="color:#CD3131">%7</span><span style="color:black"> </span><span style="color:#CD3131">=</span><span style="color:black">
</span><span style="color:blue">load</span><span style="color:black"> </span><span style="color:teal">half</span><span style="color:black">,
</span><span style="color:teal">half</span><span style="color:black">* </span><span style="color:teal">@x</span><span style="color:black">,
</span><span style="color:teal">align</span><span style="color:black"> </span><span style="color:#09885A">2</span><span style="color:black">, !</span><span style="color:teal">dbg</span><span style="color:black"> !</span><span style="color:#09885A">25</span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;background:#FFFFFE">
<span style="color:#CD3131">%8</span><span style="color:black"> </span><span style="color:#CD3131">=</span><span style="color:black">
</span><span style="color:blue">fpext</span><span style="color:black"> </span><span style="color:teal">half</span><span style="color:black">
</span><span style="color:#CD3131">%</span><span style="color:#09885A">7</span><span style="color:black">
</span><span style="color:teal">to</span><span style="color:black"> </span><span style="color:teal">float</span><span style="color:black">, !</span><span style="color:teal">dbg</span><span style="color:black"> !</span><span style="color:#09885A">25</span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;background:#FFFFFE">
<span style="color:#CD3131">%9</span><span style="color:black"> </span><span style="color:#CD3131">=</span><span style="color:black">
</span><span style="color:blue">load</span><span style="color:black"> </span><span style="color:teal">half</span><span style="color:black">,
</span><span style="color:teal">half</span><span style="color:black">* </span><span style="color:teal">@w</span><span style="color:black">,
</span><span style="color:teal">align</span><span style="color:black"> </span><span style="color:#09885A">2</span><span style="color:black">, !</span><span style="color:teal">dbg</span><span style="color:black"> !</span><span style="color:#09885A">26</span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;background:#FFFFFE">
<span style="color:#CD3131">%10</span><span style="color:black"> </span><span style="color:#CD3131">=</span><span style="color:black">
</span><span style="color:blue">fpext</span><span style="color:black"> </span><span style="color:teal">half</span><span style="color:black">
</span><span style="color:#CD3131">%</span><span style="color:#09885A">9</span><span style="color:black">
</span><span style="color:teal">to</span><span style="color:black"> </span><span style="color:teal">float</span><span style="color:black">, !</span><span style="color:teal">dbg</span><span style="color:black"> !</span><span style="color:#09885A">26</span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;background:#FFFFFE">
<span style="color:#CD3131">%11</span><span style="color:black"> </span><span style="color:#CD3131">=</span><span style="color:black">
</span><span style="color:blue">fadd</span><span style="color:black"> </span><span style="color:teal">float</span><span style="color:black">
</span><span style="color:#CD3131">%</span><span style="color:#09885A">8</span><span style="color:black">,
</span><span style="color:#CD3131">%</span><span style="color:#09885A">10</span><span style="color:black">, !</span><span style="color:teal">dbg</span><span style="color:black"> !</span><span style="color:#09885A">27</span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;background:#FFFFFE">
<span style="color:#CD3131">%12</span><span style="color:black"> </span><span style="color:#CD3131">=</span><span style="color:black">
</span><span style="color:blue">fptrunc</span><span style="color:black"> </span><span style="color:teal">float</span><span style="color:black">
</span><span style="color:#CD3131">%</span><span style="color:#09885A">11</span><span style="color:black">
</span><span style="color:teal">to</span><span style="color:black"> </span><span style="color:teal">half</span><span style="color:black">, !</span><span style="color:teal">dbg</span><span style="color:black"> !</span><span style="color:#09885A">25</span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;background:#FFFFFE">
<span style="color:blue">store</span><span style="color:black"> </span><span style="color:teal">half</span><span style="color:black">
</span><span style="color:#CD3131">%</span><span style="color:#09885A">12</span><span style="color:black">,
</span><span style="color:teal">half</span><span style="color:black">* </span><span style="color:teal">@x</span><span style="color:black">,
</span><span style="color:teal">align</span><span style="color:black"> </span><span style="color:#09885A">2</span><span style="color:black">, !</span><span style="color:teal">dbg</span><span style="color:black"> !</span><span style="color:#09885A">28</span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;background:#FFFFFE">
<span style="color:blue">ret</span><span style="color:black"> </span><span style="color:teal">void</span><span style="color:black">, !</span><span style="color:teal">dbg</span><span style="color:black"> !</span><span style="color:#09885A">29</span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;background:#FFFFFE">
<span style="color:black">}</span><o:p></o:p></p>
</div>
</div>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;background:#FFFFFE">
<span style="color:black"> </span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;background:#FFFFFE">
<span style="color:black">InstCombine then comes along and gets rid of all of the fpext and fptrunc. Leaving</span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;background:#FFFFFE">
<span style="color:black"> </span><o:p></o:p></p>
</div>
<div>
<div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;background:#FFFFFE">
<span style="color:blue">define</span><span style="color:black"> </span><span style="color:teal">dso_local</span><span style="color:black">
</span><span style="color:teal">void</span><span style="color:black"> </span><span style="color:teal">@foo</span><span style="color:black">()
</span><span style="color:teal">local_unnamed_addr</span><span style="color:black">
</span><span style="color:#09885A">#0</span><span style="color:black"> !</span><span style="color:teal">dbg</span><span style="color:black"> !</span><span style="color:#09885A">18</span><span style="color:black"> {</span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;background:#FFFFFE">
<span style="color:#CD3131">%1</span><span style="color:black"> </span><span style="color:#CD3131">=</span><span style="color:black">
</span><span style="color:blue">load</span><span style="color:black"> </span><span style="color:teal">half</span><span style="color:black">,
</span><span style="color:teal">half</span><span style="color:black">* </span><span style="color:teal">@y</span><span style="color:black">,
</span><span style="color:teal">align</span><span style="color:black"> </span><span style="color:#09885A">2</span><span style="color:black">, !</span><span style="color:teal">dbg</span><span style="color:black"> !</span><span style="color:#09885A">21</span><span style="color:black">,
!</span><span style="color:teal">tbaa</span><span style="color:black"> !</span><span style="color:#09885A">22</span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;background:#FFFFFE">
<span style="color:#CD3131">%2</span><span style="color:black"> </span><span style="color:#CD3131">=</span><span style="color:black">
</span><span style="color:blue">load</span><span style="color:black"> </span><span style="color:teal">half</span><span style="color:black">,
</span><span style="color:teal">half</span><span style="color:black">* </span><span style="color:teal">@z</span><span style="color:black">,
</span><span style="color:teal">align</span><span style="color:black"> </span><span style="color:#09885A">2</span><span style="color:black">, !</span><span style="color:teal">dbg</span><span style="color:black"> !</span><span style="color:#09885A">26</span><span style="color:black">,
!</span><span style="color:teal">tbaa</span><span style="color:black"> !</span><span style="color:#09885A">22</span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;background:#FFFFFE">
<span style="color:#CD3131">%3</span><span style="color:black"> </span><span style="color:#CD3131">=</span><span style="color:black">
</span><span style="color:blue">fadd</span><span style="color:black"> </span><span style="color:teal">half</span><span style="color:black">
</span><span style="color:#CD3131">%</span><span style="color:#09885A">1</span><span style="color:black">,
</span><span style="color:#CD3131">%</span><span style="color:#09885A">2</span><span style="color:black">, !</span><span style="color:teal">dbg</span><span style="color:black"> !</span><span style="color:#09885A">21</span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;background:#FFFFFE">
<span style="color:#CD3131">%4</span><span style="color:black"> </span><span style="color:#CD3131">=</span><span style="color:black">
</span><span style="color:blue">load</span><span style="color:black"> </span><span style="color:teal">half</span><span style="color:black">,
</span><span style="color:teal">half</span><span style="color:black">* </span><span style="color:teal">@w</span><span style="color:black">,
</span><span style="color:teal">align</span><span style="color:black"> </span><span style="color:#09885A">2</span><span style="color:black">, !</span><span style="color:teal">dbg</span><span style="color:black"> !</span><span style="color:#09885A">27</span><span style="color:black">,
!</span><span style="color:teal">tbaa</span><span style="color:black"> !</span><span style="color:#09885A">22</span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;background:#FFFFFE">
<span style="color:#CD3131">%5</span><span style="color:black"> </span><span style="color:#CD3131">=</span><span style="color:black">
</span><span style="color:blue">fadd</span><span style="color:black"> </span><span style="color:teal">half</span><span style="color:black">
</span><span style="color:#CD3131">%</span><span style="color:#09885A">3</span><span style="color:black">,
</span><span style="color:#CD3131">%</span><span style="color:#09885A">4</span><span style="color:black">, !</span><span style="color:teal">dbg</span><span style="color:black"> !</span><span style="color:#09885A">28</span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;background:#FFFFFE">
<span style="color:blue">store</span><span style="color:black"> </span><span style="color:teal">half</span><span style="color:black">
</span><span style="color:#CD3131">%</span><span style="color:#09885A">5</span><span style="color:black">,
</span><span style="color:teal">half</span><span style="color:black">* </span><span style="color:teal">@x</span><span style="color:black">,
</span><span style="color:teal">align</span><span style="color:black"> </span><span style="color:#09885A">2</span><span style="color:black">, !</span><span style="color:teal">dbg</span><span style="color:black"> !</span><span style="color:#09885A">29</span><span style="color:black">,
!</span><span style="color:teal">tbaa</span><span style="color:black"> !</span><span style="color:#09885A">22</span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;background:#FFFFFE">
<span style="color:blue">ret</span><span style="color:black"> </span><span style="color:teal">void</span><span style="color:black">, !</span><span style="color:teal">dbg</span><span style="color:black"> !</span><span style="color:#09885A">30</span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;background:#FFFFFE">
<span style="color:black">}</span><o:p></o:p></p>
</div>
</div>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;background:#FFFFFE">
<span style="color:black"> </span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;background:#FFFFFE">
<span style="color:black"> </span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;background:#FFFFFE">
<span style="color:black">Then SelectionDAG type legalization comes along and creates this as the final assembly</span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;background:#FFFFFE">
<span style="color:black"> </span><o:p></o:p></p>
</div>
<div>
<div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;background:#FFFFFE">
<span style="color:blue">pushq</span><span style="color:black"> </span><span style="color:#4864AA">%rax</span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;background:#FFFFFE">
<span style="color:blue">.cfi_def_cfa_offset</span><span style="color:black"> </span>
<span style="color:#09885A">16</span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;background:#FFFFFE">
<span style="color:blue">movzwl</span><span style="color:black"> </span><span style="color:teal">y</span><span style="color:black">(</span><span style="color:#4864AA">%rip</span><span style="color:black">),
</span><span style="color:#4864AA">%edi</span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;background:#FFFFFE">
<span style="color:blue">callq</span><span style="color:black"> </span><span style="color:teal">__gnu_h2f_ieee</span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;background:#FFFFFE">
<span style="color:blue">movss</span><span style="color:black"> </span><span style="color:#4864AA">%xmm0</span><span style="color:black">,
</span><span style="color:#09885A">4</span><span style="color:black">(</span><span style="color:#4864AA">%rsp</span><span style="color:black">)
</span><span style="color:green"># 4-byte Spill</span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;background:#FFFFFE">
<span style="color:blue">movzwl</span><span style="color:black"> </span><span style="color:teal">z</span><span style="color:black">(</span><span style="color:#4864AA">%rip</span><span style="color:black">),
</span><span style="color:#4864AA">%edi</span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;background:#FFFFFE">
<span style="color:blue">callq</span><span style="color:black"> </span><span style="color:teal">__gnu_h2f_ieee</span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;background:#FFFFFE">
<span style="color:blue">addss</span><span style="color:black"> </span><span style="color:#09885A">4</span><span style="color:black">(</span><span style="color:#4864AA">%rsp</span><span style="color:black">),
</span><span style="color:#4864AA">%xmm0</span><span style="color:black"> </span>
<span style="color:green"># 4-byte Folded Reload</span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;background:#FFFFFE">
<span style="color:blue">movss</span><span style="color:black"> </span><span style="color:#4864AA">%xmm0</span><span style="color:black">,
</span><span style="color:#09885A">4</span><span style="color:black">(</span><span style="color:#4864AA">%rsp</span><span style="color:black">)
</span><span style="color:green"># 4-byte Spill</span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;background:#FFFFFE">
<span style="color:blue">movzwl</span><span style="color:black"> </span><span style="color:teal">w</span><span style="color:black">(</span><span style="color:#4864AA">%rip</span><span style="color:black">),
</span><span style="color:#4864AA">%edi</span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;background:#FFFFFE">
<span style="color:blue">callq</span><span style="color:black"> </span><span style="color:teal">__gnu_h2f_ieee</span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;background:#FFFFFE">
<span style="color:blue">addss</span><span style="color:black"> </span><span style="color:#09885A">4</span><span style="color:black">(</span><span style="color:#4864AA">%rsp</span><span style="color:black">),
</span><span style="color:#4864AA">%xmm0</span><span style="color:black"> </span>
<span style="color:green"># 4-byte Folded Reload</span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;background:#FFFFFE">
<span style="color:blue">callq</span><span style="color:black"> </span><span style="color:teal">__gnu_f2h_ieee</span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;background:#FFFFFE">
<span style="color:blue">movw</span><span style="color:black"> </span><span style="color:#4864AA">%ax</span><span style="color:black">,
</span><span style="color:teal">x</span><span style="color:black">(</span><span style="color:#4864AA">%rip</span><span style="color:black">)</span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;background:#FFFFFE">
<span style="color:blue">popq</span><span style="color:black"> </span><span style="color:#4864AA">%rax</span><o:p></o:p></p>
</div>
</div>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;background:#FFFFFE">
<span style="color:black"> </span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;background:#FFFFFE">
<span style="color:black"> </span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;background:#FFFFFE">
<span style="color:black">I assumed SelectionDAG should produce something equivalent to the original clang code with 4 total extends to f32 and 2 truncates. Instead we got 3 extends and 1 truncate. So we lost the intermediate rounding between the 2 adds that
was in the original clang IR.</span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;background:#FFFFFE">
<span style="color:black"> </span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;background:#FFFFFE">
<span style="color:black">I believe this occurs because the TypePromoteFloat legalization converts all arithmetic operations to their f32 equivalents, but does not place conversions to/from half around them. Instead fp_to_f16 and f16_to_fp nodes are only generated
at loads, stores, bitcasts, and a probably a few other places. Basically only the place where the 16-bit size is needed to make the operation possible. Basically what we have is a very similar implementation to promoting integers, but that doesn't work for
FP because we lose out on intermediate rounding.</span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;background:#FFFFFE">
<span style="color:black"> </span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;background:#FFFFFE">
<span style="color:black">It seems like what we should instead do is insert fp16_to_fp and fp_to_fp16 in the libcall and arithmetic op handling. And use i16 to connect the legalized pieces together. Similar to how we use integer types when softening operations.
I'm not sure if there would still be rounding issues with this, but it seems closer to matching the IR.</span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;background:#FFFFFE">
<span style="color:black"> </span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;background:#FFFFFE">
<span style="color:black">Unfortunately, I think this would have the side effect of changing half arguments and return types to i16 instead of float, which would be an ABI change. At least on some targets __fp16 can't be used as an argument or return type so
maybe that won't be a real problem?</span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;background:#FFFFFE">
<span style="color:black"> </span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;background:#FFFFFE">
<span style="color:black">Anyone else have any thoughts on this?</span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;background:#FFFFFE">
<span style="color:black"> </span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;background:#FFFFFE">
<span style="color:#222222;background:white">~Craig</span><o:p></o:p></p>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</blockquote>
</div>
</div>
</div>
</body>
</html>