<html><body>
<p><font size="2" face="sans-serif">Hi Olivier,</font><br>
<font size="2" face="sans-serif"> I think we discussed this last Thursday? My feeling is that each use of the multiply can be considered separately. If it can be combined, then we should do so. The multiply should be left in place and removed by a dead code elimination pass sometime later. This is what TOBEY does. If you want me to explain the XL method in more detail, come talk to me.</font><br>
<br>
<font size="2" face="sans-serif"> Kevin</font><br>
<font size="2" face="sans-serif">----------------------------------------------<br>
Kevin O'Brien<br>
Manager, Advanced Compiler Technology<br>
IBM T.J Watson Research Center, Yorktown Heights, NY</font><br>
<br>
<img width="16" height="16" src="cid:1__=0ABBF7D3DFC019C38f9e8a93df938@us.ibm.com" border="0" alt="Inactive hide details for Olivier H Sallenave---08/26/2014 11:12:04 AM---Hi, I tried to compile the following using -ffp-contra"><font size="2" color="#424282" face="sans-serif">Olivier H Sallenave---08/26/2014 11:12:04 AM---Hi, I tried to compile the following using -ffp-contract=fast:</font><br>
<br>
<table width="100%" border="0" cellspacing="0" cellpadding="0">
<tr valign="top"><td width="1%"><img width="96" height="1" src="cid:2__=0ABBF7D3DFC019C38f9e8a93df938@us.ibm.com" border="0" alt=""><br>
<ul style="padding-left: 4pt"><font size="1" color="#5F5F5F" face="sans-serif">From:</font></ul>
</td><td width="100%"><img width="1" height="1" src="cid:2__=0ABBF7D3DFC019C38f9e8a93df938@us.ibm.com" border="0" alt=""><br>
<font size="1" face="sans-serif">Olivier H Sallenave/Watson/IBM</font></td></tr>
<tr valign="top"><td width="1%"><img width="96" height="1" src="cid:2__=0ABBF7D3DFC019C38f9e8a93df938@us.ibm.com" border="0" alt=""><br>
<ul style="padding-left: 4pt"><font size="1" color="#5F5F5F" face="sans-serif">To:</font></ul>
</td><td width="100%"><img width="1" height="1" src="cid:2__=0ABBF7D3DFC019C38f9e8a93df938@us.ibm.com" border="0" alt=""><br>
<font size="1" face="sans-serif">llvmdev@cs.uiuc.edu, </font></td></tr>
<tr valign="top"><td width="1%"><img width="96" height="1" src="cid:2__=0ABBF7D3DFC019C38f9e8a93df938@us.ibm.com" border="0" alt=""><br>
<ul style="padding-left: 4pt"><font size="1" color="#5F5F5F" face="sans-serif">Cc:</font></ul>
</td><td width="100%" valign="middle"><img width="1" height="1" src="cid:2__=0ABBF7D3DFC019C38f9e8a93df938@us.ibm.com" border="0" alt=""><br>
<font size="1" face="sans-serif">Samuel F Antao/Watson/IBM@IBMUS, Kevin K O'Brien/Watson/IBM@IBMUS</font></td></tr>
<tr valign="top"><td width="1%"><img width="96" height="1" src="cid:2__=0ABBF7D3DFC019C38f9e8a93df938@us.ibm.com" border="0" alt=""><br>
<ul style="padding-left: 4pt"><font size="1" color="#5F5F5F" face="sans-serif">Date:</font></ul>
</td><td width="100%"><img width="1" height="1" src="cid:2__=0ABBF7D3DFC019C38f9e8a93df938@us.ibm.com" border="0" alt=""><br>
<font size="1" face="sans-serif">08/26/2014 11:12 AM</font></td></tr>
<tr valign="top"><td width="1%"><img width="96" height="1" src="cid:2__=0ABBF7D3DFC019C38f9e8a93df938@us.ibm.com" border="0" alt=""><br>
<ul style="padding-left: 4pt"><font size="1" color="#5F5F5F" face="sans-serif">Subject:</font></ul>
</td><td width="100%"><img width="1" height="1" src="cid:2__=0ABBF7D3DFC019C38f9e8a93df938@us.ibm.com" border="0" alt=""><br>
<font size="1" face="sans-serif">Multiply-add combining</font></td></tr>
</table>
<hr width="100%" size="2" align="left" noshade style="color:#8091A5; "><br>
<br>
<font size="2" face="sans-serif">Hi,</font><br>
<br>
<font size="2" face="sans-serif">I tried to compile the following using -ffp-contract=fast:</font><br>
<br>
<font size="2" face="sans-serif"> %mul = fmul double %sub5, %x</font><br>
<font size="2" face="sans-serif"> %add = fadd double %add6, %mul</font><br>
<font size="2" face="sans-serif"> %sub = fsub double %sub5, %mul</font><br>
<br>
<font size="2" face="sans-serif">I expected fadd and fsub to be contracted with fmul, which didn't happen.</font><br>
<br>
<font size="2" face="sans-serif">When looking in DAGCombiner.cpp, it appears the result of the fmul needs to be used only once, which isn't the case here as it is used by both the fadd and the fsub:</font><br>
<br>
<font size="2" face="sans-serif"> // fold (fadd (fmul x, y), z) -> (fma x, y, z)</font><br>
<font size="2" face="sans-serif"> if (N0.getOpcode() == ISD::FMUL && N0.hasOneUse())</font><br>
<font size="2" face="sans-serif"> return DAG.getNode(ISD::FMA, SDLoc(N), VT, N0.getOperand(0),</font><br>
<font size="2" face="sans-serif"> N0.getOperand(1), N1);</font><br>
<br>
<font size="2" face="sans-serif">This heuristic looks a little conservative, could we instead check that every instruction using the result of the fmul are combinable (i.e., they are either fadd or fsub)?</font><br>
<br>
<br>
<font size="2" face="sans-serif">Thanks in advance,</font><br>
<font size="2" face="sans-serif">Olivier</font><br>
<br>
</body></html>