<html>
<head>
<base href="https://bugs.llvm.org/">
</head>
<body><table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Bug ID</th>
<td><a class="bz_bug_link
bz_status_NEW "
title="NEW - narrow truncated FP math with 'reassoc' fast-math-flag FMF"
href="https://bugs.llvm.org/show_bug.cgi?id=43847">43847</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>narrow truncated FP math with 'reassoc' fast-math-flag FMF
</td>
</tr>
<tr>
<th>Product</th>
<td>libraries
</td>
</tr>
<tr>
<th>Version</th>
<td>trunk
</td>
</tr>
<tr>
<th>Hardware</th>
<td>PC
</td>
</tr>
<tr>
<th>OS</th>
<td>All
</td>
</tr>
<tr>
<th>Status</th>
<td>NEW
</td>
</tr>
<tr>
<th>Severity</th>
<td>enhancement
</td>
</tr>
<tr>
<th>Priority</th>
<td>P
</td>
</tr>
<tr>
<th>Component</th>
<td>Scalar Optimizations
</td>
</tr>
<tr>
<th>Assignee</th>
<td>unassignedbugs@nondot.org
</td>
</tr>
<tr>
<th>Reporter</th>
<td>spatel+llvm@rotateright.com
</td>
</tr>
<tr>
<th>CC</th>
<td>llvm-bugs@lists.llvm.org
</td>
</tr></table>
<p>
<div>
<pre>Forking this off from the non-FMF example in <a class="bz_bug_link
bz_status_NEW "
title="NEW - narrow casted FMA op?"
href="show_bug.cgi?id=43841">bug 43841</a> - it's the same source
code:
float fmaf(float x, float y, float z) {
return (((double)x * (double)y) + (double)z);
}
But let's compile with a flag that gives us wide license to rearrange
floating-point math:
$ clang -O2 fma2.c -S -o - -emit-llvm -funsafe-math-optimizations
define float @fmaf(float %x, float %y, float %z) {
%conv = fpext float %x to double
%conv1 = fpext float %y to double
%mul = fmul reassoc nsz arcp double %conv, %conv1
%conv2 = fpext float %z to double
%add = fadd reassoc nsz arcp double %mul, %conv2
%conv3 = fptrunc double %add to float
ret float %conv3
}
------------------------------------------------------------------------------
We should be able to remove all of the cast ops (fptrunc/fpext) in this IR and
narrow the math ops (fmul/fadd) to float type. Related transforms are already
implemented in InstCombine.
Notes:
1. "-funsafe-math-optimizations" translates to "reassoc nsz arcp" in IR. We
only care about "reassoc" in this example, but we don't seem to have the clang
flag for that wired up.
2. We're moving to a fast-math-flags model where all FP values can carry FMF.
But this example shows that we can not or do not apply the flags to the cast
instructions yet.
3. Ideally, we will implement #2 to allow pattern-matching from the trailing
"fptrunc", but even without that, we could allow this fold:
fmul reassoc (fpext X), (fpext Y) --> fpext (fmul X, Y)</pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are on the CC list for the bug.</li>
</ul>
</body>
</html>