<html>
<head>
<base href="https://llvm.org/bugs/" />
</head>
<body><table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Bug ID</th>
<td><a class="bz_bug_link
bz_status_NEW "
title="NEW --- - [PPC] Inefficient code for floating point comparison"
href="https://llvm.org/bugs/show_bug.cgi?id=30701">30701</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>[PPC] Inefficient code for floating point comparison
</td>
</tr>
<tr>
<th>Product</th>
<td>libraries
</td>
</tr>
<tr>
<th>Version</th>
<td>trunk
</td>
</tr>
<tr>
<th>Hardware</th>
<td>PC
</td>
</tr>
<tr>
<th>OS</th>
<td>Windows NT
</td>
</tr>
<tr>
<th>Status</th>
<td>NEW
</td>
</tr>
<tr>
<th>Severity</th>
<td>normal
</td>
</tr>
<tr>
<th>Priority</th>
<td>P
</td>
</tr>
<tr>
<th>Component</th>
<td>Backend: PowerPC
</td>
</tr>
<tr>
<th>Assignee</th>
<td>unassignedbugs@nondot.org
</td>
</tr>
<tr>
<th>Reporter</th>
<td>amehsan@ca.ibm.com
</td>
</tr>
<tr>
<th>CC</th>
<td>llvm-bugs@lists.llvm.org
</td>
</tr>
<tr>
<th>Classification</th>
<td>Unclassified
</td>
</tr></table>
<p>
<div>
<pre>The changes in <a href="https://reviews.llvm.org/D23614">https://reviews.llvm.org/D23614</a> expose a problem where we
generate two fcmpu insn for one compare. See <a href="show_bug.cgi?id=30701#c1">comment 1</a> for a complete example.
The problem is the following:
We have this fp compare in IR:
%t1 = fcmp ult double %t0, 0.000000e+00
Which is lowered to
t10: ch = br_cc t0, setult:ch, t2, ConstantFP:f64<0.000000e+00>,
BasicBlock:ch<good 0x10024bbced0>
We also have the following lines of code in PPC backend:
// Comparisons that require checking two conditions.
setCondCodeAction(ISD::SETULT, MVT::f32, Expand);
setCondCodeAction(ISD::SETULT, MVT::f64, Expand);
So the target independent expansion, converts the above check to the following:
t12: i1 = setcc t2, ConstantFP:f64<0.000000e+00>, setlt:ch
t14: i1 = setcc t2, ConstantFP:f64<0.000000e+00>, setuo:ch
t15: i1 = or t12, t14
and from this we generate two fcmpu. We probably need to check two bits of
CR,so going to to target indepenent expansion is fine. We need the OR
instruction generated, but we need to realize that once the first setcc was
converted to fcmpu, the second one is not needed.
Before the patch that I mentioned above, (when we generate 0 using load instead
of xor) we generate only one fcmpu even though we have two setcc insns in the
selction dag right before instruction selection begins.</pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are on the CC list for the bug.</li>
</ul>
</body>
</html>