<html>
<head>
<base href="http://llvm.org/bugs/" />
</head>
<body><table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Bug ID</th>
<td><a class="bz_bug_link
bz_status_NEW "
title="NEW --- - Misoptimization of combining movl+addl int leal"
href="http://llvm.org/bugs/show_bug.cgi?id=20776">20776</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>Misoptimization of combining movl+addl int leal
</td>
</tr>
<tr>
<th>Product</th>
<td>libraries
</td>
</tr>
<tr>
<th>Version</th>
<td>trunk
</td>
</tr>
<tr>
<th>Hardware</th>
<td>PC
</td>
</tr>
<tr>
<th>OS</th>
<td>Linux
</td>
</tr>
<tr>
<th>Status</th>
<td>NEW
</td>
</tr>
<tr>
<th>Severity</th>
<td>normal
</td>
</tr>
<tr>
<th>Priority</th>
<td>P
</td>
</tr>
<tr>
<th>Component</th>
<td>Backend: X86
</td>
</tr>
<tr>
<th>Assignee</th>
<td>unassignedbugs@nondot.org
</td>
</tr>
<tr>
<th>Reporter</th>
<td>wujingyue@gmail.com
</td>
</tr>
<tr>
<th>CC</th>
<td>llvmbugs@cs.uiuc.edu
</td>
</tr>
<tr>
<th>Classification</th>
<td>Unclassified
</td>
</tr></table>
<p>
<div>
<pre>The symptom is similar to <a class="bz_bug_link
bz_status_NEW "
title="NEW --- - TwoAddressInstructionPass fails to optimize mov+add to lea"
href="show_bug.cgi?id=20701">http://llvm.org/bugs/show_bug.cgi?id=20701</a> where the
backend misses the opportunity of combining addl+movl into leal. However, I
feel the root cause is different, so I decided to file a separate bug.
The test case is reduced from loop-strength-reduce8.ll
@G = external global float*
declare i32* @foo()
define i32* @bar(i1 %cond) {
entry:
%v0 = call i32* @foo() ; v0 = eax
%v1 = bitcast i32* %v0 to float* ; v1 = v0
br i1 %cond, label %then, label %merge
then:
%v2 = getelementptr float* %v1, i64 16
store float* %v2, float** @G ; just use %v2. doesn't have to be a store
br label %merge
merge:
ret i32* %v0
}
running "llc -mtriple=i386-apple-darwin" on it gives the following machine
code:
## BB#0: ## %entry
subl $12, %esp
Ltmp0:
.cfi_def_cfa_offset 16
calll L_foo$stub
testb $1, 16(%esp)
je LBB0_2
## BB#1: ## %then
movl %eax, %ecx
addl $64, %ecx
movl L_G$non_lazy_ptr, %edx
movl %ecx, (%edx)
LBB0_2: ## %merge
addl $12, %esp
retl
where
movl %eax, %ecx
addl $64, %ecx
could have been combined into
leal 64(%eax), %ecx
The only place I am aware of that could combine movl+addl into leal is in
TwoAddressInstructionPass at around Line 1157.
if (!regBKilled || isProfitableToConv3Addr(regA, regB)) {
if (convertInstTo3Addr(...)) {
But both heuristics (i.e., !regBKilled and isProfitableToConv3Addr) failed.
The pseudo machine code before the TwoAddressInstructionPass is
vreg0 = %eax
vreg1 = vreg0
if (cond) {
vreg2 = vreg1 + 64
use(vreg2)
}
%eax = vreg0
isProfitableToConv3Addr failed because vreg1 is not a direct copy of %eax.
!regBKilled failed because "vreg2 = vreg1 + 16" is the last use of vreg1, and
the pass seems to think RegisterCoalescer would coalesce vreg1 and vreg2 and
end up with simply vreg1/vreg2 += 16. However, while RegisterCoalescer does
coalesce them later, it cannot coalesce vreg0 and vreg2 because vreg0 is used
after then if-then
vreg0 = %eax
if (cond) {
vreg2 = vreg0
vreg2 += 64
use(vreg2)
}
%eax = vreg0
leaving
vreg2 = vreg0
vreg2 += 64
not combined.
I am not sure which part of the backend should be responsible for this
misoptimization. Bob Wilson mentioned it could be an issue with
RegisterCoalescer, but Coalescer seems optimal on this particular example.
Should TwoAddressInstructionPass use a better heuristic? Is it a phase-ordering
issue: part of TwoAddressInstructionPass should run after RegisterCoalescer? Or
should we run a peephole-optimization pass looking for the pattern of addl+movl
after register coalescing?
Any thoughts?
Thanks,
Jingyue</pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are on the CC list for the bug.</li>
</ul>
</body>
</html>