[LLVMbugs] [Bug 20776] New: Misoptimization of combining movl+addl int leal

bugzilla-daemon at llvm.org bugzilla-daemon at llvm.org
Wed Aug 27 12:21:12 PDT 2014


            Bug ID: 20776
           Summary: Misoptimization of combining movl+addl int leal
           Product: libraries
           Version: trunk
          Hardware: PC
                OS: Linux
            Status: NEW
          Severity: normal
          Priority: P
         Component: Backend: X86
          Assignee: unassignedbugs at nondot.org
          Reporter: wujingyue at gmail.com
                CC: llvmbugs at cs.uiuc.edu
    Classification: Unclassified

The symptom is similar to http://llvm.org/bugs/show_bug.cgi?id=20701 where the
backend misses the opportunity of combining addl+movl into leal. However, I
feel the root cause is different, so I decided to file a separate bug. 

The test case is reduced from loop-strength-reduce8.ll

@G = external global float*                                                     

declare i32* @foo()                                                             

define i32* @bar(i1 %cond) {                                                    
  %v0 = call i32* @foo() ; v0 = eax                                             
  %v1 = bitcast i32* %v0 to float* ; v1 = v0                                    
  br i1 %cond, label %then, label %merge                                        

  %v2 = getelementptr float* %v1, i64 16                                        
  store float* %v2, float** @G ; just use %v2. doesn't have to be a store
  br label %merge                                                               

  ret i32* %v0                                                                  

running "llc -mtriple=i386-apple-darwin" on it gives the following machine

## BB#0:                                ## %entry
        subl    $12, %esp
        .cfi_def_cfa_offset 16
        calll   L_foo$stub
        testb   $1, 16(%esp)
        je      LBB0_2
## BB#1:                                ## %then
        movl    %eax, %ecx
        addl    $64, %ecx
        movl    L_G$non_lazy_ptr, %edx
        movl    %ecx, (%edx)
LBB0_2:                                 ## %merge
        addl    $12, %esp


movl %eax, %ecx
addl $64, %ecx

could have been combined into

leal 64(%eax), %ecx

The only place I am aware of that could combine movl+addl into leal is in
TwoAddressInstructionPass at around Line 1157. 

if (!regBKilled || isProfitableToConv3Addr(regA, regB)) {
  if (convertInstTo3Addr(...)) {

But both heuristics (i.e., !regBKilled and isProfitableToConv3Addr) failed. 

The pseudo machine code before the TwoAddressInstructionPass is

vreg0 = %eax
vreg1 = vreg0
if (cond) {
  vreg2 = vreg1 + 64
%eax = vreg0

isProfitableToConv3Addr failed because vreg1 is not a direct copy of %eax.
!regBKilled failed because "vreg2 = vreg1 + 16" is the last use of vreg1, and
the pass seems to think RegisterCoalescer would coalesce vreg1 and vreg2 and
end up with simply vreg1/vreg2 += 16. However, while RegisterCoalescer does
coalesce them later, it cannot coalesce vreg0 and vreg2 because vreg0 is used
after then if-then

vreg0 = %eax
if (cond) {
  vreg2 = vreg0
  vreg2 += 64
%eax = vreg0


vreg2 = vreg0
vreg2 += 64

not combined. 

I am not sure which part of the backend should be responsible for this
misoptimization. Bob Wilson mentioned it could be an issue with
RegisterCoalescer, but Coalescer seems optimal on this particular example.
Should TwoAddressInstructionPass use a better heuristic? Is it a phase-ordering
issue: part of TwoAddressInstructionPass should run after RegisterCoalescer? Or
should we run a peephole-optimization pass looking for the pattern of addl+movl
after register coalescing? 

Any thoughts?  


You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20140827/52f8b8af/attachment.html>

More information about the llvm-bugs mailing list