[llvm-bugs] [Bug 25389] New: Inline asm expr argument order affects optimization

via llvm-bugs llvm-bugs at lists.llvm.org
Mon Nov 2 21:24:28 PST 2015


https://llvm.org/bugs/show_bug.cgi?id=25389

            Bug ID: 25389
           Summary: Inline asm expr argument order affects optimization
           Product: new-bugs
           Version: 3.7
          Hardware: PC
                OS: All
            Status: NEW
          Severity: normal
          Priority: P
         Component: new bugs
          Assignee: unassignedbugs at nondot.org
          Reporter: irony42 at me.com
                CC: llvm-bugs at lists.llvm.org
    Classification: Unclassified

I've encountered a case where llvm makes an incorrect (or at least
inconsistent) optimization choice.

The following two llvmir functions are functionally equivalent, and textually
almost identical. They differ only on their second-to-last line, where the
argument order to the asm statement is slightly different. However, when
compiled to x86 using the commands below, the first produces significantly
better code than the second.

Function A: Optimized well, compiles to "mov ecx, dword ptr [ecx]; add ecx,
11223344; ret" 
echo 'define void @main() naked minsize noinline nounwind {
%regs = tail call { i32, i32, i32, i32, i32, i32, i32 } asm sideeffect "",
"={eax},={ecx},={edx},={ebx},={esi},={edi},={ebp}"()
%eax = extractvalue { i32, i32, i32, i32, i32, i32, i32 } %regs, 0
%ecx = extractvalue { i32, i32, i32, i32, i32, i32, i32 } %regs, 1
%edx = extractvalue { i32, i32, i32, i32, i32, i32, i32 } %regs, 2
%ebx = extractvalue { i32, i32, i32, i32, i32, i32, i32 } %regs, 3
%esi = extractvalue { i32, i32, i32, i32, i32, i32, i32 } %regs, 4
%edi = extractvalue { i32, i32, i32, i32, i32, i32, i32 } %regs, 5
%ebp = extractvalue { i32, i32, i32, i32, i32, i32, i32 } %regs, 6
%x.ptr = inttoptr i32 %ecx to i32*
%x = load i32, i32* %x.ptr
%y = add i32 %x, 11223344
tail call void asm sideeffect "",
"{eax},{ecx},{edx},{ebx},{esi},{edi},{ebp}"(i32 %eax, i32 %y, i32 %edx, i32
%ebx, i32 %esi, i32 %edi, i32 %ebp)
ret void
}' | llvm-as | opt -Os | llc -march=x86 -mcpu=core2 -mattr=-rdrnd -O2
-filetype=obj -x86-asm-syntax=intel -enable-pie -relocation-model=pic - -o - |
llvm-objdump -disassemble -x86-asm-syntax=intel -

Function B: Not optimized well. It should be the same as Function A, but the
resulting code unnecessarily spills to stack: "mov dword ptr [esp], eax; mov
eax, 11223344; add eax, dword ptr [ecx]; mov ecx, dword ptr [esp]; ret" 
echo 'define void @main() naked minsize noinline nounwind {
%regs = tail call { i32, i32, i32, i32, i32, i32, i32 } asm sideeffect "",
"={eax},={ecx},={edx},={ebx},={esi},={edi},={ebp}"()
%eax = extractvalue { i32, i32, i32, i32, i32, i32, i32 } %regs, 0
%ecx = extractvalue { i32, i32, i32, i32, i32, i32, i32 } %regs, 1
%edx = extractvalue { i32, i32, i32, i32, i32, i32, i32 } %regs, 2
%ebx = extractvalue { i32, i32, i32, i32, i32, i32, i32 } %regs, 3
%esi = extractvalue { i32, i32, i32, i32, i32, i32, i32 } %regs, 4
%edi = extractvalue { i32, i32, i32, i32, i32, i32, i32 } %regs, 5
%ebp = extractvalue { i32, i32, i32, i32, i32, i32, i32 } %regs, 6
%x.ptr = inttoptr i32 %ecx to i32*
%x = load i32, i32* %x.ptr
%y = add i32 %x, 11223344
tail call void asm sideeffect "",
"{ecx},{eax},{edx},{ebx},{esi},{edi},{ebp}"(i32 %y, i32 %eax, i32 %edx, i32
%ebx, i32 %esi, i32 %edi, i32 %ebp)
ret void
}' | llvm-as | opt -Os | llc -march=x86 -mcpu=core2 -mattr=-rdrnd -O2
-filetype=obj -x86-asm-syntax=intel -enable-pie -relocation-model=pic - -o - |
llvm-objdump -disassemble -x86-asm-syntax=intel -


I believe that the above demonstrates a bug (probably in some optimization pass
within the x86 backend) wherein certain optimizations fail to get applied
depending on the order of arguments to an asm expression.

These test cases can be reduced by removing edx thru ebp (leaving only eax and
ecx); in that reduced case, Function A stays the same whereas Function B uses
an extra scratch register. I did not reduce them to such because I thought
unnecessarily spilling to stack was more evident of a bug than merely using an
extra register.

I'm using llvm 3.7, although with s/load i32,/load/ it should reproduce on at
least 3.5 and 3.6 as well.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20151103/b50e9b61/attachment.html>


More information about the llvm-bugs mailing list