[llvm-dev] Help required regarding IPRA and Local Function optimization
vivek pandya via llvm-dev
llvm-dev at lists.llvm.org
Thu Jun 30 09:32:15 PDT 2016
One more interesting thing I have noticed is as following :
In sqlite3 code consider 3 functions namely sqlite3Update, sqlite3Select
and sqlite3Where begin sqlite3WhereBegin is called by both functions
sqlite3Update and sqlite3Select but according to CallGraphSCC sqlite3Update
is codegen before in that case during RegMask propagation phase default
regmask is used for call site of sqlite3WhereBegin and later
sqlite3WhereBegin is optimized not to save callee saved registers this
should obviously not happen.
Here is assembly code that is printed with lldb dis command on run time
failure and after careful observation I have identified one bug:
...
0x10002d8ff <+1855>: movl -0x74(%rbp), %r13d
0x10002d903 <+1859>: movq -0x30(%rbp), %r12 ; this contains address
of a structure
0x10002d907 <+1863>: movq -0x38(%rbp), %r14
0x10002d90b <+1867>: movq -0x58(%rbp), %r15
0x10002d90f <+1871>: leaq -0x150(%rbp), %rdi
0x10002d916 <+1878>: movq -0x50(%rbp), %rsi
0x10002d91a <+1882>: callq 0x10001a940 ;
sqlite3ExprResolveNames at sqlite3.c:47419 this function preserves callee
saved regs
0x10002d91f <+1887>: testl %eax, %eax
0x10002d921 <+1889>: je 0x10002d92c ; <+1900> at
sqlite3.c:66485
0x10002d923 <+1891>: movq -0x70(%rbp), %rdx
0x10002d927 <+1895>: jmp 0x10002d2b1 ; <+241> at
sqlite3.c:66299
0x10002d92c <+1900>: xorl %eax, %eax
0x10002d92e <+1902>: movq %rax, -0xe0(%rbp)
0x10002d935 <+1909>: xorl %ecx, %ecx
0x10002d937 <+1911>: xorl %r8d, %r8d
0x10002d93a <+1914>: movq %r14, %rdi
0x10002d93d <+1917>: movq -0xd8(%rbp), %rsi
0x10002d944 <+1924>: movq -0x50(%rbp), %rdx
0x10002d948 <+1928>: callq 0x100030600 ;
sqlite3WhereBegin at sqlite3.c:69859 this function will not save any callee
saved regs and actual code uses R12
0x10002d94d <+1933>: movq %rax, %r14
0x10002d950 <+1936>: testq %r14, %r14
0x10002d953 <+1939>: je 0x10002e0d3 ; <+3859> at
sqlite3.c:66699
0x10002d959 <+1945>: movq -0x48(%rbp), %rax
0x10002d95d <+1949>: cmpb $0x0, 0x69(%rax)
0x10002d961 <+1953>: movl $0xa, %eax
0x10002d966 <+1958>: movl $0x26, %esi
0x10002d96b <+1963>: cmovnel %eax, %esi
0x10002d96e <+1966>: movq %r12, %rdi ; here value of R12 is
clobbered so wrong address is passed as parameter and due to that while
executing sqlite3VdbeAddOp2 bed memory access error is raised.
0x10002d971 <+1969>: movq -0x68(%rbp), %rdx
0x10002d975 <+1973>: movl %r13d, %ecx
0x10002d978 <+1976>: callq 0x100019720 ;
sqlite3VdbeAddOp2 at sqlite3.c:37297
...
Here is lldb dis result for sqlite3VdbeAddOp3:
0x100019500 <+0>: pushq %rbp
0x100019501 <+1>: movq %rsp, %rbp
0x100019504 <+4>: pushq %r15
0x100019506 <+6>: pushq %r14
0x100019508 <+8>: pushq %r13
0x10001950a <+10>: pushq %r12
0x10001950c <+12>: pushq %rbx
0x10001950d <+13>: pushq %rax
0x10001950e <+14>: movl %ecx, %r12d
0x100019511 <+17>: movl %edx, %r13d
0x100019514 <+20>: movl %esi, %r15d
0x100019517 <+23>: movq %rdi, %rbx
-> 0x10001951a <+26>: movl 0x18(%rbx), %r14d
Please correct me if any thing is wrong and also please provide some help.
-Vivek
2016-06-30 14:21 GMT+05:30 vivek pandya <vivekvpandya at gmail.com>:
> Hello Mentors,
>
> I am currently finding bug in Local Function related optimization due to
> which runtime failures are observed in some test cases, as those test cases
> are containing very large function with recursion and object oriented code
> so I am not able to find a pattern which is causing failure. So I tried
> following simple case to understand expected behavior from this
> optimization.
>
> Consider following code :
>
> define void @bar() #0 {
> call void asm sideeffect "movl %ecx, %r15d", "~{r15}"() #0
> call void @foo()
> call void asm sideeffect "movl %r15d, %ebx", "~{rbx}"() #0
> ret void
> }
>
> define internal void @foo() #0 {
> call void asm sideeffect "movl %r14d, %r15d", "~{r15}"() #0
> ret void
> }
>
> and its generated assembly code when IPRA enabled:
>
> .section __TEXT,__text,regular,pure_instructions
> .macosx_version_min 10, 12
> .p2align 4, 0x90
> _foo: ## @foo
> .cfi_startproc
> ## BB#0:
> ## InlineAsm Start
> movl %r14d, %r15d
> ## InlineAsm End
> retq
> .cfi_endproc
>
> .globl _bar
> .p2align 4, 0x90
> _bar: ## @bar
> .cfi_startproc
> ## BB#0:
> pushq %r15
> Ltmp0:
> .cfi_def_cfa_offset 16
> pushq %rbx
> Ltmp1:
> .cfi_def_cfa_offset 24
> pushq %rax
> Ltmp2:
> .cfi_def_cfa_offset 32
> Ltmp3:
> .cfi_offset %rbx, -24
> Ltmp4:
> .cfi_offset %r15, -16
> ## InlineAsm Start
> movl %ecx, %r15d
> ## InlineAsm End
> callq _foo
> ## InlineAsm Start
> movl %r15d, %ebx
> ## InlineAsm End
> addq $8, %rsp
> popq %rbx
> popq %r15
> retq
> .cfi_endproc
>
>
> .subsections_via_symbols
>
> now foo clobbers R15 (which is callee saved) but as foo is local function
> IPRA will mark R15 as clobbered and foo will not have save/restore for R15
> in prologue/epilog . Now for above function code to work correctly in call
> site of foo in bar save and restore of R15 is expected but I am not able to
> find a pass in llvm which does that in fact if I am not wrong RegMasks of
> call site will be used by reg allocators
> by LiveIntervals::checkRegMaskInterference and due to that if R15 is marked
> clobbered by call _foo then R15 will not be used for live-range which is
> spanned across call _foo. ( that it self is other concerns because it may
> result in virtual reg spill due to lack of available regs, as while setting
> callee saved regs none it will be propagated through regmaks)
>
> Here are my questions related to this example:
> 1) Is there any pass or code in LLVM which is responsible for caller saved
> register for Physical Registers? By looking at InlineSpiller.cpp it is
> responsible for VReg spilling.
> 2) If such pass exists then why R15 is not saved around call __foo?
> 3) Why _bar is saving %rax in above code?
>
> Please help!
>
> Sincerely,
> Vivek
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160630/6e157a83/attachment-0001.html>
More information about the llvm-dev
mailing list