[llvm-dev] Help required regarding IPRA and Local Function optimization

vivek pandya via llvm-dev llvm-dev at lists.llvm.org
Sat Jul 2 04:27:54 PDT 2016


On Sat, Jul 2, 2016 at 5:07 AM, Quentin Colombet <qcolombet at apple.com>
wrote:

> Hi Vivek,
>
> I believe your reduced test case is broken.
>
> > On Jun 30, 2016, at 1:51 AM, vivek pandya <vivekvpandya at gmail.com>
> wrote:
> >
> > Hello Mentors,
> >
> > I am currently finding bug in Local Function related optimization due to
> which runtime failures are observed in some test cases, as those test cases
> are containing very large function with recursion and object oriented code
> so I am not able to find a pattern which is causing failure. So I tried
> following simple case to understand expected behavior from this
> optimization.
> >
> > Consider following code :
> >
> > define void @bar() #0 {
> >   call void asm sideeffect "movl      %ecx, %r15d", "~{r15}"() #0
> >   call void @foo()
> >   call void asm sideeffect "movl      %r15d, %ebx", "~{rbx}"() #0
> >   ret void
> > }
> >
> > define internal void @foo() #0 {
> >   call void asm sideeffect "movl      %r14d, %r15d", "~{r15}"() #0
> >   ret void
> > }
> >
> > and its generated assembly code when IPRA enabled:
> >
> >       .section        __TEXT,__text,regular,pure_instructions
> >       .macosx_version_min 10, 12
> >       .p2align        4, 0x90
> > _foo:                                   ## @foo
> >       .cfi_startproc
> > ## BB#0:
> >       ## InlineAsm Start
> >       movl    %r14d, %r15d
> >       ## InlineAsm End
> >       retq
> >       .cfi_endproc
> >
> >       .globl  _bar
> >       .p2align        4, 0x90
> > _bar:                                   ## @bar
> >       .cfi_startproc
> > ## BB#0:
> >       pushq   %r15
> > Ltmp0:
> >       .cfi_def_cfa_offset 16
> >       pushq   %rbx
> > Ltmp1:
> >       .cfi_def_cfa_offset 24
> >       pushq   %rax
> > Ltmp2:
> >       .cfi_def_cfa_offset 32
> > Ltmp3:
> >       .cfi_offset %rbx, -24
> > Ltmp4:
> >       .cfi_offset %r15, -16
> >       ## InlineAsm Start
> >       movl    %ecx, %r15d
> >       ## InlineAsm End
> >       callq   _foo
> >       ## InlineAsm Start
> >       movl    %r15d, %ebx
> >       ## InlineAsm End
> >       addq    $8, %rsp
> >       popq    %rbx
> >       popq    %r15
> >       retq
> >       .cfi_endproc
> >
> >
> > .subsections_via_symbols
> >
> > now foo clobbers R15 (which is callee saved) but as foo is local
> function IPRA will mark R15 as clobbered and foo will not have save/restore
> for R15 in prologue/epilog . Now for above function code to work correctly
> in call site of foo in bar save and restore of R15 is expected but I am not
> able to find a pass in llvm which does that in fact if I am not wrong
> RegMasks of call site will be used by reg allocators by
> LiveIntervals::checkRegMaskInterference and due to that if R15 is marked
> clobbered  by call _foo then R15 will not be used for live-range which is
> spanned across call _foo. ( that it self is other concerns because it may
> result in virtual reg spill due to lack of available regs, as while setting
> callee saved regs none it will be propagated through regmaks)
> >
> > Here are my questions related to this example:
> > 1) Is there any pass or code in LLVM which is responsible for caller
> saved register for Physical Registers? By looking at InlineSpiller.cpp it
> is responsible for VReg spilling.
>
> If you caller saved register "by hand” (like with inline assembly, you are
> supposed to control their live range.
> What I am saying is that if you want support from the compiler, you need
> to give it this freedom, and your test case does not provide that.
> i.e., if you want the compiler to help, you would need to save r15 in a
> virtual register, and use this virtual register in the next inline asm
> statement.
> E.g. (do not try to run that code, the syntax is probably wrong, but I
> wanted to illustrate the idea)
>
> define void @bar() #0 {
>   call void asm sideeffect "movl        %ecx, %r15d; movl %r15d, $r", i32
> %tmpVal, "~{r15}"() #0
>   call void @foo()
>   call void asm side effect “movl $r, %r15d; movl       %r15d, %ebx",
> "~{rbx}"() #0
>   ret void
> }
>
> > 2) If such pass exists then why R15 is not saved around call __foo?
>
> R15 is not live in your example. I mean, inline asm statements are opaque
> for the compiler and it cannot track the liveness from the strings :). The
> only thing it knows, is what you tell it: you clobber r15 in one
> instruction and rbx in another. It does know the second one use r15 from
> the first one.
>
> > 3) Why _bar is saving %rax in above code?
>
> That’s an optimization :). We actually need to do sub $8 (probably to
> realign the stack), but since sub and push are as expensive, we do push.
>
> Thanks Quentin, I got your point. I will update the test case accordingly.

Sincerely,,
Vivek


> Cheers,
> -Quentin
> >
> > Please help!
> >
> > Sincerely,
> > Vivek
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160702/4f2bcad8/attachment.html>


More information about the llvm-dev mailing list