[LLVMdev] bx instruction getting generated in arm assembly for	O1
    David Chisnall 
    David.Chisnall at cl.cam.ac.uk
       
    Tue Nov 25 08:04:38 PST 2014
    
    
  
Again, this looks correct - the only difference is that the first version is better optimised.  r11 is spilled because it is not used to store the stack pointer.  The following:
>  blx r0
>  pop {r11, pc}
Is restoring r11 and jumping to the saved link register (and adjusting the stack pointer: you've got to love AArch32 assembly, where a jump, stack pointer adjustment, and register reload is a single instruction).  If r11 is not spilled, then we're left with:
push lr
...
blx r0
pop pc
And this is equivalent to simply:
bx r0
So, again, what is the bug that your test is testing for?  Or are you just checking that clang 3.5 really is doing tail-call optimisation in trivial cases?
David
On 25 Nov 2014, at 07:42, MAYUR PANDEY <mayur.p at samsung.com> wrote:
> Hi Jonathan,
> The assembly generated in case of clang-3.5 is
> indirect_call:
>  .fnstart
> .Leh_func_begin0:
>  ldr r0, .LCPI0_0
>  ldr r1, .LCPI0_1
> .LPC0_0:
>  add r0, pc, r0
>  ldr r0, [r1, r0]
>  ldr r0, [r0]
>  bx r0
>  .align 2
> .LCPI0_0:
>  .long _GLOBAL_OFFSET_TABLE_-(.LPC0_0+8)
> .LCPI0_1:
>  .long indirect_func(GOT)
> .Ltmp0:
>  .size indirect_call, .Ltmp0-indirect_call
> .Leh_func_end0:
>  .fnend
>  
> with clang-3.4.2 the assembly generated is:
> ndirect_call:
>  push {r11, lr}
>  ldr r0, .LCPI0_0
>  mov r11, sp
>  ldr r1, .LCPI0_1
> .LPC0_0:
>  add r0, pc, r0
>  ldr r0, [r1, r0]
>  ldr r0, [r0]
>  blx r0
>  pop {r11, pc}
>  .align 2
> .LCPI0_0:
>  .long _GLOBAL_OFFSET_TABLE_-(.LPC0_0+8)
> .LCPI0_1:
>  .long indirect_func(GOT)
> .Ltmp0:
>  .size indirect_call, .Ltmp0-indirect_call
>  
> Both assemblies are generated with O1 optimization. The assembly generated with trunk version of clang is similar to 3.5
>  
> Thanks,
> Mayur
>  
> ------- Original Message -------
> Sender : Jonathan Roelofs<jonathan at codesourcery.com>
> Date : Nov 25, 2014 10:15 (GMT+09:00)
> Title : Re: [LLVMdev] bx instruction getting generated in arm assembly for O1
>  
> 
> 
> On 11/24/14 8:00 AM, MAYUR PANDEY wrote:
> > Hi,
> >
> > For the following test:
> >
> > int (*indirect_func)();
> >
> > int indirect_call()
> > {
> >       return indirect_func();
> > }
> >
> > when generating the assembly with clang-3.5, for -march=armv5te,  there is a
> > difference in the assemblies generated with O0 and O1:
> >
> > In the assembly generated with O0, we are getting the "blx" instruction whereas
> > with O1 we get "bx" (in 3.4.2 we used to get "blx" for both O0 and O1).
> Can you post the asm that you're seeing for this function?
> 
> There's a related case to this on armv4t which Iain has a patch for, that I 
> think we forgot about... The problem there is that armv4t doesn't have blx at 
> all, so should be generating a sequence like: 'mov r0, ...; bx _Ltmp; _Ltmp: bl r0'.
> >
> > Is this because of this patch:  [llvm] r214959 - ARM: do not generate BLX
> > instructions on Cortex-M CPUs
> I doubt it. armv5te isn't a cortex-m processor.
> 
> 
> Cheers,
> 
> Jon
> >
> > Or I am missing something.
> >
> > Thanks,
> >
> > Mayur
> >
> >
> >
> >
> > _______________________________________________
> > LLVM Developers mailing list
> > LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> >
> 
> -- 
> Jon Roelofs
> jonathan at codesourcery.com
> CodeSourcery / Mentor Embedded
>  
>  
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
    
    
More information about the llvm-dev
mailing list