[LLVMdev] [PATCH] Unwanted r11 in push/pop on Cortex-M.
Andrea Mucignat
andrea at nestlabs.com
Sun Oct 20 17:06:56 PDT 2013
To recap, this is what I was trying to solve:
This C code:
int bar(int a, int b, int c, int d, int e, int f);
int foo(int a, int b, int c, int d, int e )
{
int x = 3*a;
return bar3(a,b,c,d,e,x);
}
Produced the following assembly output:
foo:
push {r11, lr}
sub sp, #8
bl bar
add sp, #8
pop {r11, pc}
The part I didn't like is that push/pop become 4 bytes instructions because
the register that we picked to "align" the stack pointer does not belong to
the low register set (r0-r7).
After a bit of digging, the following patch seems to take care of it.
Basically what happens is:
- the C function in the example doesn't have any local to save
- the only spilled register is LR
- the Prolog/Epilog Inserted calls the target specific
ARMFrameLowering::processFunctionBeforeCalleeSavedScan
- at this point the we need to fix the fact that SP would not be EABI
compliant because only 1 register has been spilled so far
- inside processFunctionBeforeCalleeSavedScan, there a check: if
TargetAlign == 8 && (NumGPRSpills & 1) then we align the stack by pushing
one of the _unspilled_ registers.
The problem is that this "unspilled" register list starts with r11 (for
reasons I'm not clear about), so I decided to skip all the registers r8-r15.
The part I'm not sure about is: what if r4-r7 are all spilled already, is
the SP going to be aligned by some other routine later on (maybe by adding
a "sub sp, sp, #4")?
Any comment?
Thanks,
Andrea
diff -aruN llvm-source/lib/Target/ARM/ARMFrameLowering.cpp
llvm-source-new/lib/Target/ARM/ARMFrameLowering.cpp
--- llvm-source/lib/Target/ARM/ARMFrameLowering.cpp 2013-07-29
10:50:07.000000000 -0700
+++ llvm-source-new/lib/Target/ARM/ARMFrameLowering.cpp 2013-10-20
16:33:59.905251552 -0700
@@ -1313,7 +1313,7 @@
for (unsigned i = 0, e = UnspilledCS1GPRs.size(); i != e; ++i) {
unsigned Reg = UnspilledCS1GPRs[i];
// Don't spill high register if the function is thumb1
- if (!AFI->isThumb1OnlyFunction() ||
+ if (!AFI->isThumbFunction() ||
isARMLowRegister(Reg) || Reg == ARM::LR) {
MRI.setPhysRegUsed(Reg);
if (!MRI.isReserved(Reg))
On Tue, Oct 15, 2013 at 10:15 AM, Andrea Mucignat <andrea at nestlabs.com>wrote:
> Umesh,
>
> From Target/ARM/ARMRegisterInfo.td, it looks like FP for Thumb should be
> r7 instead of r11.
>
> // Register classes.
> //
> // pc == Program Counter
> // lr == Link Register
> // sp == Stack Pointer
> // r12 == ip (scratch)
> // r7 == Frame Pointer (thumb-style backtraces)
> // r9 == May be reserved as Thread Register
> // r11 == Frame Pointer (arm-style backtraces)
> // r10 == Stack Limit
>
> Regardless, I'm compiling with -fomit-frame-pointer, so it should not
> really matter, right?
>
> Andrea
>
>
>
> On Tue, Oct 15, 2013 at 8:33 AM, Umesh Kalappa <umesh.kalappa0 at gmail.com>wrote:
>
>> Hi andrea,
>> R11 treated as frame pointer at arm backend , which is fixed again .
>>
>> Thanks
>> Umesh
>>
>>
>> On Tuesday, October 15, 2013, Andrea Mucignat <andrea at nestlabs.com>
>> wrote:
>> > Umesh,
>> > Makes some sort of sense to me, OTOH:
>> > If instead of choosing r11 as a "dummy" to align the stack we had
>> chosen some other register in the range r0-r7 then we could have emitted
>> the PUSH encoding T1 (2 bytes opcode) as opposed to the encoding T2 (which
>> is a 4 bytes opcode).
>> > A
>> >
>> > On Tue, Oct 15, 2013 at 2:59 AM, Umesh Kalappa <
>> umesh.kalappa0 at gmail.com> wrote:
>> >
>> > Hi Andrea,
>> >
>> > That is because the LR is the fixed register as per the
>> >
>> http://infocenter.arm.com/help/topic/com.arm.doc.ihi0042e/IHI0042E_aapcs.pdf
>> > and out_char() function is not the leaf function ,Hence compiler
>> > tends to save and restore the LR and the save and restore of
>> > register r11 is to align stack for 8 bytes as per ARM EABI.
>> >
>> >
>> >
>> > Thanks
>> >
>> > ~Umesh
>> >
>> >
>> > On Tuesday, October 15, 2013, Umesh Kalappa <umesh.kalappa0 at gmail.com>
>> wrote:
>> >> ---------- Forwarded message ----------
>> >> From: Andrea Mucignat <andrea at nestlabs.com>
>> >> Date: Tue, Oct 15, 2013 at 7:05 AM
>> >> Subject: [LLVMdev] Unwanted push/pop on Cortex-M.
>> >> To: llvmdev at cs.uiuc.edu
>> >>
>> >>
>> >> Hi,
>> >>
>> >> I have this code:
>> >>
>> >> void platform_putchar(int, char);
>> >> void out_char( char ch );
>> >>
>> >> void out_char( char ch )
>> >> {
>> >> platform_putchar (0, ch);
>> >> }
>> >>
>> >> I'm compiling with the following clang invocation:
>> >>
>> >> $
>> /usr/local/vendor/toolchains/llvm/3.3/armv7m/bin/armv7m-none-eabi-clang
>> >> -mcpu=cortex-m4 -mfloat-abi=soft -mthumb -nostdinc -ffreestanding
>> >> -ffunction-sections -fdata-sections -fno-exceptions -fno-rtti
>> >> -fomit-frame-pointer -momit-leaf-frame-pointer -nostdinc -v -Os -S -o
>> >> a.s a.c
>> >> vendor-clang version 3.3 (tags/RELEASE_33/final) (based on LLVM 3.3)
>> >> Target: armv7m-none--eabi
>> >> Thread model: posix
>> >>
>> "/usr/local/vendor-20130805-b8d59d2/toolchains/llvm/3.3/armv7m/bin/armv7m-none-eabi-clang"
>> >> -cc1 -triple thumbv7em-none--eabi -S -disable-free -main-file-name a.c
>> >> -mrelocation-model static -fmath-errno -mconstructor-aliases
>> >> -target-abi aapcs -target-cpu cortex-m4 -msoft-float -mfloat-abi soft
>> >> -target-feature +soft-float -target-feature +soft-float-abi
>> >> -target-feature -neon -target-linker-version 2.22
>> >> -momit-leaf-frame-pointer -v -ffunction-sections -fdata-sections
>> >> -coverage-file /tmp/a.s -nostdsysteminc -nobuiltininc -resource-dir
>> >>
>> /usr/local/vendor-20130805-b8d59d2/toolchains/llvm/3.3/armv7m/bin/../lib/clang/3.3
>> >> -Os -fno-dwarf-directory-asm -fdebug-compilation-dir /tmp
>> >> -ferror-limit 19 -fmessage-length 207 -ffreestanding -mstackrealign
>> >> -fno-rtti -fno-signed-char -fobjc-runtime=gcc
>> >> -fobjc-default-synthesize-properties -fdiagnostics-show-option
>> >> -fcolor-diagnostics -backend-option -vectorize-loops -o a.s -x c a.c
>> >> clang -cc1 version 3.3 based upon LLVM 3.3 default target
>> armv7m-none-eabi
>> >> #include "..." search starts here:
>> >> End of search list.
>> >>
>> >> $ cat a.s
>> >> .syntax unified
>> >> .eabi_attribute 6, 10
>> >> .eabi_attribute 9, 2
>> >> .eabi_attribute 10, 5
>> >> .fpu vfpv4
>> >> .eabi_attribute 20, 1
>> >> .eabi_attribute 21, 1
>> >> .eabi_attribute 23, 3
>> >> .eabi_attribute 24, 1
>> >> .eabi_attribute 25, 1
>> >> .eabi_attribute 44, 1
>> >> .file "a.c"
>> >> .section .text.out_char,"ax",%progbits
>> >> .globl out_char
>> >> .align 2
>> >> .type out_char,%function
>> >> .code 16
>> >> .thumb_func
>> >> out_char:
>> >> push.w {r11, lr}
>> >> mov r1, r0
>> >> movs r0, #0
>> >> bl platform_putchar
>> >> pop.w {r11, pc}
>> >> .Ltmp0:
>> >> .size out_char, .Ltmp0-out_char
>> >>
>> >> The one question I have is:
>> >> why can't this out_char function be compiled into:
>> >>
>> >> out_char:
>> >> mov r1, r0
>> >> mov r0, #0
>> >> bl platform_putchar
>> >> bx lr
>> >>
>> >> What's the clang/llvm module responsible for generating the prologue
>> >> and epilogue for this function?
>> >> I looked into CodeGen/PrologEpilogInserter, ARMFrameLowering.cpp
>> >> Thumb1FrameLowering.cpp but it's not 100% clear how the code gen
>> >> decides what the stack frame look like.
>> >>
>> >> Here's the -emit-llvm output
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20131020/49c8df94/attachment.html>
More information about the llvm-dev
mailing list