[llvm-dev] [ARM] Register pressure with -mthumb forces register reload before each call

Prathamesh Kulkarni via llvm-dev llvm-dev at lists.llvm.org
Tue Apr 7 13:07:05 PDT 2020


On Tue, 31 Mar 2020 at 22:03, Prathamesh Kulkarni
<prathamesh.kulkarni at linaro.org> wrote:
>
> Hi,
> Compiling attached test-case, which is reduced version of of
> uECC_shared_secret from tinycrypt library [1], with
> --target=arm-linux-gnueabi -march=armv6-m -Oz -S
> results in reloading of register holding function's address before
> every call to blx:
>
>         ldr       r3, .LCPI0_0
>         blx      r3
>         mov    r0, r6
>         mov    r1, r5
>         mov    r2, r4
>         ldr       r3, .LCPI0_0
>         blx       r3
>         ldr        r3, .LCPI0_0
>         mov     r0, r6
>         mov     r1, r5
>         mov     r2, r4
>         blx       r3
>
> .LCPI0_0:
>         .long   foo
>
> From dump of regalloc (attached), AFAIU, what seems to happen during
> greedy allocator is, all virt regs %0 to %3 are live across first two
> calls to foo. Thus %0, %1 and %2 get assigned r6, r5 and r4
> respectively, and %3 which holds foo's address doesn't have any
> register left.
> Since it's live-range has least weight, it does not evict any existing interval,
> and gets split. Eventually we have the following allocation:
>
> [%0 -> $r6] tGPR
> [%1 -> $r5] tGPR
> [%2 -> $r4] tGPR
> [%6 -> $r3] tGPR
> [%11 -> $r3] tGPR
> [%16 -> $r3] tGPR
> [%17 -> $r3] tGPR
>
> where %6, %11, %16 and %17 all are derived from %3.
> And since r3 is a call-clobbered register, the compiler is forced to
> reload foo's address
> each time before blx.
>
> To fix this, I thought of following approaches:
> (a) Disable the heuristic to prefer indirect call when there are at
> least 3 calls to
> same function in basic block in ARMTargetLowering::LowerCall for Thumb-1 ISA.
>
> (b) In ARMTargetLowering::LowerCall, put another constraint like
> number of arguments, as a proxy for register pressure for Thumb-1, but
> that's bound to trip another cases.
>
> (c) Give higher priority to allocate vrit reg used for indirect calls
> ? However, if that
> results in spilling of some other register, it would defeat the
> purpose of saving code-size. I suppose ideally we want to trigger the
> heuristic of using indirect call only when we know beforehand that it
> will not result in spilling. But I am not sure if it's possible to
> estimate that during isel ?
>
> I would be grateful for suggestions on how to proceed further.
ping ?

Thanks,
Prathamesh
>
> [1] https://github.com/intel/tinycrypt/blob/master/lib/source/ecc_dh.c#L139
>
> Thanks,
> Prathamesh


More information about the llvm-dev mailing list