[LLVMdev] [ARM] [PIC] optimizing the loading of hidden global variable

Wed Mar 12 10:43:59 PDT 2014

Hi,

When Im compiling a code with fvisibility=hidden fPIC for ARM, I find
that LLVM generates less optimized code than GCC.

For example:

test.cpp:

void init(void *);

int g0[100];

int g1[100];

int g2[100];

void foo() {

  init(&g0);

  init(&g1);

  init(&g2);

}

Clang will emit 1 GOT entry for each GV and 2 instructions to get the
address: 

        ldr     r0, .LCPI0_2

        add     r0, r0, r4

         bl      _Z4initPv(PLT)

GCC  does this only for the first GV. The rest GV address are computed
directly:

        ldr     r4, .L2

.LPIC0:

        add     r4, pc, r4   è get &g0 via GOT_PC Relative

        mov     r0, r4

        bl      _Z4initPv(PLT)

        add     r0, r4, #400    è get &g1

        bl      _Z4initPv(PLT)

        add     r0, r4, #800  è get &g2

        ldmfd   sp!, {r4, lr}

        b       _Z4initPv(PLT)

.L3:

        .align  2

.L2:

        .word   .LANCHOR0-(.LPIC0+8)  è 1 GOT offset entry

It seems its a missing optimizing opportunity for LLVM both in code size
and performance, any ideas? If so, I can open a bug and try to fix it.

Thanks,

Weiming

Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by
The Linux Foundation

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140312/6d6d44d1/attachment.html>