[LLVMdev] [ARM] [PIC] optimizing the loading of hidden global variable
Weiming Zhao
weimingz at codeaurora.org
Wed Mar 12 10:43:59 PDT 2014
Hi,
When Im compiling a code with fvisibility=hidden fPIC for ARM, I find
that LLVM generates less optimized code than GCC.
For example:
test.cpp:
void init(void *);
int g0[100];
int g1[100];
int g2[100];
void foo() {
init(&g0);
init(&g1);
init(&g2);
}
Clang will emit 1 GOT entry for each GV and 2 instructions to get the
address:
ldr r0, .LCPI0_2
add r0, r0, r4
bl _Z4initPv(PLT)
GCC does this only for the first GV. The rest GV address are computed
directly:
ldr r4, .L2
.LPIC0:
add r4, pc, r4 è get &g0 via GOT_PC Relative
mov r0, r4
bl _Z4initPv(PLT)
add r0, r4, #400 è get &g1
bl _Z4initPv(PLT)
add r0, r4, #800 è get &g2
ldmfd sp!, {r4, lr}
b _Z4initPv(PLT)
.L3:
.align 2
.L2:
.word .LANCHOR0-(.LPIC0+8) è 1 GOT offset entry
It seems its a missing optimizing opportunity for LLVM both in code size
and performance, any ideas? If so, I can open a bug and try to fix it.
Thanks,
Weiming
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by
The Linux Foundation
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140312/6d6d44d1/attachment.html>
More information about the llvm-dev
mailing list