[cfe-dev] Vtable code generation options
Serhii Huralnik via cfe-dev
cfe-dev at lists.llvm.org
Mon Aug 21 08:35:36 PDT 2017
Hey,
First of all sorry if this mail should be targeted to some other list, but
seems like no one except Clang people can help with my issue. I'm using OS
X environment, but probably it is common to any ELF-based system.
It is about how virtual tables are generated and which link relocations are
needed to make them work. I have a very simple example that creates vtable
with one slot for pure virtual:
class Foo {
public:
virtual void bar() = 0;
};
class Baz : public Foo {
public:
virtual void bar() override;
};
void Baz::bar() {}
void xyz() { Baz().bar(); }
Compiler produces next vtable layout for Foo:
__ZTV3Foo:
.quad 0
.quad __ZTI3Foo
.quad ___cxa_pure_virtual
It is easy to find that linker will produce absolute relocation for symbol
___cxa_pure_virtual. In most cases this relocation will be proceeded by
dynamic linker. Lets now assume that our loadable module has a lot of
similar vtables for miscellaneous classes. Each slot for pure method will
cause new absolute relocation for the same symbol and we end up with binary
that contains a huge bunch of relocation entries that refer to the same
symbol. E.g. here are few relocation entries for Android dynamic library
built with clang-3.6 from Android NDK:
...
00332fe4 00030502 R_ARM_ABS32 00000000 __cxa_pure_virtual
00332fe8 00030502 R_ARM_ABS32 00000000 __cxa_pure_virtual
00332fec 00030502 R_ARM_ABS32 00000000 __cxa_pure_virtual
00332ff0 00030502 R_ARM_ABS32 00000000 __cxa_pure_virtual
00332ff4 00030502 R_ARM_ABS32 00000000 __cxa_pure_virtual
00332ff8 00030502 R_ARM_ABS32 00000000 __cxa_pure_virtual
00332ffc 00030502 R_ARM_ABS32 00000000 __cxa_pure_virtual
003330f4 00030502 R_ARM_ABS32 00000000 __cxa_pure_virtual
003330f8 00030502 R_ARM_ABS32 00000000 __cxa_pure_virtual
003330fc 00030502 R_ARM_ABS32 00000000 __cxa_pure_virtual
...
I guess such situation is pretty common for large projects. If dynamic
linker is not clever enough to detect such situation - it will waste time
with lookups for the same symbol over and over.
Is compiler able to avoid emitting of absolute relocation for this case?
Maybe it may introduce some lightweight shim function that indeed will call
__cxa_pure_virtual via jump slot. These shims may be mergeable, thus final
binary will contain only one shim instance and each vtable would need only
relative relocation without expensive symbol lookup, that now is needed
only once. Also as far as I see - such approach should incur any
significant speed or size regression for generated code since pure virtual
stub won't be called often (probably no more than once, if any). At the
same time dynamic linking may be performed faster.
Does Clang support similar approach at the moment? Or there are some
downsides for described approach and it can not be implemented at all?
--
Best regards
Serhii
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20170821/4c91a98c/attachment.html>
More information about the cfe-dev
mailing list