[cfe-dev] GCC and Clang produce undefined references to functions with vague linkage

Thu Jun 28 13:00:19 PDT 2012

On Jun 28, 2012, at 12:12 PM, Joe Buck wrote:
> On Thu, Jun 28, 2012 at 02:13:47PM -0400, Rafael Espíndola wrote:
> [ problem with visibility for bar::~bar for testcase ]
>> $ cat test.h
>> struct foo {
>>  virtual ~foo();
>> };
>> struct bar : public foo {
>>  virtual void zed();
>> };
>> $ cat def.cpp
>> #include "test.h"
>> void bar::zed() {
>> }
>> $ cat undef.cpp
>> #include "test.h"
>> void f() {
>>  foo *x(new bar);
>>  delete x;
>> }
>> 
> ...
>> 
>> I can see two ways of solving this and would like for both clang and
>> gcc to implement the same:
>> 
>> [1] * Make sure the destructor is emitted everywhere. That is, clang and
>> gcc have a bug in producing an undefined reference to _ZN3barD0Ev.
>> [2] * Make it clear that the file exporting the vtable has to export the
>> symbols used in it. That is, the Itanium c++ abi needs clarification
>> and so does gcc's lto plugin (and the llvm patch I am working on).
> 
> I think that the second solution wins because it allows for the production
> of less object code, and it is consistent with the rationale for the
> vtable optimization rule (the vtable is emitted by the file that has the
> definition for the first non-inline virtual function; simply do the same
> for the auto-generated virtual destructor).  The first solution requires
> making one copy per compilation unit and eliminating the duplicates at
> link time.

But that's pervasively true in C++ — the linker has to eliminate duplicates
all the time.  Idiomatic C++ code ends up plunking down hundreds, if
not thousands, of inline functions in every single translation unit.  This is
already a massive burden for linking C++ programs, particularly in debug
builds.  Adding a few extra symbols when the optimizer determines that
it can devirtualize, but declines to inline, is essentially irrelevant.

In fact, it is particularly unimportant because it's very likely that this duplicate
will be in the same DSO as the vtable.  That means that the first solution
imposes some extra work on the static linker alone (but again, only when
devirtualizing a call to a function we don't want to inline!) while preserving
our ability to reduce work for the dynamic linker (since calls do not rely
on address equality of the function across translation units).  The second
solution is an irrevocable guarantee that every symbol mentioned in
a strong vtable *must* be exported to the whole world.

Also recall that these symbols can already be emitted in arbitrary
other translation units — we cannot change the ABI to say that these
symbols are *only* emitted in the file defining the v-table.

Finally, both the language standard and the ABI are clearly designed
around an assumption that every translation unit that needs an inline
function will emit it.

John.