[libcxx-commits] [PATCH] D127444: [libc++] Use ABI tags instead of internal linkage to provide per-TU insulation

Eric Fiselier via Phabricator via libcxx-commits libcxx-commits at lists.llvm.org
Mon Jun 13 09:47:27 PDT 2022


EricWF requested changes to this revision.
EricWF added a comment.
This revision now requires changes to proceed.

Hmm, so I'm not sure this is the best idea. At worst, I think it'll introduce ODR bugs if we use it on anything other than inline functions [2], and when applied only to inline functions it helps only in a very very limited number of cases [1].

[1]. Since most changes to a function signature affect the mangling as well, there's no additional benefit from further changing the mangling using `__abi_tag__`. The one case it would catch is if we change only the return type of a non-template function (since the return type is mangled into function templates).
 And we shouldn't be changing the return type of user-facing functions, like ever.

[2] All of the following examples break when the attribute is applied to a class type.

  // RUN: clang++ -c -o test1.o -DTAG=1 test.cpp && clang++ -o a.out -DTAG=2 test.cpp test1.o && ./a.out
  #include <typeinfo>
  #include <cassert>
  #include <cstdio>
  
  #ifndef TAG
  #error TAG must be defined
  #endif
  #define TO_STR2(X) #X
  #define TO_STR(X) TO_STR2(X)
  
  #define HIDDEN __attribute__((__abi_tag__(TO_STR(TAG))))
  
  struct HIDDEN  MyType {
    virtual ~MyType() = default;
    static  int data;
  };
  
  inline  int MyType::data = 42;
  
  int* get_data();
  const std::type_info* get_info();
  void throw_type();
  
  __attribute__((weak)) void passes_param(MyType); // weak only used to demonstrate linker failure at runtime rather than linktime.
  
  #if TAG == 1
  int* get_data() {
    return &MyType::data;
  }
  void throw_type() {
    MyType t{};
    throw t;
  }
  
  const std::type_info* get_info() {
    return &typeid(MyType);
  }
  void passes_param(MyType) {
    printf("INFO: Passed to MyType at TAG=1");
  }
  #else
  int main() {
    int num_errors = 0;
    if (&passes_param) {
      printf("SUCCESS: passes_param(MyType) has definition\n");
    } else {
      ++num_errors;
      printf("ERROR: passes_param(MyType) is undefined\n");
    }
    if (get_data() == &MyType::data) {
      printf("SUCCESS: data at TAG=1 == data at TAG=2\n");
    } else {
  
      printf("ERROR: data at TAG=1 != data at TAG=2\n");
      ++num_errors;
    }
    if (*get_info() == typeid(MyType)) {
  
      printf("typeid(MyType)@TAG=1 == typeid(MyType)@TAG=2\n");
    } else {
      printf("ERROR: typeid(MyType)@TAG=1 == typeid(MyType)@TAG=2\n");
      ++num_errors ;
    }
    try {
      throw_type();
      assert(false);
    } catch (MyType const& t) {
      printf("INFO: exception MyType at TAG=1 caught as MyType at TAG=2\n");
    } catch (...) {
      printf("ERROR: exception MyType at TAG=1 not caught\n");
      ++num_errors;
    }
    printf("%d errors...\n", num_errors);
    return num_errors;
  }
  #endif





================
Comment at: libcxx/include/__config:687
+// on two levels:
+// 1. The symbol is given hidden visibility, which ensures that users won't start exporting
+//    symbols from their dynamic library by means of using the libc++ headers. This ensures
----------------
Symbols that are linkonce_odr and that get emitted inside a users dylib are never linked to from outside the dylib. So they're already hermetic in that way.



================
Comment at: libcxx/include/__config:691
+//
+// 2. The symbol is given an ABI tag that changes with each version of libc++. This ensures
+//    that no ODR violation can arise from mixing two TUs compiled with different versions
----------------
This actually seems like a rather big footgun to me. At least if it's ever applied to anything other than an inline function.

In the case where it's applied to either (A) a type, or (B) a data member, then, Instead of ensuring that no ODR violations arise, this all but ensures ODR violations do arise. As a result any behavior that depends on uniqueness (typeid, exceptions, static data members, vtables, type mangling, anything with a unique address) will now have one definition per libc++ version, and valid behavior break.



Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D127444/new/

https://reviews.llvm.org/D127444



More information about the libcxx-commits mailing list