[clang] [ItaniumCXXABI] Mark RTTI type name as global unnamed_addr (PR #111343)
Richard Smith via cfe-commits
cfe-commits at lists.llvm.org
Tue Oct 8 11:10:34 PDT 2024
zygoloid wrote:
> But, I have a question: now that it has ensured the uniqueness of typeinfo's address, why does the implementation still compare the type equality by the address of type name string?
The uniqueness of the address of the `type_info` object itself is not guaranteed. The reason is that we sometimes [generate `type_info` objects for pointers to incomplete types](https://itanium-cxx-abi.github.io/cxx-abi/abi.html#:~:text=When%20it%20is,the%20type_info%20addresses.) as part of exception handling, so we can end up with multiple `type_info` objects for the same type.
> For symbols with internal linkage or hidden visibility, I don't think there would be problems if they were allowed to be merged. For example, considering there are 2 translation units 0.cpp and 1.cpp, and we defined internal `class A {}` in these 2 translation units, since they are all internal symbols, I think comparing `0.cpp::A` with `1.cpp::A` is undefined behavior. Because there at least one symbol was referenced outside of the current visibility scope.
Such comparisons can happen in valid C++ code. For example:
```c++
// a.cc
namespace { class A {}; }
void throwA() { throw A(); }
```
```c++
// b.cc
namespace { class A {}; }
void throwA();
int main() {
try { throwA(); }
catch (A) { return 1; }
catch (...) { return 2; }
}
```
This program is valid and `main` is required to return 2 -- the `A` in `a.cc` and the `A` in `b.cc` are two different types. During exception handling, we will compare them by comparing their `type_info`, which means we will compare the addresses of the name strings, so we need two different addresses.
> For dynamic loading, if there are two same symbols in different DSOs, the symbol would be interposed.
Not in the cases I mentioned. (I think we can *probably* get away with marking type name strings as `unnamed_addr` if the type has external linkage, because we don't expect `unnamed_addr` to have any effect after static linking. But it's not clear to me that that's actually guaranteed by the LLVM semantics, or whether it would be permissible to make use of `unnamed_addr` in some cross-DSO LTO situation.)
You *can* still apply `unnamed_addr` in the cases where the the target ABI rule is that `type_info` comparisons will always use or fall back to a string comparison. Per the libc++ implementation, that's that case on Apple arm64 targets. You can detect this using `classifyRTTIUniqueness`.
I think it's also correct and safe to apply `local_unnamed_addr` to these type name strings in all cases. Merging with another string literal from the same compilation should always be OK.
https://github.com/llvm/llvm-project/pull/111343
More information about the cfe-commits
mailing list