[PATCH] D153771: [BOLT][Instrumentation] Fix hash table memory corruption and append-pid option

Denis Revunov via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Jul 3 06:06:10 PDT 2023


treapster added a comment.

At first i couldn't reproduce with cmake configured to use clang, but with gcc the runtime library does indeed break. Turns out, the definition for

  void *operator new(size_t, void *) noexcept;

which was added in the problematic commit is not provided by gcc, and in runtimelib it's use is replaced with a call and a relocation to undefined symbol(demangled to `operator new(unsigned long, void*)`:

  0000000000008352  0000005700000004 R_X86_64_PLT32         0000000000000000 _ZnwmPv - 4

For some reason JITLink does not catch it, and after linking the runtime lib to the binary, the call is just left there with zero immediate which causes segfault.
If we provide a definition for operator new, the problem goes away. Now why i did not provide a definition:

- That <https://stackoverflow.com/a/3156822> answer says we only need definition and doesn't mention declaration
- The page on cppreference <https://en.cppreference.com/w/cpp/memory/new/operator_new> does not list (9) and (10) as replacable and in Notes section only allows definition in class scope, or in global scope with non-void pointer type.
- The page <https://timsong-cpp.github.io/cppwp/n4868/new.delete.placement> from that <https://stackoverflow.com/a/68902962> answer explicitly says: `These functions are reserved; a C++ program may not define functions that displace the versions in the C++ standard library`.

So it led me to believe that these forms of new are implicitly defined in the compiler and it is UB to define them in global scope. But apparently GCC requires them(or at least the one in question) to be explicitly defined. There is also ambiguity in whether we can call it "displacing the versions in the standard library" when we're not using standard library - whether UB arises from the definition alone or from definition when using standard library. We can probably assume the latter and get away with it, but it is not super clear. So, C++ is being C++ again.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D153771/new/

https://reviews.llvm.org/D153771



More information about the llvm-commits mailing list