[PATCH] D85628: [HotColdSplitting] Add command line options for supplying cold function names via user input.

Jay Feldblum via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Aug 12 17:02:54 PDT 2020


yfeldblum added a comment.

In D85628#2214314 <https://reviews.llvm.org/D85628#2214314>, @hiraditya wrote:

> If there was a way to provide handwritten profile/coverage file, maybe that would work in absence of profile information?

I am not sure I see a need for profiles here.

I don't think we need profile information. We just need `__cxa_guard_acquire` to be marked cold, and for the compiler to infer coldness of code in the same block as a call to something marked as cold. (Apparently HCS does this?)

The compiler can also infer coldness of hidden functions which are only called by cold code.

But let's take `std::string` as an example. Let's say we have a function with a local static variable of type `std::string`. The goal is to have the inlined slow path outlined to the cold section, and for the slow path to be minimal in size. So we just emit a call to the `std::string` ctor, which is compiled normally since it is inline and not hidden, instead of inlining the ctor into the slow path. We can make the assumption that the `std::string` ctor is ODR-used *somewhere* in the resulting binary, so we can make the assumption that forcing a reference to this function will not increase overall code size.

Whether the compiler emits a perf-optimized or a size-optimized definition for the `std::string` ctor may be influenced by profiles. But that seems to me like a separate question that doesn't pertain to the specific topic of handling local static variables.

But for another type, say `hidden_foo` which is defined in an anonymous namespace and which is only constructed once only at a site inferred to be cold, we can take a different approach and possibly inline the ctor. Here, if there is only one site, minimal code size would (likely) imply inlining the ctor; but if there are two sites, minimal code size would (likely) imply emitting a ctor definition and then calling it twice.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D85628/new/

https://reviews.llvm.org/D85628



More information about the llvm-commits mailing list