[PATCH] D87840: [lld] Add a new known text prefix - ".text.split."

Thu Sep 17 12:21:28 PDT 2020

snehasish added a comment.

Carrying over the discussion from D87813 <https://reviews.llvm.org/D87813> since it's more appropriate here:

@MaskRay

> For your future lld patch: why can't the split sections be placed in .test.cold? This just affects how -z keep-text-section-prefix groups input sections into output sections.

The intent here is to create a new "known" output section prefix which holds the split parts of functions. The rationale for this is outlined below.

@tmsriram

> Is there a good reason to put this into .text.split? Does having a new output section give you more leverage on how to manage the mapping of such code?

Yes, having them placed in a separate section does provide additional leverage. In particular your suggestion of placing the split parts in .text would benefit from hugepages, however this may result in suboptimal icache performance since the split parts are interspersed across other code.

Interaction with hugepages
--------------------------

Placing the split parts in a separate section allows us to experiment whether keeping them on hugepages is beneficial similar. For example, we may choose to unlock for FDO targets but keep on hugepages for AFDO to mitigate against profile quality issues.

Loss of locality
----------------

>From our experiments we found that keeping them in a separate section (as opposed to placing in .text.unlikely or .text) improved metrics such as icache and itlb misses. For itlb misses, it's due to the fact that the split parts are placed on hugepages vs regular pages. For icache misses, we posit that the split parts are distributed across the section reducing locality. The loss here is significant and can be up to a 5% difference (Search B, L2i miss). Note that in this experiment we apply an aggressive 99% threshold for splitting out cold blocks.

|           | .text.unlikely |          | .text.split |
| --------- | -------------- | -------- | ----------- |
|           | Search A       | Search B | Search A    | Search B |
| l1i_miss  | 3.83           | -1.70    | 0.65        | -5.17    |
| l2_miss   | 5.48           | 6.93     | 0.64        | 1.68     |
| itlb_miss | -32.27         | -10.25   | -35.41      | -15.90   |
| stlb_miss | -59.39         | -42.36   | -67.56      | -62.20   |
|

Ease of monitoring hotness
--------------------------

Another benefit of placing them in a separate section is ease of monitoring by collecting sampling data from production. We can keep an eye on the hotness of the split output section to ensure that our tuning thresholds for splitting are optimal for the fleet. While I believe it is still possible to monitor split parts (disambiguate symbols using ".cold" suffix) for function splitting, it is more tedious.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D87840/new/

https://reviews.llvm.org/D87840