<div dir="ltr">Arguably since this is going to be very cpu dependent perhaps it should be part of TTI?<div><br></div><div>-eric</div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, Oct 8, 2020 at 8:47 PM Snehasish Kumar via Phabricator <<a href="mailto:reviews@reviews.llvm.org">reviews@reviews.llvm.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">snehasish created this revision.<br>
snehasish added a reviewer: tmsriram.<br>
Herald added subscribers: llvm-commits, hiraditya.<br>
Herald added a project: LLVM.<br>
snehasish requested review of this revision.<br>
<br>
Based on internal testing at Google we found that setting the profile<br>
summary cutoff threshold to 999950 yields the best results in terms of<br>
itlb and icache metrics (as observed on Intel CPUs).<br>
<br>
*default* = Split out code if no profile count available for block<br>
*size-%*  = The fraction of bytes split out of .text and .text.hot<br>
*itlb*    = Misses per kilo instructions (MPKI) for itlb<br>
*icache*  = Misses per kilo instructions (MPKI) for L1 <<a href="https://reviews.llvm.org/L1" rel="noreferrer" target="_blank">https://reviews.llvm.org/L1</a>> icache<br>
<br>
Search1<br>
<br>
| cutoff  | size-%  | itlb     | icache  |<br>
| ------- | ------- | -------- | ------- |<br>
| default | 42.5861 | 0.0822151      | 2.46363 |<br>
| 999999  | 44.9350 | 0.0767194      | 2.44416 |<br>
| 999950  | 50.0660 | 0.075744 | 2.4091  |<br>
| 999500  | 56.9158 | 0.082564 | 2.4188  |<br>
| 995000  | 63.8625 | 0.0814927      | 2.42832 |<br>
| 990000  | 71.7314 | 0.106906 | 2.57785 |<br>
|<br>
<br>
Search2<br>
<br>
| cutoff  | size-% | itlb     | icache  |<br>
| ------- | ------ | -------- | ------- |<br>
| default | 2.8845 | 0.626712 | 4.73245 |<br>
| 999999  | 3.3291 | 0.602309 | 4.70045 |<br>
| 999950  | 3.8577 | 0.587842 | 4.71632 |<br>
| 999500  | 4.4170 | 0.63577  | 4.68351 |<br>
| 995000  | 5.1020 | 0.657969 | 4.82272 |<br>
| 990000  | 5.7153 | 0.719122 | 5.39496 |<br>
<br>
<br>
Repository:<br>
  rG LLVM Github Monorepo<br>
<br>
<a href="https://reviews.llvm.org/D89085" rel="noreferrer" target="_blank">https://reviews.llvm.org/D89085</a><br>
<br>
Files:<br>
  llvm/lib/CodeGen/MachineFunctionSplitter.cpp<br>
  llvm/test/CodeGen/X86/machine-function-splitter.ll<br>
<br>
<br>
Index: llvm/test/CodeGen/X86/machine-function-splitter.ll<br>
===================================================================<br>
--- llvm/test/CodeGen/X86/machine-function-splitter.ll<br>
+++ llvm/test/CodeGen/X86/machine-function-splitter.ll<br>
@@ -1,5 +1,5 @@<br>
 ; RUN: llc < %s -mtriple=x86_64-unknown-linux-gnu -split-machine-functions | FileCheck %s -check-prefix=MFS-DEFAULTS<br>
-; RUN: llc < %s -mtriple=x86_64-unknown-linux-gnu -split-machine-functions -mfs-count-threshold=2000 | FileCheck %s --dump-input=always -check-prefix=MFS-OPTS1<br>
+; RUN: llc < %s -mtriple=x86_64-unknown-linux-gnu -split-machine-functions -mfs-psi-cutoff=0 -mfs-count-threshold=2000 | FileCheck %s --dump-input=always -check-prefix=MFS-OPTS1<br>
 ; RUN: llc < %s -mtriple=x86_64-unknown-linux-gnu -split-machine-functions -mfs-psi-cutoff=950000 | FileCheck %s -check-prefix=MFS-OPTS2<br>
<br>
 define void @foo1(i1 zeroext %0) nounwind !prof !14 !section_prefix !15 {<br>
Index: llvm/lib/CodeGen/MachineFunctionSplitter.cpp<br>
===================================================================<br>
--- llvm/lib/CodeGen/MachineFunctionSplitter.cpp<br>
+++ llvm/lib/CodeGen/MachineFunctionSplitter.cpp<br>
@@ -43,7 +43,7 @@<br>
     PercentileCutoff("mfs-psi-cutoff",<br>
                      cl::desc("Percentile profile summary cutoff used to "<br>
                               "determine cold blocks. Unused if set to zero."),<br>
-                     cl::init(0), cl::Hidden);<br>
+                     cl::init(999950), cl::Hidden);<br>
<br>
 static cl::opt<unsigned> ColdCountThreshold(<br>
     "mfs-count-threshold",<br>
<br>
<br>
</blockquote></div>