<div dir="ltr">Arguably since this is going to be very cpu dependent perhaps it should be part of TTI?<div><br></div><div>-eric</div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, Oct 8, 2020 at 8:47 PM Snehasish Kumar via Phabricator <<a href="mailto:reviews@reviews.llvm.org">reviews@reviews.llvm.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">snehasish created this revision.<br>
snehasish added a reviewer: tmsriram.<br>
Herald added subscribers: llvm-commits, hiraditya.<br>
Herald added a project: LLVM.<br>
snehasish requested review of this revision.<br>
<br>
Based on internal testing at Google we found that setting the profile<br>
summary cutoff threshold to 999950 yields the best results in terms of<br>
itlb and icache metrics (as observed on Intel CPUs).<br>
<br>
*default* = Split out code if no profile count available for block<br>
*size-%* = The fraction of bytes split out of .text and .text.hot<br>
*itlb* = Misses per kilo instructions (MPKI) for itlb<br>
*icache* = Misses per kilo instructions (MPKI) for L1 <<a href="https://reviews.llvm.org/L1" rel="noreferrer" target="_blank">https://reviews.llvm.org/L1</a>> icache<br>
<br>
Search1<br>
<br>
| cutoff | size-% | itlb | icache |<br>
| ------- | ------- | -------- | ------- |<br>
| default | 42.5861 | 0.0822151 | 2.46363 |<br>
| 999999 | 44.9350 | 0.0767194 | 2.44416 |<br>
| 999950 | 50.0660 | 0.075744 | 2.4091 |<br>
| 999500 | 56.9158 | 0.082564 | 2.4188 |<br>
| 995000 | 63.8625 | 0.0814927 | 2.42832 |<br>
| 990000 | 71.7314 | 0.106906 | 2.57785 |<br>
|<br>
<br>
Search2<br>
<br>
| cutoff | size-% | itlb | icache |<br>
| ------- | ------ | -------- | ------- |<br>
| default | 2.8845 | 0.626712 | 4.73245 |<br>
| 999999 | 3.3291 | 0.602309 | 4.70045 |<br>
| 999950 | 3.8577 | 0.587842 | 4.71632 |<br>
| 999500 | 4.4170 | 0.63577 | 4.68351 |<br>
| 995000 | 5.1020 | 0.657969 | 4.82272 |<br>
| 990000 | 5.7153 | 0.719122 | 5.39496 |<br>
<br>
<br>
Repository:<br>
rG LLVM Github Monorepo<br>
<br>
<a href="https://reviews.llvm.org/D89085" rel="noreferrer" target="_blank">https://reviews.llvm.org/D89085</a><br>
<br>
Files:<br>
llvm/lib/CodeGen/MachineFunctionSplitter.cpp<br>
llvm/test/CodeGen/X86/machine-function-splitter.ll<br>
<br>
<br>
Index: llvm/test/CodeGen/X86/machine-function-splitter.ll<br>
===================================================================<br>
--- llvm/test/CodeGen/X86/machine-function-splitter.ll<br>
+++ llvm/test/CodeGen/X86/machine-function-splitter.ll<br>
@@ -1,5 +1,5 @@<br>
; RUN: llc < %s -mtriple=x86_64-unknown-linux-gnu -split-machine-functions | FileCheck %s -check-prefix=MFS-DEFAULTS<br>
-; RUN: llc < %s -mtriple=x86_64-unknown-linux-gnu -split-machine-functions -mfs-count-threshold=2000 | FileCheck %s --dump-input=always -check-prefix=MFS-OPTS1<br>
+; RUN: llc < %s -mtriple=x86_64-unknown-linux-gnu -split-machine-functions -mfs-psi-cutoff=0 -mfs-count-threshold=2000 | FileCheck %s --dump-input=always -check-prefix=MFS-OPTS1<br>
; RUN: llc < %s -mtriple=x86_64-unknown-linux-gnu -split-machine-functions -mfs-psi-cutoff=950000 | FileCheck %s -check-prefix=MFS-OPTS2<br>
<br>
define void @foo1(i1 zeroext %0) nounwind !prof !14 !section_prefix !15 {<br>
Index: llvm/lib/CodeGen/MachineFunctionSplitter.cpp<br>
===================================================================<br>
--- llvm/lib/CodeGen/MachineFunctionSplitter.cpp<br>
+++ llvm/lib/CodeGen/MachineFunctionSplitter.cpp<br>
@@ -43,7 +43,7 @@<br>
PercentileCutoff("mfs-psi-cutoff",<br>
cl::desc("Percentile profile summary cutoff used to "<br>
"determine cold blocks. Unused if set to zero."),<br>
- cl::init(0), cl::Hidden);<br>
+ cl::init(999950), cl::Hidden);<br>
<br>
static cl::opt<unsigned> ColdCountThreshold(<br>
"mfs-count-threshold",<br>
<br>
<br>
</blockquote></div>