[PATCH] D57125: [HotColdSplit] Introduce a cost model to control splitting behavior

Vedant Kumar via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Jan 23 15:48:27 PST 2019


vsk created this revision.
vsk added reviewers: tejohnson, junbuml, hiraditya, fhahn, sebpop.
Herald added subscribers: kristof.beyls, javed.absar.

The main goal of the model is to avoid *increasing* function size, as
that would eradicate any memory locality benefits from splitting. This
happens when:

- There are too many inputs or outputs to the cold region. Argument materialization and reloads of outputs have a cost.
- The cold region has too many distinct exit blocks, causing a large switch to be formed in the caller.
- The code size cost of the split code is less than the cost of a set-up call.

A secondary goal is to prevent excessive overall binary size growth.

With the cost model in place, I experimented to find a splitting
threshold that works well in practice. To make warm & cold code easily
separable for analysis purposes, I moved split functions to a "__cold"
section. I experimented with thresholds between [0, 4] and set the
default to the threshold which minimized geomean __text size.

Experiment data from building LNT+externals for X86 (N = 639 programs,
all sizes in bytes):

| Configuration | __text geom size | __cold geom size | TEXT geom size |
| *-Os*         | 1736.3           | 0, n=0           | 10961.6        |
| -Os, thresh=0 | 1740.53          | 124.482, n=134   | 11014          |
| -Os, thresh=1 | 1734.79          | 57.8781, n=90    | 10978.6        |
| -Os, thresh=2 | 1733.85          | 65.6604, n=61    | 10977.6        |
| -Os, thresh=3 | 1733.85          | 65.3071, n=61    | 10977.6        |
| -Os, thresh=4 | 1735.08          | 67.5156, n=54    | 10965.7        |
| *-Oz*         | 1554.4           | 0, n=0           | 10153          |
| -Oz, thresh=2 | 1552.2           | 65.633, n=61     | 10176          |
| *-O3*         | 2563.37          | 0, n=0           | 13105.4        |
| -O3, thresh=2 | 2559.49          | 71.1072, n=61    | 13162.4        |
|

Picking thresh=2 reduces the geomean __text section size by 0.14% at
-Os, -Oz, and -O3 and causes ~0.2% growth in the TEXT segment. Note that
TEXT size is page-aligned, whereas section sizes are byte-aligned.

Experiment data from building LNT+externals for ARM64 (N = 558 programs,
all sizes in bytes):

| Configuration | __text geom size | __cold geom size | TEXT geom size |
| *-Os*         | 1763.96          | 0, n=0           | 42934.9        |
| -Os, thresh=2 | 1760.9           | 76.6755, n=61    | 42934.9        |
|

Picking thresh=2 reduces the geomean __text section size by 0.17% at
-Os and causes no growth in the TEXT segment.

Measurements were done with D57082 <https://reviews.llvm.org/D57082> applied.


https://reviews.llvm.org/D57125

Files:
  llvm/lib/Transforms/IPO/HotColdSplitting.cpp
  llvm/test/Transforms/HotColdSplit/X86/extraction-subregion-breaks-phis.ll
  llvm/test/Transforms/HotColdSplit/X86/outline-expensive.ll
  llvm/test/Transforms/HotColdSplit/addr-taken.ll
  llvm/test/Transforms/HotColdSplit/apply-noreturn-bonus.ll
  llvm/test/Transforms/HotColdSplit/apply-penalty-for-inputs.ll
  llvm/test/Transforms/HotColdSplit/apply-penalty-for-outputs.ll
  llvm/test/Transforms/HotColdSplit/apply-successor-penalty.ll
  llvm/test/Transforms/HotColdSplit/outline-disjoint-diamonds.ll
  llvm/test/Transforms/HotColdSplit/resume.ll
  llvm/test/Transforms/HotColdSplit/split-cold-2.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D57125.183201.patch
Type: text/x-patch
Size: 15849 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20190123/4260dfa0/attachment.bin>


More information about the llvm-commits mailing list