[all-commits] [llvm/llvm-project] 0a7bf3: [CodeLayout] Faster basic block reordering, ext-ts...

spupyrev via All-commits all-commits at lists.llvm.org
Fri Oct 6 11:58:38 PDT 2023


  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: 0a7bf3aad692c5bb591cac605a19980b00325d50
      https://github.com/llvm/llvm-project/commit/0a7bf3aad692c5bb591cac605a19980b00325d50
  Author: spupyrev <spupyrev at users.noreply.github.com>
  Date:   2023-10-06 (Fri, 06 Oct 2023)

  Changed paths:
    M llvm/lib/Transforms/Utils/CodeLayout.cpp

  Log Message:
  -----------
  [CodeLayout] Faster basic block reordering, ext-tsp (#68275)

Aggressive inlining might produce huge functions with >10K of basic 
blocks. Since BFI treats _all_ blocks and jumps as "hot" having 
non-negative (but perhaps small) weight, the current implementation can
be slow, taking minutes to produce an layout. This change introduces a
few modifications that significantly (up to 50x on some instances) 
speeds up the computation. Some notable changes:
- reduced the maximum chain size to 512 (from the prior 4096);
- introeuced MaxMergeDensityRatio param to avoid merging chains with
very differen densities;
- dropped a couple of params that seem unnecessary.

Looking at some "offline" metrics (e.g., the number of created 
fall-throughs), there shouldn't be problems; in fact, I do see some
metrics go up. But it might be hard/impossible to measure perf 
difference for such small changes. I did test the performance clang-14 
binary and do not record a perf or i-cache-related differences.

My 5 benchmarks, with ext-tsp runtime (the lower the better) and 
"tsp-score" (the higher the better).
**Before**:

- benchmark 1:
  reordering running time is 2486 milliseconds
  score: 125503458 (128.3102%)
- benchmark 2:
  reordering running time is 3443 milliseconds
  score: 12613997277 (129.7495%)
- benchmark 2:
  reordering running time is 1978 milliseconds
  score: 1315881613 (105.8991%)
- benchmark 4:
  reordering running time is 7364 milliseconds
  score: 89513906284 (100.3413%)
- benchmark 5:
  reordering running time is 372605 milliseconds
  score: 21292505965077 (99.9979%)

**After**:
- benchmark 1:
  reordering running time is 2498 milliseconds
  score: 125510418 (128.3173%)

- benchmark 2:
  reordering running time is 3201 milliseconds
  score: 12614502162 (129.7547%)

- benchmark 3:
  reordering running time is 2137 milliseconds
  score: 1315938168 (105.9036%)

- benchmark 4:
  reordering running time is 6242 milliseconds
  score: 89518095837 (100.3460%)

- benchmark 5:
  reordering running time is 5819 milliseconds
  score: 21292295939119 (99.9969%)




More information about the All-commits mailing list