[llvm] [BOLT] Optimize basic block loops to avoid n^2 loop (PR #156243)

Jakub Beránek via llvm-commits llvm-commits at lists.llvm.org
Tue Sep 23 01:34:07 PDT 2025


Kobzol wrote:

I ran it on Rust's CI.

 This is the log for LLVM:
<details>
<summary>LLVM</summary>

```
2025-09-23T07:36:44.0535321Z [2025-09-23T07:36:44.052Z INFO  opt_dist::exec] Executing `/rustroot/bin/llvm-bolt /tmp/.tmpZ4BJv1 -data /tmp/tmp-multistage/opt-artifacts/LLVM-bolt.profdata -o /checkout/obj/build/x86_64-unknown-linux-gnu/stage2/lib/libLLVM.so.21.1-rust-1.92.0-nightly -reorder-blocks=ext-tsp -reorder-functions=cdsort -split-functions -split-strategy=cdsplit -split-all-cold -jump-tables=move -icf=all -update-debug-sections -dyno-stats --time-rewrite --time-opts [at /checkout/obj]`
2025-09-23T07:36:44.0567506Z BOLT-INFO: shared object or position-independent executable detected
2025-09-23T07:36:44.0571599Z BOLT-INFO: Target architecture: x86_64
2025-09-23T07:36:44.0572022Z BOLT-INFO: BOLT version: <unknown>
2025-09-23T07:36:44.0572636Z BOLT-INFO: first alloc address is 0x0
2025-09-23T07:36:44.0573216Z BOLT-INFO: creating new program header table at address 0x7c00000, offset 0x7c00000
2025-09-23T07:36:44.0573772Z BOLT-INFO: enabling relocation mode
2025-09-23T07:36:44.4354680Z BOLT-INFO: enabling lite mode
2025-09-23T07:36:44.8656995Z BOLT-WARNING: split function detected on input : d_type.cold. The support is limited in relocation mode
2025-09-23T07:36:47.8034711Z BOLT-WARNING: Failed to analyze 1171 relocations
2025-09-23T07:36:47.8390721Z BOLT-INFO: pre-processing profile using branch profile reader
2025-09-23T07:36:58.2078225Z BOLT-WARNING: 1 collisions detected while hashing binary objects. Use -v=1 to see the list.
2025-09-23T07:36:59.5066972Z BOLT-INFO: 14891 out of 127004 functions in the binary (11.7%) have non-empty execution profile
2025-09-23T07:36:59.5067725Z BOLT-INFO: 240 functions with profile could not be optimized
2025-09-23T07:36:59.5068200Z BOLT-INFO: profile for 1 objects was ignored
2025-09-23T07:37:00.2101723Z BOLT-INFO: profile quality metrics for the hottest 1000 functions (reporting top 5% values): function CFG discontinuity 0.00%; call graph flow conservation gap 0.00%; CFG flow conservation gap 0.00% (weighted) 0.00% (worst); exception handling usage 0.00% (of total BBEC) 0.00% (of total InvokeEC)
2025-09-23T07:37:00.8293612Z BOLT-INFO: validate-mem-refs updated 1 object references
2025-09-23T07:37:00.8687575Z BOLT-INFO: 593325 instructions were shortened
2025-09-23T07:37:00.9457522Z BOLT-INFO: removed 1712 empty blocks
2025-09-23T07:37:01.4707672Z BOLT-INFO: ICF folded 1673 out of 127312 functions in 4 passes. 12 functions had jump tables.
2025-09-23T07:37:01.4708686Z BOLT-INFO: Removing all identical functions will save 292.82 KB of code space. Folded functions were called 2701464704 times based on profile.
2025-09-23T07:37:02.7062384Z BOLT-INFO: basic block reordering modified layout of 7814 functions (52.47% of profiled, 6.22% of total)
2025-09-23T07:37:02.7441908Z BOLT-INFO: UCE removed 4 blocks and 166 bytes of code
2025-09-23T07:37:03.4401755Z BOLT-INFO: splitting separates 10641700 hot bytes from 8501746 cold bytes (55.59% of split functions is hot).
2025-09-23T07:37:03.4585762Z BOLT-INFO: 164 Functions were reordered by LoopInversionPass
2025-09-23T07:38:02.2842104Z BOLT-INFO: splitting separates 5422296 hot bytes from 8471894 cold bytes (39.03% of split functions is hot).
2025-09-23T07:38:02.6756469Z BOLT-INFO: program-wide dynostats after all optimizations before SCTC and FOP:
2025-09-23T07:38:02.6756924Z 
2025-09-23T07:38:02.6757067Z         235104516805 : executed forward branches
2025-09-23T07:38:02.6757524Z          33013952055 : taken forward branches
2025-09-23T07:38:02.6757926Z          65383434196 : executed backward branches
2025-09-23T07:38:02.6758337Z          39036323064 : taken backward branches
2025-09-23T07:38:02.6758724Z          14888197913 : executed unconditional branches
2025-09-23T07:38:02.6759109Z          19895859861 : all function calls
2025-09-23T07:38:02.6759459Z           5214498636 : indirect calls
2025-09-23T07:38:02.6759787Z           3961463863 : PLT calls
2025-09-23T07:38:02.6760121Z        1784588195112 : executed instructions
2025-09-23T07:38:02.6760504Z         428273852366 : executed load instructions
2025-09-23T07:38:02.6760889Z         189118609977 : executed store instructions
2025-09-23T07:38:02.6761265Z           2678563147 : taken jump table branches
2025-09-23T07:38:02.6761651Z                    0 : taken unknown indirect branches
2025-09-23T07:38:02.6762013Z         315376148914 : total branches
2025-09-23T07:38:02.6762340Z          86938473032 : taken branches
2025-09-23T07:38:02.6762706Z         228437675882 : non-taken conditional branches
2025-09-23T07:38:02.6763104Z          72050275119 : taken conditional branches
2025-09-23T07:38:02.6763479Z         300487951001 : all conditional branches
2025-09-23T07:38:02.6763717Z 
2025-09-23T07:38:02.6763896Z         210977195564 : executed forward branches (-10.3%)
2025-09-23T07:38:02.6764493Z          17127806877 : taken forward branches (-48.1%)
2025-09-23T07:38:02.6764922Z          89510755437 : executed backward branches (+36.9%)
2025-09-23T07:38:02.6765354Z          40550966508 : taken backward branches (+3.9%)
2025-09-23T07:38:02.6766052Z           9850607756 : executed unconditional branches (-33.8%)
2025-09-23T07:38:02.6766480Z          19895859861 : all function calls (=)
2025-09-23T07:38:02.6766830Z           5214498636 : indirect calls (=)
2025-09-23T07:38:02.6767176Z           3961463863 : PLT calls (=)
2025-09-23T07:38:02.6767539Z        1771510302677 : executed instructions (-0.7%)
2025-09-23T07:38:02.6767949Z         428273852366 : executed load instructions (=)
2025-09-23T07:38:02.6768429Z         189118609977 : executed store instructions (=)
2025-09-23T07:38:02.6768834Z           2678563147 : taken jump table branches (=)
2025-09-23T07:38:02.6769234Z                    0 : taken unknown indirect branches (=)
2025-09-23T07:38:02.6769619Z         310338558757 : total branches (-1.6%)
2025-09-23T07:38:02.6769986Z          67529381141 : taken branches (-22.3%)
2025-09-23T07:38:02.6770399Z         242809177616 : non-taken conditional branches (+6.3%)
2025-09-23T07:38:02.6770904Z          57678773385 : taken conditional branches (-19.9%)
2025-09-23T07:38:02.6771317Z         300487951001 : all conditional branches (=)
2025-09-23T07:38:02.6771583Z 
2025-09-23T07:38:02.8707041Z BOLT-INFO: SCTC: patched 117 tail calls (113 forward) tail calls (4 backward) from a total of 117 while removing 5 double jumps and removing 120 basic blocks totalling 594 bytes of code. CTCs total execution count is 9486579 and the number of times CTCs are taken is 5413745
2025-09-23T07:38:09.1294010Z BOLT-INFO: setting __hot_start to 0x7e00000
2025-09-23T07:38:09.1294477Z BOLT-INFO: setting __hot_end to 0x8b1a287
2025-09-23T07:38:10.7843550Z ===-------------------------------------------------------------------------===
2025-09-23T07:38:10.7844184Z                                  Rewrite passes
2025-09-23T07:38:10.7844663Z ===-------------------------------------------------------------------------===
2025-09-23T07:38:10.7845839Z   Total Execution Time: 1241.3852 seconds (79.3465 wall clock)
2025-09-23T07:38:10.7846220Z 
2025-09-23T07:38:10.7846499Z    ---User Time---   --System Time--   --User+System--   ---Wall Time---  --- Name ---
2025-09-23T07:38:10.7847151Z   1192.1434 ( 98.8%)  34.0971 ( 97.0%)  1226.2405 ( 98.8%)  64.2015 ( 80.9%)  run optimization passes
2025-09-23T07:38:10.7847759Z    4.5596 (  0.4%)   0.4438 (  1.3%)   5.0035 (  0.4%)   5.0036 (  6.3%)  disassemble functions
2025-09-23T07:38:10.7848284Z    4.0559 (  0.3%)   0.2574 (  0.7%)   4.3133 (  0.3%)   4.3133 (  5.4%)  emit and link
2025-09-23T07:38:10.7848814Z    3.2442 (  0.3%)   0.1594 (  0.5%)   3.4036 (  0.3%)   3.4037 (  4.3%)  discover file objects
2025-09-23T07:38:10.7849368Z    1.4483 (  0.1%)   0.0560 (  0.2%)   1.5043 (  0.1%)   1.5043 (  1.9%)  pre-process profile data
2025-09-23T07:38:10.7849923Z    0.4897 (  0.0%)   0.0000 (  0.0%)   0.4897 (  0.0%)   0.4897 (  0.6%)  process profile data
2025-09-23T07:38:10.7850456Z    0.2412 (  0.0%)   0.1367 (  0.4%)   0.3779 (  0.0%)   0.3779 (  0.5%)  read special sections
2025-09-23T07:38:10.7851005Z    0.0261 (  0.0%)   0.0000 (  0.0%)   0.0261 (  0.0%)   0.0261 (  0.0%)  read debug info
2025-09-23T07:38:10.7851548Z    0.0130 (  0.0%)   0.0001 (  0.0%)   0.0131 (  0.0%)   0.0131 (  0.0%)  process metadata pre-CFG
2025-09-23T07:38:10.7852116Z    0.0130 (  0.0%)   0.0001 (  0.0%)   0.0131 (  0.0%)   0.0131 (  0.0%)  process profile data pre-CFG
2025-09-23T07:38:10.7852699Z    0.0002 (  0.0%)   0.0000 (  0.0%)   0.0002 (  0.0%)   0.0002 (  0.0%)  update metadata post-emit
2025-09-23T07:38:10.7853235Z    0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  discover storage
2025-09-23T07:38:10.7853773Z    0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  process section metadata
2025-09-23T07:38:10.7854510Z    0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  process metadata post-CFG
2025-09-23T07:38:10.7855085Z    0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  finalize metadata pre-emit
2025-09-23T07:38:10.7855637Z    0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  update debug info
2025-09-23T07:38:10.7856168Z   1206.2346 (100.0%)  35.1506 (100.0%)  1241.3852 (100.0%)  79.3465 (100.0%)  Total
2025-09-23T07:38:10.7856509Z 
2025-09-23T07:38:10.7856711Z ===-------------------------------------------------------------------------===
2025-09-23T07:38:10.7857160Z                           Binary Function Pass Manager
2025-09-23T07:38:10.7857590Z ===-------------------------------------------------------------------------===
2025-09-23T07:38:10.7858152Z   Total Execution Time: 1226.2137 seconds (64.1746 wall clock)
2025-09-23T07:38:10.7858474Z 
2025-09-23T07:38:10.7858750Z    ---User Time---   --System Time--   --User+System--   ---Wall Time---  --- Name ---
2025-09-23T07:38:10.7859365Z   1162.7142 ( 97.5%)  32.0898 ( 94.1%)  1194.8040 ( 97.4%)  57.2111 ( 89.1%)  split-functions
2025-09-23T07:38:10.7859918Z    1.7854 (  0.1%)   0.0000 (  0.0%)   1.7854 (  0.1%)   1.7855 (  2.8%)  reorder-functions
2025-09-23T07:38:10.7860487Z   12.6568 (  1.1%)   0.0000 (  0.0%)  12.6568 (  1.0%)   0.8433 (  1.3%)  reorder-blocks
2025-09-23T07:38:10.7861025Z    3.5497 (  0.3%)   1.5576 (  4.6%)   5.1073 (  0.4%)   0.8314 (  1.3%)  identical-code-folding
2025-09-23T07:38:10.7861644Z    0.7751 (  0.1%)   0.0000 (  0.0%)   0.7751 (  0.1%)   0.7751 (  1.2%)  profile-quality-stats
2025-09-23T07:38:10.7862170Z    0.5383 (  0.0%)   0.0000 (  0.0%)   0.5383 (  0.0%)   0.5383 (  0.8%)  fix-branches
2025-09-23T07:38:10.7862731Z    0.3774 (  0.0%)   0.0000 (  0.0%)   0.3774 (  0.0%)   0.3774 (  0.6%)  print dyno-stats after optimizations
2025-09-23T07:38:10.7863297Z    0.3492 (  0.0%)   0.0000 (  0.0%)   0.3492 (  0.0%)   0.3492 (  0.5%)  validate-mem-refs
2025-09-23T07:38:10.7863865Z    0.3344 (  0.0%)   0.0000 (  0.0%)   0.3344 (  0.0%)   0.3344 (  0.5%)  set dyno-stats before optimizations
2025-09-23T07:38:10.7864475Z    0.1949 (  0.0%)   0.0000 (  0.0%)   0.1949 (  0.0%)   0.1949 (  0.3%)  simplify-conditional-tail-calls
2025-09-23T07:38:10.7865062Z    0.1915 (  0.0%)   0.0000 (  0.0%)   0.1915 (  0.0%)   0.1915 (  0.3%)  validate-internal-calls
2025-09-23T07:38:10.7865579Z    5.1667 (  0.4%)   0.0000 (  0.0%)   5.1667 (  0.4%)   0.1470 (  0.2%)  aligner
2025-09-23T07:38:10.7866074Z    0.1213 (  0.0%)   0.0000 (  0.0%)   0.1213 (  0.0%)   0.1213 (  0.2%)  inst-lowering
2025-09-23T07:38:10.7866578Z    0.0843 (  0.0%)   0.0000 (  0.0%)   0.0843 (  0.0%)   0.0842 (  0.1%)  strip-rep-ret
2025-09-23T07:38:10.7867092Z    0.0822 (  0.0%)   0.0000 (  0.0%)   0.0822 (  0.0%)   0.0822 (  0.1%)  lower-annotations
2025-09-23T07:38:10.7867604Z    0.4988 (  0.0%)   0.4240 (  1.2%)   0.9228 (  0.1%)   0.0427 (  0.1%)  normalize CFG
2025-09-23T07:38:10.7868114Z    0.7246 (  0.1%)   0.0082 (  0.0%)   0.7328 (  0.1%)   0.0383 (  0.1%)  finalize-functions
2025-09-23T07:38:10.7868648Z    0.8009 (  0.1%)   0.0000 (  0.0%)   0.8009 (  0.1%)   0.0381 (  0.1%)  eliminate-unreachable
2025-09-23T07:38:10.7869194Z    0.5729 (  0.0%)   0.0000 (  0.0%)   0.5729 (  0.0%)   0.0365 (  0.1%)  shorten-instructions
2025-09-23T07:38:10.7869719Z    0.0351 (  0.0%)   0.0000 (  0.0%)   0.0351 (  0.0%)   0.0351 (  0.1%)  clean-mc-state
2025-09-23T07:38:10.7870222Z    0.3754 (  0.0%)   0.0000 (  0.0%)   0.3754 (  0.0%)   0.0343 (  0.1%)  remove-nops
2025-09-23T07:38:10.7870727Z    0.0218 (  0.0%)   0.0001 (  0.0%)   0.0219 (  0.0%)   0.0219 (  0.0%)  assign-sections
2025-09-23T07:38:10.7871247Z    0.1237 (  0.0%)   0.0170 (  0.0%)   0.1407 (  0.0%)   0.0185 (  0.0%)  loop-inversion-opt
2025-09-23T07:38:10.7871785Z    0.0146 (  0.0%)   0.0000 (  0.0%)   0.0146 (  0.0%)   0.0145 (  0.0%)  estimate-edge-counts
2025-09-23T07:38:10.7872301Z    0.0145 (  0.0%)   0.0000 (  0.0%)   0.0145 (  0.0%)   0.0145 (  0.0%)  print-stats
2025-09-23T07:38:10.7872835Z    0.0135 (  0.0%)   0.0000 (  0.0%)   0.0135 (  0.0%)   0.0135 (  0.0%)  patch-entries
2025-09-23T07:38:10.7873360Z    0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  retpoline-insertion
2025-09-23T07:38:10.7873869Z    0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  inlining
2025-09-23T07:38:10.7874358Z    0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  reorder-data
2025-09-23T07:38:10.7874883Z    0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  PLT call optimization
2025-09-23T07:38:10.7875415Z    0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  tail duplication
2025-09-23T07:38:10.7875972Z    0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  frame-optimizer
2025-09-23T07:38:10.7876478Z    0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  peepholes
2025-09-23T07:38:10.7876979Z    0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  alloc-combiner
2025-09-23T07:38:10.7877616Z    0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  indirect-call-promotion
2025-09-23T07:38:10.7878209Z   1192.1171 (100.0%)  34.0967 (100.0%)  1226.2137 (100.0%)  64.1746 (100.0%)  Total
2025-09-23T07:38:10.7878551Z 
2025-09-23T07:38:10.7878737Z ===-------------------------------------------------------------------------===
2025-09-23T07:38:10.7879171Z                                   CG breakdown
2025-09-23T07:38:10.7879629Z ===-------------------------------------------------------------------------===
2025-09-23T07:38:10.7880128Z   Total Execution Time: 0.8442 seconds (0.8442 wall clock)
2025-09-23T07:38:10.7880432Z 
2025-09-23T07:38:10.7880643Z    ---User Time---   --User+System--   ---Wall Time---  --- Name ---
2025-09-23T07:38:10.7881157Z    0.8442 (100.0%)   0.8442 (100.0%)   0.8442 (100.0%)  Callgraph construction
2025-09-23T07:38:10.7881627Z    0.8442 (100.0%)   0.8442 (100.0%)   0.8442 (100.0%)  Total
2025-09-23T07:38:10.7881902Z 
```

</details>

And here for the Rust compiler's shared library:
<details>
<summary>rustc</summary>

```
2025-09-23T08:08:24.1087661Z [2025-09-23T08:08:24.108Z INFO  opt_dist::exec] Executing `/rustroot/bin/llvm-bolt /tmp/.tmp7rsDA1 -data /tmp/tmp-multistage/opt-artifacts/rustc-bolt.profdata -o /checkout/obj/build/x86_64-unknown-linux-gnu/stage2/lib/librustc_driver-37c25f9240306b8c.so -reorder-blocks=ext-tsp -reorder-functions=cdsort -split-functions -split-strategy=cdsplit -split-all-cold -jump-tables=move -icf=all -update-debug-sections -dyno-stats --time-rewrite --time-opts [at /checkout/obj]`
2025-09-23T08:08:24.1162714Z BOLT-INFO: shared object or position-independent executable detected
2025-09-23T08:08:24.1167794Z BOLT-INFO: Target architecture: x86_64
2025-09-23T08:08:24.1168182Z BOLT-INFO: BOLT version: <unknown>
2025-09-23T08:08:24.1168539Z BOLT-INFO: first alloc address is 0x0
2025-09-23T08:08:24.1169073Z BOLT-INFO: creating new program header table at address 0x5000000, offset 0x5000000
2025-09-23T08:08:24.1169599Z BOLT-INFO: enabling relocation mode
2025-09-23T08:08:24.3930106Z BOLT-INFO: enabling lite mode
2025-09-23T08:08:25.0832384Z BOLT-WARNING: split function detected on input : d_type.cold. The support is limited in relocation mode
2025-09-23T08:08:27.2217568Z BOLT-WARNING: Failed to analyze 216 relocations
2025-09-23T08:08:27.2420728Z BOLT-INFO: pre-processing profile using branch profile reader
2025-09-23T08:08:39.2601621Z BOLT-WARNING: 10 collisions detected while hashing binary objects. Use -v=1 to see the list.
2025-09-23T08:08:40.7672287Z BOLT-INFO: 14020 out of 73549 functions in the binary (19.1%) have non-empty execution profile
2025-09-23T08:08:40.7673057Z BOLT-INFO: 496 functions with profile could not be optimized
2025-09-23T08:08:40.7673519Z BOLT-INFO: profile for 1 objects was ignored
2025-09-23T08:08:41.5617188Z BOLT-INFO: profile quality metrics for the hottest 1000 functions (reporting top 5% values): function CFG discontinuity 0.00%; call graph flow conservation gap 0.00%; CFG flow conservation gap 0.00% (weighted) 0.00% (worst); exception handling usage 0.00% (of total BBEC) 0.00% (of total InvokeEC)
2025-09-23T08:08:42.4075121Z BOLT-INFO: 830299 instructions were shortened
2025-09-23T08:08:42.4612088Z BOLT-INFO: removed 1400 empty blocks
2025-09-23T08:08:42.4612532Z BOLT-INFO: merged 3 duplicate CFG edges
2025-09-23T08:08:43.1085506Z BOLT-INFO: ICF folded 71 out of 73966 functions in 3 passes. 13 functions had jump tables.
2025-09-23T08:08:43.1086714Z BOLT-INFO: Removing all identical functions will save 33.68 KB of code space. Folded functions were called 83349861 times based on profile.
2025-09-23T08:08:44.2241731Z BOLT-INFO: basic block reordering modified layout of 8895 functions (63.45% of profiled, 12.04% of total)
2025-09-23T08:08:45.6003056Z BOLT-INFO: splitting separates 19348618 hot bytes from 9477858 cold bytes (67.12% of split functions is hot).
2025-09-23T08:08:45.6137762Z BOLT-INFO: 118 Functions were reordered by LoopInversionPass
2025-09-23T08:17:09.2383905Z BOLT-INFO: splitting separates 12219489 hot bytes from 7571490 cold bytes (61.74% of split functions is hot).
2025-09-23T08:17:09.6368551Z BOLT-INFO: program-wide dynostats after all optimizations before SCTC and FOP:
2025-09-23T08:17:09.6370124Z 
2025-09-23T08:17:09.6370507Z         159303000032 : executed forward branches
2025-09-23T08:17:09.6370968Z          13512704701 : taken forward branches
2025-09-23T08:17:09.6371346Z          22109772505 : executed backward branches
2025-09-23T08:17:09.6371727Z          15474886893 : taken backward branches
2025-09-23T08:17:09.6372118Z           7312349150 : executed unconditional branches
2025-09-23T08:17:09.6372699Z          10414141994 : all function calls
2025-09-23T08:17:09.6373049Z           5212041960 : indirect calls
2025-09-23T08:17:09.6373384Z            157126340 : PLT calls
2025-09-23T08:17:09.6373718Z        1299789320113 : executed instructions
2025-09-23T08:17:09.6374097Z         327477301380 : executed load instructions
2025-09-23T08:17:09.6374499Z         183484733372 : executed store instructions
2025-09-23T08:17:09.6374959Z           3252587661 : taken jump table branches
2025-09-23T08:17:09.6375346Z                    0 : taken unknown indirect branches
2025-09-23T08:17:09.6375717Z         188725121687 : total branches
2025-09-23T08:17:09.6376050Z          36299940744 : taken branches
2025-09-23T08:17:09.6376498Z         152425180943 : non-taken conditional branches
2025-09-23T08:17:09.6376899Z          28987591594 : taken conditional branches
2025-09-23T08:17:09.6377281Z         181412772537 : all conditional branches
2025-09-23T08:17:09.6377523Z 
2025-09-23T08:17:09.6377706Z         150062838844 : executed forward branches (-5.8%)
2025-09-23T08:17:09.6378130Z           7662406296 : taken forward branches (-43.3%)
2025-09-23T08:17:09.6378553Z          31348251587 : executed backward branches (+41.8%)
2025-09-23T08:17:09.6378981Z          14962033590 : taken backward branches (-3.3%)
2025-09-23T08:17:09.6379428Z           6073487992 : executed unconditional branches (-16.9%)
2025-09-23T08:17:09.6379854Z          10414141994 : all function calls (=)
2025-09-23T08:17:09.6380214Z           5212041960 : indirect calls (=)
2025-09-23T08:17:09.6380562Z            157126340 : PLT calls (=)
2025-09-23T08:17:09.6380925Z        1293805780658 : executed instructions (-0.5%)
2025-09-23T08:17:09.6381339Z         327477301380 : executed load instructions (=)
2025-09-23T08:17:09.6381748Z         183484733372 : executed store instructions (=)
2025-09-23T08:17:09.6382150Z           3252587661 : taken jump table branches (=)
2025-09-23T08:17:09.6382549Z                    0 : taken unknown indirect branches (=)
2025-09-23T08:17:09.6382926Z         187484578423 : total branches (-0.7%)
2025-09-23T08:17:09.6383286Z          28697927878 : taken branches (-20.9%)
2025-09-23T08:17:09.6383697Z         158786650545 : non-taken conditional branches (+4.2%)
2025-09-23T08:17:09.6384153Z          22624439886 : taken conditional branches (-22.0%)
2025-09-23T08:17:09.6384584Z         181411090431 : all conditional branches (-0.0%)
2025-09-23T08:17:09.6384859Z 
2025-09-23T08:17:09.8918378Z BOLT-INFO: SCTC: patched 33 tail calls (33 forward) tail calls (0 backward) from a total of 33 while removing 0 double jumps and removing 33 basic blocks totalling 165 bytes of code. CTCs total execution count is 1454562 and the number of times CTCs are taken is 1450251
2025-09-23T08:17:18.0775912Z BOLT-INFO: setting _end to 0x7ad9420
2025-09-23T08:17:18.0928802Z BOLT-INFO: setting __hot_start to 0x5200000
2025-09-23T08:17:18.0929438Z BOLT-INFO: setting __hot_end to 0x69c5e9b
2025-09-23T08:17:19.7001113Z ===-------------------------------------------------------------------------===
2025-09-23T08:17:19.7001661Z                                  Rewrite passes
2025-09-23T08:17:19.7002090Z ===-------------------------------------------------------------------------===
2025-09-23T08:17:19.7002612Z   Total Execution Time: 4849.9231 seconds (527.8837 wall clock)
2025-09-23T08:17:19.7003190Z 
2025-09-23T08:17:19.7003481Z    ---User Time---   --System Time--   --User+System--   ---Wall Time---  --- Name ---
2025-09-23T08:17:19.7004150Z   2411.0373 ( 99.3%)  2421.3456 (100.0%)  4832.3830 ( 99.6%)  510.3416 ( 96.7%)  run optimization passes
2025-09-23T08:17:19.7004733Z    6.2426 (  0.3%)   0.2655 (  0.0%)   6.5081 (  0.1%)   6.5083 (  1.2%)  emit and link
2025-09-23T08:17:19.7005265Z    5.2722 (  0.2%)   0.5675 (  0.0%)   5.8397 (  0.1%)   5.8411 (  1.1%)  disassemble functions
2025-09-23T08:17:19.7006101Z    2.6959 (  0.1%)   0.1529 (  0.0%)   2.8488 (  0.1%)   2.8489 (  0.5%)  discover file objects
2025-09-23T08:17:19.7006765Z    1.4865 (  0.1%)   0.0403 (  0.0%)   1.5268 (  0.0%)   1.5269 (  0.3%)  pre-process profile data
2025-09-23T08:17:19.7007318Z    0.5127 (  0.0%)   0.0000 (  0.0%)   0.5127 (  0.0%)   0.5127 (  0.1%)  process profile data
2025-09-23T08:17:19.7007852Z    0.1676 (  0.0%)   0.1084 (  0.0%)   0.2759 (  0.0%)   0.2760 (  0.1%)  read special sections
2025-09-23T08:17:19.7008378Z    0.0114 (  0.0%)   0.0000 (  0.0%)   0.0114 (  0.0%)   0.0114 (  0.0%)  read debug info
2025-09-23T08:17:19.7008986Z    0.0084 (  0.0%)   0.0000 (  0.0%)   0.0084 (  0.0%)   0.0084 (  0.0%)  process metadata pre-CFG
2025-09-23T08:17:19.7009552Z    0.0084 (  0.0%)   0.0000 (  0.0%)   0.0084 (  0.0%)   0.0084 (  0.0%)  process profile data pre-CFG
2025-09-23T08:17:19.7010162Z    0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  discover storage
2025-09-23T08:17:19.7010696Z    0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  process section metadata
2025-09-23T08:17:19.7011256Z    0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  update metadata post-emit
2025-09-23T08:17:19.7011824Z    0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  process metadata post-CFG
2025-09-23T08:17:19.7012387Z    0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  finalize metadata pre-emit
2025-09-23T08:17:19.7012931Z    0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  update debug info
2025-09-23T08:17:19.7013477Z   2427.4429 (100.0%)  2422.4802 (100.0%)  4849.9231 (100.0%)  527.8837 (100.0%)  Total
2025-09-23T08:17:19.7013821Z 
2025-09-23T08:17:19.7014007Z ===-------------------------------------------------------------------------===
2025-09-23T08:17:19.7014456Z                           Binary Function Pass Manager
2025-09-23T08:17:19.7014889Z ===-------------------------------------------------------------------------===
2025-09-23T08:17:19.7015393Z   Total Execution Time: 4832.3549 seconds (510.3134 wall clock)
2025-09-23T08:17:19.7015715Z 
2025-09-23T08:17:19.7015979Z    ---User Time---   --System Time--   --User+System--   ---Wall Time---  --- Name ---
2025-09-23T08:17:19.7016604Z   2372.4561 ( 98.4%)  2411.8820 ( 99.6%)  4784.3382 ( 99.0%)  502.1853 ( 98.4%)  split-functions
2025-09-23T08:17:19.7017165Z    1.9135 (  0.1%)   0.0000 (  0.0%)   1.9135 (  0.0%)   1.9135 (  0.4%)  reorder-functions
2025-09-23T08:17:19.7017703Z    4.9072 (  0.2%)   6.8726 (  0.3%)  11.7798 (  0.2%)   1.0785 (  0.2%)  identical-code-folding
2025-09-23T08:17:19.7018235Z    0.8752 (  0.0%)   0.0000 (  0.0%)   0.8752 (  0.0%)   0.8752 (  0.2%)  fix-branches
2025-09-23T08:17:19.7018760Z    0.8417 (  0.0%)   0.0000 (  0.0%)   0.8417 (  0.0%)   0.8445 (  0.2%)  profile-quality-stats
2025-09-23T08:17:19.7019295Z   10.8591 (  0.5%)   0.0000 (  0.0%)  10.8591 (  0.2%)   0.5774 (  0.1%)  reorder-blocks
2025-09-23T08:17:19.7019816Z    0.4994 (  0.0%)   0.0000 (  0.0%)   0.4994 (  0.0%)   0.4994 (  0.1%)  validate-mem-refs
2025-09-23T08:17:19.7020386Z    0.3845 (  0.0%)   0.0000 (  0.0%)   0.3845 (  0.0%)   0.3845 (  0.1%)  print dyno-stats after optimizations
2025-09-23T08:17:19.7020953Z   10.1393 (  0.4%)   2.5906 (  0.1%)  12.7299 (  0.3%)   0.3762 (  0.1%)  finalize-functions
2025-09-23T08:17:19.7021519Z    0.3394 (  0.0%)   0.0000 (  0.0%)   0.3394 (  0.0%)   0.3394 (  0.1%)  set dyno-stats before optimizations
2025-09-23T08:17:19.7022194Z    0.2545 (  0.0%)   0.0000 (  0.0%)   0.2545 (  0.0%)   0.2545 (  0.0%)  simplify-conditional-tail-calls
2025-09-23T08:17:19.7022784Z    0.2519 (  0.0%)   0.0000 (  0.0%)   0.2519 (  0.0%)   0.2519 (  0.0%)  validate-internal-calls
2025-09-23T08:17:19.7023321Z    0.1601 (  0.0%)   0.0000 (  0.0%)   0.1601 (  0.0%)   0.1601 (  0.0%)  inst-lowering
2025-09-23T08:17:19.7023844Z    0.1249 (  0.0%)   0.0000 (  0.0%)   0.1249 (  0.0%)   0.1249 (  0.0%)  lower-annotations
2025-09-23T08:17:19.7024353Z    3.9011 (  0.2%)   0.0000 (  0.0%)   3.9011 (  0.1%)   0.1101 (  0.0%)  aligner
2025-09-23T08:17:19.7024889Z    0.1056 (  0.0%)   0.0000 (  0.0%)   0.1056 (  0.0%)   0.1056 (  0.0%)  strip-rep-ret
2025-09-23T08:17:19.7025398Z    0.0489 (  0.0%)   0.0000 (  0.0%)   0.0489 (  0.0%)   0.0489 (  0.0%)  clean-mc-state
2025-09-23T08:17:19.7025945Z    0.9365 (  0.0%)   0.0000 (  0.0%)   0.9365 (  0.0%)   0.0400 (  0.0%)  eliminate-unreachable
2025-09-23T08:17:19.7026494Z    0.7548 (  0.0%)   0.0000 (  0.0%)   0.7548 (  0.0%)   0.0363 (  0.0%)  shorten-instructions
2025-09-23T08:17:19.7027053Z    0.5528 (  0.0%)   0.0000 (  0.0%)   0.5528 (  0.0%)   0.0294 (  0.0%)  normalize CFG
2025-09-23T08:17:19.7027558Z    0.4854 (  0.0%)   0.0000 (  0.0%)   0.4854 (  0.0%)   0.0243 (  0.0%)  remove-nops
2025-09-23T08:17:19.7028068Z    0.0154 (  0.0%)   0.0000 (  0.0%)   0.0154 (  0.0%)   0.0154 (  0.0%)  assign-sections
2025-09-23T08:17:19.7028630Z    0.1778 (  0.0%)   0.0000 (  0.0%)   0.1778 (  0.0%)   0.0136 (  0.0%)  loop-inversion-opt
2025-09-23T08:17:19.7029144Z    0.0085 (  0.0%)   0.0000 (  0.0%)   0.0085 (  0.0%)   0.0085 (  0.0%)  print-stats
2025-09-23T08:17:19.7029669Z    0.0084 (  0.0%)   0.0000 (  0.0%)   0.0084 (  0.0%)   0.0084 (  0.0%)  estimate-edge-counts
2025-09-23T08:17:19.7030197Z    0.0077 (  0.0%)   0.0000 (  0.0%)   0.0077 (  0.0%)   0.0077 (  0.0%)  patch-entries
2025-09-23T08:17:19.7030725Z    0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  retpoline-insertion
2025-09-23T08:17:19.7031283Z    0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  indirect-call-promotion
2025-09-23T08:17:19.7031838Z    0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  PLT call optimization
2025-09-23T08:17:19.7032355Z    0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  inlining
2025-09-23T08:17:19.7032867Z    0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  tail duplication
2025-09-23T08:17:19.7033378Z    0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  peepholes
2025-09-23T08:17:19.7033877Z    0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  reorder-data
2025-09-23T08:17:19.7034390Z    0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  frame-optimizer
2025-09-23T08:17:19.7034909Z    0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  alloc-combiner
2025-09-23T08:17:19.7035446Z   2411.0097 (100.0%)  2421.3452 (100.0%)  4832.3549 (100.0%)  510.3134 (100.0%)  Total
2025-09-23T08:17:19.7035797Z 
2025-09-23T08:17:19.7035986Z ===-------------------------------------------------------------------------===
2025-09-23T08:17:19.7036419Z                                   CG breakdown
2025-09-23T08:17:19.7036838Z ===-------------------------------------------------------------------------===
2025-09-23T08:17:19.7037419Z   Total Execution Time: 1.1101 seconds (1.1101 wall clock)
2025-09-23T08:17:19.7037725Z 
2025-09-23T08:17:19.7037936Z    ---User Time---   --User+System--   ---Wall Time---  --- Name ---
2025-09-23T08:17:19.7038452Z    1.1101 (100.0%)   1.1101 (100.0%)   1.1101 (100.0%)  Callgraph construction
2025-09-23T08:17:19.7038921Z    1.1101 (100.0%)   1.1101 (100.0%)   1.1101 (100.0%)  Total
2025-09-23T08:17:19.7039198Z 
```

</details>

https://github.com/llvm/llvm-project/pull/156243


More information about the llvm-commits mailing list