[llvm] [BOLT] Optimize basic block loops to avoid n^2 loop (PR #156243)
Jakub Beránek via llvm-commits
llvm-commits at lists.llvm.org
Tue Sep 23 01:34:07 PDT 2025
Kobzol wrote:
I ran it on Rust's CI.
This is the log for LLVM:
<details>
<summary>LLVM</summary>
```
2025-09-23T07:36:44.0535321Z [2025-09-23T07:36:44.052Z INFO opt_dist::exec] Executing `/rustroot/bin/llvm-bolt /tmp/.tmpZ4BJv1 -data /tmp/tmp-multistage/opt-artifacts/LLVM-bolt.profdata -o /checkout/obj/build/x86_64-unknown-linux-gnu/stage2/lib/libLLVM.so.21.1-rust-1.92.0-nightly -reorder-blocks=ext-tsp -reorder-functions=cdsort -split-functions -split-strategy=cdsplit -split-all-cold -jump-tables=move -icf=all -update-debug-sections -dyno-stats --time-rewrite --time-opts [at /checkout/obj]`
2025-09-23T07:36:44.0567506Z BOLT-INFO: shared object or position-independent executable detected
2025-09-23T07:36:44.0571599Z BOLT-INFO: Target architecture: x86_64
2025-09-23T07:36:44.0572022Z BOLT-INFO: BOLT version: <unknown>
2025-09-23T07:36:44.0572636Z BOLT-INFO: first alloc address is 0x0
2025-09-23T07:36:44.0573216Z BOLT-INFO: creating new program header table at address 0x7c00000, offset 0x7c00000
2025-09-23T07:36:44.0573772Z BOLT-INFO: enabling relocation mode
2025-09-23T07:36:44.4354680Z BOLT-INFO: enabling lite mode
2025-09-23T07:36:44.8656995Z BOLT-WARNING: split function detected on input : d_type.cold. The support is limited in relocation mode
2025-09-23T07:36:47.8034711Z BOLT-WARNING: Failed to analyze 1171 relocations
2025-09-23T07:36:47.8390721Z BOLT-INFO: pre-processing profile using branch profile reader
2025-09-23T07:36:58.2078225Z BOLT-WARNING: 1 collisions detected while hashing binary objects. Use -v=1 to see the list.
2025-09-23T07:36:59.5066972Z BOLT-INFO: 14891 out of 127004 functions in the binary (11.7%) have non-empty execution profile
2025-09-23T07:36:59.5067725Z BOLT-INFO: 240 functions with profile could not be optimized
2025-09-23T07:36:59.5068200Z BOLT-INFO: profile for 1 objects was ignored
2025-09-23T07:37:00.2101723Z BOLT-INFO: profile quality metrics for the hottest 1000 functions (reporting top 5% values): function CFG discontinuity 0.00%; call graph flow conservation gap 0.00%; CFG flow conservation gap 0.00% (weighted) 0.00% (worst); exception handling usage 0.00% (of total BBEC) 0.00% (of total InvokeEC)
2025-09-23T07:37:00.8293612Z BOLT-INFO: validate-mem-refs updated 1 object references
2025-09-23T07:37:00.8687575Z BOLT-INFO: 593325 instructions were shortened
2025-09-23T07:37:00.9457522Z BOLT-INFO: removed 1712 empty blocks
2025-09-23T07:37:01.4707672Z BOLT-INFO: ICF folded 1673 out of 127312 functions in 4 passes. 12 functions had jump tables.
2025-09-23T07:37:01.4708686Z BOLT-INFO: Removing all identical functions will save 292.82 KB of code space. Folded functions were called 2701464704 times based on profile.
2025-09-23T07:37:02.7062384Z BOLT-INFO: basic block reordering modified layout of 7814 functions (52.47% of profiled, 6.22% of total)
2025-09-23T07:37:02.7441908Z BOLT-INFO: UCE removed 4 blocks and 166 bytes of code
2025-09-23T07:37:03.4401755Z BOLT-INFO: splitting separates 10641700 hot bytes from 8501746 cold bytes (55.59% of split functions is hot).
2025-09-23T07:37:03.4585762Z BOLT-INFO: 164 Functions were reordered by LoopInversionPass
2025-09-23T07:38:02.2842104Z BOLT-INFO: splitting separates 5422296 hot bytes from 8471894 cold bytes (39.03% of split functions is hot).
2025-09-23T07:38:02.6756469Z BOLT-INFO: program-wide dynostats after all optimizations before SCTC and FOP:
2025-09-23T07:38:02.6756924Z
2025-09-23T07:38:02.6757067Z 235104516805 : executed forward branches
2025-09-23T07:38:02.6757524Z 33013952055 : taken forward branches
2025-09-23T07:38:02.6757926Z 65383434196 : executed backward branches
2025-09-23T07:38:02.6758337Z 39036323064 : taken backward branches
2025-09-23T07:38:02.6758724Z 14888197913 : executed unconditional branches
2025-09-23T07:38:02.6759109Z 19895859861 : all function calls
2025-09-23T07:38:02.6759459Z 5214498636 : indirect calls
2025-09-23T07:38:02.6759787Z 3961463863 : PLT calls
2025-09-23T07:38:02.6760121Z 1784588195112 : executed instructions
2025-09-23T07:38:02.6760504Z 428273852366 : executed load instructions
2025-09-23T07:38:02.6760889Z 189118609977 : executed store instructions
2025-09-23T07:38:02.6761265Z 2678563147 : taken jump table branches
2025-09-23T07:38:02.6761651Z 0 : taken unknown indirect branches
2025-09-23T07:38:02.6762013Z 315376148914 : total branches
2025-09-23T07:38:02.6762340Z 86938473032 : taken branches
2025-09-23T07:38:02.6762706Z 228437675882 : non-taken conditional branches
2025-09-23T07:38:02.6763104Z 72050275119 : taken conditional branches
2025-09-23T07:38:02.6763479Z 300487951001 : all conditional branches
2025-09-23T07:38:02.6763717Z
2025-09-23T07:38:02.6763896Z 210977195564 : executed forward branches (-10.3%)
2025-09-23T07:38:02.6764493Z 17127806877 : taken forward branches (-48.1%)
2025-09-23T07:38:02.6764922Z 89510755437 : executed backward branches (+36.9%)
2025-09-23T07:38:02.6765354Z 40550966508 : taken backward branches (+3.9%)
2025-09-23T07:38:02.6766052Z 9850607756 : executed unconditional branches (-33.8%)
2025-09-23T07:38:02.6766480Z 19895859861 : all function calls (=)
2025-09-23T07:38:02.6766830Z 5214498636 : indirect calls (=)
2025-09-23T07:38:02.6767176Z 3961463863 : PLT calls (=)
2025-09-23T07:38:02.6767539Z 1771510302677 : executed instructions (-0.7%)
2025-09-23T07:38:02.6767949Z 428273852366 : executed load instructions (=)
2025-09-23T07:38:02.6768429Z 189118609977 : executed store instructions (=)
2025-09-23T07:38:02.6768834Z 2678563147 : taken jump table branches (=)
2025-09-23T07:38:02.6769234Z 0 : taken unknown indirect branches (=)
2025-09-23T07:38:02.6769619Z 310338558757 : total branches (-1.6%)
2025-09-23T07:38:02.6769986Z 67529381141 : taken branches (-22.3%)
2025-09-23T07:38:02.6770399Z 242809177616 : non-taken conditional branches (+6.3%)
2025-09-23T07:38:02.6770904Z 57678773385 : taken conditional branches (-19.9%)
2025-09-23T07:38:02.6771317Z 300487951001 : all conditional branches (=)
2025-09-23T07:38:02.6771583Z
2025-09-23T07:38:02.8707041Z BOLT-INFO: SCTC: patched 117 tail calls (113 forward) tail calls (4 backward) from a total of 117 while removing 5 double jumps and removing 120 basic blocks totalling 594 bytes of code. CTCs total execution count is 9486579 and the number of times CTCs are taken is 5413745
2025-09-23T07:38:09.1294010Z BOLT-INFO: setting __hot_start to 0x7e00000
2025-09-23T07:38:09.1294477Z BOLT-INFO: setting __hot_end to 0x8b1a287
2025-09-23T07:38:10.7843550Z ===-------------------------------------------------------------------------===
2025-09-23T07:38:10.7844184Z Rewrite passes
2025-09-23T07:38:10.7844663Z ===-------------------------------------------------------------------------===
2025-09-23T07:38:10.7845839Z Total Execution Time: 1241.3852 seconds (79.3465 wall clock)
2025-09-23T07:38:10.7846220Z
2025-09-23T07:38:10.7846499Z ---User Time--- --System Time-- --User+System-- ---Wall Time--- --- Name ---
2025-09-23T07:38:10.7847151Z 1192.1434 ( 98.8%) 34.0971 ( 97.0%) 1226.2405 ( 98.8%) 64.2015 ( 80.9%) run optimization passes
2025-09-23T07:38:10.7847759Z 4.5596 ( 0.4%) 0.4438 ( 1.3%) 5.0035 ( 0.4%) 5.0036 ( 6.3%) disassemble functions
2025-09-23T07:38:10.7848284Z 4.0559 ( 0.3%) 0.2574 ( 0.7%) 4.3133 ( 0.3%) 4.3133 ( 5.4%) emit and link
2025-09-23T07:38:10.7848814Z 3.2442 ( 0.3%) 0.1594 ( 0.5%) 3.4036 ( 0.3%) 3.4037 ( 4.3%) discover file objects
2025-09-23T07:38:10.7849368Z 1.4483 ( 0.1%) 0.0560 ( 0.2%) 1.5043 ( 0.1%) 1.5043 ( 1.9%) pre-process profile data
2025-09-23T07:38:10.7849923Z 0.4897 ( 0.0%) 0.0000 ( 0.0%) 0.4897 ( 0.0%) 0.4897 ( 0.6%) process profile data
2025-09-23T07:38:10.7850456Z 0.2412 ( 0.0%) 0.1367 ( 0.4%) 0.3779 ( 0.0%) 0.3779 ( 0.5%) read special sections
2025-09-23T07:38:10.7851005Z 0.0261 ( 0.0%) 0.0000 ( 0.0%) 0.0261 ( 0.0%) 0.0261 ( 0.0%) read debug info
2025-09-23T07:38:10.7851548Z 0.0130 ( 0.0%) 0.0001 ( 0.0%) 0.0131 ( 0.0%) 0.0131 ( 0.0%) process metadata pre-CFG
2025-09-23T07:38:10.7852116Z 0.0130 ( 0.0%) 0.0001 ( 0.0%) 0.0131 ( 0.0%) 0.0131 ( 0.0%) process profile data pre-CFG
2025-09-23T07:38:10.7852699Z 0.0002 ( 0.0%) 0.0000 ( 0.0%) 0.0002 ( 0.0%) 0.0002 ( 0.0%) update metadata post-emit
2025-09-23T07:38:10.7853235Z 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) discover storage
2025-09-23T07:38:10.7853773Z 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) process section metadata
2025-09-23T07:38:10.7854510Z 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) process metadata post-CFG
2025-09-23T07:38:10.7855085Z 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) finalize metadata pre-emit
2025-09-23T07:38:10.7855637Z 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) update debug info
2025-09-23T07:38:10.7856168Z 1206.2346 (100.0%) 35.1506 (100.0%) 1241.3852 (100.0%) 79.3465 (100.0%) Total
2025-09-23T07:38:10.7856509Z
2025-09-23T07:38:10.7856711Z ===-------------------------------------------------------------------------===
2025-09-23T07:38:10.7857160Z Binary Function Pass Manager
2025-09-23T07:38:10.7857590Z ===-------------------------------------------------------------------------===
2025-09-23T07:38:10.7858152Z Total Execution Time: 1226.2137 seconds (64.1746 wall clock)
2025-09-23T07:38:10.7858474Z
2025-09-23T07:38:10.7858750Z ---User Time--- --System Time-- --User+System-- ---Wall Time--- --- Name ---
2025-09-23T07:38:10.7859365Z 1162.7142 ( 97.5%) 32.0898 ( 94.1%) 1194.8040 ( 97.4%) 57.2111 ( 89.1%) split-functions
2025-09-23T07:38:10.7859918Z 1.7854 ( 0.1%) 0.0000 ( 0.0%) 1.7854 ( 0.1%) 1.7855 ( 2.8%) reorder-functions
2025-09-23T07:38:10.7860487Z 12.6568 ( 1.1%) 0.0000 ( 0.0%) 12.6568 ( 1.0%) 0.8433 ( 1.3%) reorder-blocks
2025-09-23T07:38:10.7861025Z 3.5497 ( 0.3%) 1.5576 ( 4.6%) 5.1073 ( 0.4%) 0.8314 ( 1.3%) identical-code-folding
2025-09-23T07:38:10.7861644Z 0.7751 ( 0.1%) 0.0000 ( 0.0%) 0.7751 ( 0.1%) 0.7751 ( 1.2%) profile-quality-stats
2025-09-23T07:38:10.7862170Z 0.5383 ( 0.0%) 0.0000 ( 0.0%) 0.5383 ( 0.0%) 0.5383 ( 0.8%) fix-branches
2025-09-23T07:38:10.7862731Z 0.3774 ( 0.0%) 0.0000 ( 0.0%) 0.3774 ( 0.0%) 0.3774 ( 0.6%) print dyno-stats after optimizations
2025-09-23T07:38:10.7863297Z 0.3492 ( 0.0%) 0.0000 ( 0.0%) 0.3492 ( 0.0%) 0.3492 ( 0.5%) validate-mem-refs
2025-09-23T07:38:10.7863865Z 0.3344 ( 0.0%) 0.0000 ( 0.0%) 0.3344 ( 0.0%) 0.3344 ( 0.5%) set dyno-stats before optimizations
2025-09-23T07:38:10.7864475Z 0.1949 ( 0.0%) 0.0000 ( 0.0%) 0.1949 ( 0.0%) 0.1949 ( 0.3%) simplify-conditional-tail-calls
2025-09-23T07:38:10.7865062Z 0.1915 ( 0.0%) 0.0000 ( 0.0%) 0.1915 ( 0.0%) 0.1915 ( 0.3%) validate-internal-calls
2025-09-23T07:38:10.7865579Z 5.1667 ( 0.4%) 0.0000 ( 0.0%) 5.1667 ( 0.4%) 0.1470 ( 0.2%) aligner
2025-09-23T07:38:10.7866074Z 0.1213 ( 0.0%) 0.0000 ( 0.0%) 0.1213 ( 0.0%) 0.1213 ( 0.2%) inst-lowering
2025-09-23T07:38:10.7866578Z 0.0843 ( 0.0%) 0.0000 ( 0.0%) 0.0843 ( 0.0%) 0.0842 ( 0.1%) strip-rep-ret
2025-09-23T07:38:10.7867092Z 0.0822 ( 0.0%) 0.0000 ( 0.0%) 0.0822 ( 0.0%) 0.0822 ( 0.1%) lower-annotations
2025-09-23T07:38:10.7867604Z 0.4988 ( 0.0%) 0.4240 ( 1.2%) 0.9228 ( 0.1%) 0.0427 ( 0.1%) normalize CFG
2025-09-23T07:38:10.7868114Z 0.7246 ( 0.1%) 0.0082 ( 0.0%) 0.7328 ( 0.1%) 0.0383 ( 0.1%) finalize-functions
2025-09-23T07:38:10.7868648Z 0.8009 ( 0.1%) 0.0000 ( 0.0%) 0.8009 ( 0.1%) 0.0381 ( 0.1%) eliminate-unreachable
2025-09-23T07:38:10.7869194Z 0.5729 ( 0.0%) 0.0000 ( 0.0%) 0.5729 ( 0.0%) 0.0365 ( 0.1%) shorten-instructions
2025-09-23T07:38:10.7869719Z 0.0351 ( 0.0%) 0.0000 ( 0.0%) 0.0351 ( 0.0%) 0.0351 ( 0.1%) clean-mc-state
2025-09-23T07:38:10.7870222Z 0.3754 ( 0.0%) 0.0000 ( 0.0%) 0.3754 ( 0.0%) 0.0343 ( 0.1%) remove-nops
2025-09-23T07:38:10.7870727Z 0.0218 ( 0.0%) 0.0001 ( 0.0%) 0.0219 ( 0.0%) 0.0219 ( 0.0%) assign-sections
2025-09-23T07:38:10.7871247Z 0.1237 ( 0.0%) 0.0170 ( 0.0%) 0.1407 ( 0.0%) 0.0185 ( 0.0%) loop-inversion-opt
2025-09-23T07:38:10.7871785Z 0.0146 ( 0.0%) 0.0000 ( 0.0%) 0.0146 ( 0.0%) 0.0145 ( 0.0%) estimate-edge-counts
2025-09-23T07:38:10.7872301Z 0.0145 ( 0.0%) 0.0000 ( 0.0%) 0.0145 ( 0.0%) 0.0145 ( 0.0%) print-stats
2025-09-23T07:38:10.7872835Z 0.0135 ( 0.0%) 0.0000 ( 0.0%) 0.0135 ( 0.0%) 0.0135 ( 0.0%) patch-entries
2025-09-23T07:38:10.7873360Z 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) retpoline-insertion
2025-09-23T07:38:10.7873869Z 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) inlining
2025-09-23T07:38:10.7874358Z 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) reorder-data
2025-09-23T07:38:10.7874883Z 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) PLT call optimization
2025-09-23T07:38:10.7875415Z 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) tail duplication
2025-09-23T07:38:10.7875972Z 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) frame-optimizer
2025-09-23T07:38:10.7876478Z 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) peepholes
2025-09-23T07:38:10.7876979Z 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) alloc-combiner
2025-09-23T07:38:10.7877616Z 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) indirect-call-promotion
2025-09-23T07:38:10.7878209Z 1192.1171 (100.0%) 34.0967 (100.0%) 1226.2137 (100.0%) 64.1746 (100.0%) Total
2025-09-23T07:38:10.7878551Z
2025-09-23T07:38:10.7878737Z ===-------------------------------------------------------------------------===
2025-09-23T07:38:10.7879171Z CG breakdown
2025-09-23T07:38:10.7879629Z ===-------------------------------------------------------------------------===
2025-09-23T07:38:10.7880128Z Total Execution Time: 0.8442 seconds (0.8442 wall clock)
2025-09-23T07:38:10.7880432Z
2025-09-23T07:38:10.7880643Z ---User Time--- --User+System-- ---Wall Time--- --- Name ---
2025-09-23T07:38:10.7881157Z 0.8442 (100.0%) 0.8442 (100.0%) 0.8442 (100.0%) Callgraph construction
2025-09-23T07:38:10.7881627Z 0.8442 (100.0%) 0.8442 (100.0%) 0.8442 (100.0%) Total
2025-09-23T07:38:10.7881902Z
```
</details>
And here for the Rust compiler's shared library:
<details>
<summary>rustc</summary>
```
2025-09-23T08:08:24.1087661Z [2025-09-23T08:08:24.108Z INFO opt_dist::exec] Executing `/rustroot/bin/llvm-bolt /tmp/.tmp7rsDA1 -data /tmp/tmp-multistage/opt-artifacts/rustc-bolt.profdata -o /checkout/obj/build/x86_64-unknown-linux-gnu/stage2/lib/librustc_driver-37c25f9240306b8c.so -reorder-blocks=ext-tsp -reorder-functions=cdsort -split-functions -split-strategy=cdsplit -split-all-cold -jump-tables=move -icf=all -update-debug-sections -dyno-stats --time-rewrite --time-opts [at /checkout/obj]`
2025-09-23T08:08:24.1162714Z BOLT-INFO: shared object or position-independent executable detected
2025-09-23T08:08:24.1167794Z BOLT-INFO: Target architecture: x86_64
2025-09-23T08:08:24.1168182Z BOLT-INFO: BOLT version: <unknown>
2025-09-23T08:08:24.1168539Z BOLT-INFO: first alloc address is 0x0
2025-09-23T08:08:24.1169073Z BOLT-INFO: creating new program header table at address 0x5000000, offset 0x5000000
2025-09-23T08:08:24.1169599Z BOLT-INFO: enabling relocation mode
2025-09-23T08:08:24.3930106Z BOLT-INFO: enabling lite mode
2025-09-23T08:08:25.0832384Z BOLT-WARNING: split function detected on input : d_type.cold. The support is limited in relocation mode
2025-09-23T08:08:27.2217568Z BOLT-WARNING: Failed to analyze 216 relocations
2025-09-23T08:08:27.2420728Z BOLT-INFO: pre-processing profile using branch profile reader
2025-09-23T08:08:39.2601621Z BOLT-WARNING: 10 collisions detected while hashing binary objects. Use -v=1 to see the list.
2025-09-23T08:08:40.7672287Z BOLT-INFO: 14020 out of 73549 functions in the binary (19.1%) have non-empty execution profile
2025-09-23T08:08:40.7673057Z BOLT-INFO: 496 functions with profile could not be optimized
2025-09-23T08:08:40.7673519Z BOLT-INFO: profile for 1 objects was ignored
2025-09-23T08:08:41.5617188Z BOLT-INFO: profile quality metrics for the hottest 1000 functions (reporting top 5% values): function CFG discontinuity 0.00%; call graph flow conservation gap 0.00%; CFG flow conservation gap 0.00% (weighted) 0.00% (worst); exception handling usage 0.00% (of total BBEC) 0.00% (of total InvokeEC)
2025-09-23T08:08:42.4075121Z BOLT-INFO: 830299 instructions were shortened
2025-09-23T08:08:42.4612088Z BOLT-INFO: removed 1400 empty blocks
2025-09-23T08:08:42.4612532Z BOLT-INFO: merged 3 duplicate CFG edges
2025-09-23T08:08:43.1085506Z BOLT-INFO: ICF folded 71 out of 73966 functions in 3 passes. 13 functions had jump tables.
2025-09-23T08:08:43.1086714Z BOLT-INFO: Removing all identical functions will save 33.68 KB of code space. Folded functions were called 83349861 times based on profile.
2025-09-23T08:08:44.2241731Z BOLT-INFO: basic block reordering modified layout of 8895 functions (63.45% of profiled, 12.04% of total)
2025-09-23T08:08:45.6003056Z BOLT-INFO: splitting separates 19348618 hot bytes from 9477858 cold bytes (67.12% of split functions is hot).
2025-09-23T08:08:45.6137762Z BOLT-INFO: 118 Functions were reordered by LoopInversionPass
2025-09-23T08:17:09.2383905Z BOLT-INFO: splitting separates 12219489 hot bytes from 7571490 cold bytes (61.74% of split functions is hot).
2025-09-23T08:17:09.6368551Z BOLT-INFO: program-wide dynostats after all optimizations before SCTC and FOP:
2025-09-23T08:17:09.6370124Z
2025-09-23T08:17:09.6370507Z 159303000032 : executed forward branches
2025-09-23T08:17:09.6370968Z 13512704701 : taken forward branches
2025-09-23T08:17:09.6371346Z 22109772505 : executed backward branches
2025-09-23T08:17:09.6371727Z 15474886893 : taken backward branches
2025-09-23T08:17:09.6372118Z 7312349150 : executed unconditional branches
2025-09-23T08:17:09.6372699Z 10414141994 : all function calls
2025-09-23T08:17:09.6373049Z 5212041960 : indirect calls
2025-09-23T08:17:09.6373384Z 157126340 : PLT calls
2025-09-23T08:17:09.6373718Z 1299789320113 : executed instructions
2025-09-23T08:17:09.6374097Z 327477301380 : executed load instructions
2025-09-23T08:17:09.6374499Z 183484733372 : executed store instructions
2025-09-23T08:17:09.6374959Z 3252587661 : taken jump table branches
2025-09-23T08:17:09.6375346Z 0 : taken unknown indirect branches
2025-09-23T08:17:09.6375717Z 188725121687 : total branches
2025-09-23T08:17:09.6376050Z 36299940744 : taken branches
2025-09-23T08:17:09.6376498Z 152425180943 : non-taken conditional branches
2025-09-23T08:17:09.6376899Z 28987591594 : taken conditional branches
2025-09-23T08:17:09.6377281Z 181412772537 : all conditional branches
2025-09-23T08:17:09.6377523Z
2025-09-23T08:17:09.6377706Z 150062838844 : executed forward branches (-5.8%)
2025-09-23T08:17:09.6378130Z 7662406296 : taken forward branches (-43.3%)
2025-09-23T08:17:09.6378553Z 31348251587 : executed backward branches (+41.8%)
2025-09-23T08:17:09.6378981Z 14962033590 : taken backward branches (-3.3%)
2025-09-23T08:17:09.6379428Z 6073487992 : executed unconditional branches (-16.9%)
2025-09-23T08:17:09.6379854Z 10414141994 : all function calls (=)
2025-09-23T08:17:09.6380214Z 5212041960 : indirect calls (=)
2025-09-23T08:17:09.6380562Z 157126340 : PLT calls (=)
2025-09-23T08:17:09.6380925Z 1293805780658 : executed instructions (-0.5%)
2025-09-23T08:17:09.6381339Z 327477301380 : executed load instructions (=)
2025-09-23T08:17:09.6381748Z 183484733372 : executed store instructions (=)
2025-09-23T08:17:09.6382150Z 3252587661 : taken jump table branches (=)
2025-09-23T08:17:09.6382549Z 0 : taken unknown indirect branches (=)
2025-09-23T08:17:09.6382926Z 187484578423 : total branches (-0.7%)
2025-09-23T08:17:09.6383286Z 28697927878 : taken branches (-20.9%)
2025-09-23T08:17:09.6383697Z 158786650545 : non-taken conditional branches (+4.2%)
2025-09-23T08:17:09.6384153Z 22624439886 : taken conditional branches (-22.0%)
2025-09-23T08:17:09.6384584Z 181411090431 : all conditional branches (-0.0%)
2025-09-23T08:17:09.6384859Z
2025-09-23T08:17:09.8918378Z BOLT-INFO: SCTC: patched 33 tail calls (33 forward) tail calls (0 backward) from a total of 33 while removing 0 double jumps and removing 33 basic blocks totalling 165 bytes of code. CTCs total execution count is 1454562 and the number of times CTCs are taken is 1450251
2025-09-23T08:17:18.0775912Z BOLT-INFO: setting _end to 0x7ad9420
2025-09-23T08:17:18.0928802Z BOLT-INFO: setting __hot_start to 0x5200000
2025-09-23T08:17:18.0929438Z BOLT-INFO: setting __hot_end to 0x69c5e9b
2025-09-23T08:17:19.7001113Z ===-------------------------------------------------------------------------===
2025-09-23T08:17:19.7001661Z Rewrite passes
2025-09-23T08:17:19.7002090Z ===-------------------------------------------------------------------------===
2025-09-23T08:17:19.7002612Z Total Execution Time: 4849.9231 seconds (527.8837 wall clock)
2025-09-23T08:17:19.7003190Z
2025-09-23T08:17:19.7003481Z ---User Time--- --System Time-- --User+System-- ---Wall Time--- --- Name ---
2025-09-23T08:17:19.7004150Z 2411.0373 ( 99.3%) 2421.3456 (100.0%) 4832.3830 ( 99.6%) 510.3416 ( 96.7%) run optimization passes
2025-09-23T08:17:19.7004733Z 6.2426 ( 0.3%) 0.2655 ( 0.0%) 6.5081 ( 0.1%) 6.5083 ( 1.2%) emit and link
2025-09-23T08:17:19.7005265Z 5.2722 ( 0.2%) 0.5675 ( 0.0%) 5.8397 ( 0.1%) 5.8411 ( 1.1%) disassemble functions
2025-09-23T08:17:19.7006101Z 2.6959 ( 0.1%) 0.1529 ( 0.0%) 2.8488 ( 0.1%) 2.8489 ( 0.5%) discover file objects
2025-09-23T08:17:19.7006765Z 1.4865 ( 0.1%) 0.0403 ( 0.0%) 1.5268 ( 0.0%) 1.5269 ( 0.3%) pre-process profile data
2025-09-23T08:17:19.7007318Z 0.5127 ( 0.0%) 0.0000 ( 0.0%) 0.5127 ( 0.0%) 0.5127 ( 0.1%) process profile data
2025-09-23T08:17:19.7007852Z 0.1676 ( 0.0%) 0.1084 ( 0.0%) 0.2759 ( 0.0%) 0.2760 ( 0.1%) read special sections
2025-09-23T08:17:19.7008378Z 0.0114 ( 0.0%) 0.0000 ( 0.0%) 0.0114 ( 0.0%) 0.0114 ( 0.0%) read debug info
2025-09-23T08:17:19.7008986Z 0.0084 ( 0.0%) 0.0000 ( 0.0%) 0.0084 ( 0.0%) 0.0084 ( 0.0%) process metadata pre-CFG
2025-09-23T08:17:19.7009552Z 0.0084 ( 0.0%) 0.0000 ( 0.0%) 0.0084 ( 0.0%) 0.0084 ( 0.0%) process profile data pre-CFG
2025-09-23T08:17:19.7010162Z 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) discover storage
2025-09-23T08:17:19.7010696Z 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) process section metadata
2025-09-23T08:17:19.7011256Z 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) update metadata post-emit
2025-09-23T08:17:19.7011824Z 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) process metadata post-CFG
2025-09-23T08:17:19.7012387Z 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) finalize metadata pre-emit
2025-09-23T08:17:19.7012931Z 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) update debug info
2025-09-23T08:17:19.7013477Z 2427.4429 (100.0%) 2422.4802 (100.0%) 4849.9231 (100.0%) 527.8837 (100.0%) Total
2025-09-23T08:17:19.7013821Z
2025-09-23T08:17:19.7014007Z ===-------------------------------------------------------------------------===
2025-09-23T08:17:19.7014456Z Binary Function Pass Manager
2025-09-23T08:17:19.7014889Z ===-------------------------------------------------------------------------===
2025-09-23T08:17:19.7015393Z Total Execution Time: 4832.3549 seconds (510.3134 wall clock)
2025-09-23T08:17:19.7015715Z
2025-09-23T08:17:19.7015979Z ---User Time--- --System Time-- --User+System-- ---Wall Time--- --- Name ---
2025-09-23T08:17:19.7016604Z 2372.4561 ( 98.4%) 2411.8820 ( 99.6%) 4784.3382 ( 99.0%) 502.1853 ( 98.4%) split-functions
2025-09-23T08:17:19.7017165Z 1.9135 ( 0.1%) 0.0000 ( 0.0%) 1.9135 ( 0.0%) 1.9135 ( 0.4%) reorder-functions
2025-09-23T08:17:19.7017703Z 4.9072 ( 0.2%) 6.8726 ( 0.3%) 11.7798 ( 0.2%) 1.0785 ( 0.2%) identical-code-folding
2025-09-23T08:17:19.7018235Z 0.8752 ( 0.0%) 0.0000 ( 0.0%) 0.8752 ( 0.0%) 0.8752 ( 0.2%) fix-branches
2025-09-23T08:17:19.7018760Z 0.8417 ( 0.0%) 0.0000 ( 0.0%) 0.8417 ( 0.0%) 0.8445 ( 0.2%) profile-quality-stats
2025-09-23T08:17:19.7019295Z 10.8591 ( 0.5%) 0.0000 ( 0.0%) 10.8591 ( 0.2%) 0.5774 ( 0.1%) reorder-blocks
2025-09-23T08:17:19.7019816Z 0.4994 ( 0.0%) 0.0000 ( 0.0%) 0.4994 ( 0.0%) 0.4994 ( 0.1%) validate-mem-refs
2025-09-23T08:17:19.7020386Z 0.3845 ( 0.0%) 0.0000 ( 0.0%) 0.3845 ( 0.0%) 0.3845 ( 0.1%) print dyno-stats after optimizations
2025-09-23T08:17:19.7020953Z 10.1393 ( 0.4%) 2.5906 ( 0.1%) 12.7299 ( 0.3%) 0.3762 ( 0.1%) finalize-functions
2025-09-23T08:17:19.7021519Z 0.3394 ( 0.0%) 0.0000 ( 0.0%) 0.3394 ( 0.0%) 0.3394 ( 0.1%) set dyno-stats before optimizations
2025-09-23T08:17:19.7022194Z 0.2545 ( 0.0%) 0.0000 ( 0.0%) 0.2545 ( 0.0%) 0.2545 ( 0.0%) simplify-conditional-tail-calls
2025-09-23T08:17:19.7022784Z 0.2519 ( 0.0%) 0.0000 ( 0.0%) 0.2519 ( 0.0%) 0.2519 ( 0.0%) validate-internal-calls
2025-09-23T08:17:19.7023321Z 0.1601 ( 0.0%) 0.0000 ( 0.0%) 0.1601 ( 0.0%) 0.1601 ( 0.0%) inst-lowering
2025-09-23T08:17:19.7023844Z 0.1249 ( 0.0%) 0.0000 ( 0.0%) 0.1249 ( 0.0%) 0.1249 ( 0.0%) lower-annotations
2025-09-23T08:17:19.7024353Z 3.9011 ( 0.2%) 0.0000 ( 0.0%) 3.9011 ( 0.1%) 0.1101 ( 0.0%) aligner
2025-09-23T08:17:19.7024889Z 0.1056 ( 0.0%) 0.0000 ( 0.0%) 0.1056 ( 0.0%) 0.1056 ( 0.0%) strip-rep-ret
2025-09-23T08:17:19.7025398Z 0.0489 ( 0.0%) 0.0000 ( 0.0%) 0.0489 ( 0.0%) 0.0489 ( 0.0%) clean-mc-state
2025-09-23T08:17:19.7025945Z 0.9365 ( 0.0%) 0.0000 ( 0.0%) 0.9365 ( 0.0%) 0.0400 ( 0.0%) eliminate-unreachable
2025-09-23T08:17:19.7026494Z 0.7548 ( 0.0%) 0.0000 ( 0.0%) 0.7548 ( 0.0%) 0.0363 ( 0.0%) shorten-instructions
2025-09-23T08:17:19.7027053Z 0.5528 ( 0.0%) 0.0000 ( 0.0%) 0.5528 ( 0.0%) 0.0294 ( 0.0%) normalize CFG
2025-09-23T08:17:19.7027558Z 0.4854 ( 0.0%) 0.0000 ( 0.0%) 0.4854 ( 0.0%) 0.0243 ( 0.0%) remove-nops
2025-09-23T08:17:19.7028068Z 0.0154 ( 0.0%) 0.0000 ( 0.0%) 0.0154 ( 0.0%) 0.0154 ( 0.0%) assign-sections
2025-09-23T08:17:19.7028630Z 0.1778 ( 0.0%) 0.0000 ( 0.0%) 0.1778 ( 0.0%) 0.0136 ( 0.0%) loop-inversion-opt
2025-09-23T08:17:19.7029144Z 0.0085 ( 0.0%) 0.0000 ( 0.0%) 0.0085 ( 0.0%) 0.0085 ( 0.0%) print-stats
2025-09-23T08:17:19.7029669Z 0.0084 ( 0.0%) 0.0000 ( 0.0%) 0.0084 ( 0.0%) 0.0084 ( 0.0%) estimate-edge-counts
2025-09-23T08:17:19.7030197Z 0.0077 ( 0.0%) 0.0000 ( 0.0%) 0.0077 ( 0.0%) 0.0077 ( 0.0%) patch-entries
2025-09-23T08:17:19.7030725Z 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) retpoline-insertion
2025-09-23T08:17:19.7031283Z 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) indirect-call-promotion
2025-09-23T08:17:19.7031838Z 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) PLT call optimization
2025-09-23T08:17:19.7032355Z 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) inlining
2025-09-23T08:17:19.7032867Z 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) tail duplication
2025-09-23T08:17:19.7033378Z 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) peepholes
2025-09-23T08:17:19.7033877Z 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) reorder-data
2025-09-23T08:17:19.7034390Z 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) frame-optimizer
2025-09-23T08:17:19.7034909Z 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) alloc-combiner
2025-09-23T08:17:19.7035446Z 2411.0097 (100.0%) 2421.3452 (100.0%) 4832.3549 (100.0%) 510.3134 (100.0%) Total
2025-09-23T08:17:19.7035797Z
2025-09-23T08:17:19.7035986Z ===-------------------------------------------------------------------------===
2025-09-23T08:17:19.7036419Z CG breakdown
2025-09-23T08:17:19.7036838Z ===-------------------------------------------------------------------------===
2025-09-23T08:17:19.7037419Z Total Execution Time: 1.1101 seconds (1.1101 wall clock)
2025-09-23T08:17:19.7037725Z
2025-09-23T08:17:19.7037936Z ---User Time--- --User+System-- ---Wall Time--- --- Name ---
2025-09-23T08:17:19.7038452Z 1.1101 (100.0%) 1.1101 (100.0%) 1.1101 (100.0%) Callgraph construction
2025-09-23T08:17:19.7038921Z 1.1101 (100.0%) 1.1101 (100.0%) 1.1101 (100.0%) Total
2025-09-23T08:17:19.7039198Z
```
</details>
https://github.com/llvm/llvm-project/pull/156243
More information about the llvm-commits
mailing list