[llvm] [PGO] Add llvm.loop.estimated_trip_count metadata (PR #152775)

Joel E. Denny via llvm-commits llvm-commits at lists.llvm.org
Thu Aug 14 14:17:47 PDT 2025


jdenny-ornl wrote:

> I assume that this patch is intended to be NFC-ish, but it does cause codegen changes, and as-is it's hard to tell whether that's because of the new loop metadata that is present everywhere, or due to divergence between branch weights and the metadata. (A sample to look at would be libclamav_nsis_LZMADecode.c from llvm-test-suite.)

Thanks.  I took a look at libclamav_nsis_LZMADecode.c, using the [results you previously sent](https://llvm-compile-time-tracker.com/compare.php?from=3e579d93ab50952628a51bda05f3a39f6a5a631c&to=f7b65011de519b1bd987892475db61f99dde44ce&stat=instructions%3Au) as a reference.  For -O3, that shows that the previously landed version of this PR reduced the instruction count by 25.84%.

Following the instructions at the About page there, I used `valgrind --tool=callgrind` to try to reproduce the results.

For the current PR, I saw the following summary:

```
==2593500== I   refs:      909,055,484
```

I commented out all runs of PGOEstimateTripCountsPass and saw:

```
==2600705== I   refs:      1,203,023,203
```

So running PGOEstimateTripCountsPass still has roughly the same impact as in the previous results: 24.4% reduction.

I then tried to determine what in PGOEstimateTripCountsPass causes the change.  In setLoopEstimatedTripCount's addStringMetadataToLoop call, I renamed the loop metadata from `llvm.loop.estimated_trip_count` to `foo`, which I believe should be semantically ignored by the rest of LLVM, including getLoopEstimatedTripCount:

```
==2674918== I   refs:      909,054,727
```

So, it appears that it is the up-front loop metadata creation that produces the change rather any changes related to estimated trip counts or branch weights.

To rule out anything else from the PR, I went back to the main branch and added a new function pass into the pipeline where this PR adds PGOEstimateTripCountsPass.  The new pass merely calls addStringMetadataToLoop to add the bogus "foo" loop metadata to all loops.  For any function containing loops, it preserves only CFGAnalyses and LazyCallGraphAnalysis, and it preserves all analyses otherwise.

```
==2797012== I   refs:      909,223,730
```

I then removed the addStringMetadataToLoop call from that pass to confirm that it not the analysis invalidation causes the change:

```
==2800546== I   refs:      1,206,601,523
```

> I think we should at least start by not having the pass and only doing on the fly updates.

Do you think we should remove the pass or just disable it by default, as was previously suggested?  I am leery of the latter because I am afraid it will just end up being unused code, unless someone else commits to continuing to investigate it.

https://github.com/llvm/llvm-project/pull/152775


More information about the llvm-commits mailing list