[clang-tools-extra] Bfi precision (PR #66285)
Matthias Braun via cfe-commits
cfe-commits at lists.llvm.org
Thu Oct 26 11:45:33 PDT 2023
MatzeB wrote:
I think I traced the clang regression down to `InlineCostCallAnalyzer::isColdCallSite` returning `true` in more instances. It previously did not do that because of what feels more like an accidental loss of precision:
The example I analyzed starts out as a single-block function with a call resulting in this BFI:
```
- entry: float = 1.0, int = 8
```
It seems that when we inline calls we just scale their BFI values to the callsite. And given that the entry block only had a factor of `int = 8` we have basically no precision left on the low-end. So after inlining a bit more the BFI in my example looks like this:
```
- entry: float = 1.0, int = 8
- if.then.i: float = 0.0, int = 0
- if.else.i: float = 0.0, int = 7
- if.end15.sink.split.i: float = 0.0, int = 0
- if.end15.i: float = 0.0, int = 8
- if.then19.i: float = 0.0, int = 5
- _ZN4llvm12DenseMapBaseINS_8DenseMapIPNS_10StructTypeEPNS_12StructLayoutENS_12DenseMapInfoIS3_vEENS_6detail12DenseMapPairIS3_S5_EEEES3_S5_S7_SA_E20InsertIntoBucketImplIS3_EEPSA_RKS3_RKT_SE_.exit: float = 0.0, int = 8
```
Note that some blocks frequencies got scaled down to `0`. But more importantly no matter what we do it is impossible to have any code marked as "cold" in this situation. Because the cold code threshold is computed as 2% of the entry block frequency. With an entry block value of `int = 8` the 2% cold-threshold is integer `0` and it is effectively impossible for any block to be below that threshold.
In comparison with the new conversion scheme we end up with:
```
- entry: float = 1.0, int = 18014398509481984
- if.then.i: float = 0.0, int = 9002696048640
- if.else.i: float = 0.0, int = 18005395813433344
- if.end15.sink.split.i: float = 0.0, int = 18000892999733
- if.end15.i: float = 0.0, int = 18014398509481984
- if.then19.i: float = 0.0, int = 11258999068426240
- _ZN4llvm12DenseMapBaseINS_8DenseMapIPNS_10StructTypeEPNS_12StructLayoutENS_12DenseMapInfoIS3_vEENS_6detail12DenseMapPairIS3_S5_EEEES3_S5_S7_SA_E20InsertIntoBucketImplIS3_EEPSA_RKS3_RKT_SE_.exit: float = 0.0, int = 18014398509481984
```
for the same situation and blocks like `if.end15.sink.split.i` end up being classified as cold code now.
https://github.com/llvm/llvm-project/pull/66285
More information about the cfe-commits
mailing list