[PATCH] D52845: Update entry count for cold calls
Wei Mi via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Oct 12 16:52:17 PDT 2018
wmi added a comment.
Sorry for the late reply. I didn't notice the thread.
I admit the problem that the patch describes introduces some impreciseness to the profile annotated, but a problem is that the patch only updates the entry counts of outlined functions, but the profiles inside of those outlined functions are not updated. Ideally if the entry count of a outline function is doubled, maybe the callsite count inside of the function also should be doubled too?
I am not saying it is necessary to keep all the profiles consistent, after all it is about cold place. But I like to know whether the patch is trying to solve a theoretical problem or motivated by a real case.
Thanks,
Wei.
================
Comment at: lib/Transforms/IPO/SampleProfile.cpp:866-867
+ auto pair = notInlinedCounts.try_emplace(Callee, 0);
+ const auto *FS = Pair.getSecond();
+ pair.first->second += FS->getEntrySamples();
+ }
----------------
Can be merged into one line.
================
Comment at: lib/Transforms/IPO/SampleProfile.cpp:1603-1609
+ uint64_t CallCount = pair.getSecond();
+ uint64_t CalleeCount = 0;
+ if (auto ECount = Callee->getEntryCount())
+ CalleeCount = ECount.getCount();
+
+ CalleeCount += CallCount;
+ Callee->setEntryCount(CalleeCount);
----------------
Maybe merged to:
ECount = Callee->getEntryCount()
Callee->setEntryCount(pair.getSecond() + ECount ? ECount.getCount() : 0);
================
Comment at: test/Transforms/SampleProfile/Inputs/inline-cold.prof:1-9
+main:225715:10
+ 2.1: 5553
+ 3: 5391
+ 3.1: _Z3sumii:586
+ 0: 5279
+ 1: 5279
+ 2: 5279
----------------
Better to make the function total count equal to the sum of the counts inside of the function because such inconsistency can cause trouble sometimes.
================
Comment at: test/Transforms/SampleProfile/inline-cold.ll:30-31
+ store i32 %y, i32* %y.addr, align 4
+ %0 = load i32, i32* %x.addr, align 4, !dbg !11
+ %1 = load i32, i32* %y.addr, align 4, !dbg !11
+ %add = add nsw i32 %0, %1, !dbg !11
----------------
%0, %1... should be renamed in testcase.
Repository:
rL LLVM
https://reviews.llvm.org/D52845
More information about the llvm-commits
mailing list