[PATCH] D52845: Update entry count for cold calls

Wei Mi via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Oct 12 16:52:17 PDT 2018


wmi added a comment.

Sorry for the late reply. I didn't notice the thread.

I admit the problem that the patch describes introduces some impreciseness to the profile annotated, but a problem is that the patch only updates the entry counts of outlined functions, but the profiles inside of those outlined functions are not updated. Ideally if the entry count of a outline function is doubled, maybe the callsite count inside of the function also should be doubled too?

I am not saying it is necessary to keep all the profiles consistent, after all it is about cold place. But I like to know whether the patch is trying to solve a theoretical problem or motivated by a real case.

Thanks,
Wei.



================
Comment at: lib/Transforms/IPO/SampleProfile.cpp:866-867
+    auto pair = notInlinedCounts.try_emplace(Callee, 0);
+    const auto *FS = Pair.getSecond();
+    pair.first->second += FS->getEntrySamples();
+  }
----------------
Can be merged into one line.


================
Comment at: lib/Transforms/IPO/SampleProfile.cpp:1603-1609
+    uint64_t CallCount = pair.getSecond();
+    uint64_t CalleeCount = 0;
+    if (auto ECount = Callee->getEntryCount())
+      CalleeCount = ECount.getCount();
+
+    CalleeCount += CallCount;
+    Callee->setEntryCount(CalleeCount);
----------------
Maybe merged to:
ECount = Callee->getEntryCount()
Callee->setEntryCount(pair.getSecond() + ECount ? ECount.getCount() : 0);


================
Comment at: test/Transforms/SampleProfile/Inputs/inline-cold.prof:1-9
+main:225715:10
+ 2.1: 5553
+ 3: 5391
+ 3.1: _Z3sumii:586
+  0: 5279
+  1: 5279
+  2: 5279
----------------
Better to make the function total count equal to the sum of the counts inside of the function because such inconsistency can cause trouble sometimes.


================
Comment at: test/Transforms/SampleProfile/inline-cold.ll:30-31
+  store i32 %y, i32* %y.addr, align 4
+  %0 = load i32, i32* %x.addr, align 4, !dbg !11
+  %1 = load i32, i32* %y.addr, align 4, !dbg !11
+  %add = add nsw i32 %0, %1, !dbg !11
----------------
%0, %1... should be renamed in testcase.


Repository:
  rL LLVM

https://reviews.llvm.org/D52845





More information about the llvm-commits mailing list