[PATCH] D115538: [CSSPGO][llvm-profgen] Fix external address issues of perf reader (leading external LBR part)
Lei Wang via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Sun Dec 12 10:57:20 PST 2021
wlei added inline comments.
================
Comment at: llvm/test/tools/llvm-profgen/inline-noprobe2.test:86
;CHECK: 4.3: 0
-;CHECK: 5.1: 10
-;CHECK: 5.3: 10
-;CHECK: 6: 10
-;CHECK: 6.1: 12
-;CHECK: 6.3: 10
+;CHECK: 5.1: 11
+;CHECK: 5.3: 11
----------------
Here it's indeed fixed by this patch.
================
Comment at: llvm/test/tools/llvm-profgen/inline-noprobe2.test:97
;CHECK: 65499: 0
+;CHECK: quick_sort:903:25
+;CHECK: 1: 24
----------------
wenlei wrote:
> Ok, so this happens with existing test cases too. I didn't realize the entire function profile for small function could be missing due to external addresses even in contrived tests. Did you verify what is the external address (what function from what library) leads to samples involving `quick_sort` being ignored?
Sorry for the confusing. This is not related with this patch, I deleted this part by mistake in https://reviews.llvm.org/D115013. When I fixed this test case, I realized it and tried to sneak in this diff :)
================
Comment at: llvm/tools/llvm-profgen/PerfReader.cpp:258-259
+ // external address(kernel, dynamic library), the call stack is truncated
+ // since we can not optimize code outside of the binary. The remaining LBRs
+ // continue to be unwinded using a context-less frame stack.
+ if (IsLeaf && State.CurrentLeafFrame->isExternalFrame()) {
----------------
wenlei wrote:
> A bit confused by the comment on "unwinded using a context-less frame stack" - I thought after we strip out the leading external frame and external LBR, we would still use the remaining stack as context, hence it's not context-less.
Good point. Based on this, we don't need the code here. We can have CALL/Return/JMP to external address and current unwinder all covered this, we can just reuse them. One exception is return to external code, that should be handed by the next patch. Removed this code.
================
Comment at: llvm/tools/llvm-profgen/PerfReader.cpp:261
+ if (IsLeaf && State.CurrentLeafFrame->isExternalFrame()) {
+ State.InstPtr.update(State.getCurrentLBRSource());
+ State.switchToFrame(State.getCurrentLBRSource());
----------------
wenlei wrote:
> I think we can do some sanity check here:
> 1. assert that we have a paired LBR here? ie. LBR target is external.
> 2. also assert that the next frame on stack aligns with LBR source?
>assert that we have a paired LBR here? ie. LBR target is external.
Sounds good, added assertion in `validateInitialState`.
> also assert that the next frame on stack aligns with LBR source?
If I understand correctly, that assumes the LBR source should be a CALL? The tricky part is we don't only have CALL into external address, we can have RETURN, JMP. One common cases is the PLT, internal call --> PLT--> external address. PLT code is a JMP.
How about to give a warning if the following CALL unwinding hit a mismatched frame.
================
Comment at: llvm/tools/llvm-profgen/PerfReader.cpp:262
+ State.InstPtr.update(State.getCurrentLBRSource());
+ State.switchToFrame(State.getCurrentLBRSource());
+ State.advanceLBR();
----------------
wenlei wrote:
> Should this be `pushFrame`/`popFrame` instead of `switchToFrame` depending on the instruction at the source of LBR?
> Basically we are unwinding through the internal->external LBR here, which should be either a call or return, right?
Yeah, same to above comment. the current unwinder should already covered pushFrame/popFrame. One difference is the Return. For returning to external address, we don't have a target callsite, hence we can't push a valid frame. But this case should be handled by `https://reviews.llvm.org/D115550` which used an artificial frame to truncate the call stack.
> which should be either a call or return.
You see, it appeared that the PLT code is a JMP.
================
Comment at: llvm/tools/llvm-profgen/PerfReader.cpp:624
+ NumTopExternalFrame++;
+ CallStack.emplace_back(UnwindState::ProfiledFrame::ExternalAddr);
+ }
----------------
wenlei wrote:
> If only leaf frames are external, can we keep the remaining frames for unwinding?
Sounds good, tried to keep the remaining internal call stack for unwinding.
================
Comment at: llvm/tools/llvm-profgen/PerfReader.cpp:979
+ else {
+ NumTotalSample++;
parseSample(TraceIt);
----------------
wenlei wrote:
> nit: move this into `parseSample`
fixed, thanks!
================
Comment at: llvm/tools/llvm-profgen/PerfReader.cpp:1165
+ emitWarningSummary(NumTopExternalFrame, NumTotalSample,
+ "of samples have top external frame in call stack.");
----------------
wenlei wrote:
> nit: top->leaf to make it explicit, for message and variable name.
Fixed
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D115538/new/
https://reviews.llvm.org/D115538
More information about the llvm-commits
mailing list