[PATCH] D115538: [CSSPGO][llvm-profgen] Fix external address issues of perf reader (leading external LBR part)

Lei Wang via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Dec 13 20:45:18 PST 2021


wlei added inline comments.


================
Comment at: llvm/tools/llvm-profgen/PerfReader.cpp:261
+    if (IsLeaf && State.CurrentLeafFrame->isExternalFrame()) {
+      State.InstPtr.update(State.getCurrentLBRSource());
+      State.switchToFrame(State.getCurrentLBRSource());
----------------
wenlei wrote:
> wlei wrote:
> > wenlei wrote:
> > > I think we can do some sanity check here:
> > > 1. assert that we have a paired LBR here? ie. LBR target is external.
> > > 2. also assert that the next frame on stack aligns with LBR source?
> > >assert that we have a paired LBR here? ie. LBR target is external.
> > Sounds good, added assertion in `validateInitialState`.
> > 
> > > also assert that the next frame on stack aligns with LBR source?
> > 
> > If I understand correctly, that assumes the LBR source should be a CALL? The tricky part is we don't only have CALL into external address, we can have RETURN, JMP.  One common cases is the PLT,  internal call --> PLT--> external address. PLT code is a JMP.
> > 
> > How about to give a warning if the following CALL unwinding hit a mismatched frame.
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> On 2nd thought, you're right that if the transfer is not a call, there will be mismatch, PLT is one, tail call jump can be another. 
> 
> Thought for PLT, perhaps we can treat PLT as external too, so anything involved PLT will be stripped too.
> 
> I was hoping that we can have a sense of how intrusive this kind of mismatch is. Hopefully it's rare. Warning is good but perhaps PLT case can be handled? 
PLT case can be handled, we can have a diff to make PLT code as external code. then we can have some statistic base on this. I will have a separated diff for PLT stuffs.


================
Comment at: llvm/tools/llvm-profgen/PerfReader.cpp:624
+        NumTopExternalFrame++;
+        CallStack.emplace_back(UnwindState::ProfiledFrame::ExternalAddr);
+      }
----------------
hoy wrote:
> wlei wrote:
> > wenlei wrote:
> > > If only leaf frames are external, can we keep the remaining frames for unwinding?
> > Sounds good, tried to keep the remaining internal call stack for unwinding.
> Maybe we can keep the whole call stack by replacing consecutive external frames with an artificial frame introduced in the other patch. This should be synchronized with the LBR parser where we replace consecutive external LBRs with an artificial branch. 
Good point. Changed to both use `external branch`, update in the next patch.


================
Comment at: llvm/tools/llvm-profgen/PerfReader.cpp:62
   auto *ParentFrame = State.getParentFrame();
+  if (ParentFrame->Address != Source && State.isLeadingFrameExternal())
+    NumMismatchedFrame++;
----------------
wenlei wrote:
> Actually do we need the check on `isLeadingFrameExternal`? Should we always expect parent frame address and LBR source to align for any calls? 
> 
> Additionally, should there a one instruction shift between frame address and source/call address (see `getCallAddrFromFrameAddr`)?
> 
> And when mismatch happens, should we abort unwinding? It seems that at this point it's guaranteed to generate incorrect context? 
> 
> How you seen such mismatch happening? 
> Should we always expect parent frame address and LBR source to align for any calls?

you see there is a comment to explain "// TODO: Currently we just assume all the addr that can't match the
 2nd frame is in prolog/epilog.", so before we saw frame in prolog/epilog mismatch the call stack.

> Additionally, should there a one instruction shift between frame address and source/call address (see getCallAddrFromFrameAddr)?

We already fix this mismatch in `extractCallstack` see 
```
    // We need to translate return address to call address for non-leaf frames.
    if (!CallStack.empty()) {
      auto CallAddr = Binary->getCallAddrFromFrameAddr(FrameAddr);
      if (!CallAddr) {
      ....
```

> And when mismatch happens, should we abort unwinding? It seems that at this point it's guaranteed to generate incorrect context?

Yeah, before for the frame in  prolog/epilog, we didn't abort. If mismatch happen, we still can process using a context-less stack, but that will make things even complicated.

> How you seen such mismatch happening?


I haven't tested it. I will add a warning for both leading frame issue and frame   prolog/epilog. Maybe we can see how many cases of them to decide if we need to abort.








================
Comment at: llvm/tools/llvm-profgen/PerfReader.cpp:242
   UnwindState State(Sample, Binary);
+  NumTotalLBR += State.getLBRStackSize();
 
----------------
wenlei wrote:
> nit: this is more like total branches. usually LBR record is considered the full 16/32 entires of branch pairs in LBR. same applies to the warning message. 
thanks for the suggestion.


================
Comment at: llvm/tools/llvm-profgen/PerfReader.cpp:525
+      // Do not ignore the entire samples, the remaining LBR can still be
+      // unwinded using a context-less stack.
+      continue;
----------------
hoy wrote:
> nit: unwinded -> unwound
fixed, thanks!


================
Comment at: llvm/tools/llvm-profgen/PerfReader.cpp:533
+        if (LBRStack.empty()) {
+          NumLeadingExternalLBR++;
+          // Record this LBR since current source and next LBR' target is still
----------------
hoy wrote:
> nit: NumLeadingExternalLBR -> NumLeadingOutgoingLBR?
fixed, thanks!


================
Comment at: llvm/tools/llvm-profgen/PerfReader.h:217
+
+// The special frame address.
+enum SpecialFrameAddr {
----------------
hoy wrote:
> nit: address -> addresses
> 
> Also enum class is preferred.
Fixed, thanks

> Also enum class is preferred.

It appeared that an complication issue to initialize the static const variable in ProfiledFrame


================
Comment at: llvm/tools/llvm-profgen/PerfReader.h:302
   void advanceLBR() { LBRIndex++; }
+  bool isLeadingFrameExternal() const {
+    if (!LBRStack.empty() && LBRStack[0].Target == ExternalAddr)
----------------
wenlei wrote:
> nit: LeadingFrame -> LeafFrame. LeafLBR -> LastLBR. Leaf or root is used for frame, but there's no leaf or root for LBR, which isn't hierarchical. If there're other inconsistent usage of the terms, we can fix them too.  
> 
> Additionally, why not check on CurrentLeafFrame instead given the function name says "Frame"?
Thanks for the suggestion. The warning is moved to the next patch.


================
Comment at: llvm/tools/llvm-profgen/PerfReader.h:305
+      return true;
+    return false;
+  }
----------------
hoy wrote:
> nit: use a single return statement `return !LBRStack.empty() && LBRStack[0].Target == ExternalAddr` .
The warning is moved to the next patch, thanks!


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D115538/new/

https://reviews.llvm.org/D115538



More information about the llvm-commits mailing list