[PATCH] D118177: [llvm-profgen] Decouple artificial branch from LBR parser and fix external address related issues

Lei Wang via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Apr 28 15:06:14 PDT 2022


wlei added inline comments.


================
Comment at: llvm/tools/llvm-profgen/PerfReader.cpp:86
+    // then return from the external funtion. This doesn't damage the call
+    // stack, so the state should remain valid, just skip recording this range.
+    NumPairedExtAddr++;
----------------
wenlei wrote:
> > This doesn't damage the call stack, so the state should remain valid, just skip recording this range.
> 
> There's an implicit assumption that is we also filter out external stack frames which matches external LBRs. Perhaps worth pointing out very briefly.
> 
Comments refined.


================
Comment at: llvm/tools/llvm-profgen/PerfReader.cpp:508-514
+  emitWarningSummary(Unwinder.NumPairedExtAddr * 2, Unwinder.NumTotalBranches,
+                     "of branches containing paired external address.");
+
+  emitWarningSummary(Unwinder.NumUnpairedExtAddr, Unwinder.NumTotalBranches,
+                     "of branches containing external address but doesn't have "
+                     "another external address to pair, likely due to "
+                     "interrupt jmp or invalid perf script input.");
----------------
wenlei wrote:
> I think we need to be a bit more disciplined about adding these warnings now. As you can see, for any llvm-profgen run, we're always printing out a bunch of warnings, which can be over whelming to users (non-compiler devs). 
> 
> It's ok to add these two in this change, but we probably need a follow up change to restructure those warnings: 1) try to see if we can trim some, 2) perhaps hide some less important ones under something like a verbose mode, or different warning level.
Agreed the less important warnings should be trimmed . Will follow up with a patch.


================
Comment at: llvm/tools/llvm-profgen/PerfReader.cpp:865
+  uint64_t ExternalOffset = Binary->virtualAddrToOffset(ExternalAddr);
+  uint64_t EndOffeset = ExternalOffset;
   for (const LBREntry &LBR : Sample->LBRStack) {
----------------
wenlei wrote:
> The notion of ExternalOffset is a bit weird -- it translates a special constant into something special but not a const...  Wondering if it could be avoid, e.g. replace `X == ExternalOffset` with `Binary->offsetIsCode(X)`? 
Changed to use `Binary->offsetIsCode`


================
Comment at: llvm/tools/llvm-profgen/ProfiledBinary.h:81
   // isn't a real call context the compiler will see.
-  ExternalAddr = 1,
+  ExternalAddr = 0x1000000000000000ULL,
 };
----------------
wenlei wrote:
> hoy wrote:
> > wlei wrote:
> > > hoy wrote:
> > > > wlei wrote:
> > > > > wenlei wrote:
> > > > > > wlei wrote:
> > > > > > > hoy wrote:
> > > > > > > > This could be a potential legal address. Use UINT64_MAX instead?
> > > > > > > Okay, Before I didn't use UINT64_MAX because there can be add operations to address value and adding value to UINT64_MAX caused overflow, we might always keep in mind of this. I changed to UINT64_MAX and fixed one overflow issue, also wondering if we have a better number for this so that overflow can't be issues in the future.
> > > > > > Technically, both and 1 and UINT64_MAX can be legal address, but practically that probably will never happen. While overflow is an issue with UINT64_MAX, I think 1 is fine?
> > > > > This is the address not offset, it will be minus the base address, this is more common than "add" overflow. Both 1 and UINT64_MAX can overflow, that's why I choose the value in the middle 0x1000000000000000ULL. But I'm fine with either value.
> > > > How can UINT64_MAX overflow? Do we add anything to it? 
> > > > 
> > > > 
> > > > I guess with value 1 we would need special handling when converting it to offset right?
> > > > 
> > > > Actually IIRC, the upper half of the 64-bit mem space, i.e, FFFF8000'00000000 through FFFFFFFF'FFFFFFFF, is used as kernel space on Linux, so using UINT64_MAX to represent external address should be fine.
> > > > 
> > > Uh.. when it detect bogus trace ` LeafAddr >= LBRLeaf + 0x100`, I already fixed this.
> > > 
> > > yeah, UINT64_MAX overflow are very unlikely, it should be fine.
> > > 
> > > 1 is also okay I guess, minus a number, then become a negative value, but still doesn't conflict to binary offset.
> > > 
> > I see. Then either value works. 
> I don't have strong opinion either, but I'd suggest keep 1 if UINT64_MAX isn't better than 1. Make changes only if needed.
Sounds good to use 1


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D118177/new/

https://reviews.llvm.org/D118177



More information about the llvm-commits mailing list