[PATCH] D118177: [llvm-profgen] Decouple artificial branch from LBR parser and fix external address related issues

Wenlei He via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Apr 28 09:55:14 PDT 2022


wenlei added inline comments.


================
Comment at: llvm/tools/llvm-profgen/PerfReader.cpp:86
+    // then return from the external funtion. This doesn't damage the call
+    // stack, so the state should remain valid, just skip recording this range.
+    NumPairedExtAddr++;
----------------
> This doesn't damage the call stack, so the state should remain valid, just skip recording this range.

There's an implicit assumption that is we also filter out external stack frames which matches external LBRs. Perhaps worth pointing out very briefly.



================
Comment at: llvm/tools/llvm-profgen/PerfReader.cpp:508-514
+  emitWarningSummary(Unwinder.NumPairedExtAddr * 2, Unwinder.NumTotalBranches,
+                     "of branches containing paired external address.");
+
+  emitWarningSummary(Unwinder.NumUnpairedExtAddr, Unwinder.NumTotalBranches,
+                     "of branches containing external address but doesn't have "
+                     "another external address to pair, likely due to "
+                     "interrupt jmp or invalid perf script input.");
----------------
I think we need to be a bit more disciplined about adding these warnings now. As you can see, for any llvm-profgen run, we're always printing out a bunch of warnings, which can be over whelming to users (non-compiler devs). 

It's ok to add these two in this change, but we probably need a follow up change to restructure those warnings: 1) try to see if we can trim some, 2) perhaps hide some less important ones under something like a verbose mode, or different warning level.


================
Comment at: llvm/tools/llvm-profgen/PerfReader.cpp:865
+  uint64_t ExternalOffset = Binary->virtualAddrToOffset(ExternalAddr);
+  uint64_t EndOffeset = ExternalOffset;
   for (const LBREntry &LBR : Sample->LBRStack) {
----------------
The notion of ExternalOffset is a bit weird -- it translates a special constant into something special but not a const...  Wondering if it could be avoid, e.g. replace `X == ExternalOffset` with `Binary->offsetIsCode(X)`? 


================
Comment at: llvm/tools/llvm-profgen/ProfiledBinary.h:81
   // isn't a real call context the compiler will see.
-  ExternalAddr = 1,
+  ExternalAddr = 0x1000000000000000ULL,
 };
----------------
hoy wrote:
> wlei wrote:
> > hoy wrote:
> > > wlei wrote:
> > > > wenlei wrote:
> > > > > wlei wrote:
> > > > > > hoy wrote:
> > > > > > > This could be a potential legal address. Use UINT64_MAX instead?
> > > > > > Okay, Before I didn't use UINT64_MAX because there can be add operations to address value and adding value to UINT64_MAX caused overflow, we might always keep in mind of this. I changed to UINT64_MAX and fixed one overflow issue, also wondering if we have a better number for this so that overflow can't be issues in the future.
> > > > > Technically, both and 1 and UINT64_MAX can be legal address, but practically that probably will never happen. While overflow is an issue with UINT64_MAX, I think 1 is fine?
> > > > This is the address not offset, it will be minus the base address, this is more common than "add" overflow. Both 1 and UINT64_MAX can overflow, that's why I choose the value in the middle 0x1000000000000000ULL. But I'm fine with either value.
> > > How can UINT64_MAX overflow? Do we add anything to it? 
> > > 
> > > 
> > > I guess with value 1 we would need special handling when converting it to offset right?
> > > 
> > > Actually IIRC, the upper half of the 64-bit mem space, i.e, FFFF8000'00000000 through FFFFFFFF'FFFFFFFF, is used as kernel space on Linux, so using UINT64_MAX to represent external address should be fine.
> > > 
> > Uh.. when it detect bogus trace ` LeafAddr >= LBRLeaf + 0x100`, I already fixed this.
> > 
> > yeah, UINT64_MAX overflow are very unlikely, it should be fine.
> > 
> > 1 is also okay I guess, minus a number, then become a negative value, but still doesn't conflict to binary offset.
> > 
> I see. Then either value works. 
I don't have strong opinion either, but I'd suggest keep 1 if UINT64_MAX isn't better than 1. Make changes only if needed.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D118177/new/

https://reviews.llvm.org/D118177



More information about the llvm-commits mailing list