[PATCH] D153771: [BOLT][Instrumentation] Fix hash table memory corruption and append-pid option

Rafael Auler via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Jun 27 13:56:44 PDT 2023


rafauler added inline comments.


================
Comment at: bolt/runtime/instr.cpp:254
   uint64_t Val;
+  void report(const char *Msg = nullptr) {
+    char Buf[BufSize];
----------------
treapster wrote:
> rafauler wrote:
> > report -> dump
> > 
> > see Graph::dump() for an easier way to write (and read) this code
> > 
> > also include a call to this function inside a DEBUG() macro to showcase how/where you want this printed. If you don't want to always print it in debug, leave it commented out, so at least we know how to use it when we need it.
> It is done this way because we want to construct a string once and print it atomically in a single call to write(). If write is called more than once, we get garbage when threads and processes are involved.
Sounds good, can you write a comment explaining that? (regarding the single call to write() )


================
Comment at: bolt/test/runtime/instrumentation-indirect-2.c:65-80
+CHECK-ALL-DAG: func1
+CHECK-ALL-DAG: func2
+CHECK-ALL-DAG: func3
+CHECK-ALL-DAG: func4
+CHECK-ALL-DAG: func5
+CHECK-ALL-DAG: func6
+CHECK-ALL-DAG: func7
----------------
treapster wrote:
> rafauler wrote:
> > rafauler wrote:
> > > For some reason this test is failing on shared build.
> > > 
> > > Build BOLT with BUILD_SHARED_LIBS=On to check that.
> > The reason this doesn't work is because  "func1" is matching to "func10"
> > 
> > Either remove func10-func16 or rename them with letters. e.g.: funca, funcb, funcc, funcd, funcf, funcg
> I tried matching {{\bfunc1\b}} and similar constructs and it didn't help. I now changed numbers to letters and it still seems to fail sometimes..
You're right, unfortunately it seems to be failing some times. That's annoying. I gave up on the DAG thing and tried this:

https://pastebin.com/rgn1hKr9

And it seems to be working consistently. That's one option forward, if you're OK with it.


================
Comment at: bolt/test/runtime/instrumentation-indirect-2.c:96-97
+# process and vice versa.
+RUN: cat %t.$child_pid.fdata | FileCheck %s --check-prefix=CHECK-CHILD
+RUN: cat %t.$par_pid.fdata | FileCheck %s --check-prefix=CHECK-PARENT
+
----------------
treapster wrote:
> rafauler wrote:
> > This can accidentally match other entries in the profile, such as the activity in funcX calling printf(). I imagine what we want to match is the indirect call itself, right? For that I would rather use:
> > 
> > 
> >   RUN: llvm-bolt %t.exe -data %t.child.fdata \
> >   RUN:   -print-finalized -print-only=main -o /dev/null | FileCheck %s --check-prefix=CHECK_CHILD
> > 
> > And then match this string:
> >     {{.*}}:   callq   *%rax # CallProfile: 8 (0 misses) :
> >         { func1: 1 (0 misses) },
> >         { func3: 1 (0 misses) },
> > ...
> > 
> > 
> > Actually, I ran BOLT like that and surprisingly the profile is incorrect and it is printing this:
> > .LFT15 (8 instructions, align : 1)
> >   Exec Count : 8
> >   CFI State : 3
> >   Predecessors: .Ltmp10
> >     000000d1:   movslq  -0x94(%rbp), %rax
> >     000000d8:   movq    -0x90(%rbp,%rax,8), %rax
> >     000000e0:   movl    -0x98(%rbp), %edi
> >     000000e6:   callq   *%rax # CallProfile: 8 (0 misses) :
> >         { <unknown>: 1 (0 misses) },
> >         { <unknown>: 1 (0 misses) },
> >         { <unknown>: 1 (0 misses) },
> >         { <unknown>: 1 (0 misses) },
> >         { <unknown>: 1 (0 misses) },
> >         { <unknown>: 1 (0 misses) },
> >         { <unknown>: 1 (0 misses) },
> >         { <unknown>: 1 (0 misses) }
> >     000000e8:   movl    -0x94(%rbp), %eax
> >     000000ee:   addl    $0x2, %eax
> >     000000f1:   movl    %eax, -0x94(%rbp)
> >     000000f7:   jmp     .Ltmp10
> >   Successors: .Ltmp10 (mispreds: 0, count: 8)
> > 
> > 
> > 
> > I'm not sure why yet.
> If we match instructions, it unnecessarily becomes arch-specific. I think we need to figure out why indirect calls are not recorded in profile and construct a regex to match them there.
Sounds good.  Matching against the CallProfile annotation (ignoring the instruction opcode) shouldn't be arch-specific, though. But I'm also OK with regex matching fdata, whichever you prefer.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D153771/new/

https://reviews.llvm.org/D153771



More information about the llvm-commits mailing list