[PATCH] D28075: MergeFunctions: Preserve debug info in thunks, under option -mergefunc-preserve-debug-info

Wed Jan 4 16:52:01 PST 2017

appcs added a comment.

  Thank you for your review, Adrian.

  True, under -mergefunc-preserve-debug-info the existing debug info (from the entry block) for the merged function's arguments is
  preserved.

  Call sites of the merged function occurring from within the TU of the merged function's definition will point directly to the shared
  implementation and call sites of the merged function that are external to the TU of the merged function's definition call the thunk
  for the merged function (which tail call's the shared implementation passing forward the incoming arguments). This is existing
  behaviour under -mergefunc and remains unchanged under -mergefunc-preserve-debug-info (except for the tail call part).

  So, given two source files, thunk-debugability.c & thunk-debugability-aux.c with content as follows:

  thunk-debugability.c {
        1 int sumA(int *a, int size) {
        2   int i, s;
        3   for (i = 0, s = 0; i < size; i++)
        4     s += a[i];
        5   return s;
        6 }
        7
        8 int sumB(int *a, int size) {
        9   int i, s;
       10   for (i = 0, s = 0; i < size; i++)
       11     s += a[i];
       12   return s;
       13 }
       14
       15 extern int sumExternalTest(int *p, int size);
       16
       17 int main(void) {
       18
       19   int a[8] = { 1, 2, 3, 4, 5, 6, 7, 8 };
       20
       21   int x, y;
       22
       23   x = sumA(a, 8);
       24   y = sumB(a, 8);
       25   sumExternalTest(a, 8);
       26
       27   return (x == y) ? 0 : -1;
       28 }
  }

  thunk-debugability-aux.c {
        1 extern int sumA(int *a, int size);
        2 extern int sumB(int *a, int size);
        3
        4 int sumExternalTest(int *p, int size) {
        5
        6   int x, y;
        7
        8   x = sumA(p, 8);
        9   y = sumB(p, 8);
       10
       11   return (x == y ? x : -1);
       12 }
  }

  Under -mergefunc (i.e. existing behaviour) {
    sumA() [shared implementation] remains as per the definition given by the user
    sumB() [merged function] as defined by the user is erased and a new sumB() is created containing a tail call to sumA()
    The call made by main() to sumB() is replaced by a call to sumA()
    The call to sumB() made by the external caller sumExternalTest() will call the thunk sumB() and in turn, sumB() will tail call sumA()
  }

  Under -mergefunc -mergefunc-preserve-debug-info (i.e. new behaviour) {
    sumA() [shared implementation] remains as per the definition given by the user
    sumB() [merged function] as defined by the user is retained, but is transformed as follows: {
      The debug info for the arguments in the entry block is preserved
      A call is made to sumA() passing forward the arguments received by sumB()
      Debug info is created for the call to sumA() and its return value
      The call to sumA() is not marked as a tail call [PS: Needs elaboration/clarification]
      The rest of the CFG for sumB() is erased
    }
    The call made by main() to sumB() is replaced by a call to sumA()
    The call to sumB() made by the external caller sumExternalTest() will call the thunk sumB() and in turn, sumB() will call sumA()
  }

  The rationale for the new behaviour is that under -mergefunc-preserve-debug-info it is possible to step from the call site of the
  merged function into its thunk and from there, into the shared implementation with the backtrace truly indicating the execution 
  flow. Once that new flow (which is the essence of -mergefunc modulo the tail call, which is just optimization) is understood and
  debugged, the user can recompile with -mergefunc-preserve-debug-info removed, if the need be.

  With the above example source code (compiled at -O0 -mergefunc -mergefunc-preserve-debug-info):

  Backtraces: {

    With call to sumA() not marked as a tail call in thunk sumB(): {

      Setting a breakpoint at a thunked function call site within the TU of its definition: {
        (gdb) break thunk-debugability.c:24
        Breakpoint 1 at 0x400609: file ./thunk-debugability.c, line 24.
        (gdb) run
        Starting program: /auto/compiler-migration/anmparal/code/upstreaming/MFI/thunk-debugability.mfig.exe

        Breakpoint 1, main () at ./thunk-debugability.c:24
        24        y = sumB(a, 8);
        (gdb) step
        sumA (a=0x7fffffffd6d0, size=8) at ./thunk-debugability.c:3
        3         for (i = 0, s = 0; i < size; i++)
        (gdb) bt
        #0  sumA (a=0x7fffffffd6d0, size=8) at ./thunk-debugability.c:3
        #1  0x000000000040060e in main () at ./thunk-debugability.c:24
        (gdb)
      }

      Setting a breakpoint at a thunked function call site outside the TU of its definition: {
        (gdb) break thunk-debugability-aux.c:9
        Breakpoint 2 at 0x400507: file ./thunk-debugability-aux.c, line 9.
        (gdb) run
        The program being debugged has been started already.
        Start it from the beginning? (y or n) y
        Starting program: /auto/compiler-migration/anmparal/code/upstreaming/MFI/thunk-debugability.mfig.exe

        Breakpoint 2, sumExternalTest (p=0x7fffffffd6d0, size=8) at ./thunk-debugability-aux.c:9
        9         y = sumB(p, 8);
        (gdb) step
        sumB (a=0x7fffffffd6d0, size=8) at ./thunk-debugability.c:8
        8       int sumB(int *a, int size)
        (gdb) bt
        #0  sumB (a=0x7fffffffd6d0, size=8) at ./thunk-debugability.c:8
        #1  0x0000000000400510 in sumExternalTest (p=0x7fffffffd6d0, size=8) at ./thunk-debugability-aux.c:9
        #2  0x000000000040061f in main () at ./thunk-debugability.c:25
        (gdb) step
        sumA (a=0x7fffffffd6d0, size=8) at ./thunk-debugability.c:3
        3         for (i = 0, s = 0; i < size; i++)
        (gdb) bt
        #0  sumA (a=0x7fffffffd6d0, size=8) at ./thunk-debugability.c:3
        #1  0x00000000004005a4 in sumB (a=0x7fffffffd6d0, size=8) at ./thunk-debugability.c:8
        #2  0x0000000000400510 in sumExternalTest (p=0x7fffffffd6d0, size=8) at ./thunk-debugability-aux.c:9
        #3  0x000000000040061f in main () at ./thunk-debugability.c:25
        (gdb)
      }
    }

    With call to sumA() marked as a tail call in thunk sumB(): {

      Setting a breakpoint at a thunked function call site within the TU of its definition: {
        (gdb) break thunk-debugability.c:24
        Breakpoint 1 at 0x400609: file ./thunk-debugability.c, line 24.
        (gdb) run
        Starting program: /auto/compiler-migration/anmparal/code/upstreaming/MFI/thunk-debugability.mfig.exe

        Breakpoint 1, main () at ./thunk-debugability.c:24
        24        y = sumB(a, 8);
        (gdb) step
        sumA (a=0x7fffffffd6d0, size=8) at ./thunk-debugability.c:3
        3         for (i = 0, s = 0; i < size; i++)
        (gdb) bt
        #0  sumA (a=0x7fffffffd6d0, size=8) at ./thunk-debugability.c:3
        #1  0x000000000040060e in main () at ./thunk-debugability.c:24
        (gdb)
      }

      Setting a breakpoint at a thunked function call site outside the TU of its definition: {
        (gdb) break thunk-debugability-aux.c:9
        Breakpoint 2 at 0x400507: file ./thunk-debugability-aux.c, line 9.
        (gdb) run
        Starting program: /auto/compiler-migration/anmparal/code/upstreaming/MFI/thunk-debugability.mfig.exe

        Breakpoint 2, sumExternalTest (p=0x7fffffffd6d0, size=8) at ./thunk-debugability-aux.c:9
        9         y = sumB(p, 8);
        (gdb) step
        sumB (a=0x7fffffffd6d0, size=8) at ./thunk-debugability.c:8
        8       int sumB(int *a, int size)
        (gdb) bt
        #0  sumB (a=0x7fffffffd6d0, size=8) at ./thunk-debugability.c:8
        #1  0x0000000000400510 in sumExternalTest (p=0x7fffffffd6d0, size=8) at ./thunk-debugability-aux.c:9
        #2  0x000000000040061f in main () at ./thunk-debugability.c:25
        (gdb) step
        sumA (a=0x7fffffffd6d0, size=8) at ./thunk-debugability.c:1
        1       int sumA(int *a, int size)
        (gdb) bt
        #0  sumA (a=0x7fffffffd6d0, size=8) at ./thunk-debugability.c:1
        #1  0x0000000000400510 in sumExternalTest (p=0x7fffffffd6d0, size=8) at ./thunk-debugability-aux.c:9
        #2  0x000000000040061f in main () at ./thunk-debugability.c:25
        (gdb)
      }
    }
  }

  So, these two scenarios have identical behaviour: {
    With call to sumA() not marked as a tail call in thunk sumB() and setting a breakpoint at a thunked function call site within the TU of its definition.
    With call to sumA()     marked as a tail call in thunk sumB() and setting a breakpoint at a thunked function call site within the TU of its definition.
  }

  Whereas, in: {
    With call to sumA() not marked as a tail call in thunk sumB() and setting a breakpoint at a thunked function call site outside the TU of its definition.
    With call to sumA()     marked as a tail call in thunk sumB() and setting a breakpoint at a thunked function call site outside the TU of its definition.
  }
  - the behaviour differs: When the call to sumA() is not marked as a tail call, the execution flow is clear (from call site, to thunk,
  to shared implementation)  but when the call to sumA() is marked as a tail call, note how across the step, sumB() is ("suddenly")
  replaced by sumA()
  Which behaviour do we feel is better for the debug experience?

  I need to clarify: It is not true that the call to sumA() will be auto-converted into a tail call at -O<non-zero>
  I apologize for making a misleading statement; I did not realize that I was passing an already -O<non-zero>'ed LLVM IR file to
  opt -O<non-zero> -mergefunc -mergefunc-preserve-debug-info

  So, do we want to conditionally tail call the shared implementation?
  [debug]      At -O0          -mergefunc -mergefunc-preserve-debug-info - it is an actual call
  [production] At -O<non-zero> -mergefunc -mergefunc-preserve-debug-info - it is a  tail   call
  - or do we just unconditionally keep the tail call, as-(currently)-is?

  --

  Thank you and @fhahn for the review; Please let me know which way we want to resolve the question of the tail call and
  I'll re-spin, with the test reworked with the above (keeping a single TU) example code, incorporating the other feedback.

Anmol.

https://reviews.llvm.org/D28075