[PATCH] D28075: MergeFunctions: Preserve debug info in thunks, under option -mergefunc-preserve-debug-info

Thu Jan 5 18:21:42 PST 2017

appcs added a comment.

In https://reviews.llvm.org/D28075#637188, @aprantl wrote:

> > Call sites of the merged function occurring from within the TU of the merged function's definition will point directly to the shared implementation and call sites of the merged function that are external to the TU of the merged function's definition call the thunk for the merged function (which tail call's the shared implementation passing forward the incoming arguments). This is existing behaviour under -mergefunc and remains unchanged under -mergefunc-preserve-debug-info (except for the tail call part).
>
> Is this the right thing to do, though? I would think that for better debugability users would always prefer the version with the thunk even in the current TU. Then again, subsequent optimization passes would likely inline the thunk and thus potentially undo most of the effect. Have you experimented with this variant? Would the debug info from the thunk survive?

Thank you; I agree that this greatly enhances the debug experience. Automatically modifying the call site with a direct call to the shared implementation instead of (what is now) the thunk, when within the TU, is confusing when debugging. Stepping into the thunk from the call site and from within the thunk, into the shared implementation keeps the debug experience uniform for internal and external call sites of functions that become thunks. (If you ask me, this is similar in spirit to not marking the thunk's call to the shared implementation as a tail call to help debugging - the full back trace is the same in both cases and otherwise, there is a certain element of surprise in both cases. Basically, we are saying that under -mergefunc-preserve-debug-info we trade off optimization partly (at -O0) to aid debugability, modifying the transformation slightly, to make the execution flow explicit). The same question remains for thunk call sites within the TU; do we do this:

[debug]         At -O0                   -mergefunc -mergefunc-preserve-debug-info - call thunk even though within the same TU
[production] At -O<non-zero> -mergefunc -mergefunc-preserve-debug-info - call shared implementation [as-(currently)-is]

MergeFunctions runs pretty much at the end of the sequence of optimization phases (only prior to Loop sinking and Instruction Simplify), so inlining has already happened at the call-sites.

I did verify that (when the thunk does not tail call the shared implementation), the preserved debug info survives even when merge-functions is done prior to inlining: {

{

   1 int sumA(int *a, int size) {
   2   int i, s;
   3   for (i = 0, s = 0; i < size; i++)
   4     s += a[i];
   5   return s;
   6 }
   7 
   8 int sumB(int *a, int size) {
   9   int i, s;
  10   for (i = 0, s = 0; i < size; i++)
  11     s += a[i];
  12   return s;
  13 }
  14 
  15 int main(void) {
  16 
  17   int a[8] = { 1, 2, 3, 4, 5, 6, 7, 8 };
  18 
  19   int x;
  20 
  21   x = sumB(a, 8);
  22 
  23   return 0;
  24 }

}

define i32 @sumB(i32* %a, i32 %size) #0 !dbg !44 {

  ...// inlined code from sumA()
  store i32* %a, i32** %a.addr, align 8
  call void @llvm.dbg.declare(metadata i32** %a.addr, metadata !50, metadata !13), !dbg !51
  store i32 %size, i32* %size.addr, align 4
  call void @llvm.dbg.declare(metadata i32* %size.addr, metadata !52, metadata !13), !dbg !53
  ...// rest of inlined code from sumA()

}

define i32 @main() #0 !dbg !69 {

  entry:
   ...// inlined code from sumB()
   %a.addr.i = alloca i32*, align 8
   call void @llvm.dbg.declare(metadata i32** %a.addr.i, metadata !50, metadata !13), !dbg !78
   %size.addr.i = alloca i32, align 4
   call void @llvm.dbg.declare(metadata i32* %size.addr.i, metadata !52, metadata !13), !dbg !79
   ...// rest of inlined code from sumB()

}

...

!50 = !DILocalVariable(name: "a", arg: 1, scope: !44, file: !7, line: 8, type: !11)
 !51 = !DILocation(line: 8, column: 15, scope: !44)
 !52 = !DILocalVariable(name: "size", arg: 2, scope: !44, file: !7, line: 8, type: !10)
 !53 = !DILocation(line: 8, column: 22, scope: !44)
 ...
 !78 = !DILocation(line: 8, column: 15, scope: !44, inlinedAt: !74)
 !79 = !DILocation(line: 8, column: 22, scope: !44, inlinedAt: !74)

  ...

}

Also, when the thunk does tail call the shared implementation, the exact same LLVM IR as above is generated (i.e. as when the thunk does not tail call the shared implementation).

https://reviews.llvm.org/D28075