[PATCH] D81784: [ELF] Resolve relocations in .debug_* referencing (discarded symbols or ICF folded section symbols) to tombstone values

Fri Jun 26 12:02:37 PDT 2020

dblaikie added a comment.

In D81784#2116925 <https://reviews.llvm.org/D81784#2116925>, @jhenderson wrote:

> Whilst merging this change into our downstream port, our private tests flagged up a behaviour difference between our old behaviour and the version this patch implements. On further investigation, I think it is a bug in the upstream patch rather than our merge or tests.
>
> Consider the ICF case with two duplicate functions:
>
>   int foo(int x)
>   {
>   	return 42 + x;
>   }
>  
>   int bar(int x)
>   {
>   	return 42 + x;
>   }
>  
>   int main(){
>   	int x = 0;
>   	x += foo(x);
>   	x += bar(x);
>   	return x;
>   }
>
>
> In this case, LLD `--icf=all` will result in foo and bar being combined, with foo being the "kept" function. Prior to this change, references from debug data to the folded function (i.e. bar) were updated to point at the kept function. This would of course mean there was duplicate debug data for various addresses. In general, this may not have been ideal, but in one case I think it is actually necessary, specifically in the case of .debug_line. If you run gdb on the above example, having linked it using LLD, you can set a breakpoint at lines within bar (e.g. line 8 in my example), and it will be hit both when foo is executed and when bar is executed. This is reasonable behaviour, given the debug data doesn't include call site information the debugger can use here. With this patch, the reference in .debug_line to bar is patched to -1. This means that placing a breakpoint at line 8 will result in a breakpoint at address 0x6 (in my local case), rather than in the middle of foo. This in turn means the breakpoint will never be hit. From a user's point of view, if they don't know where their function is being folded into (which would probably be the norm), the breakpoint not working properly is a serious regression. Of course, if it is placed, the program will jump to a different function than expected (and may even break unexpectedly), but I think personally that this is less confusing, especially as that is the old behaviour anyway.
>
> I think we need to add a special case for patching .debug_line, so that duplicates are not patched to -1, but rather to the symbol they are a duplicate of (dead references should still be patched to -1). This would match our downstream's old behaviour.

I have mixed feelings about this - I assume any strategy for this would likely also trigger for the case I've seen more often, which is two inline function definitions that are identical and rather than one DW_TAG_subprogram getting to point to the final copy, both DW_TAG_subprograms in different CUs point to the final copy (if that copy is the same length, I think is the test in other linkers - but maybe it's if their contents are identical)?

Is this a change in lld's behavior? Or has lld always resolved identical-but-deduplicated functions to only the one subprogram, or to all subprograms?

Or is your workaround (the one you have in your fork) /only/ for the line table, but not for DW_AT_ranges on the CU or for the low_pc of the subprogram? That would seem pretty quirky - having addresses covered by the line table that aren't covered by the ranges of the CU? But also if it is applied to teh CU too, then you have CUs with overlapping ranges (@probinson I think expressed some concern about overlapping CUs being undesirable previously)

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D81784/new/

https://reviews.llvm.org/D81784