[PATCH] D55309: ThinLTO: Do not import debug info for imported global constants

Tue Dec 18 09:06:20 PST 2018

On Thu, Dec 6, 2018 at 9:22 AM Teresa Johnson <tejohnson at google.com> wrote:

> On Wed, Dec 5, 2018 at 1:52 PM David Blaikie <dblaikie at gmail.com> wrote:
>
>>
>>
>> On Wed, Dec 5, 2018 at 11:37 AM Teresa Johnson via Phabricator <
>> reviews at reviews.llvm.org> wrote:
>>
>>> tejohnson accepted this revision.
>>> tejohnson added a comment.
>>> This revision is now accepted and ready to land.
>>>
>>> LGTM to address regression, especially since the debug info already
>>> isn't maintained for global variables that are const propagated. David, as
>>> a follow on it would be good to understand if/when there are cases where
>>> the debug info should be imported for these variables (i.e. if they aren't
>>> in fact constant propagated after internalization).
>>>
>>
>> Sure - would you prefer that in the form of an email discussion,
>> documentation, further elaborated comments (might get a bit verbose, but I
>> did try to cover some of the tricks there)?
>>
>
> That's a good question. I suppose an email discussion around cases where
> we will now be losing information after importing and what we could do to
> fix/improve that would be good. FIXME comments might be good too?
>

I included a somewhat verbose, but hopefully descriptive comment about some
of the issues in the committed version of this patch (
https://reviews.llvm.org/D55309?vs=176755&id=176875#toc ).

Basically for now - so long as this only imports constants and so long as
the original module still gets linked in (even if the importing causes the
original module to no longer be "needed" as such), I don't think we lose
anything - the debug info emitted in the original module will describe the
constant (assuming the constant isn't established in some other module -
eg: if the constant is uninitialized but then initialized in one other
random module - but then optimizations remove that write operation and
instead import the constant value to all users - now the DWARF will
describe the memory location of the variable, but it'll remain
uninitialized forever)

>
>
>> (as an aside: We did some tricks to ensure that ThinLTO+Split DWARF only
>> produces a single CU (for a few reasons - Split DWARF wasn't really well
>> defined with multiple CUs, GDB couldn't cope with it, and especially it
>> didn't have any well defined way of doing cross-CU references to describe
>> the importing anyway), but looking at how much this regression grew debug
>> info - I wonder if anyone (even those not using Split DWARF) would really
>> want all the little CU fragments to be emitted as separate CUs in each
>> ThinLTO object file - it looks not great, even if it is somewhat more
>> accurate to the source code (as best it can in the medium term at least) -
>> DWARF isn't really built for all these tiny CUs - though if Apple folks
>> haven't been complaining about the debug info size for their ThinLTO use...
>> maybe that's good enough?)
>>
>
> That's a good question. I haven't heard any complaints, but not everyone
> will notice. My understanding of Apples debug info situation is that they
> essentially split it out and link it separately, so it might not be as
> noticeably problematic there?
>
> On the flip side, I wonder if it would be noticeably bad to always avoid
> all that extra debug info if it isn't significantly more useful to have
> them.
>

The classic example where it's "bad" is where cross-module inlining
ultimately inlines a module/file-local (anonymous namespace or file-scoped
"static") function. In perfect DWARF, CU A would refer to CU B for an
inlined function - that way, when a debugger is running, it knows which
function called 'foo' to find when the user asks for it.

You could imagine:
A.cpp:
  static void f3() {
  }
  void f1() {
    f2();
    f3();
  }
B.cpp:
  static void f3() {
    ...
  }
  void f2() {
    f3();
  }

If you aren't using Fission with ThinLTO, but say you're still using
ThinLTO, you get two object files from the backend actions - but the A.o
file has two CUs, because some stuff from B was imported into it.
The two CUs describe all of A.cpp and some of B.cpp, and where f1 calls f2,
a reference from f1 in the CU for A.cpp refers to f2 in B.cpp.

This way, the debugger, when evaluating the expression 'f3' in f1, it knows
to find A.cpp's f1, but in f2, even the f2 that's inlined into A.o, it
knows to find B.cpp's f3 - assuming that was also inlined, so it was
described by the B.cpp CU in A.o... - but yeah, that's a further problem,
because the debugger doesn't know that B.cpp's CU in A.o and B.cpp's CU in
B.o are the same CU, really... - if f3 isn't inlined into A.o, the debugger
still won't find it (& if it is inlined - then what can the user do with f3
anyway - they can't take its address, they can't call it, etc... - but at
least when it does happen, by keeping it in a separate CU it doesn't
collide with A.cpp's real f3, so the user can still call/interact with
/that/ f3 correctly)

With Fission+ThinLTO, A.o has only one CU, with everything that was inlined
into A.o in that CU. So you can end up with two f3 functions, completely
unrelated to each other, in the same CU - and then the debugger wouldn't
have a great time interacting with either of them. (might pick one
randomly, etc)

>
> Teresa
>
>
>>
>>>
>>>
>>>
>>> ================
>>> Comment at: lib/Linker/IRMover.cpp:1065
>>>      ValueMap.MD()[CU->getRawRetainedTypes()].reset(nullptr);
>>> +    ValueMap.MD()[CU->getRawGlobalVariables()].reset(nullptr);
>>>
>>> ----------------
>>> Please add a comment here (the old one that was removed isn't quite
>>> accurate now, but could probably be modified).
>>>
>>>
>>> Repository:
>>>   rL LLVM
>>>
>>> CHANGES SINCE LAST ACTION
>>>   https://reviews.llvm.org/D55309/new/
>>>
>>> https://reviews.llvm.org/D55309
>>>
>>>
>>>
>>>
>
> --
> Teresa Johnson |  Software Engineer |  tejohnson at google.com |
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20181218/af083bd5/attachment-0001.html>