[llvm-dev] DWARF Fission + ThinLTO

Wed May 3 14:09:05 PDT 2017

> On May 3, 2017, at 2:00 PM, David Blaikie <dblaikie at gmail.com> wrote:
> 
> So Dehao and I have been dealing with some of the nitty gritty details of debug info with ThinLTO, specifically with Fission(Split DWARF).
> 
> This applies to LTO as well, so I won't single out ThinLTO here.
> 
> 1) Multiple CUs in a .dwo file
> Clang/LLVM produces a CU for each original source file - these CUs are kept through IR linking (thin or full) and produced as distinct CUs in the resulting DWARF.
> This preserves semantics as much as possible - eg: file-local functions and types (those in anonymous namespaces or declared file-static) con co-exist even if their names collide.
> GDB does good things with this, except with Fission.

Can you lay out the terminology you are using here? Is Fission as used here different/more than creating .dwo files?

> Binutils DWP produces usable DWPs from DWO files with multiple CUs
> 
> 2) Cross-CU references
> This is where it gets trickier.
> LLVM produces cross-CU references (DW_FORM_ref_addr) to refer to types or functions defined in one CU used from another. This only happens with (thin or plain) LTO. LLVM's had this for a while.
> This helps fully express interesting cases outlined in (1) - it's possible that a file-local function is inlined into another CU - by using ref_addr the semantics described in (1) can be preserved, while also explaining the inlining that has occurred. (similar cases can arise with intra-CU type references too)
> GDB handles this, except with Fission (I mean, it doesn't get this far - but even with patches to handle (1)+Fission, it's still not enough - needs more work)
> Binutils DWP and the DWP format in general... maybe can't cope with this.
> 
> Talking about (2)+DWP:

You mean (2)+DWO+DWP?

> It looks like this may require adjusting at least the semantics, if not the syntax of the indexes in the DWP file. I've started a thread on dwarf-discuss to start trying to hash that out.
> 
> 
> 
> So, to cut a long story short: Fission+LTO is currently unusable and may take a while to make the DWARF side of this usable.
> 
> In the interim, I'd like to propose adding a flag to LLVM to support something not very nice: When merging modules (or importing things into a module in ThinLTO), do not maintain separate CUs but redirect the cu pointer in any DISubprogram (& probably DIGlobalVariable too - I forget if ThinLTO can import them) metadata to refer to the original CU in the destination module. (in the full LTO case, it'd also be necessary to import any retained types)
> 
> This flag could be passed to the task doing the merging/importing, or could be passed to the frontend compile and stashed in per-CU metadata (where it would be respected when importing/merging from that CU).
> 
> The intent would be to support this flag until the DWARF functionality can be decided on and implemented - and kept for some time after that for backwards compatibility for those who want to use Fission but haven't got the latest toolchains with the fixes/improvements in them.
> 
> 
> 
> How's this sound to everyone? Reasonable? Unreasonable? Need more details about the impact of these changes, etc? (I have examples with DWARF output, GDB behavior, etc)

If you can post an example, that would help.

thanks,
adrian