[PATCH] D16440: [ThinLTO] Link in only necessary DICompileUnit operands
Duncan P. N. Exon Smith via llvm-commits
llvm-commits at lists.llvm.org
Tue Jan 26 19:14:00 PST 2016
> On 2016-Jan-26, at 08:57, Mehdi AMINI <mehdi.amini at apple.com> wrote:
> joker.eph added a comment.
> It seems that
>> Why aren't these correlated by the metadata ID instead? I.e. !93 instead of !"_ZTSN11xercesc_2_516XMLPlatformUtilsE" in the DISubprogram scope? Then no TypeIdentifierMap would be needed.
> I think it is to break cycles. String are a convenient way to break dependencies :)
Regarding the type map, the solution I have in mind is:
1. Create a new "context", with more limited scope than LLVMContext, meant
strictly for modules that are going to be linked together. Let's
call it `DIContext` for the sake of discussion (maybe `LinkingContext`
would be better though?).
2. Each module has a `DIContext` (which itself points at the `LLVMContext`).
3. Add an explicit map, `MDString*` -> `DIType*`, that stores the types
with an `identifier:` field (i.e., UUID) in the `DIContext`, for
uniquing via ODR for C++, or the clang module UUIDs).
4. Remove `DITypeRef`s, making them `DIType*` again (and reverting
`DIScopeRef`s to `DIScope*`).
I don't think this is too hard to do; even the testcase updates in (4)
should be easily scriptable.
For (3), there's an option to just move the metadata to the `LinkingContext`
(rather than adding a `DICompositeType` uniquing map there). This makes
sense to me.
Why do we need a new context?
- Storing this on the Module precludes us from "uniquing" these (via the
new map) during BitcodeReader time.
- Uniquing by UUID on the LLVMContext is invalid, since it's possible to
have completely unrelated Modules in the same context. If both
modules have C++ types by the same name (say, `class MyType`), then
we get invalid debug info in one of the modules. Since these modules
are unrelated in this hypothetical example (and we have no intention
of linking them), ODR doesn't apply, so we don't get an out there.
> I discussed with Duncan yesterday, the ultimate plan he had last year seemed to be to reverse all the links: Subprograms would point to the CompileUnit instead of the opposite for instance. And ultimately there wouldn't be a need for the named metadata dbg.cu.
Right, I think we should reverse links (as pcc started to do with
Function -> DISubprogram) from the IR toward the compile unit, and
then following pointers gives you "the right stuff".
1. Add DISubprogram -> DICompileUnit and drop the reverse.
2. Add GlobalVariable -> DIGlobalVariable and drop the reverse
(this needs a new IR features, metadata attachments to GVs).
3. Add DIGlobalVariable -> DICompileUnit and drop the reverse.
4. Create an explicit in-memory type map, restore pointers, and
drop the "retained type" map.
Maybe we can't get rid of enums and macros -- I'm not sure -- in
which case we might still need !llvm.dbg.cu? I hope not though. If
you have no code in the executable that came from a translation unit,
why would you still want the macros from that translation unit?
More information about the llvm-commits