[LLVMdev] Proposal: type uniquing of debug info for LTO

Fri Jun 21 11:56:59 PDT 2013

On Fri, Jun 21, 2013 at 11:50 AM, Manman Ren <mren at apple.com> wrote:
>
> On Jun 21, 2013, at 11:35 AM, Eric Christopher wrote:
>
>> On Thu, Jun 20, 2013 at 10:52 PM, Manman Ren <mren at apple.com> wrote:
>>>
>>> A summary of options for issue #3:
>>> 3> To actually access the MDNode referenced via the hash value, we need to perform a lookup from the hash value to find the corresponding MDNode.
>>> The questions are where to store this map and how to keep it up-to-date when a MDNode is replaced.
>>> ---------------------
>>> Option a) a map in DwarfDebug, AsmWriter and DIBuilder, modify access functions in DI classes to pass in the map.
>>> May need to modify printing functions in AsmWriter.
>>
>> You may not need the map in DIBuilder if you piggy back on the
>> existing map in CGDebugInfo.cpp. The AsmWriter stuff is unfortunate
>> for just debug dumps of the IR. A way around this might be nice, but
>> I've not looked into it.
>>
>>> ---------------------
>>> Option b) I am going to expand David's description with more details, correct me if I am wrong.
>>
>> I'll let David continue this part of the thread with you.
>>
>>>
>>> I personally prefer option b over a since it is much cleaner.
>>>
>>
>> Option b is more general and should probably work. I suggested a
>> originally because I know it would work and the impact is limited to
>> debug info. Also it doesn't involve touching the generic metadata
>> interface or code and could be done quickly and incrementally. b may
>> also require further changes to the debug info infrastructure to deal
>> with lookup, etc so may end up being as much churn as a. b is more
>> useful if we think other users of metadata may want/need similar
>> functionality in the future.
>>
>> I'm fine with b as a choice, but it's going to involve more planning
>> and code review and involve people outside of you, me and Dave :)
>
> Got it. David, are you okay with the details I added about option b?
> Let's start a new thread with title "Proposal: extend Metadata (MDNode) to support MDHash" or something like that.

I'd like to hear a rationale for wanting b first. I don't know (and
can think of) any other users of this functionality right now.

> We can add a weight to MDHash (other names are fine with me), to support multiple MDNodes with the same hash, but different weight.
>

Wait, what? This sounds like a bad idea.

-eric

> Thanks,
> Manman
>
>>
>> -eric
>>
>>
>>> Thanks,
>>> Manman
>>>
>>> On Jun 20, 2013, at 5:39 PM, David Blaikie wrote:
>>>
>>>> On Thu, Jun 20, 2013 at 5:25 PM, Manman Ren <mren at apple.com> wrote:
>>>>>
>>>>> On Jun 20, 2013, at 5:18 PM, David Blaikie <dblaikie at gmail.com> wrote:
>>>>>
>>>>> On Thu, Jun 20, 2013 at 5:13 PM, Manman Ren <mren at apple.com> wrote:
>>>>>
>>>>>
>>>>> On Jun 20, 2013, at 4:52 PM, David Blaikie wrote:
>>>>>
>>>>> On Thu, Jun 20, 2013 at 4:45 PM, Manman Ren <mren at apple.com> wrote:
>>>>>
>>>>>
>>>>> On Jun 20, 2013, at 3:55 PM, Manman Ren <mren at apple.com> wrote:
>>>>>
>>>>>
>>>>> On Jun 20, 2013, at 2:58 PM, Eric Christopher wrote:
>>>>>
>>>>> Hi Manman,
>>>>>
>>>>> On Thu, Jun 20, 2013 at 2:51 PM, Manman Ren <mren at apple.com> wrote:
>>>>>
>>>>>
>>>>> The intent of this proposal is to speedup compilation of "-flto -g" for c++
>>>>> programs.
>>>>> This is based on discussions with Adrian, David and Eric.
>>>>>
>>>>>
>>>>> Thanks for bringing this back to the list. The original thread was
>>>>> getting quite long.
>>>>>
>>>>> ---------------------------
>>>>> Problem:
>>>>> A single class can be used in multiple source files and the DI (Debug Info)
>>>>> class is included in multiple bc files. The duplication of
>>>>> class definitions causes blow-up in # of MDNodes, # of DIEs, leading to
>>>>> large memory requirement.
>>>>>
>>>>> As an example, SPEC xalancbmk requires 7GB of memory when compiled with
>>>>> -flto -g.
>>>>> With a preliminary implementation of type uniquing, the memory usage will be
>>>>> down to 2GB.
>>>>>
>>>>> In order to unique types, we have to break cycles in the MDNodes.
>>>>>
>>>>> A simple struct definition
>>>>> struct Base {
>>>>> int a;
>>>>> };
>>>>> can cause cycles in MDNodes:
>>>>> !12 = metadata !{i32 786451, metadata !13, null, metadata !"Base", i32 1,
>>>>> i64 32, i64 32, i32 0, i32 0, null, metadata !14, i32 0, null, null} ; [
>>>>> DW_TAG_structure_type ] [Base] [line 1, size 32, align 32, offset 0] [from ]
>>>>> !14 = metadata !{metadata !15, metadata !16}
>>>>> !15 = metadata !{i32 786445, metadata !13, metadata !12, metadata !"a", i32
>>>>> 2, i64 32, i64 32, i64 0, i32 0, metadata !8} ; [ DW_TAG_member ] [a] [line
>>>>> 2, size 32, align 32, offset 0] [from int]
>>>>> !16 = metadata !{i32 786478, metadata !13, metadata !12, metadata !"Base",
>>>>> metadata !"Base", metadata !"", i32 1, metadata !17, i1 false, i1 false, i32
>>>>> 0, i32 0, null, i32 320, i1 false, null, null, i32 0, metadata !20, i32 1} ;
>>>>> [ DW_TAG_subprogram ] [line 1] [Base]
>>>>>
>>>>> Cycles: !12 -- !14 -- !15 -- !12
>>>>>       !12 -- !14 -- !16 -- !12
>>>>>
>>>>> These cycles make it hard to unique the same struct used in two bc files.
>>>>>
>>>>> ---------------------------
>>>>> How to fix:
>>>>>
>>>>> We attach a hash value to types to help type uniquing and we also replace
>>>>> references to types with their hash values.
>>>>> For the above struct "Base", we now have the following MDNodes:
>>>>> !4 = metadata !{i32 786451, metadata !5, null, metadata !"Base", i32 1, i64
>>>>> 32, i64 32, i32 0, i32 0, null, metadata !6, i32 0, i32 0, null, i32
>>>>> 915398439} ; [ DW_TAG_structure_type ] [Base] [line 1, size 32, align 32,
>>>>> offset 0] [from ]
>>>>> !6 = metadata !{metadata !7, metadata !9}
>>>>> !7 = metadata !{i32 786445, metadata !5, i32 915398439, metadata !"a", i32
>>>>> 2, i64 32, i64 32, i64 0, i32 0, metadata !8} ; [ DW_TAG_member ] [a] [line
>>>>> 2, size 32, align 32, offset 0] [from int]
>>>>> !9 = metadata !{i32 786478, metadata !5, i32 915398439, metadata !"Base",
>>>>> metadata !"Base", metadata !"", i32 1, metadata !10, i1 false, i1 false, i32
>>>>> 0, i32 0, null, i32 320, i1 false, null, null, i32 0, metadata !13, i32 1} ;
>>>>> [ DW_TAG_subprogram ] [line 1] [Base]
>>>>>
>>>>> Note that the cycles are gone and !4 has a hash value of 915398439, and the
>>>>> references to !4 are replaced with 915398439.
>>>>> Thanks Eric for suggesting replacing MD reference with a hash value.
>>>>>
>>>>>
>>>>> In particular I recommended this:
>>>>>
>>>>> a) For C++ odr replace it with the "hash" that's just a string
>>>>> representing the type name.
>>>>> b) For Internal C++ types and all C types replace it with a string
>>>>> that's a concatenation of the type name and the name of the compile
>>>>> unit.
>>>>>
>>>>> Yes, that is what we agreed on over email.
>>>>>
>>>>>
>>>>>
>>>>> There are a few issues:
>>>>> 1> How to generate the hash for a given type?
>>>>> With C++'s ODR, it should be enough by using the context and the name for
>>>>> non-internal c++ types.
>>>>> For internal c++ types and types of other languages, hash will not be used.
>>>>>
>>>>>
>>>>> Explain this?
>>>>>
>>>>>
>>>>> For a while, I am going to support both hash and MD reference, once
>>>>> everything is working with hash,
>>>>> I will update all debug info testing cases, turn -gtype-uniquing on, and
>>>>> remove the other path.
>>>>>
>>>>> For internal c++ types, initially they will follow the path of using MD
>>>>> references without a hash.
>>>>>
>>>>>
>>>>> My current implementation is to add a few static member functions in MDNode
>>>>> to profile DI nodes differently.
>>>>> + /// If the array of Vals is for debug info, profile it specially and
>>>>> return true.
>>>>> + /// If the DI node has a hash value, generate the profile using only the
>>>>> hash value and the declaration flag.
>>>>> +  static bool profileDebugInfoNode(ArrayRef<Value*> Vals, FoldingSetNodeID
>>>>> &ID);
>>>>>
>>>>> + /// If the MDNode is for debug info, profile it specially and return true.
>>>>> + /// If the DI node has a hash value, generate the profile using only the
>>>>> hash value and the declaration flag.
>>>>> +  static bool profileDebugInfoNode(const MDNode *M, FoldingSetNodeID &ID);
>>>>>
>>>>> + /// Given a hash value and a flag, generate the profile for later lookup.
>>>>> +  static bool profileDebugInfoNode(unsigned Hash, bool Declaration,
>>>>> FoldingSetNodeID &ID);
>>>>>
>>>>> These static functions are called in Metadata.cpp:
>>>>> void MDNode::Profile(FoldingSetNodeID &ID) const {
>>>>> +  if (profileDebugInfoNode(this, ID))
>>>>> +    return;
>>>>> +
>>>>>
>>>>> There are other examples of these in MDNode for handling of specific
>>>>> metadata.
>>>>> /// Methods for metadata merging.
>>>>> static MDNode *getMostGenericTBAA(MDNode *A, MDNode *B);
>>>>> static MDNode *getMostGenericFPMath(MDNode *A, MDNode *B);
>>>>> static MDNode *getMostGenericRange(MDNode *A, MDNode *B);
>>>>>
>>>>> Comments are welcome on whether this violates any layering rule.
>>>>>
>>>>>
>>>>> As I've said many times in email, I don't think this is a good idea
>>>>> and would prefer either a or b below. a) is a much simpler solution.
>>>>>
>>>>> Any reason that why it is not a good idea?
>>>>>
>>>>>
>>>>> Other choices are:
>>>>> a> Keep a map in DwarfDebug
>>>>> Keep in mind that the map is used at many stages, and it has to be in sync
>>>>> with MDNodeSet.
>>>>> b> Generalize MDNode to be aware of hash (David can provide more details)
>>>>> c> Extend MDNode to DINode and modify streamers (bitcode reader|writer, ll
>>>>> reader|writer) to be aware of DINode
>>>>> We can provide DINode::get(…) to create a DINode. DINode can have its own
>>>>> Profile function.
>>>>> Other suggestions are welcome.
>>>>>
>>>>>
>>>>> a or b please.
>>>>>
>>>>> Option a> will require a DwarfDebug pointer in every stage of the compiler,
>>>>> and passing the map to the DI classes.
>>>>> A rough estimation is around 100 places.
>>>>> Is it reasonable to pass a DwarfDebug pointer to DIBuilder and llvm linker?
>>>>> Also the map needs to be in sync with MDNodeSet, maybe using ValueHandle can
>>>>> solve the problem.
>>>>>
>>>>>
>>>>> What about putting the map in LLVMContextImpl?
>>>>> It already has a few things specifically for debug info:
>>>>> std::vector<DebugRecVH> ScopeRecords;
>>>>> DenseMap<std::pair<MDNode*, MDNode*>, int> ScopeInlinedAt;
>>>>> …
>>>>>
>>>>> I remember David mentioned it once and I forgot about the conclusion.
>>>>>
>>>>>
>>>>> I mentioned it only as speculation as to how you were implementing it
>>>>> already (but you were doing the profile-changing stuff).
>>>>>
>>>>> I don't think it should be necessary to have the map (in option (a))
>>>>> in such a central location as LLVMContext. It should be usable just
>>>>> from DwarfDebug for generation, and DIBuilder can have its own,
>>>>> separate map to do similar things during DI building.
>>>>>
>>>>>
>>>>> We also need the map during llvm linking since linking will create new
>>>>> MDNodes.
>>>>>
>>>>>
>>>>> I don't understand what you mean by this. IR linking shouldn't need to
>>>>> do anything debug-info-specific, it should just be the normal IR
>>>>> approach.
>>>>>
>>>>> The declaration-v-definition resolution can be done at codegen-time.
>>>>> We'll have to walk all the lists of retained types anyway to build the
>>>>> map, so we can do declaration-v-definition (keep definitions over
>>>>> declarations when we see both) at that point.
>>>>>
>>>>>
>>>>> Yes, you are right that at codegen-time, we can generate the map from the
>>>>> lists
>>>>> of retained types.
>>>>>
>>>>> But dumping the linked ll file requires the map when outputting comments of
>>>>> the MDNode :]
>>>>
>>>> Depending on which things we print out, but yes, in some cases
>>>> (derived types) we do print out the type referenced. I assume the
>>>> AsmPrinter can build such a map, then. (in fact, with a few clients
>>>> like this, it might be nice to build a bit of an abstraction around it
>>>> rather than just using a raw map - something that has a ctor (or I
>>>> suppose it could be a factory function) that reads in the right
>>>> metadata, walks the compile units, etc, and builds the mapping)
>>>>
>>>>>
>>>>> Thanks,
>>>>> Manman
>>>>>
>>>>>
>>>>> So any other opinion on putting it in LLVMContext other than it being
>>>>> central?
>>>>>
>>>>> Thanks,
>>>>> Manman
>>>>>
>>>>>
>>>>>
>>>>> Thanks,
>>>>> Manman
>>>>>
>>>>>
>>>>> More details for option b from David
>>>>>
>>>>> < The alternative I have in mind is a more complete version of what
>>>>> < you're proposing - a full MD feature, not an MD feature that's just
>>>>> < barely enough to support the needs of debug info. What we could do is
>>>>> < allow the insertion of these MDHash things you spoke about but take it
>>>>> < a step further and have MDNode::getOperand walk through the hash &
>>>>> < give the value (in this way, DebugInfo wouldn't have to change at all
>>>>> < to handle hashes - if the Metadata APIs are going to be aware of the
>>>>> < hashes anyway, they might as well provide this convenience
>>>>> < functionality) the metadata feature would also have to have some
>>>>> < blessed top-level named metadata that would have a list of hash+MDNode
>>>>> < to keep those MDNodes alive (so you wouldn't have to stuff all the
>>>>> < types in the retained types list - metadata would provide the full
>>>>> < support, not just half of it).
>>>>>
>>>>>
>>>>>
>>>>> Transition from current DI Metadata:
>>>>> To have a smooth transition, I will add a flag "-gtype-hashing" for the type
>>>>> uniquing work and turn it on by default when we are ready.
>>>>>
>>>>>
>>>>> I'd prefer just make the change to have the front end emit the "hash"
>>>>> (it's not really a hash, it's just a string).
>>>>>
>>>>> Are you saying no transition period? A single patch to have correct handling
>>>>> of "hash" and to update all existing testing cases?
>>>>>
>>>>>
>>>>> -----------------------------
>>>>> Patches:
>>>>> Expect the following patches:
>>>>> 1> add flag -gtype-hashing
>>>>> 2> add hash field to DI types
>>>>> 3> modify DIBuilder to use hash instead of MD reference
>>>>> 4> related to issue 3
>>>>>
>>>>>
>>>>> These can all be a single patch since it shouldn't be very large if we
>>>>> go with a) above. If we go with b) then the MDNode work should be done
>>>>> in isolation first and then the debug info on top of it.
>>>>>
>>>>> What is wrong with smaller patches?
>>>>> My estimation for all the above with a) is about 30K + testing cases.
>>>>>
>>>>>
>>>>> 5> backend change (in DwarfDebug|CompileUnit) to support types shared among
>>>>> compile units
>>>>> requires gdwarf-2 gdwarf-3 gdwarf-4 support for issues related to ref_addr
>>>>>
>>>>>
>>>>> #5 can and should be done before the rest of them.
>>>>>
>>>>> I prefer to submit patches according to the flow of the compiler, starting
>>>>> from the frontend, then IR, then backend.
>>>>> The testing cases will be added for front end, llvm-link and backend.
>>>>> Any reason why #5 should be done first?
>>>>>
>>>>>
>>>>> All changes should be local to debug info classes except patch #4.
>>>>>
>>>>>
>>>>> What's patch #4?
>>>>>
>>>>> Patch #4 above: related to issue 3 (changes corresponding to how to solve
>>>>> issue #3)
>>>>>
>>>>> -Manman
>>>>>
>>>>>
>>>>> -eric
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> LLVM Developers mailing list
>>>>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> llvm-commits mailing list
>>>>> llvm-commits at cs.uiuc.edu
>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>>>>
>>>>>
>>>
>