[Type Uniquing Patch] Modify functions in DebugInfo.h to take an extra map argument.

Eric Christopher echristo at gmail.com
Thu Jul 18 18:31:27 PDT 2013


> Somehow I can't put inline comments.

Beta software? ;)

>From David:

>
> Rather than having foo.getThing(map) we'd have something like
> lookup(foo.getThing()) - we'd need a return type from getThing that
> represent as the type identifier (the thing we keep calling a 'hash' but it
> isn't), we could just use stringref or something more specific.
>

>From you:
> Is lookup here a global function like "llvm::isSubprogramContext" or a
> member function of the DIMap?
> If the former, then it should be lookup(StringRef TypeName, DIMap &), right?
> We also need to modify foo.getThing() to return a StringRef instead of a
> DIType, right?
> In terms of lines of change, two options should be similar.
>

I was thinking the former yes. It should probably return a StringRef,
or an DITypeHandle, or something. Leave the actual choice there to
you.

> About the Verify functions, the patch modifies from Verify() to Verify(DIMap
> &). Is this interface change okay?

Again, I was hoping not to change the signature of these functions if possible.

> What about isUnsignedDIType, getOriginalTypeSize, isBlockByrefVariable, and
> isSubprogramContext? Should they take an additional DIMap?
> -    bool isUnsignedDIType();
> +    bool isUnsignedDIType(const DITypeHashMap &Map);
>
> Are the changes okay for
> -  bool isArtificial()                const {
> +  bool isArtificial(const DITypeHashMap &Map) const {
> and other member functions in DbgVariable?
>

>From David:

> As we discussed before, we shouldn't need conditional lookup (so we
> shouldn't be in a state where 'getThing' returns something that might be a
> type identifier or might be the type itself). We should just migrate one
> field at a time (or a bunch or all of them, but not switching the consuming
> code before the production code, otherwise were just adding a bunch of
> dead/untested code)
>

>From you:

> There are some cases where a thing can be either a StringRef or a
> DIDescriptor such as getContext() or getElement() of a DIArray.
> What type should getElement() or getContext() return? A Value*?
>
> Downside for what is implemented in the patch:
> What if someone just wants the StringRef? (the patch added
> getNameDerivedFrom, Value* getType)
>
> Downside for the suggested lookup:
> Verify takes a DIMap, getThing does not
> getThing returns StringRef, getContext, getElement returns a Value
> It requires updating the testing cases since the thing must be a StringRef.
>
> The first two items are not really downsides, I am just worried about the
> last item, see below.
>

I think your plan above there will work. It'll come down to which one
is cleaner ultimately. Our feeling was that it'd be cleaner not
passing the map everywhere and doing the lookup separately to keep the
concerns separate of "field" and "underlying value".

The below is thinking out loud:

As far as using Value * here that seems reasonable... or add a new
class of DIType that is a "reference to" type that just contains the
string and can be passed around like everything else, then lookup just
needs to happen when we want to change/emit it. How's that sound?

> There are additional code changes required to migrate a single field:
> 1> update the backend to support one MDNode (DIE) used in two different CUs.
> 2> modify DIBuilder to use StringRef instead of DIType
> 3> to generate the unique String name in clang.

#3 is definitely a requirement for this. The "name" of a type
absolutely must be in the IR in place of any DIType fields. I'm not
quite sure what you mean by #1 and #2.

>      Any suggestion on how to generate a unique name for internal types
> defined within a lexical scope?
>      int a(int i) {
>         if (i) {
>            struct A {
>>            };
>         }
>      }

Yep. Walk up the scope appending the name of enclosing
functions/namespace/etc to the name.

>
> The existing testing cases need major cleanup, they are not in verifiable
> states. As I mentioned earlier, there are 110 testing cases that failed to
> verify.
> It is hard to correctly update the testing cases since we don't have the
> source code to regenerate the debug info from.

Yep. This is why we started putting in the C code if we could.

> The only way I can think of is to go back to the state before David's series
> of changes (assuming that state is correct, which I am not sure about), then
> re-apply the changes manually or using script.
> I think David submitted > 5 patches to change the format of the DI MDNodes.
>
> Before cleaning up the testing cases, I don't think it is a good idea to
> perform another series of update to the testing cases.
>

That's fair, but it seems like we don't need a lot of update. Just
about any format change is going to involve a lot of changing of debug
info test cases. It's unfortunate but true.

> If we want to go with the lookup option, we need to combine all above
> (additional code changes and testing cases update) in a single patch.
> I am not against the lookup option, but does not like submitting a big
> patch: 80K code change + 229k added testing cases + xxx existing testing
> cases.
>

Gah. You sure now after your patch of earlier?

> Let me know your thoughts on how to move this forward.
>

Hope this helps.

-eric

> Thanks,
> Manman
>
> On Jul 15, 2013, at 10:13 PM, David Blaikie wrote:
>
>
> On Jul 15, 2013 10:06 PM, "Eric Christopher" <echristo at gmail.com> wrote:
>>
>> As a note, pretty sick right now. I'll get to this shortly. As a quick
>> look I think we don't want to pass in the map, but rather look it up
>> in the uses?
>
> Manman- just to expand on this, since Eric and I have tossed the idea around
> a bit...
>
> Rather than having foo.getThing(map) we'd have something like
> lookup(foo.getThing()) - we'd need a return type from getThing that
> represent as the type identifier (the thing we keep calling a 'hash' but it
> isn't), we could just use stringref or something more specific.
>
> Does that make sense? What do you think (we didn't really have a strong
> feeling, but I understand/can agree wit Eric erring this way)
>
> As we discussed before, we shouldn't need conditional lookup (so we
> shouldn't be in a state where 'getThing' returns something that might be a
> type identifier or might be the type itself). We should just migrate one
> field at a time (or a bunch or all of them, but not switching the consuming
> code before the production code, otherwise were just adding a bunch of
> dead/untested code)
>
>>
>> -eric
>>
>> On Fri, Jul 12, 2013 at 3:13 PM, Manman Ren <mren at apple.com> wrote:
>> >
>> > Thanks,
>> > Manman
>> >
>> > On Jul 12, 2013, at 2:44 PM, Eric Christopher wrote:
>> >
>> >> Getting to it shortly, week has been roughly crazy.
>> >>
>> >> -eric
>> >>
>> >> On Thu, Jul 11, 2013 at 9:24 PM, Manman Ren <mren at apple.com> wrote:
>> >>>
>> >>> Private ping :)
>> >>>
>> >>> Did you get a chance to look at the patch?
>> >>>
>> >>> Things on my list:
>> >>> 1> refactor a helper function to query module flag
>> >>> 2> usage of getDICompositeType on dragonegg
>> >>> 3> try to use assert in r185847
>> >>>
>> >>> Thanks,
>> >>> Manman
>> >>>
>> >>> On Jul 9, 2013, at 9:48 AM, Manman Ren wrote:
>> >>>
>> >>>>
>> >>>> Modify access functions that can return a DIType to take an extra map
>> >>>> argument.
>> >>>> Add helper function
>> >>>> +  void collectDITypeHashMap(const NamedMDNode *CU_Nodes,
>> >>>> DITypeHashMap &Map);
>> >>>> to update DITypeHashMap by going through RetainedTypes of each CU.
>> >>>> Testing cases are added to verify llvm-link can correctly unique
>> >>>> types if the input .ll files use string instead of DIType.
>> >>>>
>> >>>> To make sure we can dump MDNode comments even without a
>> >>>> DITypeHashMap, isDebugInfo is used instead of
>> >>>> Verify in WriteMDNodeComment. Neither is accurate for checking
>> >>>> whether MDNode is indeed debug info.
>> >>>>
>> >>>> Thanks,
>> >>>> Manman
>> >>>>
>> >>>> <type_unique_di.patch><type_unique_test.patch>
>> >>>
>> >
>
>




More information about the llvm-commits mailing list