[Type Uniquing Patch] Modify functions in DebugInfo.h to take an extra map argument.
Manman Ren
mren at apple.com
Thu Jul 18 22:05:07 PDT 2013
On Jul 18, 2013, at 9:58 PM, David Blaikie wrote:
>
> On Jul 18, 2013 6:31 PM, "Eric Christopher" <echristo at gmail.com> wrote:
> >
> > > Somehow I can't put inline comments.
> >
> > Beta software? ;)
> >
> > From David:
> >
> > >
> > > Rather than having foo.getThing(map) we'd have something like
> > > lookup(foo.getThing()) - we'd need a return type from getThing that
> > > represent as the type identifier (the thing we keep calling a 'hash' but it
> > > isn't), we could just use stringref or something more specific.
> > >
> >
> > From you:
> > > Is lookup here a global function like "llvm::isSubprogramContext" or a
> > > member function of the DIMap?
> > > If the former, then it should be lookup(StringRef TypeName, DIMap &), right?
> > > We also need to modify foo.getThing() to return a StringRef instead of a
> > > DIType, right?
> > > In terms of lines of change, two options should be similar.
> > >
> >
> > I was thinking the former yes.
>
> I was picturing just a helper member function of dwarfdebug so it could be written in that context as lookup(foo.getThing()), but short of that, if we actually have a dimap type, it probably makes sense as a member of that. If we don't then it should be fine as a free function.
>
> > It should probably return a StringRef,
> > or an DITypeHandle, or something. Leave the actual choice there to
> > you.
>
> A typedef might be healthy here at least, so we can change it later of we need to.
>
> >
> > > About the Verify functions, the patch modifies from Verify() to Verify(DIMap
> > > &). Is this interface change okay?
> >
> > Again, I was hoping not to change the signature of these functions if possible.
>
> Agreed. Let's drop anything from verify that needs type resolution, unless there's a good reason not to. We should move verification to a single pass at some point which could use resolution there,
>
> >
> > > What about isUnsignedDIType, getOriginalTypeSize, isBlockByrefVariable, and
> > > isSubprogramContext? Should they take an additional DIMap?
> > > - bool isUnsignedDIType();
> > > + bool isUnsignedDIType(const DITypeHashMap &Map);
> > >
> > > Are the changes okay for
> > > - bool isArtificial() const {
> > > + bool isArtificial(const DITypeHashMap &Map) const {
> > > and other member functions in DbgVariable?
> > >
> >
> > From David:
> >
> > > As we discussed before, we shouldn't need conditional lookup (so we
> > > shouldn't be in a state where 'getThing' returns something that might be a
> > > type identifier or might be the type itself). We should just migrate one
> > > field at a time (or a bunch or all of them, but not switching the consuming
> > > code before the production code, otherwise were just adding a bunch of
> > > dead/untested code)
> > >
> >
> > From you:
> >
> > > There are some cases where a thing can be either a StringRef or a
> > > DIDescriptor such as getContext() or getElement() of a DIArray.
> > > What type should getElement() or getContext() return? A Value*?
> > >
> > > Downside for what is implemented in the patch:
> > > What if someone just wants the StringRef? (the patch added
> > > getNameDerivedFrom, Value* getType)
> > >
> > > Downside for the suggested lookup:
> > > Verify takes a DIMap, getThing does not
> > > getThing returns StringRef, getContext, getElement returns a Value
> > > It requires updating the testing cases since the thing must be a StringRef.
> > >
> > > The first two items are not really downsides, I am just worried about the
> > > last item, see below.
> > >
> >
> > I think your plan above there will work. It'll come down to which one
> > is cleaner ultimately. Our feeling was that it'd be cleaner not
> > passing the map everywhere and doing the lookup separately to keep the
> > concerns separate of "field" and "underlying value".
> >
> > The below is thinking out loud:
> >
> > As far as using Value * here that seems reasonable... or add a new
> > class of DIType that is a "reference to" type that just contains the
> > string and can be passed around like everything else, then lookup just
> > needs to happen when we want to change/emit it. How's that sound?
>
> Gut reaction: value* is top general and deriving from ditype is incorrect, since the whole api would be invalid as the type couldnt be implicitly resolved on demand (adding an extra state to this special ditype, which seems unfortunate). I think a simple type with a 'resolve(map)' function (could be written as a non member, in which case the type would need members to differentiate and return the ditype or string/ditypehandle) that returns the ditype one way or another, internally if use a pointerunion of string value or mdnode, perhaps? (Yeah, just a value pointer would do, with dyn cast, I suppose, but at least it'd be hidden away in that type)
>
> >
> > > There are additional code changes required to migrate a single field:
> > > 1> update the backend to support one MDNode (DIE) used in two different CUs.
> > > 2> modify DIBuilder to use StringRef instead of DIType
> > > 3> to generate the unique String name in clang.
> >
> > #3 is definitely a requirement for this. The "name" of a type
> > absolutely must be in the IR in place of any DIType fields. I'm not
> > quite sure what you mean by #1 and #2.
>
> 1 and 2 make sense to me, I think, I'll try to explain them to you in person, as I understand them.
>
> But these changes can be made incrementally - the backend and frontend need to change the schema atomically (including updating test cases), but once per field not for all the fields together (yes, this means thrashing the test cases as I did in the past, by that's OK). But 2 can then be done separately for each of those changes - adding the merging support.
>
>
Attaching patches for 1, 2 and 3.
1 is needed at least for performance, to avoid creating multiple DIEs for a single MDNode that is shared across CUs.
We used to have a map in DwarfCompileUnit which maps from a MDNode to a DIE.
I am not sure whether it is required for correctness.
2 can be incremental if we are going to update a batch of fields at a time.
I will get back to you on other issues soon.
Thanks
Manman
> >
> > > Any suggestion on how to generate a unique name for internal types
> > > defined within a lexical scope?
> > > int a(int i) {
> > > if (i) {
> > > struct A {
> > > …
> > > };
> > > }
> > > }
> >
> > Yep. Walk up the scope appending the name of enclosing
> > functions/namespace/etc to the name.
>
> We should have mangling support for a such names (as they can be used for template instantiations), plus adding the cu filename for anything with internal linkage.
>
> >
> > >
> > > The existing testing cases need major cleanup, they are not in verifiable
> > > states. As I mentioned earlier, there are 110 testing cases that failed to
> > > verify.
> > > It is hard to correctly update the testing cases since we don't have the
> > > source code to regenerate the debug info from.
> >
> > Yep. This is why we started putting in the C code if we could.
>
> Yes. We can't leave the test cases to rot. We must update them as the schema changes or we lose a lot of test coverage because it no longer tests what users are using.
>
> If/when we implement a full verifier we will find lots of outdated/incomplete/broken tests and we will have to update or delete them. I suggest we do that incrementally, verifying new parts and updating tests cases go pass each time. Yes, this will be a lot of work.
>
> >
> > > The only way I can think of is to go back to the state before David's series
> > > of changes (assuming that state is correct, which I am not sure about), then
> > > re-apply the changes manually or using script.
> > > I think David submitted > 5 patches to change the format of the DI MDNodes.
> > >
> > > Before cleaning up the testing cases, I don't think it is a good idea to
> > > perform another series of update to the testing cases.
>
> We can't let the test cases rot. If we need to add stronger verification now, then we need to update the existing tests to pass such verification.
>
> > >
> >
> > That's fair, but it seems like we don't need a lot of update. Just
> > about any format change is going to involve a lot of changing of debug
> > info test cases. It's unfortunate but true.
> >
> > > If we want to go with the lookup option, we need to combine all above
> > > (additional code changes and testing cases update) in a single patch.
> > > I am not against the lookup option, but does not like submitting a big
> > > patch: 80K code change + 229k added testing cases + xxx existing testing
> > > cases.
>
> Yeah, I don't see why this is true. Why can't we move one type field over at a time?
>
> And if we need to add stronger verification (not sure why that need is happening now, but I'm pk if it is, we should do it as soon as we can) then, as I said, we should do so incrementally - verifying one feature, updating testsm etc.
>
> > >
> >
> > Gah. You sure now after your patch of earlier?
> >
> > > Let me know your thoughts on how to move this forward.
> > >
> >
> > Hope this helps.
> >
> > -eric
> >
> > > Thanks,
> > > Manman
> > >
> > > On Jul 15, 2013, at 10:13 PM, David Blaikie wrote:
> > >
> > >
> > > On Jul 15, 2013 10:06 PM, "Eric Christopher" <echristo at gmail.com> wrote:
> > >>
> > >> As a note, pretty sick right now. I'll get to this shortly. As a quick
> > >> look I think we don't want to pass in the map, but rather look it up
> > >> in the uses?
> > >
> > > Manman- just to expand on this, since Eric and I have tossed the idea around
> > > a bit...
> > >
> > > Rather than having foo.getThing(map) we'd have something like
> > > lookup(foo.getThing()) - we'd need a return type from getThing that
> > > represent as the type identifier (the thing we keep calling a 'hash' but it
> > > isn't), we could just use stringref or something more specific.
> > >
> > > Does that make sense? What do you think (we didn't really have a strong
> > > feeling, but I understand/can agree wit Eric erring this way)
> > >
> > > As we discussed before, we shouldn't need conditional lookup (so we
> > > shouldn't be in a state where 'getThing' returns something that might be a
> > > type identifier or might be the type itself). We should just migrate one
> > > field at a time (or a bunch or all of them, but not switching the consuming
> > > code before the production code, otherwise were just adding a bunch of
> > > dead/untested code)
> > >
> > >>
> > >> -eric
> > >>
> > >> On Fri, Jul 12, 2013 at 3:13 PM, Manman Ren <mren at apple.com> wrote:
> > >> >
> > >> > Thanks,
> > >> > Manman
> > >> >
> > >> > On Jul 12, 2013, at 2:44 PM, Eric Christopher wrote:
> > >> >
> > >> >> Getting to it shortly, week has been roughly crazy.
> > >> >>
> > >> >> -eric
> > >> >>
> > >> >> On Thu, Jul 11, 2013 at 9:24 PM, Manman Ren <mren at apple.com> wrote:
> > >> >>>
> > >> >>> Private ping :)
> > >> >>>
> > >> >>> Did you get a chance to look at the patch?
> > >> >>>
> > >> >>> Things on my list:
> > >> >>> 1> refactor a helper function to query module flag
> > >> >>> 2> usage of getDICompositeType on dragonegg
> > >> >>> 3> try to use assert in r185847
> > >> >>>
> > >> >>> Thanks,
> > >> >>> Manman
> > >> >>>
> > >> >>> On Jul 9, 2013, at 9:48 AM, Manman Ren wrote:
> > >> >>>
> > >> >>>>
> > >> >>>> Modify access functions that can return a DIType to take an extra map
> > >> >>>> argument.
> > >> >>>> Add helper function
> > >> >>>> + void collectDITypeHashMap(const NamedMDNode *CU_Nodes,
> > >> >>>> DITypeHashMap &Map);
> > >> >>>> to update DITypeHashMap by going through RetainedTypes of each CU.
> > >> >>>> Testing cases are added to verify llvm-link can correctly unique
> > >> >>>> types if the input .ll files use string instead of DIType.
> > >> >>>>
> > >> >>>> To make sure we can dump MDNode comments even without a
> > >> >>>> DITypeHashMap, isDebugInfo is used instead of
> > >> >>>> Verify in WriteMDNodeComment. Neither is accurate for checking
> > >> >>>> whether MDNode is indeed debug info.
> > >> >>>>
> > >> >>>> Thanks,
> > >> >>>> Manman
> > >> >>>>
> > >> >>>> <type_unique_di.patch><type_unique_test.patch>
> > >> >>>
> > >> >
> > >
> > >
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20130718/b644d613/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: type_unique_be.patch
Type: application/octet-stream
Size: 9467 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20130718/b644d613/attachment.obj>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20130718/b644d613/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: type_unique_builder.patch
Type: application/octet-stream
Size: 15069 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20130718/b644d613/attachment-0001.obj>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20130718/b644d613/attachment-0002.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: type_unique_clang.patch
Type: application/octet-stream
Size: 3743 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20130718/b644d613/attachment-0002.obj>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20130718/b644d613/attachment-0003.html>
More information about the llvm-commits
mailing list