[LLVMdev] Proposal: type uniquing of debug info for LTO

Tue Jun 25 11:35:47 PDT 2013

On Tue, Jun 25, 2013 at 10:13 AM, Manman Ren <mren at apple.com> wrote:
>
> On Jun 25, 2013, at 9:15 AM, David Blaikie <dblaikie at gmail.com> wrote:
>
> On Tue, Jun 25, 2013 at 8:59 AM, Manman Ren <mren at apple.com> wrote:
>
>
> Any suggestion on how to move this forward?
> My feeling is that we should not call Verify from so many files.
> If you agree, I can try to clean up the Verify functions first.
>
>
> Yes, we probably shouldn't be using Verify much. I haven't looked at
> all the use cases though.
>
>
> Verify called from the following 12 files other than DebugInfo DIBuilder
> Dwarf
> Index: tools/opt/opt.cpp
> Index: lib/Target/NVPTX/NVPTXAsmPrinter.cpp
> Index: lib/Transforms/Utils/Local.cpp
> Index: lib/Transforms/Instrumentation/GCOVProfiling.cpp
> Index: lib/Transforms/IPO/StripSymbols.cpp
> Index: lib/Transforms/IPO/DeadArgumentElimination.cpp
> Index: lib/CodeGen/SelectionDAG/FastISel.cpp
> Index: lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp
> Index: lib/CodeGen/SelectionDAG/FunctionLoweringInfo.cpp
> Index: lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
> Index: lib/CodeGen/MachineInstr.cpp
> Index: lib/IR/AsmWriter.cpp
>
> A sample patch which catches all uses of Verify outside Dwarfxxx.cpp
> DebugInfo.cpp DIBuilder.cpp:

I don't really understand this patch.

1) does Verify really need the map at all? Could we look at the cases
where Verify would use the map & decide whether we could just not
bother with those cases?
2) even if we were going to have such an API, why not have a default
value for the parameter so we don't have to update existing callers?
3) point below \/ is that we should probably remove these Verify calls
(except for assembly printing) - debug values & debug info metadata
should be reasonably assumed to be valid if it's accessed through the
relevant entry points (navigating from debug info for a
function/variable, from a dbg_value/declare intrinsic, etc... we
shouldn't be conditionalizing based on the verification of these
values) - at some point we'd want to add an up-front verifier that
checks these invariants in one go (& doesn't bother with it when the
IR was generated from a trusted front end, in the same way don't use
the IR verifier in those cases).

>
>
> We are calling Verify to make sure a MDNode is indeed a specific DI node.
>
>
>
> If we're using it to differentiate nodes (which I think I've done in
> one or two places) that's probably a bit dodgy & we should do
> something more explicit (based on tag values, etc).
>
> I think the trickiest use of it (& even then it's not quite
> sufficient) is in the assembly comment printing stuff - I'm not sure
> there's a good alternative there. Treating any metadata that happens
> to have an integer first operand that happens to be a tag value is...
> unreliable at best. Not sure what to do about that.
>
> Same feeling here. But we can only parse the content of the MDNode to know
> whether it is a DI node and
> if yes, what type of DI node it is.

Sure - but the question is how much parsing do we need to do. When I
removed the debug info versioning we started getting a lot more
conflicts with accidentally annotating non-debug metadata as debug
metadata (you can see this firing on some of the line info debug
nodes, for example). One possibility is we could restore something
like that - a 'unique' number to make it sufficiently sane to just
examine the number to decide if we're looking at a debug info node.

I'm not sure if that's the right answer.

In any case this problem should only apply to assembly printing - it
doesn't explain why other parts of the codebase need to use Verify at
all. Those parts of the codebase actually trying to handle debug info
should be handling valid debug info to begin with - we shouldn't need
to call "Verify" all over the place & behave differently (at most it
should be an assert - but it should probably just move entirely to a
verifier like the IR verifier we already have).

>
>
> If you prefer to communicate via lists, let me know.
>
>
> Generally, yes. Discussion should happen in the open unless there's a
> need for it to be private.
>
> As to the MDNode interface, I think debug info is the biggest user, other
> usage is quite simple, such as TBAA.
>
>
> I'm not sure what you're referring to here.
>
>
> I am referring to the fact nobody else had comments on our proposal of
> modifying MDNode :]
> That probably means we are the ones who care the most about MDNode?

Probably - but it helps to explain to the community what we're doing &
why we're doing it so it's not a surprise when they start seeing
commits.

- David

>
> Thanks,
> Manman
>
>
>
> Thanks,
> Manman
>
> On Jun 21, 2013, at 10:01 PM, Manman Ren wrote:
>
>
> Some scoping for option a:
>
> There are a few Verify functions
> DISubprogram::Verify
> DICompositeType::Verify
> DIType::Verify
> that try to access the context link:
> if (getContext() && !getContext().Verify())
>   return false;
>
> And the Verify functions are called from 10+ files.
>
> If we are oaky with not calling getContext().Verify(), option a) appears
> much better to me.
> For printing | debugging purpose, we agree that not being able to trace the
> pointer is okay.
>
> Let me know your thoughts,
> Manman
>
> On Jun 21, 2013, at 5:01 PM, Eric Christopher wrote:
>
> On Fri, Jun 21, 2013 at 4:59 PM, David Blaikie <dblaikie at gmail.com> wrote:
>
> On Fri, Jun 21, 2013 at 4:50 PM, Manman Ren <mren at apple.com> wrote:
>
>
> On Jun 21, 2013, at 4:41 PM, David Blaikie <dblaikie at gmail.com> wrote:
>
> The reason I want a flag is to avoid the need to update the existing testing
> cases while this is a work in progress.
> I believe migrating one field means updating the existing testing cases?
>
>
> It would, yes - and that's how you'd get coverage to know your change
> was stable. You'll have to update all the tests eventually anyway &
> doing so incrementally isn't substantially more expensive in my
> experience. Perhaps there's something unique to this migration that
> would make it so?
>
>
> I am going to update the existing testing cases locally to have the test
> coverage, but I don't want to check in the changes often.
> Once we turn the flag on by default and remove the other path, I will submit
> the changes on the testing cases.
>
>
> I'd rather skip the flag & have the test coverage in the open along
> with the changes. Having changes without testing is bad. I don't see a
> benefit to you keeping the test coverage locally.
>
> Another point to have a flag is to check in the patches in steps: DIBuilder
> changes, changes related to the map, and DwarfDebug changes.
> Without the flag, when I migrate the first field, I have to make sure it
> works from frontend all the way to the backend.
>
>
> True enough, but I don't think that's a great drawback.
>
> Though technically, if you really wanted, you could start at the
> backend & make the implementation conditional (check the type of the
> field, if it's an MDNode - use that, if it's a string, do the lookup)
> - implement that, then port DIBuilder, then remove the conditionality.
> Though I don't like it a lot & I think the changes will be small
> enough that frontend to backend will be still be understandable
> patches.
>
>
> All of this.
>
> -eric
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>
>
>