[LLVMdev] First-class debug info IR: MDLocation

Wed Nov 5 15:47:24 PST 2014

Hi Duncan,

Here's a bit of a long delayed reply, I've been thinking about this for a
bit now and have a bit (I think) of a more firm idea on how I see this.

> Always takes a patch to draw people in :).
>
>
Agreed. I've been looking at the patches as they've been going in. I think
it's mostly making me hate the Value->User hierarchy.

> > Using `Value` instead of `MDNode`
> > =================================
> >
> > A number of APIs expect `MDNode` -- previously, the only referenceable
> > type of metadata -- but this patch (and the ones that will follow) have
> > referenceable metadata that *do not* inherit from `MDNode`.  Metadata
> > APIs such as `Instruction::getMetadata()` and
> > `NamedMDNode::getOperand()` need to return non-`MDNode` metadata.
> >
> > To me, this change is a red flag
>
> This bothers me too -- which is why I highlighted it -- but I don't
> see any alternatives.  It seems like a natural fallout of the rest of
> the proposal.
>
>
This hurts a lot. It's by far the worst part of this currently.

I think ideally I'd almost rather see Metadata not inherit from Value.
Metadata doesn't need to be typed, it describes typed things. It isn't or
doesn't produce a value, it's just a side structure that we find worth
serializing. A container of some bits that can reference IR when necessary.

The problematic thing is the User-ness of Metadata. It definitely wants to
point to Values/Users, it definitely can use a list of where it's used for
things like RAUW - which we still use for temporary MDNodes right now. I
think that aspect is a good thing, but it doesn't necessarily need to be
blazingly fast.

> `MDUser`/`CustomMD` inherits from `User` so that its subclasses can
> leverage the use-list infrastructure (8B per operand, fast RAUW).
>
> > I also have to ask because I can't currently see it: what does debug
> info being metadata buy us?
>
> I suppose it buys us:
>
>   - the guarantee that debug info doesn't modify optimizations or code
>     generation, and
>
>   - the flexibility for optimizations to ignore/drop it when they're
>     not smart enough to update it.
>

Agreed. I think this is important.

> > How much code is simplified by that, and at what cost?
>
> I think that's hard to quantify.
>
>
Quite a bit IMO. If we could get rid of dbg_declare and dbg_value it'd be
even more. Ideally, as we've discussed in the past, these could be metadata
on instructions or <things that turn into lexical blocks>. Some work would
need to be done to handle this, but we'd get rid of the instructions out of
the IR.

> I suppose the obvious alternative is to rewrite debug info from the
> ground up, without using metadata at all.  I considered this, but
> didn't find a compelling argument for it.  Main arguments against it:
> it would be harder to implement incrementally, and it would increase
> the amount of non-code IR.
>
> Moreover, once we have specific subclasses and bitcode support for
> debug info types, moving away from metadata (or even the `Value`
> hierarchy entirely) would be an incremental step.
>
> Do you have any specific alternatives mind?
>

How about the above discussion? I think it might end up being a bit more
work in the short term, but it moves metadata to a separate hierarchy. I
don't think a fast RAUW is too important (call it once per type - I think
we RAUW much more on instructions/constants/etc and so speed is an
advantage there). Module already has a separate set of iterators for named
metadata so that side of things isn't too bad.

Thoughts?

-eric
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20141105/8eea9b2a/attachment.html>