[LLVMdev] [RFC] Separating Metadata from the Value hierarchy

Duncan P. N. Exon Smith dexonsmith at apple.com
Sun Nov 9 17:02:43 PST 2014


TL;DR: If you use metadata (especially if it's out-of-tree), check the
numbered list of lost functionality below to see whether I'm trying to
break your compiler permanently.

In response to my recent commits (e.g., [1]) that changed API from
`MDNode` to `Value`, Eric had a really interesting idea [2] -- split
metadata entirely from the `Value` hierarchy, and drop general support
for metadata RAUW.

After hacking together some example code, this seems overwhelmingly to
me like the right direction.  See the attached metadata-v2.patch for my
sketch of what the current metadata primitives might look like in the
new hierarchy (includes LLVMContextImpl uniquing support).

The initial motivation was to avoid the API weaknesses caused by having
non-`MDNode` metadata that inherits from `Value`.  In particular,
instead of changing API from `MDNode` to `Value`, change it to a base
class called `Metadata`, which sheds the underused and expensive `Value`
base class entirely.

The implications are broader: an enormous reduction in complexity (and
overhead) for metadata.

This change implies minor (major?) loss of functionality for metadata,
but Eric and I suspect that the only hard-to-fix user is debug info
itself, whose IR infrastructure I'm rewriting anyway.

Here is what we'd lose:

 1. No more arbitrary RAUW of metadata.

    While we'd keep support for RAUW of temporary MDNodes for use as
    forward declarations (when parsing assembly or constructing cyclic
    graphs), drop RAUW support for all other metadata.

    Note that we'd also keep support for RAUW of `Value` operands of
    metadata.

    If the RAUW of an operand causes a uniquing collision, uniquing
    would be dropped for that node.  This matches the current behaviour
    when an operand goes to null.

    Upgrade path: none.

 2. No more function-local metadata.

    AFAICT, function-local metadata is *only* used for indirect
    references to instructions and arguments in `@llvm.dbg.value` and
    `@llvm.dbg.declare` intrinsics.  The first argument of the following
    is an example:

        call void @llvm.dbg.value(metadata !{i32 %val}, metadata !0,
                                  metadata !1)

    Note that the debug info people uniformly seem to dislike the status
    quo, since it's awkward to get from a `Value` to the corresponding
    intrinsic.

    Upgrade path: Instead of using an intrinsic that references a
    function-local value through an `MDNode`, attach metadata to the
    corresponding argument or instruction, or to the terminating
    instruction of the basic block.  (This requires new support for
    attaching metadata to function arguments, which I'll have to add for
    debug info anyway.)

Is this going to break your compiler?  How?  Why is your use case worth
supporting?

[1]: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20141103/242667.html
    "r221167 - IR: MDNode => Value: Instruction::getAllMetadataOtherThanDebugLoc()"
[2]: http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-November/078581.html
    "Re: First-class debug info IR: MDLocation"

-------------- next part --------------
A non-text attachment was scrubbed...
Name: metadata-v2.patch
Type: application/octet-stream
Size: 31022 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20141109/4d4fffcf/attachment.obj>


More information about the llvm-dev mailing list