[LLVMdev] [RFC] Separating Metadata from the Value hierarchy

Duncan P. N. Exon Smith dexonsmith at apple.com
Wed Nov 12 13:00:19 PST 2014


If you don't care about function-local metadata and debug info
intrinsics, skip ahead to the section on assembly syntax in case you
have comments on that.

> On 2014-Nov-09, at 17:02, Duncan P. N. Exon Smith <dexonsmith at apple.com> wrote:
> 
> 2. No more function-local metadata.
> 
>    AFAICT, function-local metadata is *only* used for indirect
>    references to instructions and arguments in `@llvm.dbg.value` and
>    `@llvm.dbg.declare` intrinsics.  The first argument of the following
>    is an example:
> 
>        call void @llvm.dbg.value(metadata !{i32 %val}, metadata !0,
>                                  metadata !1)
> 
>    Note that the debug info people uniformly seem to dislike the status
>    quo, since it's awkward to get from a `Value` to the corresponding
>    intrinsic.
> 
>    Upgrade path: Instead of using an intrinsic that references a
>    function-local value through an `MDNode`, attach metadata to the
>    corresponding argument or instruction, or to the terminating
>    instruction of the basic block.  (This requires new support for
>    attaching metadata to function arguments, which I'll have to add for
>    debug info anyway.)

llvm::Argument attachments are hard
===================================

I've been looking at prototyping metadata attachments to
`llvm::Argument`, which is key to replacing debug info intrinsics.

It's a fairly big piece of new IR, and comes with its own subtle
semantic decisions.  What do you do with metadata attached to arguments
when you inline a function?  If the arguments are remapped to other
instructions (or arguments), they may have metadata of the same kind
attached.  Does one win?  Which one?  Or are they merged?  What if the
arguments get remapped to constants?  What about when a function is
cloned?

While the rest of this metadata-is-not-a-value proposal is effectively
NFC, this `Argument` part could introduce problems.  If I rewrite debug
info intrinsics as argument attachments and then immediately split
`Metadata` from `Value`, any semantic subtleties will be difficult to
diagnose in the noise of the rest of the changes.

While I was looking at this as a shortcut to avoid porting
function-local metadata, I think it introduces more points of failure
than problems it solves.

Limited function-local metadata
-------------------------------

Instead, I propose porting a limited form of function-local metadata,
whose use is severely restricted but covers our current use cases (keep
reading for details).  We can defer replacing debug info intrinsics
until the infrastructure has settled down and is stable.

Assembly syntax
===============

This is a good time to talk about assembly syntax, since it will
demonstrate what I'm thinking for function-local metadata.

Assembly syntax is important.  It's our view into the IR.  If metadata
is typeless (and not a `Value`), that should be reflected in the
assembly syntax.

Old syntax
----------

There are four places metadata can be used/reference in the IR.

 1. Operands of `MDNode`.

        !0 = metadata !{metadata !"string", metadata !1, i32* @global)

    Notice that the `@global` argument is not metadata: it's an
    `llvm::Constant`.  In the new IR, these will be wrapped in a
    `ValueAsMetadata` instance.

 2. Operands of `NamedMDNode` (yes, they're different).

        !named = metadata !{metadata !0, metadata !1}

    These operands are always `MDNode`.

 3. Attachments to instructions.

        %inst = load i32* @global, !dbg !0

    Notice that we already skip the `metadata` type here.

 4. Arguments to intrinsics.

        call void @llvm.dbg(metadata !{i32 %inst}, metadata !0)

    The first argument is subtle -- that's a function-local `MDNode`
    with `%inst` as its only operand.

    In the new IR, the second operand will be a `MetadataAsValue`
    instance that contains a reference to the `MDNode` numbered `!0`.

New syntax
----------

Types only make sense when an operand can be an `llvm::Value`.  Let's
remove them where they don't make sense.

I propose the following syntax for the above examples, using a new
keyword, `value`:

 1. Operands of `MDNode`.  Drop `metadata`, since metadata doesn't have
    types.  Use `value` to indicate a wrapped `llvm::Value`.

        !0 = !{!"string", !1, value i32* @global)

 2. Operands of `NamedMDNode`.  Drop `metadata`, since metadata doesn't
    have types.

        !named = !{!0, !1}

 3. Attachments to instructions.  No change!

        %inst = load i32* @global, !dbg !0

 4. Arguments to intrinsics.  Keep `metadata`, since here it's wrapped
    into an `llvm::Value` (which has a type).  Use `value` to indicate a
    metadata-wrapped value.

        call void @llvm.dbg(metadata value i32 %inst, metadata !0)

    Notice that the first argument doesn't use an `MDNode` anymore.

Restrictions on function-local metadata
=======================================

In the new IR, function-local metadata (say, `LocalValueAsMetadata`)
*cannot* be used as an operand to metadata -- the only legal place for
it is in a `MetadataAsValue` instance.  This prevents the additional
complexity from poisoning the rest of the metadata hierarchy.

Effectively, this restricts function-local metadata to direct operands
of intrinsics.





More information about the llvm-dev mailing list