[LLVMdev] [RFC] Separating Metadata from the Value hierarchy

Philip Reames listmail at philipreames.com
Thu Nov 13 13:34:25 PST 2014


On 11/12/2014 01:00 PM, Duncan P. N. Exon Smith wrote:
> If you don't care about function-local metadata and debug info
> intrinsics, skip ahead to the section on assembly syntax in case you
> have comments on that.
>
>> On 2014-Nov-09, at 17:02, Duncan P. N. Exon Smith <dexonsmith at apple.com> wrote:
>>
>> 2. No more function-local metadata.
>>
>>     AFAICT, function-local metadata is *only* used for indirect
>>     references to instructions and arguments in `@llvm.dbg.value` and
>>     `@llvm.dbg.declare` intrinsics.  The first argument of the following
>>     is an example:
>>
>>         call void @llvm.dbg.value(metadata !{i32 %val}, metadata !0,
>>                                   metadata !1)
>>
>>     Note that the debug info people uniformly seem to dislike the status
>>     quo, since it's awkward to get from a `Value` to the corresponding
>>     intrinsic.
>>
>>     Upgrade path: Instead of using an intrinsic that references a
>>     function-local value through an `MDNode`, attach metadata to the
>>     corresponding argument or instruction, or to the terminating
>>     instruction of the basic block.  (This requires new support for
>>     attaching metadata to function arguments, which I'll have to add for
>>     debug info anyway.)
> llvm::Argument attachments are hard
> ===================================
>
> I've been looking at prototyping metadata attachments to
> `llvm::Argument`, which is key to replacing debug info intrinsics.
>
> It's a fairly big piece of new IR, and comes with its own subtle
> semantic decisions.  What do you do with metadata attached to arguments
> when you inline a function?  If the arguments are remapped to other
> instructions (or arguments), they may have metadata of the same kind
> attached.  Does one win?  Which one?  Or are they merged?  What if the
> arguments get remapped to constants?  What about when a function is
> cloned?
>
> While the rest of this metadata-is-not-a-value proposal is effectively
> NFC, this `Argument` part could introduce problems.  If I rewrite debug
> info intrinsics as argument attachments and then immediately split
> `Metadata` from `Value`, any semantic subtleties will be difficult to
> diagnose in the noise of the rest of the changes.
>
> While I was looking at this as a shortcut to avoid porting
> function-local metadata, I think it introduces more points of failure
> than problems it solves.
>
> Limited function-local metadata
> -------------------------------
>
> Instead, I propose porting a limited form of function-local metadata,
> whose use is severely restricted but covers our current use cases (keep
> reading for details).  We can defer replacing debug info intrinsics
> until the infrastructure has settled down and is stable.
This seems entirely reasonable.

Long term, supporting metadata on arguments would be useful, but we 
should also have a broader discussion about the role of attributes and 
metadata before we do that.
>
> Assembly syntax
> ===============
>
> This is a good time to talk about assembly syntax, since it will
> demonstrate what I'm thinking for function-local metadata.
>
> Assembly syntax is important.  It's our view into the IR.  If metadata
> is typeless (and not a `Value`), that should be reflected in the
> assembly syntax.
>
> Old syntax
> ----------
>
> There are four places metadata can be used/reference in the IR.
>
>   1. Operands of `MDNode`.
>
>          !0 = metadata !{metadata !"string", metadata !1, i32* @global)
>
>      Notice that the `@global` argument is not metadata: it's an
>      `llvm::Constant`.  In the new IR, these will be wrapped in a
>      `ValueAsMetadata` instance.
>
>   2. Operands of `NamedMDNode` (yes, they're different).
>
>          !named = metadata !{metadata !0, metadata !1}
>
>      These operands are always `MDNode`.
>
>   3. Attachments to instructions.
>
>          %inst = load i32* @global, !dbg !0
>
>      Notice that we already skip the `metadata` type here.
>
>   4. Arguments to intrinsics.
>
>          call void @llvm.dbg(metadata !{i32 %inst}, metadata !0)
>
>      The first argument is subtle -- that's a function-local `MDNode`
>      with `%inst` as its only operand.
>
>      In the new IR, the second operand will be a `MetadataAsValue`
>      instance that contains a reference to the `MDNode` numbered `!0`.
>
> New syntax
> ----------
>
> Types only make sense when an operand can be an `llvm::Value`.  Let's
> remove them where they don't make sense.
Hm, how does this interact with range metadata?  Currently, the type of 
the values making up the range have to match the instruction they're 
attached to.  This seems like it could be a change in behaviour.  
Thinking about it, it doesn't seem like a bad change, but it is a 
change.  Are there other cases like this?
>
> I propose the following syntax for the above examples, using a new
> keyword, `value`:
>
>   1. Operands of `MDNode`.  Drop `metadata`, since metadata doesn't have
>      types.  Use `value` to indicate a wrapped `llvm::Value`.
>
>          !0 = !{!"string", !1, value i32* @global)
>
>   2. Operands of `NamedMDNode`.  Drop `metadata`, since metadata doesn't
>      have types.
>
>          !named = !{!0, !1}
>
>   3. Attachments to instructions.  No change!
>
>          %inst = load i32* @global, !dbg !0
>
>   4. Arguments to intrinsics.  Keep `metadata`, since here it's wrapped
>      into an `llvm::Value` (which has a type).  Use `value` to indicate a
>      metadata-wrapped value.
>
>          call void @llvm.dbg(metadata value i32 %inst, metadata !0)
>
>      Notice that the first argument doesn't use an `MDNode` anymore.
>
> Restrictions on function-local metadata
> =======================================
>
> In the new IR, function-local metadata (say, `LocalValueAsMetadata`)
> *cannot* be used as an operand to metadata -- the only legal place for
> it is in a `MetadataAsValue` instance.  This prevents the additional
> complexity from poisoning the rest of the metadata hierarchy.
>
> Effectively, this restricts function-local metadata to direct operands
> of intrinsics.
This seems entirely reasonable.

Philip



More information about the llvm-dev mailing list