[LLVMdev] [RFC] Separating Metadata from the Value hierarchy
    Adrian Prantl 
    aprantl at apple.com
       
    Thu Nov 13 11:17:06 PST 2014
    
    
  
> On Nov 12, 2014, at 1:00 PM, Duncan P. N. Exon Smith <dexonsmith at apple.com> wrote:
> 
> If you don't care about function-local metadata and debug info
> intrinsics, skip ahead to the section on assembly syntax in case you
> have comments on that.
> 
>> On 2014-Nov-09, at 17:02, Duncan P. N. Exon Smith <dexonsmith at apple.com> wrote:
>> 
>> 2. No more function-local metadata.
>> 
>>   AFAICT, function-local metadata is *only* used for indirect
>>   references to instructions and arguments in `@llvm.dbg.value` and
>>   `@llvm.dbg.declare` intrinsics.  The first argument of the following
>>   is an example:
>> 
>>       call void @llvm.dbg.value(metadata !{i32 %val}, metadata !0,
>>                                 metadata !1)
>> 
>>   Note that the debug info people uniformly seem to dislike the status
>>   quo, since it's awkward to get from a `Value` to the corresponding
>>   intrinsic.
>> 
>>   Upgrade path: Instead of using an intrinsic that references a
>>   function-local value through an `MDNode`, attach metadata to the
>>   corresponding argument or instruction, or to the terminating
>>   instruction of the basic block.  (This requires new support for
>>   attaching metadata to function arguments, which I'll have to add for
>>   debug info anyway.)
> 
> llvm::Argument attachments are hard
> ===================================
> 
> I've been looking at prototyping metadata attachments to
> `llvm::Argument`, which is key to replacing debug info intrinsics.
> 
> It's a fairly big piece of new IR, and comes with its own subtle
> semantic decisions.  What do you do with metadata attached to arguments
> when you inline a function?  If the arguments are remapped to other
> instructions (or arguments), they may have metadata of the same kind
> attached.  Does one win?  Which one?  Or are they merged?  What if the
> arguments get remapped to constants?  What about when a function is
> cloned?
> 
> While the rest of this metadata-is-not-a-value proposal is effectively
> NFC, this `Argument` part could introduce problems.  If I rewrite debug
> info intrinsics as argument attachments and then immediately split
> `Metadata` from `Value`, any semantic subtleties will be difficult to
> diagnose in the noise of the rest of the changes.
> 
> While I was looking at this as a shortcut to avoid porting
> function-local metadata, I think it introduces more points of failure
> than problems it solves.
One thing to consider is that there are cases were we describe function arguments without referencing the argument in the intrinsic at all. Currently, if you compile 
  $ cat struct.c
  struct s { int a; int b; };
  int foo(struct s s1) { return s1.a; }
  $ clang -g -O1 -arch x86_64 -S -emit-llvm struct.c
we cannot preserve the debug info for the argument so it is turned into an intrinsic describing an undef value.
  ; Function Attrs: nounwind readnone ssp uwtable
  define i32 @foo(i64 %s1.coerce) #0 {
  entry:
    %s1.sroa.0.0.extract.trunc = trunc i64 %s1.coerce to i32
    tail call void @llvm.dbg.declare(metadata !18, metadata !14, metadata !19), !dbg !20
    ret i32 %s1.sroa.0.0.extract.trunc, !dbg !21
  }
  !18 = metadata !{%struct.s* undef}
Note that it is critical for this DIVariable to make it into the debug info even if it is undefined, or the function arguments won’t match the function signature.
-- adrian
> 
> Limited function-local metadata
> -------------------------------
> 
> Instead, I propose porting a limited form of function-local metadata,
> whose use is severely restricted but covers our current use cases (keep
> reading for details).  We can defer replacing debug info intrinsics
> until the infrastructure has settled down and is stable.
> 
> Assembly syntax
> ===============
> 
> This is a good time to talk about assembly syntax, since it will
> demonstrate what I'm thinking for function-local metadata.
> 
> Assembly syntax is important.  It's our view into the IR.  If metadata
> is typeless (and not a `Value`), that should be reflected in the
> assembly syntax.
> 
> Old syntax
> ----------
> 
> There are four places metadata can be used/reference in the IR.
> 
> 1. Operands of `MDNode`.
> 
>        !0 = metadata !{metadata !"string", metadata !1, i32* @global)
> 
>    Notice that the `@global` argument is not metadata: it's an
>    `llvm::Constant`.  In the new IR, these will be wrapped in a
>    `ValueAsMetadata` instance.
> 
> 2. Operands of `NamedMDNode` (yes, they're different).
> 
>        !named = metadata !{metadata !0, metadata !1}
> 
>    These operands are always `MDNode`.
> 
> 3. Attachments to instructions.
> 
>        %inst = load i32* @global, !dbg !0
> 
>    Notice that we already skip the `metadata` type here.
> 
> 4. Arguments to intrinsics.
> 
>        call void @llvm.dbg(metadata !{i32 %inst}, metadata !0)
> 
>    The first argument is subtle -- that's a function-local `MDNode`
>    with `%inst` as its only operand.
> 
>    In the new IR, the second operand will be a `MetadataAsValue`
>    instance that contains a reference to the `MDNode` numbered `!0`.
> 
> New syntax
> ----------
> 
> Types only make sense when an operand can be an `llvm::Value`.  Let's
> remove them where they don't make sense.
> 
> I propose the following syntax for the above examples, using a new
> keyword, `value`:
> 
> 1. Operands of `MDNode`.  Drop `metadata`, since metadata doesn't have
>    types.  Use `value` to indicate a wrapped `llvm::Value`.
> 
>        !0 = !{!"string", !1, value i32* @global)
> 
> 2. Operands of `NamedMDNode`.  Drop `metadata`, since metadata doesn't
>    have types.
> 
>        !named = !{!0, !1}
> 
> 3. Attachments to instructions.  No change!
> 
>        %inst = load i32* @global, !dbg !0
> 
> 4. Arguments to intrinsics.  Keep `metadata`, since here it's wrapped
>    into an `llvm::Value` (which has a type).  Use `value` to indicate a
>    metadata-wrapped value.
> 
>        call void @llvm.dbg(metadata value i32 %inst, metadata !0)
> 
>    Notice that the first argument doesn't use an `MDNode` anymore.
> 
> Restrictions on function-local metadata
> =======================================
> 
> In the new IR, function-local metadata (say, `LocalValueAsMetadata`)
> *cannot* be used as an operand to metadata -- the only legal place for
> it is in a `MetadataAsValue` instance.  This prevents the additional
> complexity from poisoning the rest of the metadata hierarchy.
> 
> Effectively, this restricts function-local metadata to direct operands
> of intrinsics.
> 
    
    
More information about the llvm-dev
mailing list